Lecture 2

advertisement
Quantitative Methods
Bivariate Regression (OLS)
We’ll start with OLS regression.
Stands forOrdinary Least Squares Regression. Relatively basic
multivariate regression.
Purpose: Prediction? Explanation? Assessing the effects of various
independent variables on a dependent variable.
Limits: Consider the construction of the dependent variable. There are
more appropriate methods to predict or explain the number of things
that happen, or when something happens, or the likelihood that
something will happen, and so on.
(but robustness is a plus, and the discussion can serve as a foundation for
other methods)
Bivariate Regression (OLS)
How many pieces of information do we need to
define a line?
Look at Figure A.
Slope? Intercept? Different ways to interpret
“slope”.
Bivariate Regression (OLS)
Figure A is a “perfect linear relationship”.
Is this likely what our data look like?
Error....
Perfect Linear Relationships—and error
What about Figure B?
So, in OLS regression, we “fit” a line to the data.
What is the equation for a line represented in
Figure A?
How do we indicate that there is some error?
Notation
A detour about standard notation....
predicted versus population or “true”
slope, intercept, error
Apply this to figure C.
Fitting a Line
So, we fit a line to our data, in order to predict,
to explain / assess the effects of different
variables....
But how can we decide which “line” represents
the best fit?
Fitting a Line
Minimize sample errors (see Figure D1 and D2)
Minimize absolute errors (see Figure D3 and D4)
Minimize Squared Errors
(we’ll talk next week about estimating a
“principal components line”, which is what
can be used in other multivariate methods)
OLS Assumptions
There are general one of two consequences to violating any
OLS assumption.
Biased estimates—and what’s important here is to begin to
understand what is meant by “bias”.
And incorrect confidence intervals (or standard errors that are
inflated or deflated). In other words, you are more or less
sure of your results than you really otherwise would (or
“should”) be.
OLS Assumptions
Measurement error -- consequences
Specification error
1. Linear Relationship between X and Y. See
Graph E1 and E2. What will happen if we fit a
line to those data?
Non-linear relationships are very common,
and can be easily addressed.
OLS Assumptions
2. Include all relevant variables in the model.
3. Do not include irrelevant variables in the
model. (Think of this as a degrees of freedom
or information issue—but it’s also just useful to
have parsimonious models.)
(but how do we decide what variables to include
in our model?)
OLS Assumptions
Error Term Assumptions
The error terms should average out to zero, in the long run.
Note that the average residual value will always be
zero—as an articate of OLS regression calculations.
The variance of the error terms is constant across
observations. That is, we have homoskedastic errors; we
do not have heteroskedasticity. See Graphs F1 through
F4. (Note, however, that what appears to be
heteroskedasticity could be specification error).
OLS Assumptions
Error Term Assumptions
Any one residual is not correlated with any other residual.
That is, our error terms are not autocorrelated.
When is this most common?
And, we assume that the error term is uncorrelated with
each independent variable. (Again, this often reflects
specification error).
Download