Uploaded by mgp701

Lecture Multivariate Regression I

advertisement
2/7/24
TODAY
1. The general linear model
MULTIVARIATE REGRESSION I
2. Nonlinear variables and interactions
3. Empirical example
1
2
WE SAW THE SIMPLE LINEAR MODEL UNDER THE
CLASSICAL ASSUMPTIONS
1.
2.
3.
4.
3
MULTIPLE LINEAR REGRESSION MODEL
• We simply add explanatory variables:
π‘Œ" = 𝛽% + 𝛽&𝑋& + 𝛽#𝑋# + β‹― + 𝛽'𝑋' + πœ‡
π’€π’Š = 𝜢 + πœ·π‘Ώπ’Š + ππ’Š , π’Š = 𝟏, … , 𝒏
E[πœ‡" |𝑋" ]=0, i = 1, … , 𝑛
V πœ‡" = constant ≡ 𝜎 #, 𝑖 = 1 … 𝑛
Co𝑣 πœ‡" , πœ‡$ = 0 ∀ 𝑖 ≠ 𝑗
The Xi are not random and they are not all the same
Now the regressors are 𝑋&, … , 𝑋'
• From now on, for convenience of notation, we will call the y-intercept
of the model 𝛽%
• It is as if there were an explanatory variable 𝑋% = 1 for all
observations.
• We will maintain all the classic assumptions except for a small
modification in the second part of assumption 4
4
1
2/7/24
DEFINITION OF THE MULTIPLE LINEAR REGRESSION
MODEL
MOTIVATION FOR MULTIPLE REGRESSION
• Incorporate more explanatory factors into the model
• Explicitly hold fixed other factors that otherwise would be in
“Explains variable y in terms of variables x1, x2,…, xk”
• Allow for more flexible functional forms
• Example: Wage equation
5
5
6
6
EXAMPLE: AVERAGE TEST SCORES AND PER
STUDENT SPENDING
INTERPRETATION OF THE PARAMETERS OF THE
GENERAL LINEAR MODEL
π‘Œ" = 𝛽% + 𝛽&𝑋& + 𝛽#𝑋# + β‹― + 𝛽' 𝑋' + πœ‡
• With 𝐸 μ" = 0 and non-random regressors we have:
𝐸 π‘Œ" = 𝛽% + 𝛽&𝑋& + 𝛽#𝑋# + β‹― + 𝛽' 𝑋' + πœ‡
• Per student spending is likely to be correlated with average family
income at a given high school because of school financing.
• Omitting average family income in regression would lead to biased
estimate of the effect of spending on average test scores.
• In a simple regression model, effect of per student spending would
partly include the effect of family income on test scores.
• The marginal effect of 𝑋' is given by:
πœ•πΈ π‘Œ"
= 𝛽'
πœ•π‘‹'
7
7
8
2
2/7/24
INTERPRETATION OF THE MULTIPLE
REGRESSION MODEL
INTERPRETATION OF THE PARAMETERS OF THE
GENERAL LINEAR MODEL
πœ•πΈ π‘Œ"
= 𝛽'
πœ•π‘‹'
• 𝛽' measures the change in 𝐸 π‘Œ" associated with a marginal change
in the kth explanatory variable, while keeping all other variables
constant (partial derivative)
• The multiple linear regression model manages to hold the values
of other explanatory variables fixed even if, in reality, they are
correlated with the explanatory variable under consideration.
• “Ceteris paribus”-interpretation
• It has still to be assumed that unobserved factors do not change if
the explanatory variables are changed.
• The meaning of “marginal change” is tied to the units of
measurement of the explanatory variable (1 cent, $1, $1 thousand,
$1 million, etc.)
10
9
10
LINEARITY
EXAMPLE: DETERMINANTS OF COLLEGE GPA
π‘Œ" = 𝛽% + 𝛽&𝑋& + 𝛽#𝑋# + β‹― + 𝛽'𝑋' + πœ‡
Note that this model is linear in the variables and in the parameters:
• Interpretation
• Y is a linear function of the variables X2, X3, ..., XK
⇒ Linear model in the variables
• Holding ACT fixed, another point on high school grade point average is
associated with another .453 points college grade point average
• Or: If we compare two students with the same ACT, but the hsGPA of
student A is one point higher, we predict student A to have a colGPA that is
.453 higher than that of student B
• Holding high school grade point average fixed, another 10 points on ACT
are associated with less than one point on college GPA
11
• Y is a linear function of β1, β2, ..., βK
⇒ Linear model in the parameters. OLS only requires linearity in parameters
OLS only requires linearity in
parameters
12
3
2/7/24
WE SAW TWO MODELS BEFORE THE MIDTERM
EXAMPLE: CEO SALARY, SALES AND CEO TENURE
Variables in logarithms
• Logarithmic model (log-log):
π‘™π‘›π‘Œ" = 𝛽% + 𝛽&𝑙𝑛𝑋& + πœ‡ → 𝛽& 𝑖𝑠 π‘Žπ‘› π‘’π‘™π‘Žπ‘ π‘‘π‘–π‘π‘–π‘‘π‘¦
• Semi-logarithmic (log-lin) model:
π‘™π‘›π‘Œ" = 𝛽% + 𝛽&𝑋& + πœ‡ → 𝛽& 𝑖𝑠 π‘Ž π‘ π‘’π‘šπ‘– − π‘’π‘™π‘Žπ‘ π‘‘π‘–π‘π‘–π‘‘π‘¦
• Model assumes a constant elasticity relationship between CEO salary and the
sales of his or her firm.
• Model assumes a quadratic relationship between CEO salary and his or her
tenure with the firm.
• Meaning of “linear” regression
• The model has to be linear in the parameters (not in the variables)
13
13
14
INTERPRETATION
WHAT DO THESE DATA SUGGEST?
• Important: we always have to think about which model we are in
and what the units of measurement of the variables are.
• Suppose that Y and X are measured in pesos, and that 𝛽& = 0.5
Model
Linear
Log-log
Log-Lin
Lin-Log
15
Equation
π‘Œ! = 𝛽" + 𝛽#𝑋! + πœ‡
π‘™π‘›π‘Œ! = 𝛽" + 𝛽#𝑙𝑛𝑋! + πœ‡
π‘™π‘›π‘Œ! = 𝛽" + 𝛽#𝑋! + πœ‡
π‘Œ! = 𝛽" + 𝛽#𝑙𝑛𝑋! + πœ‡
Marginal effect
If x increase by $1, Y increases $ 0.5
If x increase by 1%, Y increases 0.50 %
If x increase by $1, Y increases 50 % (0.50 x 100%)
If x increase by 1%, Y increases 0.005$ (0.5/100)
16
4
2/7/24
QUADRATIC VARIABLES
• Quadratic model in X:
π‘Œ" = 𝛽% + 𝛽&𝑋& + 𝛽#𝑋&# + πœ‡
• The marginal effect of X is now given by:
πœ•πΈ π‘Œ"
= 𝛽& + 2𝛽#𝑋&
πœ•π‘‹'
• The marginal effect is no longer constant: it depends on 𝛽&, 𝛽# and
the value we assign to Xi.
• Not only does the magnitude of the effect depend on X. When
𝛽& π‘Žπ‘›π‘‘ 𝛽# have different signs, the sign of the marginal effect also
depends on the value we assign to Xi.
17
• By construction, the linear model predicts a constant marginal
effect of X on Y.
• Note that the marginal effect of the quadratic model changes
with the value of the X (here the marginal effect is increasingly
negative).
18
EXAMPLE: FAMILY INCOME AND FAMILY
CONSUMPTION
PROPERTIES OF OLS ON ANY SAMPLE OF DATA
• Fitted values and residuals
• Model has two explanatory variables: inome and income squared
• Consumption is explained as a quadratic function of income
• One has to be very careful when interpreting the coefficients:
• Algebraic properties of OLS regression
19
19
20
20
5
2/7/24
GOODNESS-OF-FIT
EXAMPLE: EXPLAINING ARREST RECORDS
• Decomposition of total variation
• R squared
• Interpretation:
• If the proportion prior arrests increases by 0.5, the predicted fall in
arrests is 7.5 arrests per 100 men.
• If the months in prison increase from 0 to 12, the predicted fall in
arrests is 0.408 arrests for a particular man.
• If the quarters employed increase by 1, the predicted fall in arrests is
10.4 arrests per 100 men.
• Alternative expression for R squared
22
23
22
23
STANDARD ASSUMPTIONS FOR THE MULTIPLE
REGRESSION MODEL
EXAMPLE: EXPLAINING ARREST RECORDS (CONT.)
• An additional explanatory variable is added.
• Assumption MLR.1 (Zero conditional mean)
• In a multiple regression model, the zero conditional mean assumption
is much more likely to hold because fewer things end up in the error.
• Interpretation:
• Average prior sentence increases number of arrests (?)
• Limited additional explanatory power as R-squared increases by little
• Example: Average test scores
• General remark on R-squared
• Even if R-squared is small (as in the given example), regression may still
provide good estimates of ceteris paribus effects.
24
24
25
25
6
2/7/24
ESTIMATION
MULTIPLE REGRESSION ANALYSIS: ESTIMATION
• Including irrelevant variables in a regression model
• Standard assumptions for the multiple regression model
• Assumption: Linear in parameters
(14
OF 37)
• Assumption MLR.2 (Random sampling)
• Omitting relevant variables: the simple case
26
27
26
27
STANDARD ASSUMPTIONS FOR THE MULTIPLE
REGRESSION MODEL (CONT.)
NON-PERFECT MULTICOLLINEARITY IN THE GENERAL
LINEAR MODEL
• Assumption MLR.4 (No perfect collinearity)
• It requires that there is no linear dependence between the explanatory
variables.
• In other words, none of the explanatory variables can be expressed as a
linear combination of the others.
• In the sample (and therefore in the population), none of the independent
variables is constant and there are no exact linear relationships among the
independent variables.
• It cannot be the case that there are constants π‘Ž$ , not all equal to zero, such that
𝑋% = π‘Ž$ 𝑋$ ; j ≠ π‘˜
• Remarks on MLR.4
• The assumption only rules out perfect collinearity/correlation between
explanatory variables; imperfect correlation is allowed.
• If an explanatory variable is a perfect linear combination of other explanatory
variables it is superfluous and may be eliminated.
• Constant variables are also ruled out (collinear with intercept).
• Non-perfect multicollinearity ≠ no correlation between the explanatory
variables
• there can be correlation if it is not perfect
• It is not enough that the correlation is not perfect between pairs of variables
• none of the variables 𝑋# … 𝑋% can be a constant. Why?
28
28
29
7
2/7/24
MORE EXAMPLES
HOW DO RESEARCHERS USE MVR?
• Example for perfect collinearity: small sample
• To interpret differences as causal, we’re always looking for an
all other things equal comparison
• Ideally, everything should be equivalent across observations
except for the variable of interest
• Example for perfect collinearity: relationships between regressors
• This is why RCTs and natural experiments were so useful
But we can’t always answer questions of interest with
experiments
30
30
31
DALE AND KRUEGER (2002)
• In multivariate regression, coefficients are interpreted as the change
in Y associated with a one-unit change in X holding all other righthand side variables constant
• In some cases, it may be possible to control for the right RHS
variables to have a plausibly all other things equal comparison
• “Regression-based causal inference is predicated on the assumption
that when key observed variables have been made equal across
treatment and control groups, selection bias from the things we can’t
see is also mostly eliminated” -MM
• Caution: while this assumption may be satisfied in some cases, it is
not in many others!
32
33
8
2/7/24
DALE AND KRUEGER (2002)
DALE AND KRUEGER (2002)
• Question: How do returns to college differ for public v. private
colleges?
• Dale and Krueger’s idea: Students who applied to the same set
of schools and had the same acceptances/rejections but attend
different schools may be similar enough for an all else equal
comparison
• Question: How do returns to college differ for public v. private
colleges?
• Why not just compare earnings for people who went to public v.
private colleges?
• Ideal Experiment: Randomly assign students to colleges
• Feasible? Ethical?
• “many decisions and choices, including those related to college
attendance, involve a certain amount of serendipitous variation
generated by financial considerations, personal circumstances, and
timing. Serendipity can be exploited in a sample of applicants on the
cusp, who could easily go one way or the other”
• We can’t do the ideal experiment, so we have to find another
way!
34
35
DALE AND KRUEGER (2002)
DALE AND KRUEGER (2002)
Data: College and Beyond (C&B)
• Includes more than 14,000 students who enrolled in a group of
moderately- to highly-selective U.S. colleges (enrolled 1976)
• Prestigious private schools like UPenn, Princeton, Yale
• Smaller private schools like Swarthmore, Williams, Oberlin
• Four public universities: Michigan, UNC, Penn State, Miami of Ohio
• Survey data collected when these students took the SAT (1975)
Follow-up data long after most completed college (1996)
36
37
9
2/7/24
DALE AND KRUEGER (2002)
DALE AND KRUEGER (2002)
&8%
Model
π‘™π‘›π‘Œ" = 𝛼 + 𝛽𝑃" + Y 𝛾" πΊπ‘…π‘‚π‘ˆπ‘ƒ"$ + 𝛿&𝑆𝐴𝑇" + 𝛿#𝑙𝑛𝑃𝐼" + πœ–"$
• π‘Œ" is earnings for individual i in 1995
• 𝑃" = 1 if individual i attended private school and =0 if attended public
• πΊπ‘…π‘‚π‘ˆπ‘ƒ"$ = 1 if individual i is in college application/acceptance group j,
=0 otherwise (together called selectivity group fixed effects)
• 𝑆𝐴𝑇" + is individual i’s SAT score
• 𝑃𝐼" is parental income for individual I
• Also includes some other controls not written out
38
&8%
π‘™π‘›π‘Œ" = 𝛼 + 𝛽𝑃" + Y 𝛾" πΊπ‘…π‘‚π‘ˆπ‘ƒ"$ + 𝛿&𝑆𝐴𝑇" + 𝛿#𝑙𝑛𝑃𝐼" + πœ–"$
$7&
$7&
• What is the coefficient of interest?
• How will we interpret it?
39
DALE AND KRUEGER (2002)
DALE AND KRUEGER
(2002)
• For causality, we need another things equal comparison
• Experiments are the ideal but not always plausible
• Clever multivariate regressions that control for the right thing
may be an alternative path
• Dale and Krueger (2002) compares students who applied for
and were accepted to a set of schools with the same selectivity
rankings but attended private v. public school and finds that
there is no estimated earnings gain from attending a private
school (control for selection group fixed effect)
40
41
10
2/7/24
QUESTIONS?
42
11
Download