Chapter 9: Dummy Variables

advertisement
Chapter 9: Dummy Variables
9.1
A Dummy Variable: is a variable that can take on only 2 possible
values:
yes, no
up, down
male, female
union member, non-union member
They provide a method for “quantifying” a “qualitative” variable
 The variable D = 1 if yes, D = 0 if no
It doesn’t matter which category gets the 0 or 1.
Estimation with Dummy Variables
9.2
If the dummy variable is the only independent variable:
Yt = 1 + 2Dt + et
If D = 0  Yt = 1 + et
If D = 1  Yt = (1 + 2) + et
Example: Wage data (See class handout)
FE = 0 if the person is male
FE = 1 if the person is female
Waget = 1 + 2FEt + et
Least squares regression will produce a b1 and b2 value such that
b1 = the mean of the Wage values for the FE=0 values
b1 + b2 = the mean of the Wage values for the FE=1 values
Estimation with Dummy Variables
9.3
If there is one continuous explanatory variable and one dummy
variable:
Yt = 1 + 2Xt + Dt + et
If D = 0  Yt = 1 + 2Xt + et
Suppose that
If D = 1  Yt = (1 + ) + 2Xt + et
1 >0, 2 >0,  > 0
 It is as though we
Y
have two regression
lines that have the

same slope
2
coefficient but have
1 + 
2
difference intercepts.
1
X
Estimation with Dummy Variables
9.4
Example: Wage data (See class handout)
FE = 0 if the person is male
FE = 1 if the person is female
Waget = 1 + 2EDt + 3FEt + et
We estimate this model as an ordinary multiple regression model.
Our estimate b3 will measure the difference in wages for males vs.
females, after controlling for differences in education.
See class handout.
Interaction Terms
9.5
An interaction term is an independent variable that is the product
of two other independent variables. These independent
variables can be continuous or dummy variables
Yt = 1 + 2Xt + 3Zt + 4XtZt + et
In this model, the effect of X on Y will depend on the level of Z.
In this model, the effect of Z on Y will depend on the level of X.
Interaction Terms Involving Dummy Variables
9.6
Yt = 1 + 2Xt + 3Dt + 4DtXt + et
If D = 0  Yt = 1 + 2Xt + et
If D = 1  Yt = (1 + 3 ) + (2+ 4 )Xt + et
Y
2+4
 1 + 3
1
2
X
Suppose that
1 >0, 2 >0, 3 >0,
4 >0
 It is as though we
have two regression
lines that have
different slope
coefficients and
different intercepts.
9.7
Dependent Variable: lnwage
Analysis of Variance
Sum of
Mean
Squares
Square
F Value
Pr > F
36.74586
36.74586
122.43
<.0001
159.67604
0.30014
196.42191
0.54785
R-Square
0.1871
2.80361
Adj R-Sq
0.1855
19.54101
Parameter Estimates
Parameter
Standard
Estimate
Error
t Value
Pr > |t|
1.33757
0.13460
9.94
<.0001
0.10673
0.00965
11.06
<.0001
Source
DF
Model
1
Error
532
Corrected Total
533
Root MSE
Dependent Mean
Coeff Var
Variable
Intercept
ed
DF
1
1
******************************************************************************
Dependent Variable: lnwage
Source
Model
Error
Corrected Total
DF
3
530
533
Root MSE
Dependent Mean
Coeff Var
Variable
Intercept
ed
feed
female
DF
1
1
1
1
Analysis of Variance
Sum of
Mean
Squares
Square
45.65281
15.21760
150.76910
0.28447
196.42191
F Value
53.49
Pr > F
<.0001
0.53336
R-Square
0.2324
2.80361
Adj R-Sq
0.2281
19.02397
Parameter Estimates
Parameter
Standard
Estimate
Error
t Value
Pr > |t|
1.54743
0.17989
8.60
<.0001
0.09993
0.01306
7.65
<.0001
0.02233
0.01885
1.18
0.2368
-0.56090
0.26344
-2.13
0.0337
Download