„Implementing Restricted Least Squares in Linear Models“ Dr. John P. Haisken-DeNew

advertisement
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
„Implementing Restricted Least Squares
in Linear Models“
Dr. John P. Haisken-DeNew
jhaiskendenew@rwi-essen.de
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
1
 Inter-Industry Wage Differentials
- Why do secretaries in the steel industry make more money than otherwise
observably identical secretaries in the services industry?
- Calculating „wage differentials“: Wages in steel > services ?
- Dummy Variables: 0 or 1
 Starting Point
Krueger/Summers (1988) „Efficiency Wages and the Inter-Industry Wage
Structure“, Econometrica, 56, p 259-93.
- Would like to interpret differentials as deviations from a weighted average
- Remove arbitrary selection of reference category
- Excellent seminal paper, however technical problems …
- Attempt to implement Restricted Least Squares (RLS) but..
- Incorrect standard errors: t-values systematically biased downward
- Incorrect overall inference: Variation systematically biased downward
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
2
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
1a. Background
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
1b. Background
 Technical Contribution (in Handout)
Haisken-DeNew/Schmidt (1997) „Inter-Industry and Inter-Regional
Differentials: Mechanics and Interpretation“, Review of Economics and
Statistics, 79(3), p. 517-21.
- How to implement Restricted Least Squares (RLS) correctly
- How to implement RLS after any linear model (OLS, FE, RE…)
- RLS was implemented in GAUSS, LIMDEP and Stata (crudely)
 Now RLS is implemented in Stata in a flexible Ado <hds97.ado>
- What does the syntax look like?
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
3
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
2a. RLS <hds97.ado> - One Dummy Set
 Run a linear regression
reg/xtreg depvar indepvars
 Standard Syntax (only ONE dummy set)
hds97 indepvars [, options]
options
description
refname( string )
a string containing the name of
the "reference" category
realname( string )
a string containing a descriptive name
for the set of dummy variables
weight( varname )
a string containing the name of the
weighting variable
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
4
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
2b. RLS <hds97.ado> - Many Dummy Sets
 Run a linear regression
reg/xtreg depvar x* Xvar_1 Zvar_1 Zvar_2 Dvar_* XXLvar_*
 Advanced Syntax (MANY dummy variable sets)
global hds97_1
Xvar_1
Xvar_ref
descriptive_name_for_X
global hds97_2
Zvar_1 Zvar_2 Zvar_ref
descriptive_name_for_Z
global hds97_3
Dvar_*
...
global hds97_50 XXLvar_*
Dvar_ref
XXLvar_ref
descriptive_name_for_D
descriptive_name_for_XXL
(up to 50 globals/constraints can be set)
Xvar_1
is a regressor used in regress or xtreg previously
Xvar_ref
is a text name for the reference category
descriptive_name is a descriptive text name of the dummy set
hds97 [, weight(wgt_var_name)]
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
5
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
2c. RLS <hds97.ado>
 Output created by <hds97.ado>
(A) Original Regression (OLS, RE, FE etc) repeated
(B) Each Dummy Variable Group using RLS is calculated
- From “k-1” Dummy Variables: “k” Coefficients reported
(C) Weighted Standard Deviation (Sampling Corrected) of RLS Betas
- Measure of overall variation
(D) F-Tests of Joint Significance
- Are the dummy variables as a group significant
(E) Sample Shares of each Dummy
- What were the sample shares used to create the weighted average
- From the weighted average, the deviations are calculated (see B)
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
6
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
3. Illustrative Example (in Handout)
 American Current Population Survey (CPS)
- Use freely available January 2004 CPS sample
- http://www.nber.org/morg/annual/morg04.dta
 Run simple wage regression (age 18-65)
- log hourly wages = f (age, gender, race, marital status, state)
 Dummy Indicators
- gender: male, female
- race: white, black, other
- marital status: married, divorced, separated, single
- states: AK, AL… WY
 Selecting arbitrary dummy variable as reference
- Which one? Makes no difference in the calculation, just in interpretation
 With RLS, interpret the dummy variables as deviations from a weighted
average as opposed to an arbitrary reference category
 If logged wages, then interpretation: %-point deviations from average
 Use <hds97.ado> to implement RLS
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
7
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
3. Sample Regression Output (in Handout)

. regress lhw age genderm raceb raceo msmar msdiv mssep
Source |
SS
df
MS
Number of obs =
8417
-------------+-----------------------------F( 7, 8409) = 181.36
Model | 242.712792
7
34.673256
Prob > F
= 0.0000
Residual | 1607.68867 8409 .191186665
R-squared
= 0.1312
-------------+-----------------------------Adj R-squared = 0.1304
Total | 1850.40146 8416 .219867093
Root MSE
= .43725
-----------------------------------------------------------------------------lhw |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.00861
.0004585
18.78
0.000
.0077112
.0095088
genderm |
.1737988
.0095849
18.13
0.000
.1550101
.1925876
raceb | -.0730053
.0162526
-4.49
0.000
-.1048645
-.0411462
raceo | -.0131488
.0193254
-0.68
0.496
-.0510315
.0247338
msmar |
.1365145
.0125807
10.85
0.000
.1118532
.1611758
msdiv |
.1014927
.0180303
5.63
0.000
.0661489
.1368365
mssep |
.0237369
.0341694
0.69
0.487
-.0432435
.0907174
_cons |
6.5783
.016593
396.45
0.000
6.545774
6.610826
------------------------------------------------------------------------------
 . global
. global
. global
hds97_1
hds97_2
hds97_3
genderm genderf
raceb
raceo
racew
msmar
msdiv
mssep
mssgl
gender
race
marital
. hds97
Name of
reference
Haisken-DeNew / Stata 2006 Mannheim
description
March 31, 2006
8
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
3a. Gender (2-Way)
Gender Wage Differentials
2-Way (SD=0.0867)
0,20
0,15
Wage Differential
0,10
0,05
0,00
male
female
female
male
-0,05
-0,10
-0,15
-0,20
Ref=Female
Ref=Male
Haisken-DeNew / Stata 2006 Mannheim
Restricted Least Squares
March 31, 2006
9
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
3b. Race (3-Way)
Race Wage Differentials
3-Way (SD=0.0205)
0.20
0.15
0.10
Wage Differential
white
0.05
other
white
0.00
white
other
other
-0.05
black
black
black
-0.10
-0.15
-0.20
Ref=Black
Ref=White
Haisken-DeNew / Stata 2006 Mannheim
Ref=Other
Restricted Least Squares
March 31, 2006
10
Marital Status Wage Differentials
4-Way (SD=0.0609)
0.20
0.15
Wage Differential
0.10
married
divorced
0.05
married
married
separated
divorced
0.00
divorced
divorced
-0.05
-0.10
separated
single
-0.15
separated
single
separated
single
separated
single
-0.20
Ref=Single
Ref=Married
Haisken-DeNew / Stata 2006 Mannheim
Ref=Divorced
Ref=Seprated
R-L-S
March 31, 2006
11
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
3c. Marital Status (4-Way)
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
3d. State of Residence (51-Way) Ref=Hi
State Wage Differential
51-Way (Reference=Alaska)
0.6
Wage Differential
0.4
0.2
0
CT
-0.2
MN
NHNJ
MI
WA
NY
VA
MO
IL
LA
WI
ME
NV
PA
GAHI
VT
RI
MD
MT ND
OH
TN
WY
KS
IA
NE
OR
FL
SC
ID IN KY
SD TXUT
NC
NM
OK
MS
WV
DCDE
AL AZCACO
-0.4
AR
MA
-0.6
American States (Ordinary Least Squares)
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
12
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
3d. State of Residence (51-Way) Ref=Lo
State Wage Differential
51-Way (Reference=Arkansas)
0.6
AK
Wage Differential
0.4
0.2
0
CT
MN
NHNJ
MI
WA
NY
VA
MO
IL
LA
ME
WI
NV
PARI
GAHI
VT
MD
TN
WY
MT NDNE
OH OR
KS
IA
FL
SC
ID IN KY
SD TXUT
NC
NM
OK
MS
WV
DCDE
ALAZCACO
MA
-0.2
-0.4
-0.6
American States (Ordinary Least Squares)
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
13
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
3d. State of Residence (51-Way)
State Wage Differentials
51-Way (SD=0.0684)
0.6
0.4
Wage Differential
AK
CT
0.2
MN
NHNJ
MI
WA
NY
VA
MO
IL
LA
ME
HI
NV
PA
GA
VT WI
MD
TN
WY
MT NDNE
OH OR RI
KS
IA
FL
SCSD
ID IN KY
NC
NM
UT
TX
OK
MS
WV
MA
DCDE
0.0
AL
CA
AZ CO
AR
-0.2
-0.4
-0.6
American States (Restricted Least Squares)
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
14
Rheinisch-Westfälisches Institut für Wirtschaftsforschung
4. Conclusions
 RLS: Interpretation of Dummy Variables
- Even with a small dimension, RLS intuitive interpretation
- Remove arbitrariness of reference category
- Allow for importance weighting of each category
 Easily Implemented with <hds97.ado>
- Can be used after regress or xtreg and coefficients calculated
- Useful additional statistics calculated
 Flexible use
- Transform a single set of dummy variables
- Transform up to 50 sets of dummy variables at once
 Areas of Application
- Wage Differentials by: Region, Industry, Occupation, Education,
Marital Status, Race, etc…
Haisken-DeNew / Stata 2006 Mannheim
March 31, 2006
15
Download