The Human capital Theory of economics states that each labor

advertisement
Regression Project
Introduction
The Human Capital Theory of economics states that each labor market participant
has a certain and distinct set of skills and abilities called human capital. Workers with
valuable human capital increase their probability of earning high wages in the labor
market. The skills acquired in college greatly increase a worker’s human capital and
therefore the worker’s expected earnings (Borjas 250). Colleges, especially elite private
colleges, are usually very expensive to attend. Presumably, students who go to these
schools expect to acquire more human capital at a private school, which would increase
their productivity, thereby increasing their wage in the labor market. Students attending
private colleges expect the value of a private education to be worth more in the labor
market than a public college education. That is, the present value of a private school
education is greater than the present value of a public education.
PVPrivate > PVPublic
The present value of a private college education equals the present value of the
wages the student expects to receive each year in the labor market minus the cost of
attending a private college:
PVPrivate=  WPrivate/ (1+r)t - 3 H/ (1+r)t


1
where r is the discount rate, t stands for the year (there are only 46 years to account for
retirement), and H represents the tuition of a private school for one year. (Borjas, 258)
The present Value of a public school education equals the present value of the
wages the student expects to receive each year in the labor market mines the cost
attending a public college.
PVPublic=  WPublic/ (1+r)t - 3 H/ (1+r)t


where r is the discount rate, t stands for the year, and H represents the tuition of a public
school for one year (Borjas 258).
Based on these equations, the wages of a private school graduate must be higher
than the wages of a public school graduate in order for PVPrivate > PVPublic.
Expected higher wages, however, may not be the only reason students attend
private colleges. Private colleges tend to have smaller class and allow more access to
faculty, which benefits some students more than others. Students who need more personal
attention from faculty to acquire valuable human capital may be willing to pay the high
tuition costs of a private college even if they do not expect a higher wage in the labor
market. Other students would prefer the smaller tuition of public schools.
Another theory about the effect of education on labor market outcome is the
signaling theory. This theory is based on the idea that certain levels of education, or a
degree from a specific college, signals a worker’s qualifications. According to the
signaling theory, there is no rate of social return. These schools don’t teach more skills or
increase a worker’s human capital. Rather, a degree from certain colleges signals worker
2
skills. Employers value credentials from elite college because they signal a high level of
human capital (Borjas 241).
Both of these economic theories predict that students attending an elite colleges or
universities can expect higher wages than students attending other universities. However,
education is not the only factor which determines a worker’s wage in the labor market.
Personal characteristics, occupation and labor market conditions also affect wages.
Other Studies
Many economists have tried to measure the effect of college quality on earnings.
Brewer, Eide, and Ehrenberg used data from the National Longitudinal Study of the High
School Class of 1972 to gauge the effect of college choice on wages. They obtained their
information on the characteristics of colleges from the Higher Education General
Information Survey. The dependant variable used by Brewer, Eide, and Ehrenberg was
the natural log of annual earnings. They controlled for many factors, including individual
characteristics and college selection process. They found a statistically significant and
large return to attending an elite school relative to a low ranked public school. They
found weaker evidence of a premium to attending an elite public university. Evidence
presented by Brewer, Eide, and Ehrenberg also suggests that the return to attending an
elite private school in 1980 was greater than the return to attending an elite private school
in 1972. (Brewer, Eide, and Ehrenberg, 11)
In “College Quality and Future Earnings: Where Should You Send Your Child to
College,” James, Alsalam, Conaty and To also use data from the National Longitudinal
Study of the High School Class of 1972 to measure the effects of college quality on
3
wages. Their regression equation controls for individual characteristics, college
characteristics, college major, occupation, and other labor market variables. James
Alsalam, Conaty, and To found that attending an Eastern elite private college or
university had a positive effect on earnings. However, they also found that taking many
math and science courses in college had a more substantial effect on earnings (James,
Alsalam, Conaty, and To 251-252).
Wales used NBER-Thorndike data for his regression analysis. NBER- Thorndike
data was collected from a sample of men who took a test for the Air Force, measuring
their math and reasoning skills, physical coordination, spatial perception, and reaction to
stress. Wales used the Gourman report as his measure of college quality. Wales found
that the quality of college or university had the most profound effect at the graduate
school level. Precisely, Wales found that a student who had attended an undergraduate
school and a graduate school, both in the top fifth of all schools, earned 60 percent more
than the average high school graduate. However, a student who attended undergraduate
and graduate schools in the bottom four fifths of all schools earned only 15 percent more
than the average high school graduate (Wales 314).
Solmon and Wachtel also used NBER-Thorndike data and the Gourman rating of
school quality. Controlling for experience, personal intelligence, years of schooling,
occupation, and college type, Solmon and Wachtel found that college type does influence
earnings. However, their results suggest that the effects of college type differ for different
types of students (Solmon and Wachtel, 89).
Unlike the studies conducted by Solmon and Wachtel and Wales, who studied
only men, I used National Longitudinal Study (NLS) young women data for my analysis.
4
The National Longitudinal Study of the Labor Market observed over 5000 young women,
ages 14 to 25, from 1968 to 1991.
I performed two regressions using this data. One regression was done to prove
that higher education levels such as college do increase a woman’s earnings. The other
was performed to measure the effect of college quality on wages. The dependent variable
in both of the regressions was the natural log of wages in 1992. I took the natural log of
wages because it lets me view the explanatory variables in terms of percent change. The
coefficients of the independent variables show the percent change in wages due to a one
unit increase in the independent variable.
Like the other studies, I took other variables besides college quality into account
when performing my regression. In each regression I used a set of personal
characteristics, labor market conditions, and education.
First Regression
My first regression measures the effect of level of education on hourly wages. I
control for age, race, region, field of study in college (if the subject went to college),
college quality (was the college attended in the top quartile of all the colleges in the data
set), and occupation. Usually experience is included as a controlled variable in studies
measuring effects on wages. However, many subjects in the NLS data set did not report
their experience in the labor market. As a result, using experience in my regression would
limit the number of observations. Therefore, in this regression, age is used as a proxy for
5
experience. Age is a better estimate of experience when pertaining to men as opposed to
women. However, age is an adequate substitute for experience for women.
The regression equation estimates age as well as age squared. This is because the
coefficient preceding age is not constant. Wages tend to increase as workers grow older,
but as age increases, wages increase at a decreasing rate.
Some individual characteristics are left out of the regression as well. Although
mother’s and father’s education most likely do affect a workers wage, they are omitted
from the regression for the same reason as experience. The data set is missing
information on parents’ level of education for many observations, therefore in order to
retain a high number of observations, the variables are omitted.
The regression equation looks like:
Ln Wage=b0+b1(age)+b2(age squared)+b3(South)+ b4(Black)+ b5(Occ-Prof)+ b6(OccMan)+ b7(Occ-Cler)+ b8(Occ-Sal)+ b9(Occ-Cra)+ b10(Occ-HH)+ b11(Occ-Svc)+
b12(Field-SC)+ b13(Field-SS)+ b14(Field-Co)+ b15(Field-Bu)+ b16(Field-Ed)+
b17(HsGrad)+ b18(SomeCol)+ b19(Bach)+ b20(GradSchl)+ b21(top25)+E
E is a normally distributed error term.
South, Black, Occupations, Fields of Study, Level of Education, and Quality of School
are all dummy variables. The occupations were measured against farm workers, and the
fields of study were measured in terms of a humanities major. Levels of Education were
measured against high school dropouts. For a complete list and definitions of the dummy
variables see appendix A.
6
First Regression Results
Many occupations and all levels of education are statistically significant at the 99
percent level. According to this regression, education does have an effect on wages.
Relative to a high school dropout, a high school diploma increases wages by 20 percent,
attending some college increases wages by 29 percent, a bachelors degree increases
wages by 40 percent, and going to graduate school increases wages by 61 percent. The pvalue for f test of the regression is <0.0001, proving that at the 99 percent level that all of
the coefficients of the independent variable are statically different from zero. In other
words, the independent variables actually do have an effect of wages. Using a correlation
matrix of estimates I detected some multicollinearity with the education levels. However,
this is to be expected because in order to graduate from college, a person must first
graduate from high school. In order to graduate from graduate school, a person must have
completed college and so on. Multicollinearity also exists in the relationships between
certain occupations, but this probably did not affect the results greatly. Heteroscedasticity
does not seem to be a problem. The r-squared of the regression is.2995, meaning that the
data points do not fit the regression line very well. Only 30 percent of the variation from
the mean can be explained by the regression line. This is most likely because there are
other variables that influence wages such as personal drive, productivity, and luck that
are extremely difficult to measure. For the actual regression coefficients and summary
statistics, see appendix C.
Second Regression
7
So it seems that education does actually does affect wages. But does the quality of
college education affect wages, too? The previous regression found a slightly positive but
statistically insignificant effect. In order to obtain a better answer to this question, I
preformed a second regression. In this regression, I reduce the observation group to only
those who graduated college.
To test whether the quality of college impacts wages among college graduates, I
control for race, age, region, college field of study, and gradschool. Again I include a
variable of whether the college attended was in the top quartile of colleges (using average
SAT score of the college) of the people in the data set. As in the last regression, age is
used as a proxy for experience. In this regression, however occupation is omitted to
preserve a higher number of degrees of freedom.
The regression equation looks like:
Ln Wage=b0+ b1(Age)+ b2(Age squared)+ b3(Black)+ b4(South)+ b5(GRADSCHL)+
b6(top 25)+ b7(FIELD-SC)+ b8(FIELD-SS)+ b9(FIELD-CO)+ b10(FIELD-BU)+
b11(FIELD-ED)+E
Again, E is a normally distributed error term. For a list and definition of the dummy
variables refer to Appendix A.
Second Regression Results
Nearly all of the coefficients in the second regression are statistically
insignificant. Only the coefficients for Graduate School and a Science or Math major in
college are statistically different from zero. There is no evidence of multicollinearity or
heteroscadasticity. The r-squared value is 0.1038; the data points fit the regression line
8
very poorly. Again, this is most likely because of some unobservable characteristics that
also effect wages. The p-value for the F test is again <.0001, meaning at the 99 percent
level the independent variables do affect the dependant variable. For complete regression
results and summary statistics, see appendix D.
Hypothesis Tests
In the second regression, like the first, the coefficient for the top 25 percent of
colleges by quality is small, slightly positive, and statistically insignificant. So is there
any real difference between going to a top school? In order to answer this question I
performed a hypothesis test for the difference of two means. The mean hourly wage of
subjects with a college degree who attended colleges in the bottom 75 percent of this
study is 1585.86. The mean wage for those who attended a college in the top 25 percent
of schools is 1687.86. For summary statistics of variables used in the hypothesis test,
refer to appendix B.
Hypothesis Test for the difference between the two means:
H0: Top25-College=0, Mean wage of top 25 minus the mean wage of all other college
graduates equals zero.
H1: Top25-College≠0, Mean wage of top 25 minus the mean wage of all other college
graduates does not equal zero.
Level of alpha: .05: Z.025=1.96
Test Statistic: (Top25-College)÷(Top25/n Top25+College/ncollege)
= (1687.86-1585.41)/√ (1050.912/173 + 804.362/496) = 1.17
Decision rule: Reject H0 if z>1.96 or z<-1.96
9
-1.96<1.17<1.96
I cannot reject the null hypothesis. There is no statistically significant difference between
the mean wage of a top 25 percent college graduate and a bottom 75 percent college
graduate. There seems to be no return to attending a school in the top 25 percent of
colleges in the NLS study.
So apparently, the regressions and the hypothesis test prove that there is no real
difference in the labor market returns to attaining a college education at a college in the
top 25 percent of all schools by quality relative to a bottom 75 percent college education.
Perhaps schools that are even more elite will have a higher return. I performed a
hypothesis test for the difference of the top 10 percent of elite college graduates and all
other college graduates.
The mean of the college graduates who attended a college ranked in the top 10
percent by SAT score is 1977.75. The mean for all other college graduates (bottom 90
percent) is 1569.81.
Hypothesis test:
H0: Top10-College=0, Mean wage of top 10 minus the mean wage of all other college
graduates equals zero.
H1: Top10-College0, Mean wage of top 10 minus the mean wage of all other college
graduates does not equal zero.
Level of alpha: .05: Z.025=1.96
Test Statistic: (Top10-College)÷ (Top10/n Top10+College/ncollege)
= (1977.75-1569.81)/ (1397.212/68 + 785.082/574) = 2.36
Decision rule: Reject H0 if z>1.96 or z<-1.96
10
2.36>1.96
I can reject the null hypothesis. The difference between the wages of a top 10 percent
college graduate and a bottom 90 percent college graduate is significantly different from
zero.
Conclusion
In the hypothesis test as well as the regressions, I have found no significant effect
on wages of attending a school in the top 25 percent of elite colleges versus the bottom
75 percent. This result differs from previous studies, perhaps because this study was
limited to women, while the other studies were either co-ed or limited to just men.
According to economists, Lawrence Mishel, Jared Berstein, and John Schmit, the ratio of
female to male wages in 1997 was at about .79, meaning that women make 79 percent of
what men make (134-135). This could cause the different results in the return to college
quality for men and women. Further there is evidence that because of socialization
processes or importance of child rearing, women often cluster into certain occupation,
which are often low-wage.
I also found some results similar to those of other studies. Like Whales, I found a
large return to attending graduate school. According to my regression, graduates of
graduate school earn 61 percent higher wages than high school dropouts and 24 percent
higher wages than college graduates. Similar to James, Alsalam, Conaty, and To, I found
that some fields of study, particularly science and math, affect wages. In the regression
including solely college graduates, the coefficient in front of the science and math field of
study was .24, significant at the 99 percent level. Consistent with the results of James,
11
Alsalam, Conatry and To, in the second hypothesis test, I found that there is a significant
return to attending an elite college in the top 10 percent of schools.
However, most of my regression coefficients in both regressions were statistically
insignificant. This was probably due, in part, to the small amount of female subjects in
the NLS data set who attended elite schools. Perhaps an analysis of a larger group of
college graduates, accounting for more college variables such as high school grades,
college grades, and individual SAT, score may produce some more significant results
about the return to college quality for women.
12
Works Cited
Borjas, George J. Labor Economics. The McGraw-Hill Companies, 1996.
Brewer, Dominic J. and Eric R. Eide and Ronald G. Ehrenberg. “Does it Pay to Attend an
Elite Private College? Cross-Cohort Evidence on the Effects of College Type on
Earnings.” Journal of Human Resources 34 (Winter 1999).
James, Estelle, nabeel Alsalam, Joseph C. Conaty, and Duc-Le To. “College Quality and
Future Earnings: Where Should You Send Your Child to College.” American
Economic Association Papers and Proceedings, May.
Mishel, Lawrence, Jared Bernstein and John Schmitt. The State of Working America.
Ithaca; Cornell University Press, 1999.
Solomon, Lewis C. and Paul Wachtel. “The Effects on Income of Type of College
Attended.” Sociology of Education 48 (Winter 1975):75-90.
Wales, Terence J. “The Effect of College Quality on Earning: Results from the NBERThorndike Data.” Journal of Human Resources 8: 306-315.
13
Appendix A
Dummy Variables
Fields of Study-For all fields the observation equals 1 if the subject studies that field in
college, otherwise the observation equals zero.
Field-SC=Sciences and Math
Field-En=Engineering
Field-SS=Social Sciences
Field-CO=Computers
Field-BU=Business
Field-ED=Education
The omitted variable is Field-HU=Humanities. All of the coefficients in front of the Field
dummy variables compare the specific field to a humanities field.
Occupations-For all occupations the observation equals 1 if the subject works in that
occupation, otherwise the observation equals zero.
OCC-PROF=Professional, technical, and kindred
OCC-MAN=Managers, Officials, and Proprietors
OCC-SAL=Sales Workers
OCC-CRA=Craftsmen, Foremen, Operatives and Kindred
OCC-HH=Private Household Workers
OCC-SVC=Service Workers except Private Household
14
The Omitted Variable is OCC-FARM=Farm workers, Farm Managers, Farm Laborers
and Foremen. All of the coefficients in front of the occupational dummy variables
compare the specific occupation to Farm workers.
Black
The observation is equal to 1 if the subject is black, 0 if otherwise.
South
The observations take on different values depending on the location of the home of the
subject.
Levels of Education
For all levels of education the observation equals 1 if the subject has obtained the specific
level of education, otherwise the observation equals zero.
HSGRAD=High School Graduate
SOMECOL=The subject has had some college
BACH=The subject has graduated college
GRADSCHL=Graduate School
The omitted variable is HSDROP=High School Dropout. The coefficients in front of the
level of education variables to a high school dropout.
15
Top 25
The colleges ranked by average SAT score of the student body. The observation is equal
to one if the subject attended a college in the top quartile of schools, zero if otherwise.
Top 10
The colleges were ranked by average SAT score of the student body. The observation
equals one if the subject attended a college in the top tenth of schools, zero if otherwise.
16
Appendix B
Summary Statistics for hourly wages for subjects who graduated college
N
Mean
Std. Dev
Minimum
Maximum
Top 10
68
1977.75
1397.21
275.00
9622.00
Bottom 90
574
1569.81
785.08
231.00
9622.00
Top 25
173
1687.86
1050.91
275.00
9622.00
Bottom 75
469
1585.41
804.36
231.00
9623.00
17
Appendix C
The First Regression Analysis
Dependent Variable: hourly_wage_log
Analysis of Variance
Source
DF
Model
Error
Corrected Total
Sum of Squares
22
2320
2342
Root MSE
Dependent Mean
Coeff Var
250.29674
585.33384
835.63058
0.50229
6.89650
7.28331
Mean
Square
F Value
11.37712
0.25230
R-Square
Adj R-Sq
45.09
Pr>F
<.0001
0.2995
0.2929
Parameter Estimates
Variable
Intercept
SOUTH
AGE93
agesqu
FIELD_SC
FIELD_EN
FIELD_SS
FIELD_CO
FIELD_BU
FIELD_ED
BLACK
top25
OCC_PROF
OCC_MAN
OCC_CLER
OCC_SAL
OCC_CRA
OCC_HH
OCC_SVC
HSGRAD
SOMECOL
BACH
GRADSCHL
Label
Intercept
SOUTH
AGE93
FIELD_SC
FIELD_EN
FIELD_SS
FIELD_CO
FIELD_BU
FIELD_ED
BLACK
top25
OCC_PROF
OCC_MAN
OCC_CLER
OCC_SAL
OCC_CRA
OCC_HH
OCC_SVC
HSGRAD
SOMECOL
BACH
GRADSCHL
DF
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Parameter
Estimate
5.94265
-0.00833
0.01577
-0.00017423
0.15518
0.30953
0.04913
0.15363
0.05904
-0.07414
-0.03558
0.02438
0.46105
0.57039
0.24549
0.33536
0.17051
-0.37108
0.05746
0.20231
0.29010
0.40975
0.61356
Standard
Error
t Value
2.34114
2.54
0.00387
-2.15
0.10709
0.15
0.00122
-0.14
0.03773
4.11
0.29236
1.06
0.05059
0.97
0.09845
1.56
0.03844
1.54
0.03958
-1.87
0.02556
-1.39
0.03915
0.62
0.08362
5.51
0.08499
6.71
0.08166
3.01
0.09546
3.51
0.08440
2.02
0.12537
-2.96
0.08402
0.68
0.03653
5.54
0.04357
6.66
0.05173
7.92
0.05269
11.64
Pr > |t|
0.0112
0.0315
0.8829
0.8865
<.0001
0.2898
0.3316
0.1188
0.1247
0.0612
0.1641
0.5335
<.0001
<.0001
0.0027
0.0005
0.0435
0.0031
0.4941
<.0001
<.0001
<.0001
<.0001
18
The First Regression Summary Statistics
The MEANS Procedure
Variable
hourly_wage
SOUTH
AGE93
FIELD_SC
FIELD_EN
FIELD_SS
FIELD_CO
FIELD_BU
FIELD_HU
FIELD_ED
BLACK
top25
DROPOUT
HSGRAD
SOMECOL
BACH
GRADSCHL
OCC_PROF
OCC_MAN
OCC_CLER
OCC_SAL
OCC_CRA
OCC_HH
OCC_SVC
OCC_FARM
N
2343
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
5159
Mean
1174.91
-48.7045939
43.7220392
0.0773406
0.0013569
0.0346966
0.0065904
0.0738515
0.0407056
0.0750145
0.2828067
0.0792789
0.0924598
0.2333786
0.1316147
0.0744330
0.0858694
0.1515798
0.0730762
0.1703819
0.0294631
0.0703625
0.0137624
0.0827680
0.0120178
Std Dev
780.5697857
62.4078255
3.0198196
0.2671570
0.0368140
0.1830281
0.0809213
0.2615545
0.1976264
0.2634403
0.4504069
0.2701998
0.2897020
0.4230221
0.3381041
0.2624997
0.2801982
0.3586478
0.2602867
0.3760044
0.1691170
0.2557817
0.1165143
0.2755579
0.1089757
Minimum Maximum
10.00
9623.00
-128.00
1.00
39.00
49.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
Miss
2816
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
19
Appendix D
The Second Regression Analysis
Dependent Variable: hourly_wage_log
Analysis of Variance
Source
DF
Model
Error
Corrected Total
Sum of Squares
12
629
641
Root MSE
Dependent Mean
Coeff Var
17.37674
149.95519
167.33192
0.48826
7.26062
6.72483
Mean
Square
F Value
1.44806
0.23840
R-Square
Adj R-Sq
6.07
Pr>F
<.0001
0.1038
0.0867
Parameter Estimates
Variable
Intercept
SOUTH
AGE93
agesqu
FIELD_SC
FIELD_EN
FIELD_SS
FIELD_CO
FIELD_BU
FIELD_ED
BLACK
GRADSCHL
top25
Label
Intercept
SOUTH
AGE93
FIELD_SC
FIELD_EN
FIELD_SS
FIELD_CO
FIELD_BU
FIELD_ED
BLACK
GRADSCHL
top25
DF
1
1
1
1
1
1
1
1
1
1
1
1
Parameter
Estimate
11.18661
-0.09784
-0.19267
0.00226
0.24321
0.27962
0.10904
0.00202
0.07671
-0.02312
0.01876
0.24144
0.00924
Standard
Error
t Value
4.44167
2.52
0.04163
-2.35
0.20314
-0.95
0.00231
0.98
0.05884
4.13
0.28521
0.98
0.06486
1.68
0.22281
0.01
0.07366
1.04
0.05217
-0.44
0.05364
0.35
0.03908
6.18
0.04559
0.20
Pr > |t|
0.0120
0.0191
0.3432
0.3289
<.0001
0.3273
0.0932
0.9928
0.2981
0.6578
0.7266
<.0001
0.8394
20
The Second Regression Summary Statistics
Variable
Hourly wage
SOUTH
AGE93
FIELD_SC
FIELD_EN
FIELD_SS
FIELD_CO
FIELD_BU
FIELD_HU
FIELD_ED
BLACK
GRADSCHL
Top25
N
642
827
827
827
827
827
827
827
827
827
827
827
827
Mean
1613.0200000
0.3712213
43.5054414
0.1958888
0.0036276
0.1269649
0.0084643
0.1003628
0.1318017
0.2889964
0.1523579
0.5356711
0.2877872
Std Dev
877.9523162
0.4834239
2.9876107
0.3971235
0.0601563
0.3331352
0.0916670
0.3006649
0.3384798
0.4535705
0.3595849
0.4990278
0.4530054
Minimum Maximum
231.00
9623.00
0
1.00
39.00
49.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
0
1.00
Miss
185
0
0
0
0
0
0
0
0
0
0
0
0
21
Download