AN ECONOMETRIC ANALYSIS OF THE EFFECTS OF INTERNET USE AT WORK ON HOURLY WAGES Written By Nicolae Cristea Submitted to Professors H. Barreto and F. Howland In Partial Completion of the Requirements for Economics 31 April 17, 2000 Abstract This paper uses Current Population Survey data to examine whether workers who use the Internet at work earn a higher wage rate than otherwise similar workers who do not use the Internet at work. A multivariate regression model is used to control for variables that might be correlated with Internet use and earnings. The estimates show that Internet users earn on average 16 to 18 percent higher wages, depending on the worker characteristics. This study also offers some support for the technology-based hypothesis of increasing wage inequality in the 1980s and 1990s. 2 Table of Contents I. Introduction 4 II. Literature Review 6 III. Theoretical Analysis 11 IV. Empirical Results 14 A. The Data 14 B. Presentation and Interpretation of Empirical Analyses 18 V. Conclusion Bibliography 26 28 3 I. Introduction An increase in wage inequality since the 1970s has been documented by a large number of researchers. This increase has been labeled as a platitude of labor economics and found to be sharp in the 1980s, tapered off in the late 1980s, and reaccelerated in the 1990s (Bernstein and Mishel, 3)1. Alan B Krueger (1993) in his paper How Computers Have Changed the Wage Structure: Evidence From Microdata, 1984-1989, lists two leading hypotheses that have emerged to explain this increase in wage inequality. The first hypothesis is that increased international competition and trade has hurt the economic position of low-skilled and less-educated workers in the United States. The second hypothesis is that rapid, skill-biased technological change in the 1980s caused profound changes in the relative productivity of various types of workers. This paper focuses on the technological hypothesis for the increase in the wage inequality. CPS data sets will be used to explore the effect of technology’s impact on wages. In particular, I will consider the issue of whether employees who use the Internet at work earn higher wages. The analysis is similar to Krueger’s (1993) work and uses the same methodology. He uses the “computer revolution” as a prototypical example of technological change in the 1980s. I am using the “Internet revolution” as a prototypical example of technological change in the 1990s. This study is intended to explore the relationship between Internet use at work and wages. The remainder of the paper is organized as follows. In section II I will discuss the important research already done on this topic. In section III I will give some 1 Bernstein, Jared and Mishel, Lawrence, "Has Wage Inequality Stopped Growing?", Monthly Labor Review, December 1997 4 theoretical background for the claims made in this paper. Section IV will be the main part of the paper, were I will perform the empirical analysis, followed by the conclusion in section V. 5 II. Literature Review Although extensive work has been done to explain the increasing wage inequality of the recent decades, little or almost no attention has been paid to the emergence of the Internet as a major medium in the 1990s. Research supporting or questioning the technological hypothesis for wage inequality has been focused on the more general aspect of computer use. By far the most important analysis of how computer use has influenced wages is Alan B. Krueger’s How Computers Have Changed the Wage Structure: Evidence From Microdata, 1984-1989. His paper focuses on the issue of whether employees who use computers at work earn more as a result of applying their computer skills, and whether the premium for using a computer can account for much of the change in the wage structure. After controlling for a number of variables such as experience, race, and sex, Krueger finds that computer use accounts for about 10-15% higher earnings, depending on the kind of worker, year, and control variables included. In addition to answering the main question, the author addresses issues of possible bias and analyses the impact of computer use on other wage differentials. Krueger (1993) uses a semi-log form for his multivariable regression model. This is a functional form that is very frequently used in models for wages. In particular, in the model that he estimates, observation i’s natural log of the wage rate lnWi is assumed to depend on Ci (a dummy variable that equals one if the ith individual uses a computer at work, and zero otherwise), a vector of observed characteristics Xi, and error i: lnWi = Xi + Ci + 6 Where and are parameters to be estimated. I have summarized Krueger's (1993) findings in the table bellow. Independent Variable Intercept Uses computer at work (1=yes) Years of Education Experience Black (1 = yes) Female (1 = yes) Married (1 = yes) Union Member(1 = yes) October 1984 Coefficient Estimates 0.75 0.17 0.69 0.027 -0.098 -0.162 0.156 0.181 October 1989 SE 0.023 0.008 0.001 0.001 0.013 0.012 0.011 0.009 Coefficient Estimates 0.905 0.188 .075 0.027 -0.121 -0.172 0.159 0.182 SE 0.024 0.008 0.002 0.001 0.013 0.012 0.011 0.010 Note: Not all independent variables are included in the table Table II.1 Krueger's (1993) OLS Regression Estimates of the Effect of Computer Use on Pay The author concludes that within the framework of his analysis, the computer dummy variable (Uses Computer at Work) has a sizable and statistically significant effect on wages. I have replicated Krueger’s analysis using data from the 1997 CPS Computer Ownership/Internet Supplement. The following table gives the estimates that I have obtained: Independent Variable Intercept Uses computer at work (1=yes) Years of Education Experience Black (1 = yes) Female (1 = yes) Married (1 = yes) Union Member(1 = yes) October 1997 Coefficient Estimates 0.877 0.225 0.076 0.025 -0.098 -0.152 0.148 0.162 October 1998 SE 0.023 0.008 0.001 0.001 0.013 0.012 0.012 0.011 Coefficient Estimates SE Computer Use at Work not Surveyed! Note: Not all independent variables are included in the table Table II.2 October 1997 OLS Regression Estimates of the Effect of Computer Use on Pay My findings provide evidence that the computer use wage premium that Krueger calculated for 1984 and 1989 is still in place, and if anything, it even increased for 1997. Concerning omitted variable bias, Krueger (1993) tries a number of empirical strategies to probe whether the computer pay differential is a real consequence of computer use or is caused by some omitted variable. First, he looks at more homogeneous 7 groups of workers. He considers a sample of secretaries, one of the occupational groups defined in the CPS. He finds that for secretaries, the estimated computer use wage premium is at least as important as for one year of additional schooling. Also, in addition to the CPS data set, Krueger examined data from an additional data set: High School and Beyond Survey. He was able to consider a more comprehensive set of personal characteristics. His conclusion was similar to the one from the CPS estimates: "Computer use at work is an important determinant of earnings"(Krueger, 50). It is theoretically possible that Krueger is not considering some variable that is correlated with computer use and, thus, is responsible for the observed wage premium. However, a variable like that is far from obvious, and the author has considered nearly everything available that could be correlated with computer use. A replication of Kruger’s analysis, using a different data set, is The Returns to Computer Use Revisited: Have Pencils Changed the Wage Structure Too?” by John E. DiNardo and Jörn-Steffen Pischke. In addition to estimating the wage differential associated with the use of a computer at work, the authors use data on German workers to estimate wage differentials associated with the use of a calculator, a telephone, writing materials, or sitting on the job. They find that the wage differentials associated with these “white-collar” tools are almost as large as those measured for computer use. DiNardo and Pischke conclude that the results seem to suggest that computer users possess unobserved skills which might have little to do with computers, but which are rewarded in the labor market, or that computers were first introduced in higher paying occupations or jobs” (DiNardo and Pischke, 292). In interpreting their results, the authors determined that a 8 direct link between the effect of computer use on wages and changes in the wage structure is weak at best. Krueger's study has dealt with this problem by estimating premiums for computer use outside the workplace. If it were the case that computer users in general and not only computer users at work posses skills that are rewarded in the labor market, then the wage premium for computer use outside the workplace should have been very close to the wage premium for computer use at work. Krueger found that this was not the case. In my study, I will similarly deal with the possibility that Internet use in general, but not Internet use at work is rewarded in the labor market2. Another study that challenges the validity of the technology-based hypothesis of the growth of earnings inequality is Computers and the Wage Structure by Michael J. Handel. By finding that most of the growth of the inequality in wages took place in the early 1980s and that computer use at work had equalizing impacts on the gender wage gap, as well as decreasing the wage gaps between education groups, Handel (1999) questions the validity of Krueger’s (1993) findings. The increase in wage inequality did not stop in the yearly 1980s. Bernstein and Mishel (1997) found that "the sharpest increase was in the early 1980s, followed by a flattening in the second half of the 1980s and a reacceleration in the 1990s. This roughly coincides with the development of new technologies and implementation of computer at work in the past two decades. John Bound’s and George Johnson’s Changes in the Structure of Wages in the 1980s: An Evaluation of Alternative Explanations point out important findings as well. After thoroughly analyzing a number of alternative explanations, the authors conclude that the primary cause of increasing wage inequality is technological advances. 9 Furthermore, Stephen Machin and John Van Reen, in their paper Technology and Changes in Skill Structure: Evidence From Seven OECD Countries, find that there have been shifts in relative labor demand that have favored skilled workers on an international level. They conclude that changes in the wage and employment distribution are closely tied to technical changes. There is considerable support from the researchers in the field for the technological hypothesis for wage inequality. Krueger (1993) went a step further by pointing out the return to a more specific part of technology - computers. He found a positive, statistically significant wage premium to computer use. Although questions about possible omitted variable bias still remain, nearly the best effort was undertaken to consider every possible variable. In a similar fashion I will look at Internet use at work. Bound and Johnson (1992) pointed out that technology is the reason behind increasing wage inequality. In addition, Bernstein and Mishel (1997) provide evidence for increasing wage inequality in the 1990s. The development of the Internet in the 1990s has symbolized technological change in this period. My analysis will link this development of the Internet (as part of the technological development) and the increase in the wage inequality. III. 2 Theoretical Analysis See Internet use at Home and at Work in section IV.B 10 The development of the Internet as a major communications medium started in the early 1990s. Ever since then, the growth of the Internet has been astounding. The Internet Software Consortium estimates the number of Internet hosts to have grown from 1,313,000 in January 1993 to 72,398,092 in January 2000. A graph of the “learning curve” type growth pattern during this period is given bellow in Figure III.1. Figure III.1 Number of Internet Hosts 1993-20003 Today, in the US, the Internet has penetrated every sphere of human activity. In the workplace, the Internet is used for a variety of tasks, ranging from instant communications and file exchange via email to actual monetary transactions. Such a severe shock to the traditional workplace has to have had significant implications for the 3 Source: Internet Software Consortium (http://www.isc.org/) 11 parties involved. More precisely, this paper investigates the effects of Internet usage in the workplace on hourly wages. In the context of the broader idea of increased wage inequality in the 1990s, the Internet use at work explanation fits the technology hypothesis described in the introduction of this paper. It can be claimed that Internet use at work increases the productivity of the workers who use it and, hence, increases their wages. The problem with such a claim is that it is possible that the estimate is confounded by other variables like education, experience, race, or gender. Running a multivariate regression model that controls for confounding variables is necessary. The linear function form of the multivariate regression model that I will use to estimate the coefficient for the independent variable that I am interested in is: Y = 0 + 1X1 + 2X2 + … + nXn + , Where, Y is the dependent variable Bi is the coefficient term of the independent variable Xi, is an error term. Special attention needs to be paid to the origin of the error term. The error term in the model above represents the influence of omitted variables that can not be measured, as well as pure measurement error. In order for the above model to be a good tool for estimating slope coefficients, the error term needs to be generated in a very specific way. Namely, the error term has to represent a draw from the Standard Econometric Gaussian Error Box. In order for the hypothesis testing using the estimates obtained with the model above to be valid, certain assumptions about the Standard Econometric Gaussian Error Box Model have to hold. Firstly, the average of the “tickets” in the box has to be zero. 12 This means that the measurement process that generated the data has to be unbiased. Secondly, each measurement has to be independent of every other measurement. A prediction of the next draw based on any other draw has to be impossible. Thirdly, each measurement has to face the same array of possible errors. In other words, the errors need to be identically distributed. The third assumption is a problem for the linear function form of the multivariate regression model for wages. More precisely, heteroscedasticity will cause the reported OLS SEs to be biased and OLS will no longer be BLUE (Best Linear Unbiased Estimator). In order to deal with heteroscedasticity, I will use a semi-log functional form for the model. Such a functional form is very frequently used in models for wages. In addition, this is the form that Krueger (1993) uses in his analysis of computer usage at work and wages. Thus, the regression equation will take the following form: lnY = 0 + 1X1 + 2X2 + … + nXn + , Where, Y is the dependent variable Bi is the coefficient term of the independent variable Xi, is an error term I will use the semi-log functional form for the majority of models that I will estimate in this paper. 13 IV. Empirical Results A. The Data I will use in my empirical analysis data from the 1998 and 1997 Current Population Survey Computer/Internet Use Supplements, which I obtained using the data extraction tool available at http://ferret.bls.census.gov/. The sample is restricted to the respondents that reported earnings data, which represents about one fourth of the initial data set. The sample is further restricted to respondents that were in the civilian labor force and reported the number of hours usually worked during the week. In order to minimize possible confounding, I also restricted the sample to workers who held one main job. In this case I am able to look at a more homogenous population. I need to do this because no separate data on Internet use is available for the respondents who worked multiple jobs. In other words, it is not clear at which of these jobs they are or are not using the Internet. After these requirements were imposed, the sample size for 1997 was 11983 and for 1998 11388. The dependent variable in my analysis is Hourly Wage. I have built this variable using a number of Earnings and Labor Force Variables from the data set. As I have mentioned above, only one fourth of the respondents were asked earnings questions. Among these, the ones who were hourly workers reported their hourly wage. In this case, that is the value that I assigned to Hourly Wage for that particular respondent. Nonhourly workers reported weekly earnings. For these respondents, I divided the weekly earnings reported by the number of hours worked during the week at the main job. Following Krueger’s example, I set a lower limit for the hourly wage at $3.00 (Krueger's limit was $1.50, but his analysis is of the 1980s). This further reduced the sample to 14 11266 for 1998 and 11860 for 1997. It is also worth mentioning that the “hourly wage” reported by hourly workers in the data sets were top coded at $99.99 and weekly earnings were top coded at $2884.61. In order to partially fix this, I deleted 6 entries from the 1997 sample and 5 entries from the 1998 sample for which the Hourly Wage was greater then or equal to $99. The independent variable that will be the focus of my analysis is Uses Internet at Work, which is a dummy variable equaling 1 if the respondent reported using the Internet at work and 0 otherwise. There are a number of issues about this variable that need to be discussed: 1) The universe for the 1998 Uses Internet at Work variable is “Internet use outside the home”. Thus, the “out of universe” values are respondents that do not use the Internet outside their homes. For the purposes of this paper, these individuals were coded as not using the internet at work 2) The universe for the 1997 Uses Internet at Work variable is “computer use at work”. Thus those who did not report using a computer at work were also coded as not using the Internet. This was valid to a great extent in 1997 as other ways of accessing the Internet except for the computer were not widely available. However, this might become a valid concern in the future. The variables that I will use in my analysis are given in the table bellow: Nr. Variable Name 1 Hourly Wage 2 Uses Internet at Work 3 Education Less than 8th Grade 4 Education High School No Diploma Variable Description Data shows amount of dollars earned per hour. Hourly wage for hourly workers, weekly earnings divided by number hours worked at main job for non-hourly workers A dummy variable equaling 1 if the respondent uses the internet at work and zero otherwise A dummy variable equaling 1 if the respondent's highest level of school completed was less then 8th grade. A dummy variable equaling 1 if the respondent's highest level of school completed was less then 12th grade, or 15 5 Education Some College No Degree 6 Education Associate Degree 7 Education Bachelor's Degree 8 Education Master's Degree 9 Education Prof. School or Doctorate 10 Experience 11 12 Experience Squared Black 13 American Indian 14 Asian 15 Married 16 Female 17 18 Married*Female Part-time Worker 19 Lives in the North-East 20 Lives in the South 21 Lives in the West 22 Private Sector Worker 23 Years of Education 24 Uses Internet at Home 12th grade but no diploma and 0 otherwise A dummy variable equaling 1 if the respondent's highest level of school completed was some college but no degree and zero otherwise. A dummy variable equaling 1 if the respondent's highest level of school completed was an associate degree and zero otherwise. A dummy variable equaling 1 if the respondent's highest level of school completed was a bachelor's degree and zero otherwise. A dummy variable equaling 1 if the respondent's highest level of school completed was master's degree and zero otherwise. A dummy variable equaling 1 if the respondent's highest level of school completed was professional school or doctorate degree and zero otherwise. Experience in years as of the year surveyed. Calculated as Age - Education in Years - 6 Experience^2 Dummy variable. Demographics - race of respondent. 1 = Black, 0 = Other race Dummy variable. Demographics - race of respondent. 1 = American Indian, 0 = Other race Dummy variable. Demographics - race of respondent. 1 = Asian, 0 = Other race Dummy variable. Demographics - marital status. 1 = Married, 0 = otherwise Dummy variable. Demographics - sex. 1 = Female, 0 = Male Dummy variable. Interaction term. Dummy variable. 1 = Part-time Worker, 0 = Full-time Worker Dummy variable. 1 = Lives in the North-East, 0 = Lives outside of the North-East region Dummy variable. 1 = Lives in the South, 0 = Lives outside of the South region Dummy variable. 1 = Lives in the West, 0 = Lives outside of the West region Dummy variable. 1 = Works in the Private Sector of the Economy, 0 = Works in the Government Sector Constructed variable based on highest level of schooling variable. A dummy variable equaling 1 if the respondent uses the internet at home and zero otherwise Table IV.A.1 Variable Description 16 The summary statistics for the variables considered are given bellow: Variable Name Hourly Wage Uses Internet at Work Education Less than 8th Grade Education High School No Diploma Education Some College No Degree Education Associate Degree Education Bachelor's Degree Education Master's Degree Education Prof. School or Doctorate Experience Experience Squared Black American Indian Asian Married Female Married*Female Union Member Part-time Worker Lives in the North-East Lives in the South Lives in the West Private Sector Worker Years of Education Uses Internet at Home October 1997 Max Min 92.3 3 1 0 1 0 Mean 13.1 0.16 0.03 SD 8.10 0.37 0.17 0.09 0.29 1 0.2 0.4 0.08 October 1998 Max Min 96.15 3 1 0 1 0 n 11854 11854 11854 Mean 14.16 0.2 0.03 SD 9.23 0.4 0.17 n 11261 11261 11261 0 11854 0.1 0.3 1 0 11261 1 0 11854 0.2 0.4 1 0 11261 0.28 1 0 11854 0.09 0.29 1 0 11261 0.19 0.39 1 0 11854 0.18 0.38 1 0 11261 0.06 0.02 0.23 0.15 1 1 0 0 11854 11854 0.07 0.02 0.25 0.15 1 1 0 0 11261 11261 19.31 534.3 0.1 0.01 0.04 0.59 0.49 0.28 0.14 0.82 0.21 0.3 0.24 0.84 13.06 0.17 12.70 599.1 0.3 0.1 0.19 0.49 0.5 0.45 0.34 0.38 0.41 0.46 0.43 0.37 2.74 0.38 77 5929 1 1 1 1 1 1 1 1 1 1 1 1 20 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11854 11854 11854 11854 11854 11854 11854 11854 11854 11854 11854 11854 11854 11854 11854 11854 19.64 546.8 0.1 0.01 0.04 0.59 0.49 0.28 0.14 0.82 0.61 0.22 0.23 0.84 13.12 0.33 12.69 597.5 0.3 0.11 0.2 0.49 0.5 0.45 0.35 0.38 0.49 0.42 0.42 0.37 2.81 0.47 77 5929 1 1 1 1 1 1 1 1 1 1 1 1 20 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11261 11261 11261 11261 11261 11261 11261 11261 11261 11261 11261 11261 11261 11261 11261 11261 Table IV.A.2 Summary Statistics In 1997, 16% of the respondents were using the Internet at work. In 1998 this number rose to 20%. The respective numbers for Internet use at home are 17% and 33%. Home Internet users were increasing in numbers quicker than "at work" Internet users. In both years, 10% of the respondents were African American, 49% female and 19% and 18% respectively had at least college level education. B. Presentation and Interpretation of Empirical Analyses 17 General Findings and the Validity of the Box Model I will initially run a linear multivariate regression equation relating the observed independent variables to the dependent variable. Hourly Wage = 0 + 1*Uses Internet at Work + 2*Education less than 8th Grade + 3*Education High School no Diploma + 4*Education Some College + 5*Education Associate Degree + 6*Education Bachelor's Degree + 7*Education Master's Degree + 8*Education Professional School or Doctorate + 9*Experience + 10*Experience Squared + 11*Black + 12*American Indian + 13*Asian + 14*Female + 15*Married*Female + 16*Married + 17*Lives in the North East + 18*Lives in the South + 19*Lives in the West + 20*Union Member + 21*Private Sector Worker+22* Part-time Worker + . Where, 0 to 22 are the coefficient terms for the independent variables, is an error term. JMP outputs the following parameter estimates: Variable Name Intercept Uses Internet at Work Education Less than 8th Grade Education High School No Diploma Education Some College No Degree Education Associate Degree Education Bachelor's Degree Education Master's Degree Education Prof. School or Doctorate Experience Experience Squared Black American Indian Asian Married Female Married*Female Union Member October 1997 Estimate SE 4.903 0.304 3.124 0.171 -3.866 0.357 -1.058 0.224 1.145 0.167 2.758 0.229 6.452 0.178 8.444 0.280 11.794 0.413 0.347 0.016 -0.005 0.000 -1.351 0.202 0.327 0.591 -0.511 0.317 2.103 0.178 -1.458 0.185 -2.043 0.239 1.775 0.183 t-stat 16.110 18.300 -10.820 -4.730 6.850 12.040 36.270 30.180 28.540 22.280 -17.050 -6.700 0.550 -1.610 11.780 -7.900 -8.540 9.680 October 1998 Estimate SE 4.922 0.366 2.787 0.190 -3.516 0.428 -1.217 0.268 1.636 0.202 2.700 0.266 7.222 0.219 9.856 0.315 14.086 0.483 0.403 0.019 -0.007 0.000 -1.201 0.244 -0.324 0.638 -0.921 0.354 1.789 0.217 -2.173 0.222 -1.621 0.288 1.202 0.217 t-stat 13.460 14.670 -8.210 -4.530 8.100 10.150 32.970 31.260 29.160 21.030 -16.460 -4.920 -0.510 -2.600 8.260 -9.810 -5.630 5.540 18 Part-time Worker 1.271 0.170 7.500 1.243 Lives in the North-East 0.939 0.173 5.430 0.935 Lives in the South -0.313 0.162 -1.940 -0.134 Lives in the West 0.416 0.169 2.460 0.381 Private Sector Worker 0.540 0.173 3.120 1.014 Table IV.B.1 Parameter Estimates for the linear regression model 0.205 0.204 0.193 0.204 0.208 6.060 4.570 -0.690 1.870 4.870 In this case I am assuming that the error terms in the equation that I have estimated are generated by the Standard Econometric Gaussian Error Box Model. The error terms account for the influences of omitted variables and measurement error that took place during data generation. In order for the results obtained with the aid of this model to be useful, however, the requirements of the Standard Econometric Gaussian Error Box Model have to hold. I have already listed the requirements of this model in the theoretical analysis section of this paper. The problem here is that the error terms are not identically distributed. We can not see the error terms but we can see the residuals. The table bellow gives the SD's of the Hourly Wage residuals for the different levels of education in 1997 and 1998. SD's of Hourly Wage Residuals Education 1997 Educ. < = 8th Grade 3.441089895 Educ. = No Diploma High School 5.211222387 Educ. = High School 5.024704702 Educ. = Some College, No Degree 5.72779791 Educ. = Associate Degree 6.419010175 Educ. = Bachelor's Degree 8.280047537 Educ. = Master's Degree 8.536087498 Educ. = Professional School or Doctorate 10.69772711 1998 4.188114875 3.858918013 5.070032609 6.374499385 7.17440541 10.03441887 10.71291691 13.93272381 Table IV.B.2 SD's of Hourly Wage Residuals, linear functional form As it can be clearly seen, the spread of the residuals increases as the education increases. This is strong visual evidence that heteroscedasticity is present. To be certain, however, I will conduct the Goldfeld-Quandt test for detecting heteroscedasticity. I will use the continuous Years of Education variable to create "high" and "low" dispersion 19 groups. For 1997, I have obtained a G_Q statistic of 9.08 with a P-value extremely close to 0. Similarly, for 1998, I have obtained a G-Q statistic of 15.99 with a P-value extremely close to 0. This clearly demonstrates the presence of heteroscedasticity in both samples. In order to ameliorate the problem of heteroscedasticity, I will turn to a different functional form for my regression equation. A semi-log linear specification would be a more appropriate way to estimate the coefficients. Krueger (1993) uses a semi-log specification in his analysis. The equation that I will estimate for 1997 and 1998 is the following: ln Hourly Wage = 0 + 1*Uses Internet at Work + 2*Education less than 8th Grade + 3*Education High School no Diploma + 4*Education Some College + 5*Education Associate Degree + 6*Education Bachelor's Degree + 7*Education Master's Degree + 8*Education Professional School or Doctorate + 9*Experience + 10*Experience Squared + 11*Black + 12*American Indian + 13*Asian + 14*Female + 15*Married*Female + 16*Married + 17*Lives in the North East + 18*Lives in the South + 19*Lives in the West + 20*Union Member + 21*Private Sector Worker + 22* Part-time Worker + . Where, 0 to 22 are the coefficient terms for the independent variables, is an error term. The JMP output for this equation is the following: Variable Name Intercept Uses Internet at Work Education Less than 8th Grade Education High School No Diploma Estimate 1.788 0.198 -0.346 -0.143 October 1997 SE t-stat 0.019 92.660 0.011 18.340 0.023 -15.270 0.014 -10.110 Estimate 1.836 0.184 -0.304 -0.152 October 1998 SE t-stat 0.021 89.360 0.011 17.200 0.024 -12.660 0.015 -10.080 20 Education Some College No Degree 0.086 0.011 8.080 0.100 0.011 Education Associate Degree 0.214 0.015 14.750 0.191 0.015 Education Bachelor's Degree 0.439 0.011 38.980 0.431 0.012 Education Master's Degree 0.535 0.018 30.190 0.556 0.018 Education Prof. School or Doctorate 0.670 0.026 25.560 0.701 0.027 Experience 0.027 0.001 26.850 0.028 0.001 Experience Squared 0.000 0.000 -21.140 0.000 0.000 Black -0.115 0.013 -8.960 -0.090 0.014 American Indian -0.004 0.037 -0.120 -0.022 0.036 Asian -0.034 0.020 -1.710 -0.052 0.020 Married 0.144 0.011 12.730 0.109 0.012 Female -0.106 0.012 -9.050 -0.145 0.012 Married*Female -0.115 0.015 -7.550 -0.072 0.016 Union Member 0.159 0.012 13.630 0.123 0.012 Part-time Worker 0.177 0.011 16.450 0.185 0.012 Lives in the North-East 0.063 0.011 5.740 0.056 0.011 Lives in the South -0.023 0.010 -2.250 -0.016 0.011 Lives in the West 0.016 0.011 1.450 0.021 0.011 Private Sector Worker 0.014 0.011 1.310 0.036 0.012 Table IV.B.3 Parameter estimates for the semi-log regression model 8.830 12.800 35.060 31.400 25.830 25.610 -20.440 -6.570 -0.610 -2.620 8.920 -11.620 -4.420 10.120 16.100 4.850 -1.510 1.810 3.110 The table bellow gives the new SD's for the Hourly Wage residuals SD's of Hourly Wage Residuals 1997 Education Educ. < = 8th Grade Educ. = No Diploma High School Educ. = High School Educ. = Some College, No Degree Educ. = Associate Degree Educ. = Bachelor's Degree Educ. = Master's Degree Educ. = Professional School or Doctorate 1998 0.35 0.30 0.36 0.37 0.41 0.41 0.38 0.42 0.40 0.34 0.32 0.42 0.44 0.48 0.47 0.56 Table IV.B.3 SD's of Hourly Wage Residuals, semi-log functional form Visually, there is not enough evidence that heteroscedasticity has been completely eliminated. However, it is obvious that it has been ameliorated. The increase in the spread of residuals with the increase in the education is not nearly as dramatic as with the previous functional form. I will turn once again to the Goldfeld-Quandt test for determining the presence of heteroscedasticity. For 1997, I have obtained a G-Q statistic of 3.13 with a P-value close to 0. Similarly, for 1998, I have obtained a G-Q statistic of 3.2 with a P-value extremely close to 0. Heteroscedasticity is still present, although it has 21 been greatly decreased. In a more accurate analysis, correcting the heteroscedasticity should be considered. Hypothesis Testing Based on the estimates of the coefficients of the dependent variables that I have obtained using a multivariate semi-log regression model (Table IV.B.3), I can perform the following hypothesis tests for 1997 and 1998: 1) NULL: B1 = 0, holding the other independent variables constant, Uses Internet at Work has no effect on Hourly Earnings ALTERNATIVE: B1 0, holding the other independent variables constant, Uses Internet at Work has an effect on Hourly Earnings The t-statistic reported by JMP for 1997 is 18.34 and for 1998 is 17.20. The JMP reported P-value for both years is less than .0001. This means that if the null hypothesis were true, the probability of getting a result like the one in the samples above or even more extreme is less than .0001. Thus, the regression shows a statistically significant relationship between Uses Internet at Work and Hourly Wages. To assess the economic importance of the estimated coefficients, I will look at two identical individuals that are different only in using a computer at work. For this purpose I will consider a hypothetical female from the 1997 sample, that has a college degree, has 19.31(at mean) years of experience, is white, married, not a union member, lives in the North East, works in the private sector of the economy, and is a full-time worker. If she happened to be using the Internet at work, my model predicts that she was making and hourly wage of $14.80. The same person, but in the case that she was not using the Internet, is predicted to make $12.40, which is significantly lower (about 16% less). To 22 emphasize the importance of the Internet wage premium, I will look into the following situation. If I assume that the person who is using the internet did not even have a 4 year bachelor's degree, but only a 2 year associate degree, she would still be making $11.88, which is close to what the person with a 4 year degree but not using the Internet is making ($12.40). This boldly points out that the return to Internet use at work may be as important as nearly two years of college education. In the model above I found the differential in hourly pay between workers who use the Internet at work and those who do not to be 21.9% (exp(0.1984)-1) in 1997 and 20.1%(exp(0.1835)-1) in 1998. In addition, I should point out the negative estimates for a number of coefficients of the independent variables. Namely, in both years, the coefficients for Education Less than 8th Grade, Education High School No Diploma, Black, Asian, Female, Married*Female, and Lives in the South were negative. The individuals with the above characteristics are predicted to be disadvantaged in the labor market. The coefficient estimates for all of the above listed variables are statistically significant. JMP reports Pvalues of less than 0.001 for all of the coefficient estimates. Internet Use at Home and at Work A concern that has been pointed out in the literature review section of this paper is that perhaps Internet users posses unobserved skills that are correlated with Internet use and are rewarded in the labor market. This is not an issue of omitted variable bias, but rather an issue of weather it is Internet use at work that is rewarded and not some skill that Internet users in general have. In order to attempt to clarify this issue, I am going to estimate another model in which I will include two additional independent variables: Internet Use at Home, and Internet Use at Home*Internet Use at Work. If workers 23 are indeed rewarded for unobserved skills that are associated with Internet use, then one would expect the coefficient for Internet Use at Home*Internet Use at Work to be the largest, followed by both Internet Use at Home and Internet Use at Work at roughly equal values. If not Internet use at work but unobserved skills that Internet users have are rewarded, then it doesn’t matter were the person uses the Internet, she will still have the positive coefficients. Bellow is JMP’s output for this particular model: Variable Name October 1997 October 1998 Estimate SE t-stat Estimate SE t-stat Intercept 1.773 0.019 91.610 1.798 0.021 86.840 Uses Internet at Work 0.187 0.013 13.890 0.176 0.014 12.820 Education Less than 8th Grade -0.343 0.023 -15.160 -0.289 0.024 -12.070 Education High School No Diploma -0.143 0.014 -10.100 -0.149 0.015 -9.930 Education Some College No Degree 0.078 0.011 7.360 0.088 0.011 7.800 Education Associate Degree 0.206 0.015 14.210 0.176 0.015 11.840 Education Bachelor's Degree 0.427 0.011 37.720 0.407 0.012 32.910 Education Master's Degree 0.514 0.018 28.810 0.522 0.018 29.400 Education Prof. School or Doctorate 0.653 0.026 24.930 0.672 0.027 24.860 Experience 0.027 0.001 27.040 0.028 0.001 25.830 Experience Squared 0.000 0.000 -21.070 0.000 0.000 -20.290 Black -0.108 0.013 -8.470 -0.070 0.014 -5.080 American Indian 0.004 0.037 0.110 -0.017 0.036 -0.470 Asian -0.029 0.020 -1.430 -0.051 0.020 -2.590 Married 0.140 0.011 12.430 0.102 0.012 8.410 Female -0.103 0.012 -8.820 -0.137 0.012 -11.060 Married*Female -0.114 0.015 -7.520 -0.077 0.016 -4.820 Union Member 0.159 0.012 13.730 0.123 0.012 10.140 Part-time Worker 0.183 0.011 17.060 0.198 0.011 17.230 Lives in the North-East 0.061 0.011 5.610 0.050 0.011 4.410 Lives in the South -0.024 0.010 -2.360 -0.019 0.011 -1.760 Lives in the West 0.013 0.011 1.180 0.017 0.011 1.500 Private Sector Worker 0.013 0.011 1.220 0.036 0.012 3.100 Uses Internet at Home 0.097 0.012 7.740 0.116 0.010 11.650 Use Int. @Home*Use Int. @Work -0.028 0.022 -1.230 -0.001 0.020 -0.070 Table IV.B.3 Parameter estimates for the semi-log regression model, including Internet use at home The coefficients that are important for this particular analysis are given in bold characters. The results point out that the wage premium is greatest for those who use the internet at work, followed at a considerable margin by those who use it at home, and with a statistically insignificant negative wage premium by those who use the internet at both work and home. This provides evidence that if our estimated coefficient for Internet use 24 is close to the true one, then it is Internet use at work that is rewarded and not some skill that Internet users have. This particular analysis does not say anything about the confounding that might be in place due to any omitted variables, but rather comments on the meaning of the coefficient already obtained, be it biased or not. V. Conclusion 25 The notion of Internet is readily associated with technology, change, and productivity. In this framework of ideas, Internet use at work is expected to have an important, positive impact on hourly wages. This paper provides empirical evidence for this claim obtained from the Current Population Survey Computer/Internet Use Supplements of 1997 and 1998 . I conducted hypothesis testing and inquiries into the economic importance of the Internet use at work impact on Wages. First, I have estimated an Ordinary Least Squares (OLS) linear multivariate regression model to find the extent of the effect of Internet use at work on wages, after controlling for a number of variables that are generally perceived to have an impact on wages. I had to turn to a different functional form, because the model under this functional form did not satisfy the requirements of the Standard Econometric Gaussian Error Box Model. Namely, the results were affected by heteroscedasticity. I have thus estimated an OLS semi-log functional form multivariate regression model. The regression estimate of the model finds a statistically significant relationship between Internet use at work and wages. Also, the estimate has economic importance, as I have found the differential in hourly pay depending on Internet use at work to be around 20 percent. Next, I am addressing the issue of weather it is Internet use at work and not some other skill associated with Internet use in general that is rewarded. Here, again, I estimate an OLS semi-log functional form multivariate regression model including two new independent variables for Internet use at home. I find Internet use at home to have a statistically significant effect on wages as well, but the coefficient estimate is considerably lower than the coefficient for Internet use at work. Therefore, I conclude 26 that biased or not, the coefficient for Internet use at work stands for returns on the Internet use at work and not on skills related with Internet use in general. A main concern in my analysis is the presence of heteroscedasticity. Although I have been able to reduce its extent by turning to a different functional form, it is still present in the model that I am using to do hypothesis testing. An improvement of my analysis would thus be eliminating heteroscedasticity and commenting on the new estimates. Although I have found statistically and economically significant relationships between Internet use at work and wages, I can not make any significant claims about the validity of the technology-based hypothesis of increasing wage inequality in the 1990's. I have merely shown that Internet use at work positively influences wages. Weather the great technology increases of the 1990's have increased the wage gap is to be seen in a much broader, more inclusive study. 27 Bibliography Krueger, Alan B., “How Computers have Changed the Wage Structure: Evidence form Microdata, 1985-1989,” Quarterly Journal of Economics, February 1993, 108:3360 DiNardo, John., Pischke, Jörn-Steffen. “The returns on computer use revisited: have pencils changed the wage structure too?” Quarterly Journal of Economics, February 1997, 112:291-303 Handel, Michael J. “Computers and the Wage Structure” Working Paper No. 285: The Jerome Levy Economics Institute, October 1999 Bernstein, Jared., Mishel, Lawrence. “Has Wage Inequality Stopped Growing?”, Monthly Labor Review, December 1997, 3-16 Bound, John., Johnson, George. “Changes in the Structure of Wages in the 1980’s: An Evaluation of Alternative Explanations” The American Economic Review, June 1992, 82:371-392 Machin, Stephen., Van Reenen, John. “Technology and Changes in Skill Structure: Evidence From Seven OECD Countries” The Quarterly Journal of Economics, November 1998, 1251-1244 28