1 Large Women Small Pay: An Empirical Study on the Impact of Obesity on White Women Mario Halasa May 8th, 2013 2 I. …………………………………. II. …………………………………. Methods and Previous Studies III. …………………………………. A Model of Weight and Wages IV. …………………………………. Data V. …………………………………. Empirical Results VI. …………………………………. Conclusion …………………………………. References Appendix: - Figure (1) OLS Estimates -Figures (2 & 3) The Reg Procedure (4 & 5) The Means Procedure -Figure (6) Oaxaca Decomposition -SAS Code Introduction 3 ABSTRACT This paper tests the hypothesis that obese white women will experience a lower wage then non-obese white women. I first observe the difference using an OLS regression followed by an Oaxaca Decomposition Model. I failed to reject the hypothesis that obese white women will experience a lower wage then non-obese white woman. I found that obese women on average earn a wage of 8.67% lower an hour, or roughly $1.46. I also found in my Oaxaca Decomposition that there is a 16.81% gap in the wage between obese and non-obese white women, of which 8.29% is unexplained. 1 I. Introduction In 20011 more than one-third of U.S. adults (35.7%) and approximately 17% of children and adolescents are obese. Of the 75 million obese people in the US 40 million are women (CDC 2013). Not only does being overweight or obese affect your health, but recent studies have also shown that being obese may have a negative impact on wage (Cawely 2004). This paper will test the hypothesis that obese white women will experience a lower wage then non-obese white woman. Cawely 2004 has concluded, discrimination aside, there are three main explanations of why obesity and lower A special thanks to Dr. Renna and Dr. Fang for helping me construct a well written paper. 1 4 wages are correlated. The first explanation is that simply being obese will lower your wages; for example obese people might be less productive then non-obese people, resulting in a lower wage. For example if you are working in retail maybe the more attractive a worker is correlates to how productive they can be. The second is that lower wages causes obesity, assuming that because you are poor; the quality of food you consume is less then that of a non-poor person. The third category is that unobserved variables cause both obesity and low wages. There also could be a scenario that white women are simply discriminated against because of their weight. Employers and co-workers may have certain stereotypes in which working with more slender women fits their preferences, in which case a lower wage might ensue. My paper uses economic methods such as an OLS Regression followed by an Oaxaca Decomposition in an attempt to show a wage penalty for obese white women. The reason I chose to run an Oaxaca/Binder Decomposition is because according to Oaxaca (1973), the decomposition method is the best was to look at inner-group differences (such as obese/non-obese) in the means levels of those due to the differences in the characteristics of variables compared to those due to the differences in the coefficients. Recent literature such as Cawley (2004) and Han (2011) uses data from the National Longitudinal Survey of Youth to show that obesity has a negative impact on wages. This is an appropriate dataset for the proposed study because it gathers information, on the same respondents about labor market information such as 5 occupation, schooling, and wages for different groups of women starting from 1979 on. II. Methods and Previous Studies Several studies have linked obesity to labor market outcomes, mostly wages. Averett & Korenman (1996) looks at the economic differences by body mass index for a sample of men and women age 23-31. To test the hypothesis that obesity has a negative impact on wages, the author uses cross sectional data from the NLSY 1979 cohort, they found that obese women have lower family incomes and lower hourly wages than those women who fell in the recommended weight-for–height range. Averett and & Korenman suggest that women who were obese or overweight at ages 16-24 have at ages 23 to 31 lower spousal income and are less likely to be married. When looking at men age 16 to 24, they found a negative correlation between weigh and wages, however there results were not statistically significant unlike the results for women. One problem with Averett & Korenman (1996) is that the weight variable could be endogenous. The OLS estimates would produce biased estimates of the relationship between obesity and wage. Pagan & Alberto (1997) used OLS to find that obese females make less then their more slender coworkers. Using a Hausman specification test to determine whether their OLS estimate is biased, they fail to reject the hypothesis that weight is uncorrelated with the error term of the wage equation. However, Pagan & Alberto call their test into question because their “instruments” (family poverty level, health limitations, and indicator variables about self esteem) are likely correlated with the error term in the wage equation. They 6 further state that given their IV test is probably hindered by the same kind of bias as their OLS, it is not surprising that they fail to reject the hypothesis that the weight variable is exogenous (Pagan 1997). Cawley (2000) found that weight lowers wages of white women, and a difference in weight of standard deviation (which they equate to 65 pounds) is associated with a wage penalty of 7%. Cawely (2004) extends the Mincer equation be including a measurement of body weight. Using data collected from the NLSY Cawely 2004 finds that weight lowers wages for white females; OLS estimates conclude that a difference of 2 standard deviations in weight, (65 pounds) for white women, show a 9% decrease in wages. This is equivalent to the wage effect of 1.5 years of education or what Cawely attributes to three years of work experience. When looking at the relationship between weight and wages of other gender-ethnic groups, he finds a negative correlation, which he attributes to unobserved heterogeneity. One thing the paper lacks is the explanation of why there seems to be a dramatic difference across gender and race of the negative effect obesity has on wage rates (Cawley 2004). Han et al. (2008) explores the extent to which the effects of obesity on labor market outcomes varies with age, skills, and occupation. Using data from the NLSY along with 1970- 1990 census information they found that women who were obese, relative to underweight or normal weight, had a decrease in the likelihood of employment, except in the case of blacks. Han et al. (2008) ends their paper by stating that their results may be subjective because endogeniety may exist in the marital status variable, and the number of children a woman has. 7 Han et. al (2011) examines the direct and indirect effects of body weight in the late- teenage years on wages, taking into account education and occupation choices. Using the Mincer equation, accompanied by an IV equation they conclude that a one unit increase of BMI is associated with a decrease in hourly wage of 1.83% for women, however, after controlling for education followed by occupation the decrease was .67 and .53% respectively (Han 2011). Han et. al (2011) also concludes that limited evidence supports the fact weight lowers wages for Hispanic women and no evidence weight lowers wages for black women. III. A Model of Weight and Wages Developed in 1974 the Mincerian wage Regression looks at the statistical relationship between market wages, education and experience. For my model assume the wages W and Body Mass Index B have the following relationship for individual i at time t you observe the following: LnW it Bit X it it In the above equation X is a vector of variables that effect wage, such as education and experience, and is the error term. Furthermore represents the effect of BMI on log wages. As long as Body Mass Index is exogenous then running an OLS regression will work. Estimating the model first through OLS, I have chosen to set up my equation using a dummy variable to represent obesity. Then to further my model, and to introduce something new to the field of economics and to try something that hasn’t been done before, I will evaluate the results with an Oaxaca Decomposition Model. 8 Introduced in 1973 this model explains the gap in the means of an outcome variable between two groups (obese/non-obese) (Oaxaca 1973). As previously stated this gap is then broken down into the part that is due to group differences in the variables, as well as group differences in the coefficients. ˆ NO NO ˆ NO LogWage NO ˆOO ˆ O LogWage O I ran the log wage equation separately for non-obese then obese females. It can be shown that the % gap is equal to the difference between the log wage of obese and non-obese females. By doing this you are able to see what the percentage difference in wage is from obese to non-obese white females. WageNO WageO gap% WageNO Once here I was able to decompose my % gap into two parts, explained and unexplained. The unexplained part is explaining the discrimination percentage. So of my % gap so much will be explained and what’s left over is discrimination. Decomposition: (LogW NO O LogW ) (X NO O X ) NO ( NO O ) ( NO O )X O Part of the equation that explains Part of the equation that explains the the difference in variables. difference in coefficients. 9 IV. Data: National Longitudinal Survey of Youth The data used in this study are from the National Longitudinal Survey of Youth (NLSY 79). All respondents were between the ages of 14 and 21 years old as of December 1979, when the first interviews where starting to be conducted. At the start of this survey, participants were to distinguish between gender, age, and ethnicity. For this paper weight from the year 2000 was gathered along with height, which was recorded in 1985; the respondents were between the ages of 20 and 27 when their height was recorded. The measurement of weight for this paper is BMI. The medical definition to be considered obese is having a BMI above 30, which is the cut off line I used for this paper (CDC 2013). Also any outliers in wage that were recorded were modified so that any hourly wage under $1 is deleted and any wage above $500 an hour is deleted. The variables used in this model are as follows: Figure 1- OLS Procedure, Parameter Estimates Included Variable Definition Estimates HGCM HGCF Managerial AFQT Attending Technical Full time Sales Tenure Administrative Service HGC Weeks worked Highest grade completed by the mother Highest grade completed by father Occupation Armed Forces Qualifier (intelligence) Attending school Occupation Works more then 20 hours/ week Occupation Job Tenure Occupation Occupation Highest grade completed Number of weeks worked since past interview Occupation .00644 .01081* .22720* .00176** -.02226 .22081 -.06573 .04042 .00434** .00270 -.21050 .06093** .00633** Farming -.00123 10 Repair Assemblers Transportation NE NC S NM M Obese Occupation Occupation Occupation North East Region North Central Region Southern Region Never Married Married Dummy Variable .17878 .04288 .14016 -.04945 -.15807** -.11965** -.09487 -.04071 -.08677** Significant at 95% * at 99%** V. Empirical Results When evaluating my results I found that I failed to reject my hypothesis that obese white women experience a lower wage then non-obese white woman. When looking at Figure 1, you can see that obese white women on average earn 8.67% less an hour then their non-obese counterpart, and the variable was statistically significant at the 99% Confidence Level. This 8.67% converted to a dollar figure is about $1.46 less an hour. Other variables that are statistically significant include the coefficient for Armed Forces Qualifying Test (AFQT), which is .00716, meaning that for every additional point on your AFQT score you make .17% more an hour. The coefficient for highest-grade completed (HGC) is .06093, which implies that for every additional grade completed you earn 6.1% more an hour. If you live in the North Central region, you make on average 15.8% less an hour compared to if you lived in the Western Region. The coefficient for the Managerial variable is .22720, which implies that you make on average 23% more an hour then if you worked as a laborer. The last variable I would like to point out is highest grade completed by father (HGCF). If your father increased his education by one year, you make 1.2% more an hour, than if he didn’t, when you are looking at the results in comparison to 11 the mean. What is also interesting is that there is no statistical significance when looking at the variable highest grade completed by mother (HGCM). When evaluating my results ran through an Oaxaca Decomposition model I found that 8.58% of the 16.81% gap in wages is explained and 8.29% is unexplained or deemed the discriminatory percent. When referring back to my results ran through an OLS regression I found a 8.67% difference in the wage of an obese white woman compared to a non-obese white women. VI. Conclusion This paper looks at and measures the correlation between weight and wages of obese white females in the United Stated. My hypothesis that obese white females are discriminated against by making a lower wage than their non-obese counterpart is not rejected, with statistical significance when ran through and OLS regression followed by an Oaxaca Decomposition. The reason I chose to specifically look at obese white women is because past research has found no statistical significance when looking as wage differences of obese black or Hispanic women. As previously stated if BMI is strictly exogenous then OLS estimates of can be looked at as the unvarying estimate of the true effect of BMI on wages. Cawley (2004) suggests otherwise, as variation in BMI may be lined to nongenetic factors such as individual choices and environment. Further instead of obesity effecting wages, it could be that wages effects obesity, so you could have an issue of simultaneity. Their-for a Two Staged Least Squared or an IV test can be run to have more exact results. But because of time and preference it was not. 12 When looking at my variable selection I would next time include an experience variable to represent the work experience women had prior to their wage that was recorded in the NLSY 79. I also need to examine whether the reproductive history has an effect on women’s wages. To further control for discrimination by obesity I could separate occupational fields by categories which involve physical labor, like farming, and jobs that do not, like sales to see if there is a pay difference between those jobs. If there is maybe we can better relate obesity more convincingly to productivity and bring more concrete answers to this field of study in economics. 13 References Averett, Susan; Korenman, Sanders. The Economic Reality of the Beauty Myth. The Journal of Human Resources. Vol. 30 No 2. p 304-330 1996. Cawley, John. The Impact of Obesity on Wages. Journal of Human Resources. Vol. 32 No 2. p. 451-474. 2004. Center for Disease Control and Prevention (CDC). http://www.cdc.gov/. 2013. Comuzzie, A. G., and D. B. Allison. 1998. "The Search for Human Obesity Genes." Science 280:1374-77. Han, Euna; Norton, Edward C. Stearns, Sally C Weight and Wages: Fat versus Lean Paychecks. Health Economics. Vol 18. No 5. p 535-548. 2009. Han, Euna; Norton, Edward C; Powell, Lisa M. Direct and Indirect Effects of Body Weight on Adult Wages. Economics and Human Biology. Vol 9 No 4. p 391-392. 2011. Oaxaca, R. 1973. “Male-Female Wage Differentials in Urban Labor Markets.” International Economic Review 14: 693–709. Pagan, Jose A., and Alberto Davlia. 1997. “Obesity, Occupation Attainment, and earnings.” Social Science Quarterly 8(3): 756-70. Journal Wada, Roy; Tekin, Erdal. Body Composition and Wages. NBER Working Paper Series. 2007. 14 Appendix: Figure 2 - The Reg Procedure: Obese= 0 Variable DF Intercept HGCM HGCF AFQT Attending Managerial Technical Sales Administrative Service Farming Repair Assemblers Transportation NE NC S NM M Full_time Tenure HGC Weeks_worked 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Parameter Estimates 1.18166 0.01259 0.00710 0.00210 -0.00130 0.21428 0.25493 0.05313 -0.00847 -0.25331 -0.14482 0.20608 0.05031 0.06386 -0.06519 -0.19212 -0.16101 -0.10932 -0.04989 -0.06718 0.00032997 0.05843 0.00660 Standard Error t Value Pr>ItI 0.19111 0.00852 0.00623 0.00079453 0.07922 0.14067 0.15967 0.14568 0.14072 0.14326 0.24062 0.16575 0.15599 0.18239 0.05384 0.04700 0.04676 0.06724 0.03906 0.05242 0.00005749 0.00924 0.00118 6.18 1.48 1.14 2.64 -0.02 1.52 1.60 0.36 -0.06 -1.77 -0.60 1.24 0.32 0.35 -1.21 -4.09 -3.44 -1.63 -1.28 -1.28 5.74 6.33 5.59 <.0001 .1398 0.2545 0.0084 0.9869 0.1280 0.1106 0.7154 0.9520 0.0773 0.5474 0.2140 0.7471 0.7263 0.2262 <.0001 0.0006 0.1043 0.2017 0.2002 <.0001 <.0001 <.0001 Standard Error t Value Pr>ItI 0.29175 0.01259 0.00989 0.00130 0.17201 0.17903 0.19303 3.40 -0.76 2.09 0.63 -0.83 1.30 0.71 0.0008 0.4480 0.0374 0.5313 0.4069 0.1945 0.4802 Number of Observations -1131 Figure 3 The Reg Procedure - Obese= 1 Variable DF Intercept HGCM HGCF AFQT Attending Managerial Technical 1 1 1 1 1 1 1 Parameter Estimates 0.99329 -0.00957 0.02067 0.00081590 -0.14287 0.23282 0.13645 15 Sales Administrative Service Farming Repair Assemblers Transportation NE NC S NM M Full_time Tenure HGC Weeks_worked 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -0.02189 0.02802 -0.08047 0.06193 0.04211 -0.02943 0.41005 -0.02062 -0.04284 0.01800 -0.04266 0.00229 -0.02894 0.00039658 0.06671 0.00475 0.18406 0.17668 0.18007 0.23079 0.22484 0.21079 0.28190 0.09784 0.07743 0.09285 0.09285 0.06857 0.09335 0.00009148 0.01474 0.00198 -0.12 0.16 -0.45 0.27 0.19 -0.14 1.45 -0.21 -0.55 0.23 -0.46 0.03 -0.31 4.34 4.53 2.40 0.9054 0.8741 0.6553 0.7886 0.8516 0.8890 0.1469 0.8332 0.5805 0.8167 0.6463 0.9733 0.7568 <.0001 <.0001 0.0172 Number of Observations- 315 Figure 4 The Means Procedure - Obese = 0 Variable N Mean Std Dev Minimum Maximum ID HGCM HGCF Race Gender AFQT Height Attending Occupation Full_time Weight Tenure Wage Region Marital HGC Weeks_worked Lnheight Lnweight Female Lnwage Bmi 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 1131 3484.29 12.0433245 12.2166225 3.00 2.00 56.6523979 64.8328912 0.0415561 271.5402299 0.8938992 141.8479222 306.1679929 16.8985500 2.5075155 1.3492485 13.9071618 45.9460654 4.1710698 4.9436129 1.00 2.6065148 23.6945400 2673.26 2.2766372 3.1373621 0 0 25.9574777 2.5015776 0.1996609 216.3108320 0.3081028 21.2444650 290.6697303 14.7889623 0.9724816 0.9724916 2.3685349 14.0473822 0.0385712 0.1494941 0 0.6397782 3.0640512 14.00 0 0 3.00 2.00 1.0330 58.00 0 5.00 0 84.00 1.00 1.00 1.00 1.00 6.00 0 4.0604430 4.4308168 1.00 0.0099503 14.4169922 12140.00 20.00 20.00 3.00 2.00 100.00 72.00 1.00 889.00 1.00 215.00 1202.00 192.30 4.00 4.00 20.00 52.00 4.2766661 5.3706380 1.00 5.2590567 29.9831383 16 White NE 1131 1131 1.00 0.1724138 0 0.3779068 1.00 0 1.00 1.00 NC 1131 0.3227233 0.467727 0 1.00 S 1131 0.3297966 0.4703471 0 1.00 W 1131 0.1750663 0.3801919 0 1.00 NM M Managerial 1131 1131 1131 0.0751547 0.7038019 0.4235190 0.2637575 0.4567814 0.4943346 0 0 0 1.00 1.00 1.00 Technical Sales 1131 1131 0.0380195 0.0946065 0.1913278 0.2928002 0 0 1.00 1.00 Administrative Service Farming Repair Assemblers Transportation 1131 1131 1131 1131 1131 1131 0.2166225 0.1220159 0.0061892 0.0274094 0.0415561 0.0167993 0.4121254 0.3274490 0.0784624 0.1633453 0.1996609 0.1285756 0 0 0 0 0 0 1.00 1.00 1.00 1.00 1.00 1.00 Laborers 1131 0.0132626 0.1144477 0 1.00 Figure 5 The Means Procedure - Obese = 1 Variable N Mean Std Dev Minimum Maximum ID HGCM HGCF Race Gender AFQT Height Attending Occupation Full_time Weight Tenure Wage Region Marital HGC Weeks_worked Lnheight 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 315 3467.26 11.4603175 11.5142857 3.00 2.00 52.8785524 64.3238095 0.0222 305.1460317 0.9111 210.1492063 321.2317460 13.3238703 2.55 1.222 13.4190 46.133 4.1629693 2716.60 2.5565290 3.1026420 0 0 25.9718977 2.8165635 0.1476401 207.0388074 0.2850361 43.0173268 304.0358682 9.2683291 0.9060 0.9975199 2.3162990 2.31639 0.0429705 3.00 0 1.00 3.00 2.00 0.2490 53.00 0 8.00 0 150.00 1.00 1.790 1.00 0 7.00 0 3.9702919 12135.00 18.00 20.00 3.00 2.00 99.8190 73.00 1.00 888.00 1.00 600.00 1181.0 102.560 4.00 6.00 20.00 52.00 4.2904594 17 Lnweight Female Lnwage Bmi White NE 315 315 315 315 315 315 5.3311550 1.00 2.4384262 35.6074312 1.00 0.1238095 0.1752794 0 0.5242298 6.2682797 0 0.3298882 5.0106353 1.00 0.5822156 30.0354004 1.00 0 6.3969297 1.00 4.6304480 91.2197232 1.00 1.00 NC 315 0.3587302 0.4803909 0 1.00 S 315 0.3587302 0.4803909 0 1.00 W 315 0.1619048 0.3689495 0 1.00 NM M Managerial 315 315 315 0.1492063 0.6634921 0.2984127 0.3568586 0.4732667 0.4582896 0 0 0 1.00 1.00 1.00 Technical Sales 315 315 0.0730159 0.1142857 0.2605765 0.3186642 0 0 1.00 1.00 Administrative Service Farming Repair Assemblers Transportation 315 315 315 315 315 315 0.2317460 0.1555556 0.0253960 0.0285714 0.0380952 0.0126984 0.4226190 0.3630101 0.1575775 0.1668637 0.1917308 0.1121476 0 0 0 0 0 0 1.00 1.00 1.00 1.00 1.00 1.00 Laborers 315 0.01904776 0.1369099 0 1.00 Figure 6 Oaxaca Decomposition Variable ID Mean obese= 0 beta for obese=0 obese=1 Beta for obese=1 explained unexplained 0 0 0 0 HGCM 12.0433 11.4603 0.01259 -0.00957 0.00733997 0.253960248 HGCF 12.2166 11.5142 0.0071 0.02067 0.00498704 -0.156247694 race 0 0 0 0 gender 0 0 0 0 AFQT 56.6523 52.8785 0.00792498 0.067901282 height attendin g occupati on 64.8328 64.3238 0 0 0.0416 271.540 2 0.0222 0.003142854 full time weight 0.8939 141.847 9 210.1492 tenure 306.167 321.2317 wage 16.8986 2.5075 region 0.0021 0.0008159 -0.0013 -0.14287 -0.00002522 0 0 -0.06718 -0.02894 0.001155496 -0.034840464 0 0 0.0003299 0.0003965 -0.004970602 -0.021397244 13.3239 0 0 2.5555 0 0 305.146 0.9111 18 marital 1.3492 1.2222 0 0 HGC weeks worked 13.9072 13.419 0.05843 0.06671 0.028525526 -0.11110932 45.9461 46.1333 0.0066 0.00475 -0.00123552 0.085346605 0 0 0 0 male female 0 0 0 0 lnwage 2.6065 2.4384 0 0 bmi 23.6945 35.0674 0 0 hsp 0 0 0 0 blk 0 0 0 0 white 0 0 0 0 NE 0.1724 0.1238 -0.06519 -0.2062 -0.003168234 0.017457038 NC 0.3227 0.3587 -0.19212 -0.04284 0.00691632 -0.053546736 S 0.3298 0.3555 -0.16101 0.018 0.004137957 -0.063638055 W 0.1751 0.1619 0 0 NM 0.0752 0.1492 -0.10932 -0.04266 0.00808968 -0.009945672 M 0.7038 0.6634 -0.04989 0.00229 -0.002015556 -0.034616212 0 0 0 0 0 0 0 0 0.4235 0.2984 0.21428 0.23282 0.026806428 -0.005532336 married nottoget her manage rial technica l sales administ rative 0.038 0.0731 0.25493 0.13645 -0.008948043 0.008660888 0.0946 0.1143 0.05313 -0.02189 -0.001046661 0.008574786 0.2166 0.2371 -0.00847 0.02802 0.000173635 -0.008651779 service 0.122 0.1555 -0.25331 -0.08047 0.008485885 -0.02687662 farming 0.0062 0.0234 -0.14482 0.06193 0.002490904 -0.00483795 repair assembl ers transpor tation 0.0274 0.0286 0.20608 0.04211 -0.000247296 0.004689542 0.0416 0.0381 0.05031 -0.02943 0.000176085 0.003038094 0.0168 0.0127 0.06386 0.41005 0.000261826 -0.004396613 laborers 0.0132 0.019 0.0858146 -0.082865358 0.0858146 -0.0828654 SAS Coding data one; set mario; if weight <= 0 then delete; if height <=0 then delete; lnheight=log(height); lnweight=log(weight); male = 0; 19 female = 0; if wage < 0 then delete; wage=wage/100; AFQT=AFQT/1000; if wage <1 then delete; if wage > 500 then delete; lnwage=log(wage); if gender = 1 then male = 1; if male = 1 then delete; if gender = 2 then female = 1; if gender < 0 then delete; if HGCM < 0 then delete; if HGCF < 0 then delete; if AFQT < 0 then delete; if attending = -4 then attending = 0; if attending = -5 then delete; if occupation < 0 then delete; if full_time < 0 then delete; if tenure < 0 then delete; if region < 0 then delete; if marital < 0 then delete; if HGC < 0 then delete; if weeks_worked < 0 then delete; obese=0; bmi= ((weight*703)/(height*height)); if bmi >30 then obese=1; else obese=0; if race = 1 then hsp = 1; else hsp=0; if race = 2 then blk = 1; else blk=0; if race = 3 then white = 1; else white=0; if region = 1 then NE =1; else NE = 0; if region= 2 then NC=1; else NC=0; if region=3 then S=1; else S=0; if region=4 then W=1; else W=0; if marital = 0 then NM=1; else NM=0; if marital = 1 then M=1; else M=0; if married = 2 then nottogether=1; if married = 3 then nottogether=1; if married = 6 then nottogether=1; else nottogether=0; if 3<=occupation<=199 then managerial=1; else managerial= 0; if 203<=occupation<=235 then technical=1; else technical=0; if 243<=occupation<=285 then sales=1; else sales=0; if 303<=occupation<=389 then administrative=1; else administrative=0; 20 if 403<=occupation<=469 then service=1; else service=0; if 473<=occupation<=499 then farming=1; else farming=0; if 503<=occupation<=699 then repair=1; else repair=0; if 703<=occupation<=799 then assemblers=1; else assemblers=0; if 803<=occupation<=859 then transportation=1; else transportation=0; if 863<=occupation<=889 then laborers=1; else laborers=0; run; data white; set one; if white=0 then delete; run; proc reg data=white; model lnwage= HGCM HGCF AFQT attending managerial technical sales administrative service farming repair assemblers transportation NE NC S NM M full_time tenure HGC weeks_worked obese; run; proc sort; by obese; run; proc means; by obese; run;