Do Former College Athletes Earn Higher Wages? by Daniel J. Henderson, Alex Olbrecht, and Solomon Polachek1 Department of Economics State University of New York at Binghamton May 2, 2004 Abstract: In this paper, we apply the Li-Racine Generalized Kernel Estimation procedure to measure how participation in college athletes affects earnings. We find no effect on earnings but some effect on occupational choice. JEL Classification: Semiparametric and Nonparametric Methods, Labor and Demographic Economics, Wage Determination 1 Special thanks to James Long for providing the data set and original SAS code. Introduction Approximately 350,000 individuals participate in NCAA sports every academic year, of which a select few will become professional athletes.2 Colleges and universities have faced a budget crunch over recent years and have had to make funding choices between supporting these athletes and academics. Understanding the effects of athletic participation will allow administrators to make more informed funding decisions. Long and Caudill (1991) argued that time spent playing sports can be a form of human capital investment. Athletes are said to learn discipline, maintain better health, acquire teamwork skills, gain a stronger drive to succeed and develop a better work ethic. If athletic time is an investment in human capital, then this cohort should earn a wage premium after graduation. However, following Becker’s (1965) allocation of time model, athletes may receive this benefit if they substitute leisure time for athletic time when deciding between the competing alternatives of studying, leisure and athletic training times. This assumes that academic investments at least outweigh what is learned during athletic time. Only this study has focused on the monetary effects of athletic participation in college. They obtained a wage benefit for males of about $650 six years after graduation and participation in college athletics using Nelson’s maximum likelihood approach. We used a newer and more accurate approach to analyze this data set to answer the same question. The advantage of generalized kernel estimation is that the data drives the estimates so that no strong assumptions on the underlying functional form are required. Since assumptions about the underlying distribution are not required, the model is more able to conform to the data, creating a better fit. This nonparametric technique also leads to more insightful results (a full discussion is forthcoming in the next section). As far as we know, this is the only study to use this approach to analyze the impact of intercollegiate athletics. Additionally, we also investigate whether athletic participation in college has any occupational choice effects. The paper is organized as follows. We first replicate Long and Caudill’s approach. Next, we apply an ordered logistic model to the data so as to make comprehension of the nonparametric results easier for individuals unfamiliar with this approach. Next, we discuss the nonparametric results. Finally, we investigate occupational choices made by athletes. Methodology The nonparametric technique used is the Li-Racine Generalized Kernel Estimation procedure. For more information see Racine-Li (2003) and Li-Racine (2003).3 This procedure employs three different kernel estimators, depending upon whether a regressor is continuous, ordered or unordered variable. Local linear estimation can be thought of in the same light as ordinary least squares. The major difference is that OLS estimates the best fit regression line through 2 NCAA.org reports that in 1998-1999 207,592 men and 145,832 women participated in intercollegiate sports. 3 The software used was n©, available from http://faculty.maxwell.syr.edu/jracine. the entire data set. The local linear approach estimates the best fit regression line at each observation. More specifically, one orders points on a vector, from least to greatest. Next, consider a point xi. The nonparametric approach estimates the best fit line through xi using the points located between xi-h and xi+h. The variable, h, is known as the window width. The optimal window widths are calculated by minimizing a cross validation function, which the reader can loosely think of as similar to a likelihood function. The optimal selection of the window widths gives the best fit local estimators. The procedure assumes that the window width is constant for all observations. Since these regression lines can be different across windows, the net effect is that when one looks at the entire data set, a very nonlinear looking estimation has been created. If however the linear parametric specification is the underlying model, generalized kernel estimation will yield the same results. Ultimately, this procedure yields a coefficient for every observation. OLS assumes one coefficient per variable for all points. A more technical explanation now follows. First consider a nonparametric regression model: y = m( xi ) + ui , i= 1,2,…,NT (1) where m is assumed to be a smooth function whose form is unknown. Note that if m is a linear function in its parameters, this yields the linear parametric model. Define xi as xi=(xic, xio, xiu) where xic is a continuous random vector of dimension q, xio is a p x 1 vector of regressors that take ordered discrete values (e.g. number of kids), and xiu is an r x 1 vector of regressors that take discrete but ordered values (e.g. two digit occupation codes). Taking a first order Taylor expansion around xjc from equation one yields: y ≈ m( xj ) + ( xi c − xj c ) B( xj ) + ui (2) c where β ( xj ) is defined to be the partial derivative of m(x) with respect to xj . Next we obtain the window widths, or bandwidths, we obtained the leave one out local kernel estimator which is defined as: −1 1 1 (xi c - xj c ) δ − j ( xj ) = [∑ Kh( c c )] ∑ Kh( c ) yi (3) c c c c (xi - xj ) (xi - xj )(xi - xj )′ (xi - xj c ) i≠ j −1 p (xsi c - xsj c ) r u )∏ l ( xsi u , xsj u λu )∏ l o ( xsi o , xsj o λo ) (4) hs s =1 s =1 s =1 where Kh is the product kernel function (see Pagan and Ullah 1999) with bandwidth hs=hs(NT) associated with the sth component of xc. The function w is the standard normal kernel function, lu is a version of Aitchison and Aitken’s (1976) kernel function4 and lo is the Wang and Van Ryzin (1961) kernel function.5 Next using the leave one out estimator, we minimize the least squares cross validation function by selecting (h, λu λo) such that q where Kh = ∏ hs w( 1 CV (h, λ λ ) = NT u o ∑ [ y − mˆ j 2 − j ( xj )] is minimized. Finally the optimal window widths are used to estimate δˆ ( x) by: 4 5 Takes the value one if xsio=xsjo, otherwise (λso)^| xsio-xsjjo|.. Takes the value one if xsiu=xsju, otherwise λsu (5) 1 (xi c - xj c ) mˆ ( x) )] δˆ( x) = ( ˆ ) = [∑ Khˆ( c c (xi - xj ) (xi c - xj c )(xi c - xj c )′ β ( x) i≠ j −1 ∑ K ((x hˆ i c 1 ) yi - xj c ) (6) −1 p (xsi c - xsj c ) r u ˆ where K = ∏ hs w( )∏ l ( xsi u , xsj u λˆu )∏ l o ( xsi o , xsj o λˆo ) . ĥs s =1 s =1 s =1 q hˆ (7) Data The data for the Cooperative Institutional Research Programs (CIRP) were collected at two points in time. It surveyed college freshmen in 1971 and had one followup in 1980, six years after expected graduation. Details can be found in Astin (1982). Information was collected on income in 1980, family background, activities such as drug use, and athletic participation. Definitions of all variables can be found in Appendix A. This investigation used virtually the same data set as the Long and Caudill (1991) study did. Observations were dropped for a variety of reasons. Following the previous study, individuals reporting no income were dropped. Additionally, since outliers can have a devastating impact on bandwidth selection, observations reporting an ACT score of zero were dropped. Zero scores were interpreted to mean that the individual did not take the ACT exam. This left 4,209 males in this sample, of which 646 (or about 16 percent) earned a varsity letter in college. Specifically, the question asked on the follow-up survey was whether an individual earned a varsity letter in a sport, leaving which sport the respondent participated in a mystery. The athletic division is also unknown since a respondent’s college is unreported. Therefore the definition of the athletic participation variable, athlete, is equal to one if an individual participated in a varsity sport in college for four years, or zero otherwise. No distinction is made between Divisions I-A, I-AA, I-AAA, II and III, but Long and Caudill (1991) made a rather convincing argument that most athletes probably did not come from schools specializing in “big-time college athletics.” To account for other factors that may influence earnings, many other independent variables were used (following the lead of Long and Caudill (1991)). College majors and occupations were used as controls. Differences in grades, intelligence, and the will to succeed were included. Our empirical model modifies Mincer’s earnings function in several ways. Mincer’s model was estimated using ordinary least squares and is defined as follows: Ln W = β0 + β1 S + β2 E + β3 E2 + ε (8) where W is the wage, S refers to years of schooling and E refers to years of experience. The timing of data collection proved to be problematic. The experience variable for all individuals would be equal for all those who did not attend some professional or graduate program. For those that did not immediately enter the workforce, it is unknown how many years they had worked before the follow-up questionnaire. The three educational categorical variables are included to capture the specific returns to schooling for each degree. In addition, vintage effects are not a concern since the time after graduation is relatively short. The other reason why Mincer’s function needed to be adjusted in our model was the manner in which the dependent variable was reported. Income was reported as follows: 1 = no income, 2 = $1 to $6,999, 3 = $7,000 to $9,999, 4 = $10,000 to $14,999, 5 = $15,000 to $19,999, 6 = $20,000 to $24,999, 7 = $25,000 to $29,999, 8 = $30,000 to $34,999, 9 = $35,000 to $39,000, and 10 = $40,000 or more. The reporting of income in this manner creates two separate problems. First, the true dependent variable is unobserved. Second, there is an open ended upper bound and the divisions between ranges are unequal. Thus ordinary least squares will produce biased results, but the estimated coefficients do provide a good starting point. We use both solutions to this problem. Following the work of Long and Caudill (1991), we use Nelson’s maximum likelihood procedure.6 Next we use an ordered logistic model. Both models produce results that support each other’s conclusions. Parametric Results Two parametric models were estimated. First, Nelson’s maximum likelihood function is fitted to the data7. The advantage to this approach is that there is a direct interpretation of the coefficients. Specifically, if there is an increase in one unit of the independent variable, one can expect an average increase (or decrease) in the wage by the value of the coefficient. However, the interpretation for each coefficient of the nonparametric model is different. It is most similar to the coefficients generated by an ordered logit or probit model. That is, each coefficient in the simplest interpretation argues whether a variable has a positive or negative effect on the dependent variable. The main difference between the two models is how the dependent variable enters. In the likelihood function approach, the income boundaries are used, causing the dependent variable’s units of measurement to be dollars. In the ordered logit model, the income categories are used. Both models provide evidence that athletic participation increases wages. Table 1 displays the results from the maximum likelihood function. The coefficient of 730.937 can be interpreted as follows: if an individual was an athlete in college, six years after graduation he can expect a wage about $731 greater than an individual who was not a collegiate athlete. Table 2 displays the results of the other parametric model. The coefficient of .1784 in front of the athlete variable indicates that participation in sports will on average lead to a higher wage. These two models lend evidence to the human capital model. Apparently, athletics can be seen as creating human capital and can serve as a source of investment 6 See “On a General Computer Algorithm for the Analysis of Models with Limited Dependent Variables,” by Forrest Nelson, Annals of Economic and Social Measurement, 1975, pages 493-509. 7 In this model, the unobserved income, W*, is assumed N~(Xβ, σ2) and the model can be written as W* = Xi β + εi . However, the boundaries of W* are known such that WL ≤ W* ≤ WH. This implies that the statement can be rewritten as ( WL - Xi β )/ σ ≤ ( W* - Xi β )/ σ ≤ ( WH - Xi β )/ σ . The probability of this event can be written as Pi (WL ≤ W* ≤ WH) = F{( WH - Xi β )/ σ} - F{( WL - Xi β )/ σ}. The likelihood function maximized is L = Π Pi . for individuals in addition to school. Additionally, this seems to suggest that individuals are substituting time on the field of play for leisure, rather than academic time. While the primary point of the paper is to discuss the relationship between wages and athletics, other coefficients should be mentioned. If other coefficients provide intuitive results, then one may feel more comfortable with the variable of interest. Specifically, in both models the coefficient on the race variable is negative. The likelihood model has the expected sign on the variable, but is not significant. The ordered logit model though has a strongly negative coefficient. Additionally each education variable predicts a higher wage. A higher ACT score, which acts as a proxy for intelligence, predicts higher wages, as do grades. Married individuals and individuals with kids also earn a higher wage for men. Nonparametric Results The advantage to using a nonparametric approach is that one is not limited to analyzing what happens on average. Additionally, this approach leads to a better fit of the data, and thus as some would argue, more precise results. No standard approach exists as to how to report nonparametric results, but the graphical approach is one of the easiest and clearest method. The interpretation of the results is rather straight forward. A positive coefficient in front of the athlete variable would indicate that an athlete would be more likely to earn a higher wage ceteris paribus. Figure 1 shows the distribution of the coefficients generated for individuals listed as athletes.8 The mean is just slightly positive. The median is negative. The histogram indicates that the distribution is fairly symmetric around zero. This leads to a rather simple conclusion. In some instances, there seems to be a benefit associated with having been a college athlete. In some instances, it seems not to have mattered. And for others, athletic participation may have had a negative effect. If a student athlete were to substitute leisure for athletic time, then that individual will graduate from college with a higher human capital stock, assuming academic time remains fixed. A higher level of human capital will lead to higher earnings, and a positive coefficient for that variable. However, if a student-athlete substitutes studying for sports time and academic time produces more human capital than playing time, then that individual will graduate with less human capital. An athlete may also substitute playing time for both leisure and academic time. If done in the proper ratio, this would lead to a coefficient close to zero for the athlete. If athletes substitute away from leisure time, participation in sports can lead to higher wages. While it would be nice to have information on how students in this sample spent their time at college, that data was not available. Had this information been available, one could test the relationship between leisure time spent and the returns on earnings. These results do support this theory however. At the very least, the claim that athletes earn systematically higher wages can be refuted. Figure 2, shows the distribution of the coefficients of athlete against occupations. Occupations were coded between one and 45, with no apparent pattern as to how occupations were assigned to a value. If a relationship exists between an occupation and a return on the wage, then one could expect athletes to be more likely to enter that 8 The coefficients for athlete for individuals who weren’t athletes are equal to zero by definition. profession in order to gain a wage advantage. However, each occupation shows a fairly symmetric distribution of coefficients around a mean of zero. Athletes do not have a systematic wage advantage in any particular field. Therefore, if we find athletes are more likely to enter a profession in the next section, we can safely argue that the reasons will be non-monetary. Occupational Choices Five occupational choice models were estimated. The choice of occupation was based upon the number of athletes selecting an occupation. In this case, any group with 20 or more athletes was included.9 Some occupations, such as business sales and business management were combined into one group. Table 3 reports the results from the five occupational choice models. In models two through five, athletes are not more or less likely to select a particular occupation. They are, however, more likely to become teachers. Why are athletes systematically more likely to become teachers? Figure 3 plots the histogram of the coefficients for the athlete variable for teachers only and is a specific portion of Figure 2. As one can more clearly surmise, there seems to be no wage premium for teachers who participated in collegiate athletics. The graph does show a premium for some athletes, but the advantage is not systematic to the group. The question then becomes why athletes pick the teaching profession. Participation in sports creates a greater sense of belonging to a school. Athletes may feel a unique connection with an academic institution and want to return to their “home.” Additionally, these individuals may want to continue being involved with sports. They may simply want to remain active in the sport they love or they may have aspirations of coaching. Since most school coaches are also teachers, this desire to continue involvement with sports may be reflected in an increased likeliness of teaching. However, since coaches at this level are not well paid, one would not expect to see any significant wage premium for athletes who chose a teaching profession. In fact, if the individual does want to pursue a college coaching career, they may view their time spent coaching as on-thejob training or an investment in human capital, and thus may be willing to accept no wage premium for supervising athletics. But is there any evidence to support this claim? Table 3 states that 91 college athletes chose a teaching occupation. Of those individuals, 81 were also high school athletes. We also found that high school athletes were more likely to choose teaching as a career.10 If these individuals want to be coaches, one would expect them to be high school teachers as opposed to elementary school teachers who do not have coaching jobs associated with their positions. If this is the case, college athletes should be more likely to become high school teachers and participation in sports should have a non-positive effect on choosing an elementary school teaching job. 9 The one exception was the category of semi-skilled labor. Only 17 athletes reported this as an occupation. A logistic model with teacher as the dependent variable reveals a coefficient of .3696 (Wald Chi Sq of 7.0856) in front of a high school athletic participation variable that replaced ATHLETE in model one of the occupational choice models. 10 Table four displays the results from the two models estimated. The variable HSTeacher is defined as 1 if an individual listed “Teacher: secondary” as an occupation and zero otherwise. The variable EleTeacher is defined as one if an individual reported “Teacher” elementary” as an occupation and zero otherwise. Two results are visibly evident. High school athletes are more likely to become high school teachers and high school athletes are less likely to become elementary school teachers.11 This supports the hypothesis that the desire to coach maybe the non-monetary driving force behind this occupational selection. Conclusion This paper reached two conclusions. First, some college athletes earned a wage premium unrelated to occupational choice after entering the workforce. Whether an individual earns that wage benefit results from a time allocation decision between academic investment, leisure time and athletic training time. Previous approaches treated all athletes as a collective group, while this approach focused on each individual. In this case, the nonparametric approach yielded a conclusion in direct contrast with the parametric approach. Second, athletic participation made it more likely for an individual to become a high school teacher, but didn’t affect other occupational choices. 11 The same teaching models were estimated using college athletic participation on the independent side. College athletes were more likely to become high school teachers (coefficient of .9695 and wald chi sq of 29.0727). College athletes were not more or less likely to become elementary school teachers. Bibliography Aitchison, J. and C. G. G. Aitken, “Multivariate Binary Discrimination by the Kernel Method,” Biometrika, Vol. 63, Number 3, December 1976. Astin, Alan, “Minorities in American Higher Education,” Jossey-Bass, San Francisco, 1982. Becker, Gary, “A Theory of the Allocation of Time,” Economic Journal, September 1965, pages 493-517. Becker, Gary, “Human Capital: A Theoretical and Empirical Analysis with Special References to Education,” NBER, Columbia University Press, New York, 1975. Ben-Porath, Yoram, “The Production of Human Capital and the Life Cycle of Earnings,” Journal of Political Economy, Volume 75, Issue 4, August 1967, pages 352-365. Li, Q. and J. Racine, “Cross-Validated Local Linear Nonparametric Regression,” Statistica Sinica, forthcoming. Li, Q. and J. Racine, “Nonparametric Estimation of Distributions with Categorical and Continuous Data,” Journal of Multivariate Analysis, August, Volume 86, Issue 2, pp 266292, 2003. Long, James and Steven Caudill, “The Impact of Participation in Intercollegiate Athletics on Income and Graduation,” Review of Economics and Statistics, Volume 73, Issue 3, August 1991, pages 525-531. Mincer, Jacob, “Schooling, Experience and Earnings,” Columbia University Press for the NBER, 1974. Nelson, Forrest, “On a General Computer Algorithm for the Analysis of Models with Limited Dependent Variables,” Annals of Economic and Social Measurement, 1975, pages 493-509. Polachek, Solomon, and Stanley Siebert, “The Economics of Earnings,” Cambridge University Press, New York, 1993. Racine, J. and Q. Li, “Nonparametric Estimation of Regression Functions with Both Categorical and Continuous Data,” Journal of Econometrics, forthcoming. Wang, M.C. and J. Van Ryzin, “A Class of Smooth Estimators for Discrete Estimation,” Biometrika, 1961. Appendix List of Variables Variable Meaning ACT Score on American College Test Athlete 1 if earned a varsity letter in college, 0 otherwise BA 1 if holds bachelors degree, 0 otherwise Business 1 if individual reported occupation as accountant, business clerical, business management or business sales, 0 otherwise Cgrades Self reported average grades CR College Region Collacad Self reported academic achievment Drivedum 1 if individual rates themselves in the highest 10 percent to “drive to achieve” Enroll Total enrollment of college, reported in categories Famdum 1 if respondent indicates that raising a family is an essential goal, 0 otherwise Firmsz Number of employees in firm individual works for, reported in categories Graddeg 1 if Ms=1 or phdprof=1, else 0 Hsathlete 1 if earned a varsity letter in high school, 0 otherwise Hsgrades Categorical variable denoting high school grades Kids Number of offspring Labor 1 if individual reported occupation as skilled, semi-skilled or unskilled labor, 0 otherwise Lastmaj Last declared major respondent reported while in college Lawyer 1 if occupation reported as a lawayer, 0 otherwise Lowasp 1 if person did not aspire to earning a bachelor’s degree or higher, 0 otherwise MajXX Represents various college majors Military 1 if individual reported occupation as military career, 0 otherwise Msp 1 if married, 0 otherwise MS 1 if holds masters degree, 0 otherwise MW 1 if college located in midwest region, 0 otherwise Occ or OccXX Represents various occupations Peduc 1 if parents graduated from college, 0 otherwise Parinc Parent’s income before taxes in 1970, reported the same way as income Part 1 if the job worked was part-time, 0 otherwise Phdprof 1 if holds Phd or advanced professional degree, 0 otherwise Private 1 if college attended was a privately owned institution, 0 otherwise Race 1 if African American, 0 otherwise Runbus 1 if respondent indicated running their own business was an essential goal, 0 otherwise S 1 if college located in south region, 0 otherwise Selfemp 1 if individual was self-employed, 0 otherwise Teacher 1 if individual reported occupation as secondary or elementary teacher Vet 1 if military veteran, 0 otherwise W 1 if college located in west region, 0 otherwise Welldum 1 if “be well off financially” is an essential goal, 0 otherwise Yrscomp number of academic years completed Appendix Table One: Maximum Likelihood Estimates Variable Coefficient T-Statistic Constant Athlete Race Msp Kids Veteran Selfemp Part Firmsz Act Cgrades Ba Ms Phdprof Private Enroll Mw S W Occ1 Occ2 Occ3 Occ4 Occ5 Occ6 Occ7 Occ8 Occ9 Occ10 Occ11 Occ12 Occ13 Occ14 Occ15 Occ16 Maj1 2864.8691 730.937 -605.708 1216.9208 1013.112 1387.7445 -508.1706 -7108.123 672.67571 108.17337 314.2632 100.5809 1409.5356 1989.0313 547.68059 231.5471 355.9974 -526.4403 818.61659 4863.9709 4783.3142 2782.7371 6100.9545 1293.6319 5661.228 2299.3772 2672.508 4495.1524 3369.1996 1824.3997 -1746.835 -724.2067 4876.3915 3118.4827 1797.0261 561.72997 2.2912125 2.2184249 -1.695354 4.880334 5.4995927 1.9515873 -1.278199 -15.77781 9.1702043 2.6735568 2.7280173 0.3173899 3.2609143 2.9962647 1.7489869 3.0368675 1.2357973 -1.379601 2.5492991 5.6950228 5.5014503 4.1044833 6.6337379 1.4065517 8.5292046 2.6710531 0.8915418 6.1026487 3.6995479 2.1728581 -1.242023 -0.653447 1.6813597 4.0994386 2.049067 0.448171 Maj2 Maj3 Maj4 Maj5 Maj6 Maj7 Maj8 Maj9 Maj10 Maj11 Drivedum Welldum Runbus Famdum Sigma -1372.355 -1385.387 874.31622 2136.8772 -39.36995 -280.263 1609.6144 -902.2263 97.70968 -205.1038 2065.6263 668.48076 1536.8381 -197.7013 42208480 -2.522944 -2.288997 1.6700445 3.3657653 -0.066616 -0.282811 1.6787699 -1.753461 0.1053729 -0.205684 8.536258 2.1645151 5.437688 -0.696708 667919.25 Table Two: Ordered Logit Model Standard Parameter Intercept Intercept Intercept Intercept Intercept Intercept Intercept Intercept ATHLETE ACT RACE CGRADES CR DRIVEDUM ENROLL FAMDUM FIRMSZ HIGHDEG KIDS LASTMAJ MSP OCC PART 10 9 8 7 6 5 4 3 DF Estimate 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -7.8063 -7.4322 -6.6105 -5.5964 -4.3025 -2.9437 -1.2409 -0.2908 0.1784 0.0289 -0.3958 0.0713 0.0134 0.51 0.1108 -0.0697 0.2869 0.066 0.2952 0.00119 0.3997 -0.0218 -3.1121 Error 0.3176 0.3093 0.2982 0.2915 0.2863 0.2822 0.2802 0.282 0.0792 0.00927 0.0869 0.0291 0.0271 0.0646 0.0195 0.0681 0.0181 0.0202 0.0456 0.00149 0.0635 0.00189 0.1367 Wald ChiSquare 604.0532 577.5724 491.3582 368.6209 225.7863 108.772 19.6194 1.0635 5.0677 9.7592 20.7313 6.0021 0.2447 62.2478 32.2966 1.0457 251.3977 10.6497 41.9651 0.6368 39.6032 133.0272 517.9777 Pr > ChiSq <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.3024 0.0244 0.0018 <.0001 0.0143 0.6208 <.0001 <.0001 0.3065 <.0001 0.0011 <.0001 0.4249 <.0001 <.0001 <.0001 Table Three: Occupational Choices Variable Intercept ATHLETE RACE KIDS MSP VETERAN CGRADES ACT PRIVATE FIRMSZ ENROLL BA MS PHDPROF DRIVEDUM RUNBUS MAJ1 MAJ2 MAJ3 MAJ4 MAJ5 MAJ6 MAJ7 MAJ8 MAJ9 MAJ10 MAJ11 N in Occ N Ath in Occ N Total Ath N Total Sample Variable Intercept ATHLETE RACE KIDS MSP VETERAN Dependent Variable TEACHER Wald Chi Coefficient sq -1.6761 7.7721 0.6838 18.0175 0.1238 0.3356 -0.2293 3.7774 0.1783 1.4617 -0.225 0.2328 0.1518 3.8974 -0.0671 9.6032 -0.0619 0.1048 -0.0891 4.2336 -0.00971 0.0401 2.6505 58.2938 2.8666 58.4242 0.7218 1.5129 -0.0639 0.1699 -0.911 12.7523 -1.6176 16.9205 -1.8671 83.1966 -1.7992 45.1273 -4.2301 95.4725 -4.0655 45.9274 -2.0123 63.137 -2.6608 18.8064 -15.5082 0.0012 -2.331 133.8938 -2.1569 8.2476 -2.2505 21.1587 LABOR Coefficient 1.4452 -0.2056 -0.2771 0.0279 -0.0744 0.376 -0.1658 -0.0506 0.0608 -0.066 0.00486 -1.2797 -2.6237 -3.0224 -0.2887 -0.3452 0.7911 0.161 0.1712 -0.2903 -0.0182 0.0632 -0.6007 -0.8919 0.0692 0.478 -0.1227 BUSINESS Wald Chi sq 14.7145 1.8871 3.3134 0.1631 0.4629 2.2087 14.0343 9.7814 0.2173 5.2333 0.0212 139.2531 97.3693 49.1646 5.5715 6.1132 6.7349 0.719 0.5017 2.4839 0.0068 0.0725 2.4524 3.1249 0.1445 2.5652 0.1451 Coefficient -2.7879 0.0741 -0.2522 0.0319 0.1352 -0.1453 -0.0514 -0.00381 0.5056 0.0718 0.0783 0.5517 0.1389 -1.6709 0.2677 0.3377 -0.1952 0.3187 0.2357 1.8685 -0.5488 -0.1176 -0.1046 -0.8849 0.8746 0.3522 0.0388 303 91 646 605 67 646 1144 187 646 4209 4209 4209 MILITARY Coefficient -10.0624 0.1125 0.9757 0.387 -0.2672 -0.9673 LAWYER Wald Chi sq 125.6549 0.2079 19.2652 8.9391 1.6193 2.1202 Coefficient -10.258 0.3829 0.0962 -0.3525 0.315 -0.6963 Wald Chi sq 30.3076 0.9953 0.0559 0.937 1.0743 0.2224 Wald Chi sq 71.7676 0.4742 4.4912 0.282 2.5152 0.3567 1.6577 0.0898 22.5445 9.7118 8.4014 27.823 0.9205 40.3562 9.6834 10.4575 0.3157 3.2063 1.1755 129.1567 5.8469 0.3041 0.0902 3.2116 28.2819 1.0845 0.0146 CGRADES ACT PRIVATE FIRMSZ ENROLL BA MS PHDPROF DRIVEDUM RUNBUS MAJ1 MAJ2 MAJ3 MAJ4 MAJ5 MAJ6 MAJ7 MAJ8 MAJ9 MAJ10 MAJ11 0.00834 0.163 -2.0251 0.9931 -0.3676 0.3709 -0.9941 -0.9163 0.6697 0.0336 0.0879 0.02 0.035 -0.2676 0.39 0.275 -0.2782 0.3516 0.00626 0.1294 0.1745 0.0077 34.3737 72.0675 155.3231 35.9257 2.1579 4.6976 2.2866 11.9931 0.0147 0.0132 0.0018 0.0037 0.4034 0.8183 0.3519 0.1324 0.1589 0.0002 0.0376 0.0562 0.387 0.1103 -0.3212 -0.4354 0.0465 -1.1456 -1.3858 5.4756 -0.1389 -0.4424 2.3171 1.9485 -3.1801 2.7771 0.1434 -1.1659 -11.9309 -13.5871 2.4915 -9.3769 3.4515 4.7815 4.7243 0.5544 21.0948 0.2456 2.4808 1.4656 79.6478 0.2168 1.2271 1.9408 2.6759 4.3184 4.8993 0.0079 0.7947 0.0004 0.001 4.604 0.0001 5.2018 N in Occ 169 178 N Ath in Occ 35 38 N Total Ath 646 646 N Total Sample 4209 4209 * The model predicts the probabilities that the dependent variable will be equal to one. Table Four: Teaching Occupation Models Parameter Intercept HSATHLETE RACE KIDS MSP VETERAN CGRADES ACT PRIVATE FIRMSZ ENROLL BA MS PHDPROF DRIVEDUM RUNBUS MAJ1 MAJ2 HSTeacher Wald ChiEstimate Square -3.0969 17.5106 0.8456 24.2204 -0.0398 0.0234 -0.1812 1.7444 0.122 0.4922 -0.178 0.106 0.1277 1.933 -0.0587 5.2095 0.11 0.2367 -0.1298 6.383 0.0199 0.1219 2.4239 31.1689 2.4666 28.0748 0.2567 0.1194 0.1537 0.7684 -0.9375 9.4929 -0.8778 3.5575 -1.0936 21.1189 EleTeacher Estimate -2.0825 -0.5298 0.3847 -0.26 0.1853 -0.3843 0.0985 -0.0573 -0.192 0.0134 -0.077 2.5628 3.0313 1.277 -0.4941 -0.7203 -1.9934 -2.4453 Wald ChiSquare 5.1614 5.9123 1.5747 2.0305 0.6623 0.2502 0.7337 3.092 0.4586 0.0368 1.0693 22.3754 27.3461 1.9165 3.6009 3.0004 10.1747 57.7751 MAJ3 MAJ4 MAJ5 MAJ6 MAJ7 -0.8508 -3.336 -3.3081 -1.2628 -1.9366 14.8905 -1.6943 -1.0943 -1.4553 MAJ8 MAJ9 MAJ10 MAJ11 8.3861 48.8746 20.4398 17.9867 6.8325 -3.0465 -5.1597 -4.4235 -2.4074 -3.0543 25.0096 25.8174 18.7298 35.2693 8.8295 0.0007 48.6523 2.0647 7.155 -15.5217 -2.5444 -14.987 -3.132 0.0005 75.6857 0.0006 9.3485 N HS ATH 132 51 N Teachers 190 113 N Total Sample 4209 4209 * The model predicts the probabilities that the dependent variable will be equal to one. Figure One: Nonparametric Model Histogram of β ( Athlete) Frequency 150 100 50 0 -1 0 1 2 B(Athlete) Q1: -0.11479 Q2: -0.00687 Q3: 0.148423 Mean: 0.028134 Figure Two: Occupations and Returns to Athletic Participation b(athlete) 2 1 0 -1 0 10 20 30 40 50 occ Figure Three: Wage Returns for Athletic Participation for Teachers Frequency 30 20 10 0 -0.6 -0.4 -0.2 0.0 B(Athlete) 0.2 0.4