High School Athletics Participation and Future Health Behavior Vasilios D. Kosteas Cleveland State University 2121 Euclid Avenue, RT 1719 Cleveland, OH 44115-2214 b.kosteas@csuohio.edu Tel: 216-687-4526, fax: 216-687-9206 Abstract This paper investigates whether participation in high school athletics has a positive impact on health behaviors for US adults in their early 30s to mid-50s, using data from the National Longitudinal Surveys of Youth 1979 cohort. We ask whether being a high school athlete makes someone more likely to exercise regularly and less likely to smoke or drink excessively at those ages. The results show that participation in athletics during high school leads to a higher probability of being physically active later in life, a lower probability of smoking on a daily basis, but a higher probability of exceeding consumption guidelines for alcoholic beverages. Furthermore, these effects are fairly consistent for the same cohort over a ten-to-fourteen year period. Key Words: exercise, smoking, alcohol, high school athletics JEL Codes: I12 1 Introduction A lack of adequate physical activity for a large segment of the adult US population, in spite of numerous attempts to raise overall activity levels remains a source of concern for policy makers. So do the prevalence of binge drinking and the fact that nearly one-in-five non-elderly US Americans smoke on a regular basis. 1 There is a sizable literature examining the determinants of health behaviors such as smoking, and drinking. While smoking and consumption of alcohol consumption have traditionally received greater attention, the literature examining the determinants of regular exercise has been growing. These studies have focused on the importance of demographic characteristics, education, income and community level amenities, such as the availability of parks and recreation centers. However, relatively little attention has focused on the potential role of participation in high school athletics on promoting healthy behaviors throughout life. The present paper combines the literature examining the determinants of healthy behavior with the literature examining the impacts of participation in high school athletics on future outcomes. In the latter, existing studies have looked into the effect of participation in high school extracurricular activities on human capital formation and labor market outcomes, with a particular emphasis on participation in high school athletics. These studies show a wide array of positive outcomes associated with participation in high school extracurricular activities, including greater educational attainment, higher wages and benefits and a greater likelihood of being a supervisor and holding higher level job responsibilities such as hiring and firing and deciding pay increases. Surprisingly, there are few studies linking these two areas of research. What literature exists tends to be outside of economics and focuses simply on establishing correlations between high school activities and future health behaviors. In this paper, we examine the effect of participation in high school athletics on health related behaviors for US adults in their early 30s to mid-50s. We hypothesize that participation in high school athletics helps to foster a preference for healthy behavior. We postulate this relationship works (at least partially) through establishing those healthy behaviors during the high school years. In an effort to tease out the extent to which the established correlations between these variables represent causal relationships, we employ 1 See the CDC website for information on binge drinking: http://www.cdc.gov/alcohol/fact-sheets/bingedrinking.htm and smoking statistics: http://www.cdc.gov/tobacco/data_statistics/fact_sheets/adult_data/cig_smoking/. 2 propensity score matching estimation and a relatively new instrumental variables approach developed by Lewbel (2012). All analyses are conducted using data from the National Longitudinal Surveys of Youth 1979 cohort. Specifically, we ask whether being a high school athlete makes someone more likely to exercise regularly and less likely to smoke or drink excessively at those ages. The results show that participation in athletics during high school leads to a higher probability of being physically active later in life, a lower probability of smoking on a daily basis, but a higher probability of exceeding consumption guidelines for alcoholic beverages. These effects are fairly consistent for the same cohort over a ten-to-fourteen year period. Background The theoretical motivation for the empirical models is straightforward. Participation in HS athletics helps to encourage future physical activity by fostering a love of sports and/or through habit formation that can continue for decades into the future. Previous studies have established the importance of habit formation and/or addiction in exercise (Acland and Levy 2012, Charness and Gneezy 2009, Royer at al 2012), smoking (e.g. Jones 2014), and drinking (e.g. Williams 2005). Participation in HS athletics may have an impact on whether the individual drinks alcohol or smokes tobacco during high school. High school athletes may be less likely to smoke since it can interfere with athletic performance. On the other hand, they may be more likely to drink alcohol since this is often a group activity in high school and participation in athletics may increase the opportunities to drink socially. Evidence indicates that HS athletes are indeed less likely to smoke compared with non-athletes (Naylor et al 2001, Terry‐McElrath and O'Malley 2011) but more likely to drink alcohol (Terry‐McElrath and O'Malley 2011, Wetherill and Fromme 2007). By impacting the decision to smoke or drink during high school, participating in HS athletics may affect future smoking and drinking behavior through habit formation (which may also be interpreted as addiction in the case of cigarette smoking and also for some individuals with respect to the consumption of alcohol). Participating in HS athletics may also affect future smoking and drinking by nurturing a desire to lead a healthy lifestyle. We could test the first mechanism if we had good information on smoking and drinking during the HS years in our dataset. Unfortunately, we do not. Thus, we will not directly test either of these mechanisms. Instead, we simply examine whether there is any effect of participation in HS 3 athletics on future health behaviors and defer to earlier studies on the question of whether participation in HS athletics affects drinking and smoking during the high school years. The existing empirical literature in economics examining the long term effects of participation in high school athletics has focused on educational and labor market outcomes. Barron, Ewing and Waddell (2000) analyze the relationship between participation in high school athletics on high school rank, educational attainment, future employment and weekly wages. Eide and Ronan (2001) estimate the relationship between athletics participation in high school and educational attainment and future wages. Their results are mixed, finding a negative correlation with educational attainment for white men, a positive correlation for black men and white women, and no correlation for Hispanics or black women. They do not find a significant correlation between athletics and wages for any of these groups. Ewing (2007) finds that in addition to earning higher wages, former high school athletes also receive more benefits. Anderson (2001) finds a positive link between participation in sports and educational outcomes for white athletes. Stevenson (2010) finds that women who go to high school in states with a higher female participation rates in athletics have greater educational attainment. Kosteas (2011) shows that former high school athletes are more likely to be supervisors at work and hold high level responsibilities, such as setting pay and making hiring and firing decisions. Pfeifer and Cornelisen (2010) find a positive link between childhood sports and educational attainment for Germany. There are also several empirical papers in the sociology literature which investigate the link between high school athletics and various educational outcomes (see Troutman and Dufur 2007 for a summary papers in this literature). There are several papers in the applied psychology and public health literatures examining the impact of playing high school sports on future health behaviors. However, these studies tend to use relatively small samples which are often not representative of the broader population. Furthermore, these papers tend to employ relatively limited models and do not go beyond establishing a correlation between participation in HS athletics and future outcomes. For recent studies in this literature see Dohle and Wansink (2013), Geisner et al (2012), Hartmann and Massoglia (2007), and Wichstrom and Wichstrom (2008). The literature examining the determinants of exercise, smoking and drinking is voluminous. Cabane and Lechner (2014) provide a nice review of the literatures on both the determinants and long term effects of physical activity, including papers in economics, 4 epidemiology, sports science, and other fields. In this section we highlight some of the key studies in the economics literature examining the labor market effects of physical activity. Existing studies can be grouped according to whether they focus on the effect of individual characteristics (education, income, age) or aspects of the local environment (access to public parks, weather conditions) on physical activity. With respect to individual characteristics, studies have found relationships between levels of physical activity and age (Breuer and Wicker 2009, Downward et al 2011, Eberth and Smith 2010, Garcia et al 2011, Humphreys and Ruseski 2015, Stamatakis and Chaudhury 2008), marital status and the presence of young children in the household (the latter for women only) (Eberth and Smithe 2010 and Garcia et al 2011), and education and income (Downward and Rasciute 2010, Fridberg 2010, Hovemann and Wicker 2009, Humphreys and Ruseski 2015, Lechner 2009, Mletzer and Jena 2010). Regarding environmental factors, existing research has uncovered a positive link between physical activity for children and both a state physical education requirement and state spending on parks and recreation (Cawley et al 2007). Studies focusing on adults have established connections between levels of physical activity and education (Mullahy and Robert 2010) and the number of gyms, parks and other recreational areas per capita (McInnes and Shinogle 2011, Humphreys and Ruseski 2007). Multiple studies have uncovered a connection between poor weather conditions and physical activity (Eisenberg and Okeke 2009, Humphreys and Ruseski 2011, Witham et al 2014). A few studies in the experimental economics literature have established the importance of peer effects (Babock and Hardman 2010, Carrel et al 2011, Leslie and Norton 2012). Finally, there are also a couple of studies which have investigated the tradeoff between exercise intensity and duration, finding that high wage Americans substitute intensity for duration (Meltzer and Jenna 2012) while high income Australians are more likely to exercise more frequently and with greater intensity (Maruyama and Shin 2012). The literature on the correlates and determinants of smoking and alcohol consumption is vast. Here, we will highlight some of the more recent papers in the health economics literature. As with the exercise literature, we can group these studies into those which look at individual/household characteristics and those which look at economic/environmental factors. In spite of the strong correlation between education and both drinking and smoking, Park and Kang (2008) do not find a causal link running from the former to the latter for Korean men. Dohmen et al (2011) show that greater preference for risk is a strong predictor of smoking. Several papers 5 have found important peer effects on drinking for US teens (Powell et al 2005), Irish college students (Delaney et al 2007) and Swedish teens (Lundborg (2006). Delaney et al (2013) find a strong association between drinking for Irish college students and the alcohol consumption of their fathers and older siblings. Gohlmann et al 2010) also find a significant impact of parental smoking on children’s smoking in Germany. Argys et al (2006) show that birth order has a significant correlation with smoking, drinking and other behaviors for US teenagers. Participation in Head Start leads to reduced smoking probability in young adulthood (Anderson et al 2010). Finally, perceptions of risk are important determinants of smoking (Lundborg and Lindgren 2004) and drinking (Lundborg and Lindgren 2002). Economic factors such as significant stock market crashes (Cotti et al 2015) also lead to greater smoking and drinking, while income is negatively related to smoking and positively related to drinking alcohol (CostaFont et al 2014). Cigarette prices are a robust determinant of youth smoking, while the effect of cigarette taxes is mixed (Nonnemaker and Farrelly 2011) and adults who are current smokers are more likely to quit in response to rising prices (Goel and Naretta 2011). The current paper adds to these literatures by examining the effect of participation in high school athletics on three health behaviors: physical activity, smoking, and consumption of alcohol. To our knowledge, this is the first paper to empirically investigate these relationships. Furthermore, we attempt to push beyond merely establishing correlations between participation in HS athletics and health behaviors to estimating causal relationships by using propensity score matching and a relatively new instrumental variables estimator developed by Lewbel (2012). Data The data come from the 1998-2002 and 2012 waves of the NLSY79 cohort, unless indicated otherwise. The NLSY79, which conducted surveys every year starting in 1979 through 1994, then in even numbered years, began with an initial sample of 12,686 individuals. The initial sample contained oversamples of poor white individuals and members of the armed forces. The military and poor white oversamples were dropped in 1985 and 1991, respectively. Data for each of our three health behaviors are not available for all years. We use data on exercise/physical activity from the 2000, 2002 and 2012 waves, data on smoking from the 1998 and 2012 waves and data on consumption of alcohol from the 2002 and 2012 waves. We chose the earliest and 6 most recent years for which data on these activities is available in order to examine whether the impact of participation in HS athletics on these activities changes as individuals get older. The NLSY is a good source of labor market data for individuals, containing information for a variety of background variables in addition to current information. One drawback with this dataset is the narrow age range of the respondents; they were between the ages of fourteen and twenty-one in 1979. However, by taking advantage of the longitudinal nature of the data and estimating the models using data from different years, we are able to examine these relationships for individuals ranging in age from 33-57 years of age for the smoking models, 35-57 years of age for the exercise/physical activity models, and 37-57 years of age for the consumption of alcohol models. We examine three different health behaviors: physical activity, smoking, and drinking alcohol. The NLSY79 used two different approaches to gathering information on physical activity (PA). In 1998 and 2000, respondents were asked to answer the following question: “How often do you participate in vigorous physical exercise or sports - such as aerobics, running, swimming, or bicycling?” Responses were placed in the following categories: never, less than once a month, one to three times each month, once or twice a week, three times or more each week. Using this information, we construct two different indicator variables, one to reflect whether the individual exercises at least once per week and another reflecting regular exercise, which takes a value of one if the individual exercises at least three times per week and zero otherwise. The surveys also ask about participation in light-to-moderate physical activity. However for the sake of the present analysis, the exercise models using the 2000 wave data focus on vigorous exercise. 2 A strong point of these measures is that they focus on leisure time physical activity (LTPA), over which individuals exert significant control, unlike work-related physical activity (WRPA) and to a lesser extend physical activity related to household work. However, they also suffer from a significant drawback in that they do not capture variations in the total amount of time spent exercising or the intensity of the activities, but simply exercise frequency. We chose to work with the 2000 data on exercise frequency in order to provide a closer in time comparison of results with the models estimated using the data from the 2002 wave of the surveys. 2 Estimates show a positive effect of participation in high school sports on the probability an individual engages in light-to-moderate physical activity at least three times per week. These results are available upon request. 7 Beginning in 2002, the NLSY79 employed a different set of questions to measure PA. Respondents were asked how frequently they participate in vigorous as well as light to moderate physical activities and the average duration of each episode. For vigorous activity, respondents are first asked “How often do you do vigorous activities for at least 10 minutes that cause heavy sweating or large increases in breathing or heart rate?” Respondents then provide a number of times they engage in vigorous activity and the time unit for their response (per day, per week, per month, per year). Similarly, regarding participation in light to moderate physical activity, respondents are asked the following question “How often do you do light or moderate activities for at least 10 minutes that cause only light sweating or slight to moderate increase in breathing or heart rate?” As with the question on vigorous physical activity, respondents report a frequency (number of times) and a time unit as well as information on the duration of a typical episode of light-to-moderate physical activity. Using this information, we calculate the number of minutes spent engaging in both vigorous and light-to-moderate PA on a weekly basis. Then we construct indicator variables capturing whether the individual meets the guidelines for physical activity established by the Office of Disease Prevention and Health Promotion. These are also the same guidelines espoused by the World Health Organization (http://www.who.int/dietphysicalactivity/factsheet_adults/en/). The guidelines call for at least 150 minutes a week of moderate-intensity, or 75 minutes a week of vigorous-intensity aerobic physical activity, “or an equivalent combination of moderate- and vigorous intensity aerobic activity.” To achieve even greater health benefits, the Office of Disease Prevention and Health Promotion recommends doubling these numbers. We create a variable measuring the number of “equivalent minutes” of activity per week: (1) equivalent minutes = 2*minutes vigorous activity + minutes light-to-moderate activity, and construct an indicator variable for the basic guideline which takes a value of one if the number of equivalent minutes is greater than or equal to 150 and zero otherwise, and an indicator for the higher guideline which takes a value of one if the number of equivalent minutes per week is greater than or equal to 300 and zero otherwise. Unlike the exercise variables contained in the 1998 and 2000 waves of the NLSY79, the questions and responses in the more recent waves of the survey capture total physical activity, including WRPA and PA associated with household work (such as gardening and cleaning). 8 Working a with a more complete measure of PA is important since many individuals may be able to meet the PA guidelines through their normal daily activities, including walking to work (which individuals may not be likely to report as exercise) or physical demands on the job. Additionally, these measures give a better sense of total activity compared to the categorical responses provided in the 1998 and 2000 waves. On the other hand, aside from switching occupations (which is less likely with workers in their forties and fifties) individuals may not have much control over their WRPA. Thus, we are less likely to see a relationship between individual characteristics and these measures of PA. The indicator variables are preferred to measures of total time spent engaged in PA because the latter are subject to significant measurement error. In the 2002 wave, the median number of minutes spent engaged in vigorous physical activity per week is 70, while the 75th percentile value is 240 minutes (or four hours per week). These are very reasonable numbers, but the 95th percentile value is 2,160 minutes. While 36 hours per week may not seem like an excessive amount of time for a triathlete or other professional athletes (even here the number is questionable), it his highly unlikely 5 percent of the population spends at least this much time per week engaged in vigorous physical activity. Focusing on the indicator variables should minimize the measurement error issues without requiring ad hoc decisions regarding capping the minutes of PA or excluding observations with large values. Using the frequency of exercise measure in 2000 and the indicator variables based on total PA in 2002 allows us to compare results using very different measures of PA. Obtaining similar estimates using these different measures will give us greater confidence in the robustness of the estimated relationship between PA and HS athletics. We use two measures of smoking: whether the individual currently smokes daily, and whether the individual has smoked at least 100 cigarettes in her lifetime. The NLSY first asked respondents whether they have smoked 100 cigarettes in their lifetime. Those who answer positively are asked a follow up question “Do you now smoke daily, occasionally or not at all?” We could use the responses to these two questions to create a categorical variable taking three values, one for does not smoke, one for occasionally smokes and one for smokes daily. However, the smokes occasionally category is rather vague. Instead we construct an indicator variable for whether the individual smokes daily since this is the best measure of whether the person is a regular smoker. 9 We develop two variables measuring alcohol consumption: the first is an indicator variable for whether the individual consumed any alcohol in the past thirty days, while the second is an indicator variable taking a value of one if the individual averages one (two) or fewer drinks per day if female (male) on the days the individual drinks and zero otherwise. Both variables are constructed from the same series of question. First, respondents are asked whether they have consumed any alcoholic beverages in the last 30 days. Those who respond in the affirmative are asked a series of follow up questions, starting with the number of days over the past 30 on which they consumed any alcohol and then an additional question asking how many drinks they have on average on the days they drink. The second variable attempts to capture whether an individual tends to keep their alcohol consumption in the zero to moderate range. The CDC defines moderate consumption as up to one drink per day for women and two drinks per day for men (http://www.cdc.gov/alcohol/faqs.htm). The CDC’s guidelines are meant to apply to each day, while our consumption variable measures average consumption on a typical, creating a potential discrepancy between the guidelines and the guideline variable. It is the closest approximation to the CDC’s guidelines given the available information. Methodology Our primary models examine the relationship between participation in HS athletics and health behaviors for US adults, regardless of work status. As a robustness check, additional results are presented for models estimated for working individuals and controlling for wages and hours worked per week. Since each of our outcome variables is binary, exploratory results are obtained via logit estimation for all models. Each model includes control variables which fall into three categories: time invariant demographics (indicators for being female, black, and Hispanic), contemporaneous control variables (age, highest grade completed, log family income, and whether the individual is married), and background variables (Armed Forces Qualifying Test (AFQT) score percentile from 1980, mother and father’s highest grade completed, a measure of the frequency of religious services attendance in 1979, the Rosenborg self-esteem score in 1979, the high school athletics participation rate in 1974 for the respondent’s reported state of residence in 1979, and indicator variables for whether the individual participated in youth organizations, hobby clubs, student government, yearbook or newspaper, performing arts clubs, or national honors society). 10 Several papers have also shown a link between various measures of time preference and exercise (Kosteas 2015), smoking (Adams and Nettle 2009, Anderson and Mellor 2008, Brown et al 2014, Khwaja et al 2007), alcohol consumption (Anderson and Mellor 2008, Richards and Hamilton 2012), risky sexual behavior (Chesson et al 2006) and body composition (Adams and Nettle 2009, Anderson and Mellor 2008, Smith et al 2005, Borghans and Golsteyn 2006, Chabris et al 2008, Zhang and Rashad 2008, Richards and Hamilton 2012). In addition to serving as determinants of these health behaviors, time preference may also be correlated with participation in HS sports. Thus, the omission of a measure of or proxy for time preference may bias our estimates. Unfortunately, the NLSY79 does not contain a measure of time preference for any of the years of data used in the primary analysis. However, there are some questions aimed at eliciting information on time preference in the 2006 wave. As a robustness check, we estimate the models for PA and drinking using the 2006 data and include two different measures of time preference. 3 The Logit estimates likely suffer from two significant issues: measurement error in the high school athletics participation variable and bias introduced by unobserved characteristics affecting both participation in high school athletics and future health behaviors. In particular, the key unobservable for the exercise/PA models is the individuals’ baseline taste for exercise. It is quite likely that individuals who are more inclined towards exercising regularly are also more likely to participate in high school athletics. Likewise, these individuals may also be more health conscious and thus less likely to undertake unhealthy behaviors such as smoking and heavy consumption of alcohol. Thus, for the exercise/PA and smoking models, the estimates based on logistic regression are likely to overstate the causal effect of participation in HS athletics on these future behaviors. In contrast, unobserved taste for exercise/healthy lifestyle is likely to lead to underestimation of the causal effect of HS athletics participation on future consumption of alcohol if the former causes an increase in the latter. In all models, assuming the measurement error is random, it will lead to attenuation bias. Without more information on the relative importance of these two issues, it is not possible to sign the direction of the bias for the exercise and smoking models. However, both omitted variables and measurement error lead to underestimation of the effect of HS athletics on drinking. In an attempt to address these issues, 3 The 2006 wave did not contain information on cigarette smoking. 11 we employ two alternative estimation routines: propensity score matching (PSM) and Lewbel’s (2012) instrumental variables estimator (LIV). In order to estimate the average treatment effect, propensity score matching takes place in three stages. First, a logit model is estimated for the probability of belonging to the treatment group and the propensity score estimated. Next, observations from the treatment group are matched to those not in the treatment group based on their propensity scores and the sample is tested to see if the samples are balanced. Rubin (2001) proposes the difference in the mean of the propensity scores for the treated and matched samples should be less than half a standard deviation and the ratio of the variances of the two samples’ propensity scores should be between 0.5 and 2.0. If the balancing requirement is satisfied, observations undergoing treatment are matched with observations that did not undergo the treatment and the effect of treatment is obtained by comparing the mean difference of the dependent variable between each treated observations and its matching observations. We employ nearest neighbor matching where each treated observation is matched to three non-treated observations. 4 While PSM estimation provides an alternative to regression analysis, there are potential drawbacks to this approach. In order to produce unbiased estimates of the treatment effect, PSM requires large sample sizes (not a problem in the present study), substantial overlap between the treatment and comparison groups and a rich set of covariates to estimate the propensity score. PSM rests on the assumption that assignment to treatment and control groups is random after conditioning on observable characteristics. Omitting variables which affect both assignment to the treatment (participation in high school athletics) and the outcome variable (engagement in regular exercise, regular smoking, or drinking alcohol) from the first stage can lead to biased estimates (Heckman et al 1997). Thus, estimates obtained via PSM may eliminate some, but not all of the bias present when estimating treatment effects using more traditional estimators (logistic regression in the present case). In particular, the unobservable characteristics of greatest concern in the present study are baseline taste for exercise/healthy behaviors (predating participation in HS athletics) and discount rates. The dataset does provide variables which measure/proxy for the discount rate, but these data are only available for 2006. Furthermore, the dataset does not have any measures of the baseline preference for health behaviors (predating 4 Changing the number of matching observations in a range from 1-5 does not substantially alter the estimated average treatment effects. 12 participation in HS athletics). This leads to the question of how discipline affects earnings. Lack of an adequate proxy means that PSM may not eliminate all of the bias found when using least squares estimation. The inclusion of several variables representing household/family characteristics during adolescence and participation on other HS club activities should considerably strengthen the first stage estimation and improve the performance of the PSM estimation. For each model, statistics show the samples (treated vs. not-treated) are sufficiently balanced for PSM estimation to perform properly. 5 We employ a two-stage instrumental variables estimation technique developed by Lewbel (2012). This technique uses heterogeneity in the residuals from the first stage estimates to construct instruments for the second stage equation. Specifically, the instruments are constructed by multiplying the residuals from each first stage regression (one for each endogenous regressor) by the exogenous variables’ deviations from their means. Thus, the first stage regression generates one instrument for each explanatory variable in the first stage equation plus each traditional instrumental variable (we do not specify any traditional IVs, so the number of instruments generated from each first stage regression is equal to the number of exogenous explanatory variables in the second stage). Identification requires the residuals from the first stage regressions to be heteroskedastic. This technique may generate less reliable estimates compared to standard IV approaches, but serves as a reasonable alternative when other valid instruments are not available, and can augment traditional IVs, leading to lesser inflation of the standard errors in the second stage regression. Before turning to the results, we briefly discuss some of the key summary statistics for the data. Table 1 presents the mean for each of the dependent variables and for the participation in HS athletics variable for each sample. In 2000, 39.1 percent of the sample reported exercising at least once per week while 20.4 percent exercised three or more times per week. In 2002, we see that 57.3 percent of the respondents met the basic PA guidelines espoused by the federal government and the WHO, while 45.8 percent met or exceeded the higher guidelines. The difference between the two estimates can be attributed to WRPA and PA associated with commuting to work. Data from the 2011-2012 round of the National Health and Nutrition Examination Surveys (NHANES) show that only 25 percent of US adults engage in at least 40 5 In general, the difference between the means of the propensity scores is around 10% while the ratio of the variances is very close to one. 13 minutes of vigorous LTPA per week, while median total PA per week is 4.67 hours. The numbers for the 2012 wave are similar; with somewhat lower fractions of the sample meet either guideline. The smoking data show a reduction in the rate of daily smoking from 24.9 percent in 1998 to 19.4 percent in 2012. The data on consumption of alcohol show fairly steady rates of consumption, with just over 56 percent of individuals consuming any alcohol in both 1998 and 2012 and roughly 40 percent meeting the guidelines for the consumption of alcohol. The reported rate of participation in HS athletics is fairly consistent over the different samples, hovering around 40 percent. Results Exercise/Physical Activity Table 2 presents the estimates for the exercise/PA models. All models contain the full set of covariates listed in the methodology section. However, for the sake of expediency, only the estimates for the HS athletics variable are presented. The logit estimates for frequency of vigorous exercise (Panel A) show that individuals who participated in HS athletics are 7.9 percentage points more likely to exercise at least one time per week and 5.7 percentage points more likely to exercise three or more times per week. These represent a 20 percent and 27.9 percent increase over the participation rate for the sample. The PSM estimates are highly similar, showing a 7 (5.7) percentage point increase in the probability of exercising one (three) day(s) or more per week. The Lewbel IV estimates show even larger effects of participation in HS athletics on future exercise frequency. However, while the J-statistic supports validity of the instruments, the Kleinbergen-Paap (KP) statistic indicates the instruments are weak, possibly introducing bias into the estimates. Panel B presents the estimates for the impact on meeting the guidelines for PA in 2002.The results from logit estimation show participation in HS athletics increases the probability of meeting the basic (higher) PA guidelines by 5 (6) percentage points. The PSM estimates show a smaller, but still statistically significant effect. The Lewbel estimates vary significantly. The basic guideline model shows not significant relationship between PA and HS athletics, while the higher guideline model shows an effect of HS athletics similar to the frequent exercise model estimates presented in panel A. The consistency in the estimates between the frequent exercise models in panel A and the higher guideline models in panel B gives us greater 14 confidence that these relationships are robust to different (in this case drastically different) measures of PA. Finally, the logit and PSM based estimates for both the basic PA guideline and higher PA guideline models using the 2012 data are very similar to those using the 2002 data. In fact, the estimates show an even larger effect of participation in HS athletics on PA as individuals get older. Overall, the results indicate that participation in HS athletics has a positive, significant (both statistically and economically), and long lasting effect on future levels of PA. Cigarette Smoking The cigarette smoking models (table 3) also show a significant effect of participation in HS athletics. Being an athlete lowers the probability that an individual smokes daily in 1998 by 2.9 percentage points according to the logit model and by 3.2 percentage points according to the PSM routine, and the probability the individual has smoked 100-plus cigarettes in her lifetime by 4.4 and 5.5 percentage points, respectively. The Lewbel IV estimates are similar to those obtained via logistic regression and PSM, but are no longer statistically significant due to the marked increase in the standard error which is a result of the instruments’ weakness. HS athletics continues to have a negative effect on smoking in 2012, but with a reduced magnitude. The PSM estimates for being a daily smoker are no longer statistically significant, while the estimates for ever having smoked 100-plus cigarettes is only significant at the ten-percent level. Part of the decline may be due to the decline in overall rates of current smoking. Overall, there does appear to be a significant, negative impact of HS athletics on smoking, at least for individuals in their thirties. Consumption of Alcohol Participation in HS athletics results in a higher probability of consuming alcohol and lower probability of staying within the CDC guidelines for alcohol consumption on a typical day during which the individual consumes alcohol. As with the exercise/PA models, the logit and PSM estimates are very similar for most of the models. Participation in HS athletics leads to a 2.8 percentage point increase in the probability of consuming any alcohol during the previous thirty days and a 4.8 percentage point decrease in the probability of meeting the guideline for alcohol consumption on a typical day. These effects are even stronger in 2012. HS athletics 15 results in a 5.1 percentage point increase in the probability of consuming any alcohol and a 5.4 percentage point reduction in the probability of meeting the guideline. In general, the Lewbel IV estimates are not significant as tests again show them to be valid, but weak. The one exception is the guideline model in 2012, which shows a large, negative effect of participation in HS athletics on the probability of maintaining the guidelines on a given day. However, given the weakness of the instruments, this estimate should be viewed with some hesitation. Overall, the results show a significant effect of participation in HS athletics on consumption of alcohol later in life. As with the PA models, these effects are long-lasting and may actually grow stronger over time. Robustness check- controlling for a measure of time preference As discussed in the methodology section, it has been shown that time preference is correlated with a variety of health behaviors and may also be correlated with participation in HS athletics. However, the NLSY79 only contains questions aimed at eliciting measures of time preference in the 2006 wave. Thus, we recreate each model using data from 2006 for the physical activity and drinking models (the NLSY79 does not contain information on smoking for the full sample in 2006). Following Smith et al (2005) who used an indicator variable representing whether the individual saved or dis-saved to proxy for time preference, we use information on a hypothetical savings question as an alternative to the time preference variables. The savings variable is constructed from a series of question aimed at eliciting information on the individual’s preferences over risk. First, individuals were asked “Suppose you have been given an item that is either worth nothing or worth $1, 000. Tomorrow you will learn what it is worth. There is a 50-50 chance it will be worth $1,000 and a 50-50 chance it will be worth nothing. You can wait to find out how much the item is worth, or you can sell it before its value is determined. What is the lowest price that would lead you to sell the item now rather than waiting to see what it is worth?” Then, they were asked the follow up question “If you received [$ (value in RISK3)/your selling price], what percentage (0 -100) of this would you save for the future rather than spend in the next 12 months?” We use the response to the second question, recoded to take values between zero and one, as our proxy for time preference. The NLSY79 data provide an alternative to the hypothetical savings rate variable described above. Respondents were asked: “Suppose you have won a prize of $1000, which you can claim immediately. However you have the alternative of waiting one month to claim the 16 prize. If you do wait, you will receive more than $1000. What is the smallest amount of money in addition to the $1000 you would have to receive one month from now to convince you to wait rather than claim the prize now?” While this information might seem to provide a more direct measure of time preference, discount rates constructed from this information do not show a significant correlation with any of the PA or drinking variables. Kosteas (2015) also found the savings variable to be a stronger predictor of PA compared to discount rates calculated from the responses to this question. Table 5 presents the estimates for the PA and drinking models using the 2006 data. Each model is first estimated without the time preference proxy (columns 1-3) and then again including the time preference measure (columns 4-6). When excluding the time preference proxy from list of control variables, the estimated effect of participation in HS athletics on PA in 2006 is qualitatively similar to the estimates obtained using the 2002 and 2012 samples, however the magnitudes of the coefficients from the logit models are somewhat smaller. The PSM estimates are very similar to those obtained using the 2002 data. It is important to note here that the PSM estimates are actually slightly larger in magnitude compared with the logit estimates. Interestingly, the estimated effect of participation in HS athletics on current PA levels is large and statistically significant. However, as before, some caution is warranted as the KP tests indicate the instruments are weak. Including the hypothetical savings variable has very little impact on the coefficient estimates. The proxy behaves exactly as we would expect. Individuals who are more future oriented (i.e. would save a larger fraction of the hypothetical award) are more likely to meet both the basic and the higher PA guidelines. Overall, the estimates for the PA models are robust to the inclusion of the time preference proxy. Switching to the models for consumption of alcohol, we see similar patterns in the models where we do not control for time preference. As with the PA models, the logit estimates are smaller in magnitude for 2006 compared with the 2002 and 2012 estimates. However, the PSM estimates are similar in magnitude to those for 2002. Consistent with the results for the 2012 data, the Lewbel IV estimates show a strong, negative effect of participation in HS athletics on the probability of adhering to the guidelines for alcohol consumption on a typical day. Inclusion of the savings variable does not have a significant impact on either the logit or IV estimates, however the PSM estimates decline significantly and are no longer statistically 17 significant. Thus, the estimates for the alcohol consumption models are not as robust to the inclusion of the savings variable as are the PA models. Robustness check- controlling for log wage and hours worked per week As an additional robustness check, we estimate the models including the log hourly wage and average hours worked per week as additional control variables. The inclusion of these variables restricts the sample to working individuals. Given the weakness of the generated instruments in the Lewbel IV approach, we focus here on the logit and PSM estimates. Generally, the results (Table 6) are highly consistent with those presented in tables 2-4. It does not appear the effect of participation in HS athletics on health behaviors depends on labor market status. Furthermore, the coefficient estimate on the hours worked per week variable is not statistically significant in any of the models. The consistency of our results when conducting these specification checks increases our confidence in the primary findings presented in tables 2-4. In general, there is strong evidence suggesting participation in HS athletics leads to higher levels of PA and lower rates of smoking in middle age, with a positive, but less robust effect on consumption of alcohol. Conclusions We estimate the effect of participation in high school athletics on several health behaviors during adulthood, through middle age. Specifically, we examine whether playing sports in high school affects the probability of meeting established guidelines for physical activity, being a daily smoker or ever having been a regular smoker, and whether and how much alcohol an individual consumes. We look at these relationships over a twelve year period, allowing us to determine whether these effects are persistent over time. We find that participation in HS athletics raises the level of PA, leads to lower rates of smoking, but higher rates of drinking alcohol. These findings are consisting with existing studies showing that high school and college athletes drink more but are less likely to smoke cigarettes when compared with their classmates who do not play sports, suggesting that these results may at least be the result of habit formation. Furthermore, the estimated effects are long-lasting and may, grow stronger over time for certain behaviors. 18 This study adds to our understanding of the many potential benefits of participation in athletics during high school. In addition to greater educational attainment and improved labor market outcomes, it appears former athletes also invest more in their health capital. The results provide additional justification for the continuing financial support for high school athletics by school districts. By using multiple estimation techniques, including propensity score matching and a relatively new instrumental variables routine, the present paper attempts to push beyond simply establishing correlation and towards the estimation of causal effects. The results indicate these relationships are indeed causal. However, more work is needed, using other estimators in order to more confidently assert the causal nature of these correlations. Additionally, future work is needed to assess the extent of participation of high school athletics that is needed to generate these benefits. Unfortunately, the limited information on participation available in the NLSY79 does not allow us to do so in the present study. 19 References Acland, D., and M. Levy (2012). Habit formation and naïveté in gym attendance: evidence from a field experiment, WP LSE Research Online. Adams, J., & Nettle, D. (2009). Time perspective, personality and smoking, body mass, and physical activity: An empirical study. British Journal of Health Psychology, 14(1), 83-105. Anderson, L. R., & Mellor, J. M. (2008). Predicting health behaviors with an experimental measure of risk preference. Journal of Health Economics, 27(5), 1260-1274. Anderson, K. H., Foster, J. E., & Frisvold, D. E. (2010). Investing in health: the long-term impact of head start on smoking. Economic Inquiry, 48(3), 587-602. Anokye, Nana Kwame, Subhash Pokhrel, Martin Buston, and Julia Fox-Rushby (2012). “The demand for sports and physical activity: results from an illustrative survey,” European Journal of Health Economics, Vol. 13: 277-287. Argys, L. M., Rees, D. I., Averett, S. L., & Witoonchart, B. (2006). Birth order and risky adolescent behavior. Economic Inquiry, 44(2), 215-233. Babock, P. S., and J. L. Hardman (2010). Networks and workouts: treatment size and status specific peer effects in a randomized field experiment, NBER WP 16581. Bailey, R., Hillman, C., Arent, S., & Petitpas, A. (2013). Physical activity: an underestimated investment in human capital. J Phys Act Health, 10(3), 289-308. Barron, J. M., Ewing, B. T., & Waddell, G. R. (2000). “The effects of high school athletics participation on education and labor market outcomes.” The Review of Economics and Statistics, 82, 409-421. Borghans, Lex, and Bart HH Golsteyn. "Time discounting and the body mass index: Evidence from the Netherlands." Economics & Human Biology 4.1 (2006): 39-61. Breuer, Christoph, and Pamela Wicker. "Decreasing sports activity with increasing age? Findings from a 20-year longitudinal and cohort sequence analysis." Research quarterly for physical activity and sport 80.1 (2009): 22-31. Brown, H., & Pol, M. (2014). The Role of Time Preferences in the Intergenerational Transfer of Smoking. Health Economics, 23(12), 1493-1501. Cabane, C., & Lechner, M. (2014). Physical activity of adults: A survey of correlates, determinants, and effects. ZEW-Centre for European Economic Research Discussion Paper, (14088). 20 Carrell, S. E., M. Hoekstra, and J. E. West (2011). Is poor fitness contagious? Evidence from randomly assigned friends, Journal of Public Economics, 95, 657-663. Cawley, J. (2004). “An economic framework for understanding physical activity and eating behaviors.” American Journal of Preventive Medicine, 27(3): 117-125. Charness, G., and U. Gneezy (2009). Incentives to exercise, Econometrica, 77, 909-931. Costa-Font, J., Hernández-Quevedo, C., & Jiménez-Rubio, D. (2014). Income inequalities in unhealthy life styles in England and Spain. Economics & Human Biology, 13, 66-75. Cotti, C., Dunn, R. A., & Tefft, N. (2014). The Dow is killing me: risky health behaviors and the stock market. Health Economics. Chesson, H. W., Leichliter, J. S., Zimet, G. D., Rosenthal, S. L., Bernstein, D. I., & Fife, K. H. (2006). Discount rates and risky sexual behaviors among teenagers and young adults. Journal of Risk and uncertainty, 32(3), 217-230. Delaney, L., Harmon, C., & Wall, P. (2008). Behavioral economics and drinking behavior: preliminary results from an Irish college study. Economic Inquiry, 46(1), 29-36. Delaney, L., Kapteyn, A., & Smith, J. P. (2013). Why do some Irish drink so much? Family, historical and regional effects on students’ alcohol consumption and subjective normative thresholds. Review of Economics of the Household, 11(1), 1-27. Dohle, S., & Wansink, B. (2013). Fit in 50 years: participation in high school sports best predicts one’s physical activity after Age 70. BMC public health, 13(1), 1100. Dohmen, T., Falk, A., Huffman, D., Sunde, U., Schupp, J., & Wagner, G. G. (2011). Individual risk attitudes: Measurement, determinants, and behavioral consequences. Journal of the European Economic Association, 9(3), 522-550. Downward, P. M., F. Lera-Lopez, and S. Rasciute (2011). The Economic Analysis of Sports Participation in Robinson, L., Bodet, G., and Downward, P. (eds.), International Handbook of Sports Management, London: Routledge, 331-353. Downward, P. M., and S. Rasciute (2011). Does Sport Make You Happy? An Analysis of the Well-being Derived from Sports Participation, International Review of Applied Economics, 25 (3), 331–348. Downward, Paul, and Joseph Riordan. "Social interactions and the demand for sport: An economic analysis." Contemporary Economic Policy 25.4 (2007): 518-537. Eberth, Barbara and Murray D. Smith (2010). Modelling the participation decision and duration of sporting activity in Scotland. Economic Modelling, Vol. 27: 822-834. 21 Eide, E. R., & Ronan, N. (2001). “Is participation in high school athletics an investment or a consumption good?” Economics of Education Review, 20, 431-442. Eisenberg, Daniel and Okeke Edward. "Too cold for a jog? Weather, physical activity, and socioeconomic status." The BE Journal of Economic Analysis & Policy 9.1 (2009): 1-32. Ewing, B. T. (2007). The labor market effects of high school athletic participation. Journal of Sports Economics, 8, 255-265. Farrell, Lisa, and Michael A. Shields. "Investigating the economic and demographic determinants of sporting participation in England." Journal of the Royal Statistical Society: Series A (Statistics in Society) 165.2 (2002): 335-348. Fridberg, T. (2010). Sport and exercise in Denmark, Scandinavia and Europe, Sport in Society, 13, 583-592. García, J., F. Lera-López, and M. J. Suárez (2011). Estimation of a structural model of the determinants of the time spent on physical activity and sport: Evidence for Spain, Journal of Sports Economics, 12, 515-537 Geisner, I. M., Grossbard, J., Tollison, S., & Larimer, M. E. (2012). Differences between athletes and non-athletes in risk and health behaviors in graduating high school seniors. Journal of Child & Adolescent Substance Abuse, 21(2), 156-166. Goel, R. K., & Naretta, M. A. (2011). Determinants of various aspects of smoking behaviour in the United States. Applied Economics Letters, 18(17), 1671-1675. Göhlmann, S., Schmidt, C. M., & Tauchmann, H. (2010). Smoking initiation in Germany: the role of intergenerational transmission. Health Economics, 19(2), 227-242. Harrison, G.W., M. Lau, and E.E. Rutstrom (2010). Individual Discount Rates and Smoking: Evidence from a Field Experiment in Denmark. Journal of Health Economics 29 (5): 708–17. Hartmann, D., & Massoglia, M. (2007). Reassessing the relationship between high school sports participation and deviance: Evidence of enduring, bifurcated effects. The Sociological Quarterly, 48(3), 485-505. Heckman, James J., Hidehiko Ishimura and Petra E. Todd (1997). “Matching as an Econometric Evaluation Estiamtor.” The Review of Economics and Statistics, Vol. 65(2): 261-294. Henderson, D. J., Olbrecht, A., & Polachek, S. W. (2006). Do former college athletes earn more at work? A nonparametric assessment. Journal of Human Resources, 41(3), 558-577. Hovemann, G., and P. Wicker, (2009). Determinants of sport participation in the European Union. European Journal for Sport and Society, 6, 51-59. 22 Humphreys, Brad R., and Jane E. Ruseski. "An economic analysis of participation and time spent in physical activity." The BE Journal of Economic Analysis & Policy 11.1 (2011). Humphreys, B. R., & Ruseski, J. E. (2015). The Economic Choice of Participation and Time Spent in Physical Activity and Sport in Canada. International Journal of Sport Finance, 10(2), 138. Humphreys, Brad R., and Jane E. Ruseski. "Participation in physical activity and government spending on parks and recreation." Contemporary Economic Policy 25.4 (2007): 538-552. Jones, A. M. (1994). Health, addiction, social interaction and the decision to quit smoking. Journal of Health Economics, 13(1), 93-110. Kari, J. T., Pehkonen, J., Hirvensalo, M., Yang, X., Hutri-Kähönen, N., Raitakari, O. T., & Tammelin, T. H. (2015). Income and Physical Activity among Adults: Evidence from SelfReported and Pedometer-Based Physical Activity Measurements. PloS one, 10(8), e0135651. Khwaja, A., Silverman, D., & Sloan, F. (2007). Time preference, time discounting, and smoking decisions. Journal of Health Economics, 26(5), 927-949. Kosteas, V. D. (2011). High school clubs participation and future supervisory status. British Journal of Industrial Relations, 49(s1), s181-s206. Kosteas, V. D. (2015). Physical activity and time preference. International Journal of Health Economics and Management, 1-26. Lechner, Michael. "Long-run labour market and health effects of individual sports activities." Journal of Health Economics 28.4 (2009): 839-854. Lechner, M., & Sari, N. (2015). Labor market effects of sports and exercise: Evidence from Canadian panel data. Labour Economics, 35, 1-15. Leslie, K. J., and M. I. Norton (2012). Exercising to the lowest common denominator, mimeo. Lewbel, A. (2012). Using heteroscedasticity to identify and estimate mismeasured and endogenous regressor models. Journal of Business & Economic Statistics, 30(1), 67-80. Lundborg, P. (2006). Having the wrong friends? Peer effects in adolescent substance use. Journal of Health Economics, 25(2), 214-233. Lundborg, P., & Lindgren, B. (2004). Do they know what they are doing? Risk perceptions and smoking behaviour among Swedish teenagers. Journal of Risk and Uncertainty, 28(3), 261-286. Lundborg, P., & Lindgren, B. (2002). Risk perceptions and alcohol consumption among young people. Journal of Risk and Uncertainty, 25(2), 165-183. 23 Maruyama, Shiko and Qing Yin (2012). “The opportunity cost of physical activity: Do higherearning Australians physical activity longer, harder, or both?” Health Policy, Vol. 106: 187-194. Meltzer, David O. and Anupam B. Jena (2010). “The economics of intense physical activity,” Journal of Health Economics, Vol. 29: 347-352. Mullahy, John, and Stephanie A. Robert. "No time to lose: time constraints and physical activity in the production of health." Review of Economics of the Household 8.4 (2010): 409-432. Naylor, A. H., Gardner, D., & Zaichkowsky, L. (2001). Drug use patterns among high school athletes and nonathletes. Adolescence, 36(144), 627. Nonnemaker, J. M., & Farrelly, M. C. (2011). Smoking initiation among youth: the role of cigarette excise taxes and prices by race/ethnicity and gender. Journal of Health Economics, 30(3), 560-567. Park, C., & Kang, C. (2008). Does education induce healthy lifestyle? Journal of Health Economics, 27(6), 1516-1531. Powell, L. M., Tauras, J. A., & Ross, H. (2005). The importance of peer effects, cigarette prices and tobacco control policies for youth smoking behavior. Journal of Health Economics, 24(5), 950-968. Richards, T. J., & Hamilton, S. F. (2012). Obesity and hyperbolic discounting: an experimental analysis. Journal of Agricultural and Resource Economics,37(2), 181. Rooth, D. O. (2011). Work out or out of work—The labor market return to physical fitness and leisure sports activities. Labour Economics, 18(3), 399-409. Royer, H., M. F. Stehr, and J. R. Sydnor (2015). Incentives, commitments and habit formation in exercise: evidence from a field experiment with workers at a fortune-500 company, American Economic Journal: Applied Economics, Vol. 7 No. 3: 51-84. Rubin, D. B. (2001). Using propensity scores to help design observational studies: application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2(3-4), 169-188. Ruseski, Jane E., and Katerina Maresova. "Economic Freedom, Sport Policy, and Individual Participation In Physical Activity: An International Comparison." Contemporary Economic Policy 32.1 (2014): 42-55. Smith, Patricia K., Barry Bogin and David Bishai (2005). “Are time preference and body mass index associated? Evidence from the National Longitudinal Survey of Youth,” Economics and Human Biology, Vol. 3: 259-270. 24 Stamatakis, E., and M. Chaudhury (2008). Temporal trends in adults’ sports participation patterns in England between 1997 and 2006: the Health Survey for England, British Journal of Sports Medicine, 42, 901-908. Stevenson, Betsey (2010). Beyond the classroom: using Title IX to measure the return to high school sports. Review of Economics and Statistics, Vol. 92(2): 284-301. Terry‐McElrath, Y. M., & O'Malley, P. M. (2011). Substance use and exercise participation among young adults: parallel trajectories in a national cohort‐sequential study. Addiction, 106(10), 1855-1865. Troutman, K. P., & Dufur, M. J. (2007). From High School Jocks to College Grads Assessing the Long-Term Effects of High School Sport Participation on Females' Educational Attainment. Youth & Society, 38(4), 443-462. Wetherill, Reagan R., and Kim Fromme. "Alcohol use, sexual activity, and perceived risk in high school athletes and non-athletes." Journal of Adolescent Health 41, no. 3 (2007): 294-301. Williams, J. (2005). Habit formation and college students' demand for alcohol. Health Economics, 14(2), 119-134. Witham, M. D., P. T. Donnan, T. Vadiveloo, F. F. Sniehotta, I. K. Crombie, Z. Feng, and M. E. T. McMurdo (2014). Association of day length and weather conditions with PA levels in older community dwelling people, PLoS ONE, 9, e85331. Zhang, L. E. I., & Rashad, I. (2008). Obesity and time preference: the health consequences of discounting the future. Journal of Biosocial Science, 40(01), 97-113. 25 Table 1: Summary Statistics 2000 Exercise Models Exercise 1+ times per week Exercise 3+ times per week Participated in HS athletics Mean 0.391 0.204 0.403 Std Dev 0.488 0.403 0.491 N 5,538 2002 Physical Activity Models Met Basic PA Guideline Meet Higher PA Guideline Participated in HS athletics Mean 0.573 0.458 0.399 Std Dev 0.495 0.498 0.49 N 4,900 2012 Physical Activity Models Met Basic PA Guideline Meet Higher PA Guideline Participated in HS athletics Mean 0.565 0.422 0.402 Std Dev 0.496 0.494 0.49 N 5,198 1998 Smoking Models Currently Smokes Daily Has Smoked 100+ Cigarettes Participated in HS athletics Mean 0.249 0.479 0.405 Std Dev 0.432 0.5 0.491 N 5,702 2012 Smoking Models Currently Smokes Daily Has Smoked 100+ Cigarettes Participated in HS athletics Mean 0.184 0.579 0.398 Std Dev 0.387 0.494 0.49 N 5,617 1998 Alcohol Models Consumed any alcohol past 30 days Meets Guidelines for Alcohol Participated in HS athletics Mean 0.562 0.686 0.407 Std Dev 0.496 0.464 0.491 N 5,671 2012 Alcohol Models Consumed any alcohol past 30 days Meets Guidelines for Alcohol Participated in HS athletics Mean 0.564 0.708 0.399 Std Dev 0.496 0.454 0.49 N 5,588 26 Table 2: Determinants of Exercise/Physical Activity Panel A: 2000 (n=5,538) Athletics Logit Exercise 1+ Times Per Week PSM Lewbel IV 0.079** (0.013) 0.07** (0.016) 0.129+ (0.068) First-stage statistics Hansen's J-statistic Kleibergen-Paap statistic Shea Partial R-squared Panel B: 2002 (N=4,900) Athletics Athletics First-stage statistics Hansen's J-statistic Kleibergen-Paap statistic Shea Partial R-squared Lewbel IV 0.057** (0.011) 0.057** (0.013) 0.115+ (0.061) 11.02 (0.89) 6.7 0.0426 Logit Met Basic PA Guideline PSM Lewbel IV 0.05** (0.015) 0.039* (0.017) 0.006 (0.069) First-stage statistics Hansen's J-statistic Kleibergen-Paap statistic Shea Partial R-squared Panel C: 2012 (N=5,198) Logit Exercise 3+ Times Per Week PSM 13.46 (0.76) 6.7 0.0426 Logit Meet Higher PA Guideline PSM Lewbel IV 0.06** (0.015) 0.048* (0.017) 0.125+ (0.072) 17.89 (0.46) 6.03 0.0462 Logit Met Basic PA Guideline PSM Lewbel IV 0.048** (0.015) 0.049** (0.018) 0.04 (0.066) 18.7 (0.41) 7.26 0.0503 27 6.03 0.0462 Logit Meet Higher PA Guideline PSM Lewbel IV 0.058** (0.015) 0.058** (0.018) 0.081 (0.068) 13.8 (0.74) 7.26 0.0503 Table presents marginal effects for logit models and average treatment effects for PSM models. All models contain the full set of control variables. Robust standard errors are in parentheses. +,*, ** denote significance at the 10%, 5%, 1% level, respectively. KP critical values: 20% = 6.31, 10% = 11.46, 5% = 21.36. 28 Table 3: Determinants of Smoking Panel A: 1998 (N=5,707) Athletics Logit Currently Smokes Daily PSM Lewbel IV -0.029* (0.012) -0.032* -0.014 0.036 (0.056) First-stage statistics Hansen's J-statistic Kleibergen-Paap statistic Shea Partial R-squared Panel B: 2012 (N=5,617) Athletics First-stage statistics Hansen's J-statistic Kleibergen-Paap statistic Shea Partial R-squared Logit Has Smoked 100+ Cigarettes PSM Lewbel IV -0.044** (0.014) -0.05** (0.016) 0.042 (0.07) 13.9 (0.74) 6.06 0.038 Logit Currently Smokes Daily PSM Lewbel IV -0.028* (0.011) -0.021 (0.014) -0.022 (0.041) 21.3 (0.26) 6.06 0.038 Logit Has Smoked 100+ Cigarettes PSM Lewbel IV -0.036** (0.014) -0.031+ (0.017) 0.058 (0.063) 27.1 (0.08) 8.31 0.0526 Table presents marginal effects for logit models and average treatment effects for PSM models. All models contain the full set of control variables. Robust standard errors are in parentheses. +,*, ** denote significance at the 10%, 5%, 1% level, respectively. KP critical values: 20% = 6.31, 10% = 11.46, 5% = 21.36. 29 16.0 (0.59) 8.31 0.0526 Table 4: Determinants of Alcohol Consumption Panel A: 2002 (N=5,671) Athletics Logit Consumed any Alcohol Past 30 Days PSM Lewbel IV 0.041** (0.014) 0.028+ (0.015) -0.017 (0.067) First-stage statistics Hansen's J-statistic Kleibergen-Paap statistic Shea Partial R-squared Panel B: 2012 (N=5,588) Athletics Logit Meets Guidelines for Alcohol PSM Lewbel IV -0.051** (0.013) -0.048** (0.015) -0.032 (0.064) 21.3 (0.26) 6.39 0.0411 Logit Consumed any Alcohol Past 30 Days PSM Lewbel IV 0.046** (0.014) 0.051** (0.016) 0.062 (0.06) First-stage statistics Hansen's J-statistic Kleibergen-Paap statistic Shea Partial R-squared 16.3 (0.57) 8.24 0.0525 Table presents marginal effects for logit models and average treatment effects for PSM models. All models contain the full set of control variables. Robust standard errors are in parentheses. +,*, ** denote significance at the 10%, 5%, 1% level, respectively. KP critical values: 20% = 6.31, 10% = 11.46, 5% = 21.36. 30 12.2 (0.84) 6.39 0.0411 Logit Meets Guidelines for Alcohol PSM Lewbel IV -0.056** (0.013) -0.054** (0.014) -0.155** (0.056) 15.46 (0.63) 8.24 0.0525 Table 5: Robustness Check- controlling for time preference Excluding Time Preference Including Time Preference Panel A: Basic PA guideline Logit PSM Lewbel IV Logit PSM Lewbel IV Athletics 0.034* 0.036* 0.141* 0.034* 0.035* 0.149* (0.015) (0.018) (0.067) (0.015) (0.017) (0.071) Time Preference Observations 5,519 5,519 5,519 0.067** 0.065** (0.017) (0.017) 5,278 5,278 5,278 First-stage statistics Hansen's J-statistic 21.3 (0.26) Kleibergen-Paap statistic Shea Partial R-squared Panel B: Higher PA guideline Athletics 6.48 5.63 0.0467 0.0441 Logit PSM Lewbel IV Logit PSM Lewbel IV 0.043** 0.048** 0.203** 0.039** 0.04* 0.208** (0.014) (0.018) (0.068) (0.015) (0.017) (0.071) Time Preference Observations 24.1 (0.19) 5,519 5,519 5,519 0.062** 0.061** (0.017) (0.017) 5,278 5,278 5,278 First-stage statistics Hansen's J-statistic Kleibergen-Paap statistic Shea Partial R-squared 24.4 (0.14) 25.0 (0.16) 6.48 5.63 0.0467 0.0441 Panel C: Any drinks past 30 days Logit PSM Lewbel IV Logit PSM Lewbel IV Athletics 0.034* 0.034* 0.083 0.032* 0.015 0.09 (0.013) (0.016) (0.062) (0.014) (0.016) (0.065) Time Preference Observations 6,178 6,178 6,178 First-stage statistics 31 -0.068** -0.068** (0.015) (0.015) 5,889 5,889 5,889 Hansen's J-statistic Kleibergen-Paap statistic Shea Partial R-squared Panel D: Alcohol guideline Athletics 25.9 (0.08) 27.9 (0.09) 7.46 6.31 0.0465 0.0428 Logit PSM Lewbel IV Logit PSM Lewbel IV -0.045** -0.048** -0.158** -0.044** -0.02 00.168** (0.012) (0.016) (0.056) (0.013) (0.015) (0.059) Time Preference Observations 6,178 6,178 6,178 0.067** 0.067** (0.015) (0.015) 5,889 5,889 5,889 First-stage statistics Hansen's J-statistic Kleibergen-Paap statistic Shea Partial R-squared 27.0 (0.08) 26.5 (0.12) 7.46 6.31 0.0465 0.0428 Table presents marginal effects for logit models and average treatment effects for PSM models. All models contain the full set of control variables. Robust standard errors are in parentheses. +,*, ** denote significance at the 10%, 5%, 1% level, respectively. 32 Table 6: Robustness Check- including only employed individuals Panel A: Exercise Met Basic PA Guidelines (2002) 0.048** (0.016) Met Higher PA Guidelines (2002) 0.056** (0.016) Met Basic PA Guidelines (2012) 0.047** (0.017) Met Higher PA Guidelines (2012) 0.062** (0.017) PSM (ATE) 0.033+ (0.018) 0.055** (0.018) 0.044* (0.019) 0.057** (0.02) Observations 4,245 4,245 3,974 3,974 Currently Smokes Daily (1998) -0.038** (0.012) Has Smoked 100+ Cigarettes (1998) -0.054** (0.015) Currently Smokes Daily (2012) -0.023* (0.012) Has Smoked 100+ Cigarettes (2012) -0.037* (0.016) PSM (ATE) -0.048** (0.013) -0.055** (0.017) -0.0057 (0.012) -0.053** (0.017) Observations 4,966 4,963 4,306 4,306 Consumed Alcohol Past 30 days (2002) 0.046** (0.015) Meets Alcohol Guidelines (2002) -0.06** (0.014) Consumed Alcohol Past 30 days (2002) 0.048** (0.016) Meets Alcohol Guidelines (2012) -0.071** (0.015) PSM (ATE) 0.043** (0.016) -0.06** (0.016) 0.058** (0.018) -0.065** (0.017) Observations 4,911 4,911 4,300 4,300 Logit (ME) Panel B: Smoking Logit (ME) Panel C: Drinking Logit (ME) Table presents marginal effects for logit models and average treatment effects for PSM models. All models contain the full set of control variables, including the log hourly wage and hours worked per week. Robust standard errors are in parentheses. +,*, ** denote significance at the 10%, 5%, 1% level, respectively. 33