TILBURG UNIVERSITY - FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION Health Care Reform and its Effects on Labour Absenteeism Due to Sick Leave: Evidence from Chile Master Thesis August 2011 Name: Isabel Asenjo-Andrews ANR: 946684 Programme: Master of Science in Economics Supervisor: Catherine Schaumans Word Count: 16,500 aprox. This paper studies the effect of the GES (Spanish initials for ‘Explicit Health Guarantees’) or AUGE (Spanish initials for ‘Health Plan of Universal Access with Explicit Guarantees’) plan on labour absenteeism due to sick leave in Chile. The hypothesis is a rise in sick leave due to a possible intensification of moral hazard resulting from the more comprehensive coverage and cheaper access to medical treatment for an increasing number of illnesses included in the GES plan. The data is retrieved from the Chilean survey ‘Encuesta de Protección’ Social of 2004, 2006 and 2009. The effect of the GES plan is estimated using a zero-inflated negative binomial regression model. The results show a positive marginal effect of the reform on the number of days of labour absenteeism that people take. The contribution of this study is to be one of the first to provide empirical evidence of the effect of this major Chilean health care reform on sick leave. Table of Contents I. Introduction……………………………………………..…………………………….....2 II. The Chilean Health Insurance System and Health Care Reform……………………..…5 III. Previous Literature ……………………………………………………………………...7 IV. Theoretical Framework………..……………………………………...………….…........9 V. Data……………………………………………………………………...……….…..…11 V.1 Sample Selection………………………………………….…………………..12 V.2 Variable Definitions…………..………………………………………………14 V.3 Descriptive Statistics…………………………………………………….…….17 VI. Estimation Strategy………………………………………………….……...………......21 VI.1 Poisson Model………………………………………………………………..22 VI.2 Negative Binomial Model………………………….…………………………22 VI.3 Zero-Inflated Binomial Model…………………………………………… ….23 VI.4 Hurdle Model…………………………………………………………………24 VI.5 Endogeneity…………………………………………………………..………26 VII. Results and Discussion…………………………………….………….….…….….28 VII.1 Zero Inflated Negative Binomial Model………………………………....... 28 VII.2 Hurdle Model…………………………………………………..………….. 33 VII.3 Robustness Checks…………………………………………..….…….….... 36 (i) Fit of the Model………………………………………………..….…....…….. 36 (ii) Potential Endogeneity…………………………………………....…….……... 36 (iii) Comparison with Hurdle Model………………………………………….........39 (iv) Alternative Specifications………………………………………………..…….40 VIII. Conclusions………………………………………………………..…………........42 IX. Appendix……………………………………………………………..………...…...…45 X. References……………….…………………………………………………....…....…52 1 I. Introduction The pursuit of a health care system that satisfies the demand of the population whilst avoiding creating perverse incentives for its consumers is a task that daunts every government. Providing quality health care is fundamental for equity considerations and also for survival of the population and productivity of the economy. Generous health care provision has side effects that must be weighed against its benefits to provide the adequate quantity in the most efficient way possible. Individual’s behaviour is very much affected by the incentives that they face. Labour supply is determined by those who are part of the workforce, and therefore may be strongly affected by economic incentives. Changes in health insurance coverage can influence individual’s incentives and consequently their behaviour regarding labour absenteeism due to sick leave. Given the weight of health expenditures in government’s budgets, it is important to take into account the alteration of incentives caused by reforms to the health care market for the design of public policies. More generous health insurance may not only relieve individuals in cases of sickness or disability, but can also contribute to induce such outcomes. Due to the role of insurance in mitigating the consequences of risk, the problem of moral hazard is bound to increase with higher insurance coverage. Moral hazard occurs when individual’s behaviour is different when they are insured against risk compared to when they are fully exposed to it. Health insurance can slightly encourage behaviours that increase the probability of injuries or illness but can also alter more subtly its consumer’s incentives, by lowering the threshold of tolerable pain of individuals. If this is the case they are likely to go to the doctor more often, increasing their absence from work. This behaviour arises from changes in incentives, not of medical needs. If the incentives of the consumers of health care are not in line with those of the health care authorities it could lead to huge increases of government and private insurers expenditures. The needs of the government to maintain a balanced budget would necessarily lead to higher taxes or lower expenditures in other important areas; private insurers, in order to prevent losses, would increase their prices, both these consequences being highly undesirable for health care consumers. In Chile, the GES 1 (or AUGE2) plan began in 2005 as part of a major health reform taking place since the year 2000. The objective of this component of the reform is to guarantee access to high quality medical attention, in a reasonable amount of time, with good financial support. Both the 1 Spanish initials for ‘Explicit Health Guarantees’. Spanish initials for ‘Health Plan of Universal Access with Explicit Guarantees’. The GES plan was originally called AUGE and is often referred to by this name. 2 2 public and private insurers (FONASA3 and ISAPRES4, respectively) in Chile are required to cover the illnesses included in the GES plan. The number of illnesses included has increased substantially in the last years. In 2005 began with 25 illnesses, it covered 56 by 2007 and 69 by 2010. People who previously could not afford to seek medical attention when they actually needed it will now be able to, increasing their labour absenteeism due to real illness. On the other hand, this could have a positive effect on the frequency of labour absenteeism due to sick leave as it facilitates the access to medical attention. The lower costs and increased availability could be an incentive for people to go to the doctor more often when they are not very sick. The demand for health care is fairly inelastic, which might suggest there is little space for moral hazard behaviour, but due to the existence of health insurance, the problem of excessive consumption exists and could possibly have increased due to the greater healthcare coverage implemented by the GES plan. The amount spent on sick leave reimbursement 5 in Chile increased by 109 percent between 2001 and 2009. Up until 2004, the expenditures in the public healthcare system (FONASA) were lower than that that of private insurers (ISAPRES). This situation was inverted afterwards as the expenditures of FONASA grew faster than those of the ISAPRES. In 2009, FONASA’s sick leave compensations amounted to 61.7 percent of total reimbursements. The increase of these expenditures was due to higher average daily reimbursement (due to an increase in wage rates), a larger number of subscribers to the public healthcare system and an increase in the days of labour absenteeism due to sick leave. This last factor was the main contribution to higher expenditures after the year 2005 due to an increase in the number of days of sick leave of 84 percent between 2001 and 2009, with 80 percent of this increase taking place between 2004 and 2009. The expenditures on sick leave of ISAPRES increased at a much slower rate than that of FONASA and it was due to both an increase in the number of days of absenteeism and to higher compensations being paid. These figures6, assuming that health conditions in Chile did not deteriorate systematically over these years7, suggest that there was a definite change in consumers’ incentives that could have affected their behaviour with respect to labour absenteeism due to sick leave. The Chilean salary compensation for days of sick leave is extremely generous compared to many other countries 8. 100 percent of the salary is reimbursed for most illnesses with a maximum of 60UF 9 per month for sick leave episodes longer than ten days. The first three days 3 Spanish initials for ‘National Health Fund’ Spanish initials form ‘Social Security Health Institutions’. 5 Known as ‘Subsidio por Incapacidad Laboral’ in Chile. 6 Superintendencia de Salud (2010). 7 Table 4 shows that health conditions did not changed substantially between 2006 and 2009. 8 Refer to Table 13 in the Appendix. 9 1 UF ≈ 48 USD (01-08-2011). 4 3 of sick leave are not reimbursed at all for episodes shorter than ten days, but workers receive their complete salary for the rest of the duration of the sick leave. The huge healthcare expenditures that moral hazard is possibly causing for the government and private insurers makes it a fundamental area for policy reform in order to reduce any over consumption of medical visits causing unnecessary sick leave. This is debatable because the increase of illnesses included in the GES plan could also contribute to a reduction in health expenditures for health insurers as it will help to prevent many sicknesses. Using data of the Chilean ‘Encuesta de Proteccion Social’, the objective of this study is to determine whether the introduction of 35 new illnesses in the GES plan caused an rise in labour absenteeism due to sick leave, which could possibly be generated by an increase of the strategic behaviour of healthcare consumers. A change in sick leave due to the reform will have important policy implications for the Chilean health care system. Due to lack of appropriate data, it is not possible to determine exactly how much of the change in sick leave is due to real medical needs being fulfilled and how much to a change in moral hazard although some deductions can be made by looking at data on Chilean’s behaviour and health status during the period analyzed. In the short run, the GES plan could be expected to have a positive impact on sick leave, as people get treatment that they actually needed. If moral hazard does not suffer an important increase due to the reform, sick leave levels should decrease in the long run once individuals have received treatment that has been more affordable for a few years. The results show a negative effect of the reform on the probability of taking a sick leave day during the year and a positive but small one on the decision of how many days to take. This could possibly be due to Chileans seeking medical attention which they actually needed and has become more accessible to or to an increase in moral hazard generated by the reform. The data used suggests that levels of sickness have not changed much between 2006 and 2009 and therefore should not be the main factor explaining changes in sick leave behaviour. The marginal effect of the extension of the GES plan between 2006 and 2009 caused an increase of sick leave equal to 0.514 days per year. The paper is organized as follows. Section II describes the Chilean health care system. The previous research that is relevant to the investigation is summarized in Section III. The theoretical framework for sick leave behaviour is explained in Section IV. In Section V the characteristics of the data used are depicted along with descriptive statistics of the variables of interest. The estimation strategy is presented in Section VI. In Section VII the empirical results obtained are discussed followed by conclusions in Section VIII. 4 II. The Chilean Health Insurance System and Health Care Reform The Chilean health system was entirely public until 1980, financed primarily by social security and fiscal funds. In 1981 a health insurance reform took place which implemented risk adjustment and market instruments, making it more similar to other insurance markets. Health insurance in Chile is mandatory but not universal, as it covers only dependant workers and retirees. Chilean dependant workers and pensioners are obliged to use seven percent of their income to purchase health insurance. Currently a dual system exists where there is a public health insurance provider (FONASA) and many private providers (ISAPRES) between which people can choose. FONASA charges a premium of seven percent of its affiliate’s salary independent of their level of income, of risk and the number of beneficiaries. It offers identical benefits to all its affiliates, but copayments depend on the level of income and range from zero to twenty percent of the cost. FONASA is financed through the payments of its beneficiaries (seven percent of their salaries) and fiscal funds. Most of the health care suppliers available for beneficiaries of FONASA are public. Formerly, the public system has not been able to offer optimum quality services due to the resource constraints that it faces. The motivation for the public sector to set its premium in this way is an equity concern to ensure that people can have health insurance regardless of their level of income and risk. Private institutions (ISAPRES) set premiums according to risk factors (age and sex). This premium consists of seven percent of the affiliate’s income plus any additional payments that must be made to obtain the chosen benefits plan. ISAPRES are solely financed by their affiliate’s payments. A wider and better variety of health care suppliers (both public and private) are offered to beneficiaries of private insurance. As the private system discriminates in this way, many Chileans have no other choice but to acquire public health insurance. The setup of the Chilean health insurance system results in a paradox where most high-risk individuals are in the public supplier which is of worse quality. There is a segmentation which causes the richest and/or healthiest people to be concentrated mainly in the private insurance companies and the most poor and/or sick in public insurance. This is due to the fact that the higher the income of the person, the higher their FONASA premium is whilst the benefits provided remain the same. On the other hand, the higher the risk of the person, the higher the premium charged by private insurers or the lower the benefits provided for a given premium paid. In FONASA the premium is always seven percent of the salary and the benefits do not vary. The final result is that FONASA will have unhealthier people based both on observable and unobservable risk. On observable risk because riskier people will be charged a higher premium by the private insurer and on unobservable risk as FONASA does not use any risk 5 adjusting at all and also accepts anyone who applies. Due to the fact that everybody is implicitly covered by it allows certain strategic behaviour because any individual may always switch to public insurance. This can lead people to purchase private insurance with low catastrophic coverage as they know they can always fall back on FONASA if a very bad health outcome takes place. Self-selection operates against public insurance because it sets premiums according to a percentage of income, whereas private insurance sets premiums according to risk factors and free implicit insurance is offered by the public insurer. During the nineties, the Chilean health care system was facing this structural segmentation, was inadequate for an aging population, was under-investing in preventive care and ISAPRES were failing to cover certain health conditions for a large segment of its beneficiaries. A reform to improve the coordination of the dual health system was implemented in order to achieve a more efficient allocation of the resources available. The Plan for Universal Access with Explicit Guarantees (AUGE) was implemented in 2005 to guarantee that all Chileans have access to health care regardless of their age, gender and ability to pay. It ensures a certain set of services to everyone, prioritizing health problems according to their epidemiological danger and the feasibility of solutions and emphasizing preventive medical care. Both ISAPRES and FONASA are obliged to provide their beneficiaries with treatment for the health conditions stipulated in the AUGE Plan. This new legal framework included the Regime of Explicit Health Guarantees (GES) that incorporates the principles of opportunity, quality, access and financial protection. A medical response is outlined for each condition in the AUGE (or GES), as well as a maximum waiting period for receiving medical care and the maximum amount to be spent per year on health (according to individual’s level of income). The goal of this reform is to ensure the principles of equity, redistribution and inclusion. The plan was implemented gradually in order to mitigate large fiscal pressures. Medical conditions are being continuously added to the AUGE’s list of priority illnesses. In 2005 it covered 25 illnesses, 40 in 2006, 56 in 2007 and by 2010 it reached 69. This lowered the cost and increased the availability of medical care for a gradually increasing number of health conditions. The Chilean salary compensation for days of sick leave is extremely generous10. 100 percent of the salary is reimbursed for most illnesses with a maximum of 60UF 11 per month for sick leaves longer than 10 days. For sick leave episodes shorter than ten days, the three first days are not reimbursed but for the rest, the complete salary is paid. Given this generous reimbursement policy, the reduction of the cost of seeking medical care could have caused an increase in sick 10 11 Refer to Table 13 in the Appendix. 1 UF ≈ 48 USD (01-08-2011). 6 leave due to a higher number of doctor visits that can result in the issuance of a medical certificate granting days off work. Although the GES Plan could be contributing to reduce expenditures of health insurers in the long term due to the enforcement of preventive medical care, it could also be increasing their burden due to the rise in medical visits and unnecessary sick leave. The later is caused by the exacerbation of the moral hazard problem which is characteristic of health insurance markets. Those who are able to grant medical certificates for sick leave – medical doctors, matrons and dentists – are likely to do so in most cases mainly because of the information asymmetry between them and the patient and given that their main priority is to improve the health status of their patients. This incentive to over issue sick leave is exacerbated due to the fact that there is little or no (reputational) cost associated with granting sick leave. III. Previous Literature Many studies reveal the fact that sick leave behaviour responds sharply to economic incentives. This is brought to light in several analyses by the finding that higher salary compensations and lower costs of work absence are positively correlated to the number of days of labour absenteeism due to sick leave. Ziebarth and Karlsson (2010) analyze a natural experiment that took place in Germany when the salary reimbursement percentage for sick leave days was decreased from 100 to 80 percent. The effect of this reform on the number of days of work missed in a year due to illness is estimated via a difference-in-differences specification using pooled data from two pre and two post-reform years. A zero-inflated negative binomial model is applied, finding that the mean number of absence days per year decreased by approximately 5 percent and the ratio of employees that took no sick leave days during the year increased by 7.5 percent due to the reform. Henrekson and Persson (2004) use time series data to measure the effect of a series of changes in the level of sick leave compensation in Sweden between 1955 and 1999. They found that the more generous compensations tend to be associated with permanent increases in sick leave. Their findings were later reinforced by a panel study using data between 1983 and 1991. First the effect of the reforms on the number of sick days per quarter is estimated via a least-squares regression with a White heteroskedasticity consistent variance-covariance matrix. The results indicate that more generous compensations lead to an increase of days of sick leave. Subsequently a dynamic panel is used by including a one period lag of the dependent variable as an explanatory variable. The estimation is done using GLS, revealing that the number of 7 sickness spells fell dramatically after the 1991 reform which reduced the compensation for the first few days of sick leave. Using Swedish panel data, Johansson and Palme (2002) analyze daily work absence behaviour for everyday during 1990 and 1991. Both a sickness insurance reform and a tax reform took place during this period, increasing workers cost of being absent from work. The results reveal that sick leave behaviour is considerably influenced by the cost of being absent. The higher the cost of labour absence, the lower the number of sick leave episodes and also the duration of each episode. Moreover, it is found that when modeling labour absenteeism decisions it is fundamental to consider the effect of preference heterogeneity amongst individuals. It is assumed that each worker decides daily whether or not to take sick leave, based on his health status and on the cost of being absent. Preferences for work absence are assumed to follow a stochastic process in order to represent the dynamic structure of sick leave behaviour that arises from the gradual variation of individuals’ health status over time. The estimation is done via a fixed effects regression model to control for unobserved heterogeneity and to avoid possible spurious correlation arising between the cost of being absent from work and work absence. The dynamic structure of work absence behaviour is also estimated, making it possible to distinguish between the transition from working today to being absent tomorrow and the transition from being on sick leave today to returning to work tomorrow. Although in Chile the level of compensation has not changed, the results of these studies are useful as they provide an idea of why sick leave might have increased due to the GES plan as this also represents a lower cost of labour absenteeism due to the fact that more and more illnesses have a much fuller coverage and access than before. They also provide an idea of the incentives arising from having such a compensation system for workers who are on sick leave, and that a reduction of this might be an effective measure to reduce the moral hazard that prevails in the health care market. The mentioned papers analyze empirical evidence about the fact that individual’s behaviour in the health care market respond to economic incentives. The contribution of this study is to provide empirical evidence of whether the recent reforms of the Chilean health care system have created an incentive for an increase in health care consumer’s strategic behaviour. This has not yet been done as the GES plan started quite recently. If this is the case it will have important policy implications for the Chilean health care system. 8 IV. Theoretical Framework The model used follows the setup of Johansson and Palme (2002) based on a consumptionleisure model. It is assumed that individuals each day maximize their utility function subject to a - in most cases binding - budget constraint. The utility function depends positively on the consumption of goods and on leisure time. Leisure time is made up of contracted leisure time and of work absence due to sick leave. The utility function on day ‘t’ is: where ‘xt’ is the composite consumption good; ‘LtT’ is the total amount of leisure time. ‘LtC’ is contracted leisure time and ‘LtSL’ is work absence due to sick leave. The price of the consumption goods is their market price. The price of contracted leisure is the opportunity cost which is the salary sacrificed. The price of leisure due to sick leave is the price of seeking medical care (going to the doctor, getting exams done, etc) which could result in the issuance of a medical certificate granting days off work. Ot is a vector of observable characteristics of the individual, and εt represents the individual’s taste that changes over time and is unobservable. This last parameter is influenced by the individual’s perceived health status. Health care reforms change the constraints that individuals face when they are ill (and possibly when they are not ill) and will therefore affect the decisions they make regarding seeking medical care and consequently missing work. Their budget constraint is not as tight if the price of medical care decreases. If leisure is a normal good and the price of sick leave leisure decreases, its consumption is expected to be higher in order to increase total leisure time. Individuals will therefore choose their level of medical assistance in order to influence their amount of sick leave to maximize their utility function. The constrained maximization problem is therefore: Subject to , , , 9 where ‘P’ is the price of the composite good, ‘M’ is the price of medical care, ‘w’ is the wage and ‘h’ is the hours worked per day. The introduction of the GES plan into the Chilean health care system has increased the access and lowered the cost of medical attention for a large number of illnesses and health conditions and therefore can be seen in this model as a decrease in M. Using the first order conditions of the maximization problem, the consumption-leisure optimality condition is: where Ux and UL are the marginal utilities of the composite good and of total leisure time, respectively. At the optimum, the wage will equal the price of medical care. This implies that, according to this model, if ‘M’ decreases due to the health care reform, the marginal utility of goods consumption with respect to the marginal utility of leisure time increases. If the price of medical care falls below the market wage, sick leave leisure time will be relatively cheaper than contracted leisure. As in this model it is assumed that consumers value both types of leisure equally, sick leave related leisure should increase with respect to contracted leisure if ‘M’ decreases. The proceeding empirical estimation is based on this theoretical model. The solution of the optimization problem reveals the variation of individual’s behaviour due to the change in their budget constraint caused by the decrease in the cost of labour absenteeism due to sick leave. The GES plan could also influence individuals work absence behaviour through the possible effect it has on moral hazard. Existing studies on moral hazard in health care consumption such as Manning et. al. (1987) reveal that it is a recurring problem in health care markets. The study reports the results of the Rand Health Insurance Experiment which consisted of a random assignment of health insurance programs with different levels of cost sharing to people in order to find the price elasticity of the demand for medical care. The results show that the lower the out-of-pocket payments, the higher the number of medical contacts. The strongest effect on quantity of medical visits takes place between zero cost-sharing and 25 percent out-of-pocket payment. As the assignation to each plan was random, it is assumed that health conditions of the groups of individuals with each insurance plan were the same on average so, in the absence of moral hazard, the consumption of health care should be identical in each group. 10 The existence of moral hazard behaviour in the health care market is likely to increase due to the extension of the GES plan in the Chilean health care system, leading to a – more than necessary - increase in medical care consumption due to the decrease of its price and increased access to it. V. Data The data was obtained from the Chilean ‘Encuesta de Proteccion Social’ (EPS). This survey contains information about the Chilean labour market and social security system. This survey has been completed in the years 2002, 2004, 2006 and 2009. The surveys of 2002 and 2004 do not include information about sick leave. The 2006 and 2009 surveys contain information about sick leave during the previous year. Due to the availability of sick leave data for only two years, and the low variation of the first differences of the explanatory variables, panel estimation will not be used. Instead, the data will be considered a cross-section for estimation purposes. In 2005, the first 25 illnesses were included in the GES plan. By the end of 2008, an additional 35 were added to the list of conditions covered. As data regarding sick leave prior to 2005 is not available, the total effect of the GES plan implementation cannot be measured. Using the information of the respondents’ sick leave behaviour in 2006 and in 2009 it is possible to measure if the increase in the illnesses covered (from 25 to 60) had an impact on labour absenteeism. The advantage of using data from after 2005 is that as the legal framework of the GES plan was already in place, it avoids the effect measured in this study from being affected by possible temporary changes in health care consumer’s incentives caused by the legalities of the implementation of the plan. The EPS includes questions on respondent’s awareness and use of the AUGE plan. The answers given 12 show that, although the awareness of the existence of the reform was similar in the surveys of 2006 and 2009, the use of the treatments offered was much higher in 2009. This supports the idea that the GES plan is likely to have had an effect on sick leave behaviour over the period analyzed. Using the data from EPS, the task is to reveal whether there has been an increase in sick leave in 2009 with respect to 2006, which could be due either to necessary medical treatment, to an increase in moral hazard or to a combination of both. 12 Refer to Table 14 in the Appendix. 11 V.1 Sample Selection Work absenteeism is only reported by the heads of household. The usable data sample consists of 18,506 individuals for whom all the information regarding the necessary variables is available. It is required to use only those individuals surveyed in two subsequent years for the creation of the lags of certain variables which are fundamental for estimation purposes and will be explained in the next subsection. Some information on other family members was also used to construct extra explanatory variables that will also be described in the following section. The sample is restricted to heads of households that work and are eligible for sick leave compensations. Respondents of the survey that are children under the age of 18 (who can legally only work 20 hours a week) and people older than 85 year olds were not included. Table 1 shows the composition of the final sample used in this study. Table 1: Sample Composition 2006 2009 Total 9,853 8,653 N° of workers who took sick leave 1,011 784 % of workers who took sick leave 10.26% 9.06% Men 6,120 5,343 % of men in total sample 62% 62% Men who took sick leave 511 367 % of men who took sick leave 8.35% 6.87% Women 3,733 3,310 % of women in total sample 38% 38% Women who took sick leave 500 417 % of women who took sick leave 13.39% 12.60% Difference -1,200 -227 -1.20% -777 0 -144 -1.48% -423 0 -83 -0.80% TOTAL 18,506 1,795 9.70% 11,463 62% 878 7.66% 7,043 38% 917 13.02% Source: EPS 2006 and EPS 2009 In both years there are fewer women than men in the sample (38 and 62 percent respectively). The 2006 sample is larger than that of 2009 and, accordingly, records more workers who took sick leave days. The data also shows that a higher percentage of women took sick leave than men in both years which is likely to be due in part to pregnancy related labour absenteeism. The percentage of workers that took at least one day of sick leave decreased from 10.3 to 9.1 between 2006 and 2009. This could be due to the fact that the GES plan began in 2005 including its first 25 illnesses 13. People with these afflictions may have been treated immediately, needing to take less days of work in future years. In order to determine whether or not this is the case, data on sick leave prior to 2005 would be needed. This is not available in the 13 Table 15 in the appendix shows illnesses included in the GES plan each year. 12 EPS or in any other Chilean survey, so the effect of the other 35 illnesses that were included in the plan between 2006 and 2009 will be measured. Although the percentage of people that took sick leave decreased, the length of their absences increased, causing the mean number of sick leave days in 2009 to be higher. The mean of sick leave days increased almost 20 percent, from 2.77 days per year in 2006 to 3.3 in 2009 as can be seen in Table 2. It is interesting that the percentage of men that took sick leave decreased by 1.48 percent whereas the percentage of women that did decreased only 0.8 percent. The first 25 illnesses included in the GES plan may have had a different effect on men and women. Possibly, if more illnesses were related to men’s health, they could have gotten treatment and shown an improved health status in 2009 with respect to 2006. On the other hand, if more illnesses that women suffer from were included at the start of the plan, although both men and women could have gotten treated for some conditions and needed less sick leave in 2009, women’s moral hazard could have increased causing their sick leave to decrease less than men’s. Observing the illnesses included at the start of the GES plan, breast cancer and prematurity of new born babies are related specifically to females. Diabetes, schizophrenia and the treatment of several types of cancer are equally common for both genders. Treatments for an array of heart conditions were also included. Statistics 14 show that in Chile in 2004 more men than women died from cardiovascular disease. This could be indicative of the higher decrease in men’s sick leave in 2006 being due to them getting treated for more illnesses included in the first phase of the AUGE than women. Table 2 shows that women’s health status declined more than men’s between 2006 and 2009, revealing that at least part of their smaller decrease in sick leave is likely to be due to real sickness. Table 2: Composition of Self-Reported Health Status by Gender Men Women Health Status 2006 (%) 2009(%) Difference(%) 2006 (%) 2009(%) Difference(%) 74.77 74.73 -0.04 70.4 68.01 -2.39 Good Health 21.94 21.56 -0.38 24.08 25.51 1.43 Regular Health 3.27 3.71 0.44 5.49 6.56 1.07 Bad Health 5,343 6,120 777 3,733 3,310 -423 N Source: EPS 2006 and EPS 2009 14 Statistics obtained from the World Health Organization. 13 V.2 Variable Definitions Work Absence: the dependent variable is the number of days the respondent was absent from work in the previous year due to illness 15. This does not take into account sick leave days granted to take care of babies younger than one year old as this is recorded separately in the survey and will not be included in this study due to the very low number of respondents who took any. Out of the total sample size of 18,406, 1,795 (9.7 percent) reported having at least one episode of sick leave in the previous year. The data regarding this variable has limitations for the effects of this study. The number of sick leave episodes instead of the total number of days in the year would have been more accurate to measure the possible moral hazard effect of the health reform. This would have provided direct insight about how often individuals visited the doctor before and after the inclusion of the second phase of the reform. This data is not available so the total number of absent days during the year will be used for the subsequent estimation of the effect of the GES plan on sick leave. Sickness: self-reported health status of the head of the household is likely to influence whether or not he/she takes sick leave and the number of days taken. Sickness is expected to affect sick days negatively as it is probable that most sick leave absenteeism is due to bad health conditions that do not allow people to perform their jobs properly. In the EPS survey, self reported health status is ranked from one to six, where one is excellent health and six is very bad. Dummies were created for each status (excellent, very good, good, regular, bad, and very bad) and then two dummies, one representing ‘good health’ and the other ‘bad health’ were created, taking the ‘regular’ health condition variable as the base, in order to avoid perfect multicollinearity. These two dummies are included in the regressions to control for individual’s health status. Endogeneity between health status and work absence is a potential problem for the estimation of the effects of the GES plan on sick leave. It is likely that the number of sick leave days has an effect on future health conditions (e.g. people who take fewer sick leave days when they are ill could end up with a deteriorated health status in the long run). As health status is a self reported variable, it is a subjective measure. It is reasonable to assume that when respondents are asked about their health status in the survey, they base their answer on the history of their health conditions, not only on their illnesses from the past year, but more recent sickness episodes will probably have a much larger influence on their answer. Transitory health conditions such as the flue, tonsillitis, among others, commonly cause people to miss work each year and are likely to condition people’s perception of their general health status. The existence of a problem of endogeneity between health status and work absence during a same period (year) is a legitimate 15 Any number of days reported over a year was adjusted to 365 days. 14 concern. It could also be assumed that failure to take care of health conditions in the present is most probably going to cause a deteriorated health status in the future and therefore not in the time frame considered in this study. If this is the case, endogeneity would not be an issue affecting the estimation results. Due to the possibility of simultaneity between the dependent variable and the number of days of sick leave taken, possible ways to deal with it will be presented in the following in Section VI. Wage: various studies’ (e.g Johansson and Palme, 2002) results show that income is negatively related to sick leave. Those who earn higher wages are likely to be more educated and work driven, leading them to take fewer days off work. Age: is expected to have a positive effect on labour absenteeism due to sickness. This is because it is closely related to people’s health status. Age squared is also included to illustrate more accurately the effect of this variable on sick leave. Male: Several studies (e.g. Allebeck and Mastekaasa, 2004 and Moreau et al., 2004) show that females take more days of sick leave than men. This could partly be due to the fact that they are given medical license to miss work if they have a sick child younger than one but these cases are not considered in this study. This could also be caused by the fact that men and women are affected by different illnesses that require different lengths of sick leave. This seems consistent with the data used as is shown in Table 1 it can be seen that a higher percentage of women than of men took sick leave in the sample used. Age*Male: in order to measure the form of the effect of age and gender on sick leave, an interaction term between the two is included in the regressions. Married: it is possible that marital status may have an effect on sick leave as people who are married or live with their partner may have slightly more willingness to miss work if their partner is employed. Single adult households are likely to have no other source of income so could be less inclined to be absent from work if this could affect the stability of their job. On the other hand, it is also possible that people who are married or live with their partner have a more comfortable situation at home that could decrease their labour absenteeism. Metropolitan Region: this dummy variable indicates whether or not the individual lives in the region of Chile where Santiago is situated. This region is the biggest in the country and concentrates most of the industrial production. Transport costs of going to the doctor could be lower in this region as there is a higher availability of medical services on offer. Due to lower transport costs to obtain medical care, sick leave would tend to increase. Also the levels of stress in the capital city are higher than in the rest of the country. Levels of stress could also give rise 15 to more labour absenteeism, but on the other hand, a possibly tighter labour market might make people less willing to miss work if it is not strictly necessary. The explanatory variables that follow were constructed using data from other family members and/or other sources of information as they are likely to influence individuals’ decisions to take sick leave: D2009: this dummy variable is equal to one for respondents of the 2009 survey and to zero for those from 2006. It represents the effect of the health care reform on sick leave behaviour, as long as all other aspects that affect sick leave are being controlled for. The hypothesis of this study suggests that due to an increase in moral hazard behaviour caused by the substantial increase of the number of illnesses in the GES plan over the three years, this variable will have a positive effect on sick leave. Number of children: variables counting the number of babies younger than one year old and children between one and seven in the household are incorporated because they may be more prone to become infected with a disease at school and therefore could infect their parents at home. Parents may take more sick leave days as they are more stressed and tired from the additional duties of looking after their children and in some circumstances it is necessary for them to stay at home to attend to their children’s needs (even when they are not granted a sick leave because their child is older than one). Working hours: the number of hours that the respondent works weekly is used to measure the effect that this may have on sick leave behaviour. The more hours an individual works per week (past a certain threshold) could be expected to have a positive effect on sick leave due to tiredness and fatigue. The opposite effect on labour absenteeism is also a possible outcome as people who work more are likely to be more work driven and show more commitment to their jobs. Unemployment: it is necessary to control for macroeconomic indicators as panel estimation is not being implemented and macroeconomic variables such as the unemployment rate and the rate of GDP growth have been found to have a significant effect on labour absenteeism in previous research16. Nordberg and Roed (2009) found that the cost of labour absenteeism is inversely related to the business cycle due to both economic and non-economic incentives. On one hand, the higher the unemployment rate, the higher the possibility of losing a job and the lower the possibility of finding a new one; reducing sick leave. On the other hand, periods of expansion can cause workers to be more stressed, decreasing their health status. Another non16 Johansson and Palme (2002), Nordberg and Roed (2009), Askildsen, Bratberg and Nilsen (2005). 16 economic incentive is that healthy workers are expected to have tighter attachments to their jobs. Therefore, in periods of recession, there will be a selection effect where the healthiest are more likely to form part of the labour market. Regional data must be included for each individual in order to control for the macroeconomic situation whilst avoiding perfect collinearity with the dummy representing the year (‘D2009’) that will intend to catch the effect of the reform. The macroeconomic indicators in Figure 1 in the Appendix could be an indication of a tighter labour market in 2009 than in 2006 and therefore could be a reason for a decrease of labour absenteeism due to sick leave even given an increase of moral hazard due to the reform. On the other hand, higher GDP growth could cause workers to be more stressed, increasing their sick leave. V.3 Descriptive statistics17 Regarding the dependent variable, Table 3 shows that its mean has increased in 2009 with respect to 2006 from 2.77 to 3.3 (almost 20 percent). A t-test was performed on the means of both years in order to determine the significance of this increase in 2009. The results of the t-test are recorded in Table 4. Table 3: Mean and Standard Deviation of Sick Leave Variable 2006 2009 Difference (%) Complete Sample 2.77 3.3 19.13 3.02 Mean 16.93 22.3 31.72 19.62 Standard Deviation 9,853 8,653 -12.18 18506 N Source: EPS 2006 and EPS 2009 Table 4 indicates that the null hypothesis of equal means can be rejected in favour of the mean being higher in 2009 than in 2006 at a 5 percent level of significance. Possible variation in sick leave conduct between the two analyzed years can be due to changes in moral hazard behaviour favoured by the extension of the GES plan or a change in the Chilean health situation. A thorough regression analysis is required to reveal whether in fact the GES plan has had the expected effect on sick leave. 17 Descriptive statistics of the independent variables are shown in Table 15 and their correlation matrix in Table 16 in the Appendix. 17 Table 4: T test Mean of 2006 and 2009 of Sick Leave Variable H0: μ(0)=μ(1) HA: μ(0)<μ(1) HA: μ(0)≠μ(1) HA: μ(0)>μ(1) P(T<t)=0.0327 P(T<t)=0.0655 P(T<t)=0.9673 Where μ(0) is the mean of the 2006 data and μ(1) is the mean of 2009 Table 5 shows the composition of the self reported health status in each year in order to observe whether Chileans health conditions have declined or improved over the analyzed period. The sickness variable takes on values between one and six where one represents excellent health and six very poor health. A dummy variable was made to represent each one of the six categories. After that, the dummies representing excellent, very good and good health were used to make a single dummy representing ‘good health’ status and the dummies for bad and very bad health were used to construct a ‘bad health’ dummy. From Table 5, it can be inferred that there has been little change in Chileans health status between 2006 and 2009. There is a slightly lower concentration of people in the category ‘good health’ and higher in ‘bad health’ in 2009 but the difference is very small. Therefore, even though the data suggests an increase in sick leave due to the minor deterioration of Chileans health status, no inferences can be made by simply looking at the data. Health Status Excellent Very good Good Regular Bad Very Bad No answer N Table 5: Composition of Self-Reported Health Status 2006 (%) 2009 (%) 7.22 Good Health 5.28 Good Health 12.31 10.09 73.12 72.16 53.59 56.79 22.75 22.75 23.03 23.03 3.73 Bad Health 4.29 Bad Health 4.11 0.38 0.51 4.8 0.02 0.02 0.01 0.01 9,853 8,653 Difference Good Health -0.96 0.28 Bad Health 0.69 -0.01 -1,200 Source: EPS 2006 and EPS 2009 Table 6 shows the differences in certain variables between those individuals that did take sick leave and those that did not, revealing important characteristics of the sample and shedding light on future findings of this study. As noted previously, fewer men than women took sick leave. The data also shows that less people who report having excellent, very good or good health took sick leave, whereas more people with bad and very bad health had positive labour absenteeism. 18 Moreover, there is a slightly higher concentration of married people and those living in the metropolitan region amongst the individuals that took sick leave. Table 6: Comparison between individuals who do and do not take sick leave SL=0 SL>0 Difference TOTAL 16,767 1,830 -14,937 18,506 N % % % % GENDER Male 63.34 48.91 -14.43 61.94 Female 36.66 51.09 14.43 38.06 HEALTH CONDITIONS Excellent 6.5 4.51 -1.99 6.31 Very good 11.5 9.19 -2.31 11.27 Good 55.45 51.7 -3.75 55.08 Regular 22.36 27.74 5.38 22.88 Bad 3.76 6.18 2.42 3.99 Very bad 0.41 0.67 0.26 0.44 No answer 0.02 0 -0.02 0.02 MARRIED Yes 62.94 64.62 1.68 63.1 No 37.06 35.38 -1.68 36.9 METROPOLITAN REGION Yes 37.44 41.39 3.95 37.83 No 62.56 58.61 -3.95 62.17 Source: EPS 2006 and EPS 2009 In Table 7 the difference between the samples of the two years used for this study can be seen. The ratio of males to females only shows a very small decrease in 2009. The percentage of people who reported having ‘good health’ (this includes those that reported excellent, very good and good) decreased in the later year. With respect to the ‘bad health’ variable (including those who reported bad and very bad), the percentage increased slightly in 2009. These health indicators could suggest a minor increase in sick leave due to real medical needs. Slightly less people were married or living with their partner in 2006 and fewer were living in the metropolitan region. If in fact married people and those living in the metropolitan region are more prone to take sick leave, as hinted in Table 6, the decrease of their presence in the 2009 sample could possibly have a negative effect on sick leave. 19 Table 7: Comparison between 2006 and 2009 2006 2009 Difference 9,853 8,653 -1,200 N % % % GENDER Male 62.11 61.75 -0.36 HEALTH CONDITIONS Good Health* 73.11 72.16 -0.95 Bad Health** 4.11 4.8 0.69 MARRIED Yes 63.4 62.76 -0.64 METROPOLITAN REGION Yes 40.22 35.1 -5.12 TOTAL 18,506 % 61.95 72.67 4.43 63.1 37.83 *Excellent, very good and good health. ** Bad and very bad health. Source: EPS 2006 and EPS 2009 A difference-in-differences estimation of the change in the number of sick leave days on the change in health status shows that there are important additional variables behind sick leave behaviour. Although the coefficient of the change in health status is positive (1.24) and significant, the r-squared is very low (0.002), indicating that labour absenteeism is not solely due to medical conditions. This encourages the investigation of the factors that explain the change in sick leave between 2006 and 2009. These include personal and economic characteristics and a change in moral hazard due to the implementation and extension of the GES plan. The first two are likely to have remained similar over the period analyzed, which is indicative of the possibly very important role played by moral hazard in any variation of individuals’ decisions regarding sick leave. From an extensive analysis of the data used for this study, it is revealed that the percentage of workers that took sick leave decreased by 1.2 percent in 2009 with respect to 2006 but the average number of sick leave days taken per year increased by 20 percent. The self reported level of sickness of the sample decreased very slightly in 2009. The percentage of men who took sick leave decreased more in 2009 with respect to 2006 than the percentage of women who did. This can be explained partly by the fact that women’s health status decreased a little bit more than men’s. Less people who reported having excellent, very good and good health took sick leave with respect to those who reported having regular health. The opposite is the case for those who reported having bad or very bad health. Amongst those who did take sick leave, there is a higher concentration of married people and those living in the Metropolitan Region than amongst those who were not absent from work due to illness. In the sample of 2009 and 2006 the percentage of people married and living in the Metropolitan region shows a small decrease. This could hint at the possibility of a slight decrease in sick leave in 2009. A negative effect on 20 the dependent variable in 2009 is also suggested from the fact that men’s sick leave decreased more than women’s between 2006 and 2009 and that the percentage of men in the sample remained the same. These effects could slightly counteract any increase in sick leave do to an intensification of moral hazard due to sick leave. All of these findings are only preliminary. It is necessary to estimate an appropriate regression model to see the entire effect of the variable of interest (‘D2009’) on the decision of whether or not to take a sick leave day during the year and how many days to take. The sign, coefficient and level of significance of the variable will give a clearer picture of the role it plays in determining individual’s behaviour regarding sick leave. It would be ideal to estimate directly how much of the change in sick leave behaviour is due to changes in health status and how much to a variation in moral hazard. With the data available, only the total change in sick leave between 2006 and 2009 can be identified. This result along with the analysis of any variation of Chilean’s health status and other important explanatory variables during the period under study and of the illnesses included in the AUGE plan before and after 2005 can indicate whether it is plausible that a rise in moral hazard has played a role in increasing sick leave. VI. Estimation Strategy In this section, the estimation technique used to identify the impact of the GES plan on sick leave behaviour is described. The Poisson model, which is often used for estimation involving count data, will be portrayed first. The adequacy of this model is discussed before presenting a possible alternative which is the negative binomial (NB) model. The existence of a dependent variable with a very high count of zeros may make it necessary to use a zero-inflated model for consistent estimation; therefore this option will be depicted subsequently. The hurdle model will then be presented as another alternative estimation method and, finally, the method to test and solve possible endogeneity of certain explanatory variables will be introduced. Due to the availability of data with reference to sick leave only in the EPS surveys of 2006 and 2009, these two datasets will be used. A difference-in-differences estimation is not used because of too small variation of the values of the first difference of most of the explanatory variables. This, added to the fact that the necessary data only exists for two years, implies that consistent panel estimation will not be possible and therefore the data is used as a cross-section for estimation purpose. 21 VI.1 Poisson Model The dependent variable is the total number of days of sick leave per year. Given the use of count data, an adequate model must be employed. In a Poisson regression, the dependent variable (y) has the following conditional mean (and variance): and the probability of y (dependent variable) given x (explanatory variables) is: Given that in the Poisson distribution the conditional variance is equal to the conditional mean (μ), if there is overdispersion, causing the variance to be larger than the mean, then the estimation will be inefficient using a Poisson regression. The distribution of the dependent variable, the number of sick days taken in a year, is skewed to the right causing the variance (2,130.7) to be much larger than the mean (4.79). Furthermore, given the large variation in the sample between each respondents number of sick leave days taken in a year there is likely to be substantial unobserved heterogeneity. VI.2 Negative Binomial Model A negative binomial (NB) distribution can be used as it allows for overdispersion and unobserved heterogeneity (not correlated with explanatory variables) of the data. The model adds an error term to the conditional mean of the Poisson distribution and assumes that the conditional variance function is quadratic in the mean: , where . The conditional variance is now and . The higher the value of alpha, the more dispersed the distribution. When alpha is zero, the dependent variable has a Poisson distribution. When using an NB regression model, the significance of the parameter alpha can be tested to identify whether this model is more appropriate than a Poisson one. 22 VI.3 Zero-Inflated Negative Binomial Model The dependent variable also presents a high proportion of zeros (approximately 90 percent) which could create problems for the negative binomial estimation. A modified count model is the zero inflated model (ZINB) which takes the existence of excess zeros into account. When dealing with response variables that are count variables often the number of zeros is excessive. This is because there are of two processes that generate zero responses. An individual may take zero sick leave days either because he is never sick or because he goes to work even when not feeling well due, for example, to the lack of a medical certificate to justify his absence, fear of deteriorating job stability or too high work load. Both of these outcomes present an identical zero response but the process through which they are reached is very different. Certain individuals were not absent from work during the year for the same reasons that others wer e absent for a positive number of days (due to illness or strategic moral hazard behaviour), whilst others took no sick leave days for different reasons. Given the two possible processes that give rise to zero outcomes and to the excessive number of these, a continuous mixture model - such as the ZINB - is appropriate to estimate the effect of the GES plan on the number of days of sick leave as it models on one hand the binary process of absence and no absence as a logit model; and on the other, the count process of the number of days of sick leave as a NB model when the binary process takes on value one. If the binary process takes on value one, the count process takes on a discrete value (0 to 365). Zero sick days can therefore be generated in two ways: through the binary process and also due to a zero count given the binary process takes on value one. Given g1(.) the density of the binary process (logit equation) and g2(.) the count density (NB equation). If g1(0) is the probability of the binary process taking value 0 then y=0. With (1g1(0)) being the probability of the binary process taking value 1, then y takes on count values following g2(.). Therefore, the density is: In order to estimate the dependent variable using this specification it is necessary to identify the variables that affect the binary process of being absent or not, and also, the variables that affect the duration of the sickness spell. In this case the same predictor variables were used for both equations. Logit Model (Selection Equation) 23 Negative Binomial Model (Duration Equation) where: Xi = vector of personal characteristics of individual i (age, age2, age*gender, gender, number of babies, number of children, marital status, health status, region where they live). Zi= vector of job characteristics of individual i (wage, hours worked per week). M = Macroeconomic characteristics (unemployment). D2009 = dummy that takes on value 1 for observations from the 2009 survey. The dummy variable D2009 that takes on value one if the observation corresponds to an individual surveyed in 2009 (and zero if surveyed in 2006) will indicate whether the introduction of 35 new illnesses in the GES plan after 2005 actually affected labour absenteeism. Assuming that there was no systematic decline in the Chilean population’s medical condition and controlling for as many variables as possible that affect sick leave, leaves this dummy variable representing the effect of the reform on sick leave behaviour. The Vuong test is used to compare the negative binomial model to the zero-inflated negative binomial model. This test reveals which model has a better fit. The Vuong test compares the predicted probabilities of two non-nested models. The null hypothesis is that both models have identical explanatory capacity and the alternative hypothesis is that model 1 is closer than model 2 to the actual model. A large positive statistic reveals that the first model fits the data better than the second model. In this case, model 1 is the ZINB regression and model 2 the NB. VI.4 Hurdle Model An alternative model often used in the presence of count data with excess zeros is the two-part or hurdle model. This model is based on the same logic as the ZINB: that the decision of whether or not to take a sick day is generated by a different process from the decision of how many days to take. If the result of the binary process is positive then ‘the hurdle is crossed’ and the conditional distribution of the positive values is generated by a truncated at zero count process. A hurdle model will be estimated in order to check the robustness of the ZINB estimation by comparing the results of the two methods. For this, a logit model will be used for the binary process and a negative binomial for the count process. 24 The difference between the ZINB and hurdle models is that the latter can be represented as the sum of two separate models (the one that represents the binary process and the other the count process) and therefore, its likelihood function is separable with respect to the parameters that are being estimated. Quite the opposite occurs with the ZINB model as it allows for a mixing process for the generation of zeros causing the likelihood function not to be separable. The decision between using the ZINB or the hurdle model depends mainly on the nature of the variable that is being estimated. The hurdle model requires a clear distinction between the two possible results of the binary process. Behaviour related to labour absenteeism due to sick leave seems to be more complicated than that. It is likely that the decision made by the respondents about whether or not to miss work and the decision about the duration of their absence are integrated rather than isolated resolutions. A simultaneous estimation of the logit and NB models that takes place when using a ZINB regression has the advantage of fitting the coefficients for both estimations at the same time and therefore is likely to achieve a better fit in this case than two separate estimations (the second using the predicted values of the first as inputs). This could indicate that the ZINB is more appropriate, but the hurdle model will also be estimated as an alternative method and to check the robustness of the former model. Another drawback of the hurdle model is that it does not allow for different variables to be included in the selection and duration equation, possible producing a less accurate estimation. If the days of sick leave were to be modeled using data only on those who took a positive number of days off work there would almost certainly be sample selection bias. The use of the logit and NB equations in the ZINB model is a way of accounting for the existence of two processes that can lead to zero counts of sick leave days. Although the selection equation models the probability of a zero or positive count, the simultaneous estimation of both the logit and NB makes the ZINB less capable of dealing with possible sample selection than the hurdle model. The hurdle model estimates in two steps: the distribution of the count process is conditional on the result of the binary process being one. Thus, it models the selection and then the duration equation separately, accounting for the former equation in the estimation of the later, therefore controlling for possible sample selection. Due to the complexity of the calculation of the total marginal effects of the hurdle model (after the logit and NB estimations), the significance of the different variables in both models will be compared in order to shed some light on the robustness of the results of the estimations using a ZINB model. 25 VI.5 Endogeneity If in fact there is an endogeneity problem between sick leave in a certain year and health status in that same period, the error term of the regressions will be correlated with the sickness parameters. The inclusion of variables representing individuals’ sickness levels would cause a bias in the estimated coefficients if in fact there is simultaneity between the dependent variable and these ones. As dummy variables representing ‘good health’ and ‘bad health’ are included separately in the regression, if they are endogenous, they will cause bias in opposite directions. On one hand, sick leave would have a negative influence on the ‘good health’ variable during the same period, and on the other, a positive influence on the ‘bad health’ one. The potential endogeneity of the ‘good health’ variable will cause a downward bias of the coefficients, and an upward bias will be generated by the possible endogeneity of ‘bad health’. The direction of the total bias resulting from ignoring the possible endogeneity of these two variables and including them in the regression is therefore uncertain. If the health status variables were to be excluded from the estimation due to their simultaneity with the dependent variable and the inexistence of any appropriate instruments, the effect on the results would be the following. If the variables ‘good health’ and ‘bad health’ are correlated with the other explanatory variables, the estimated coefficients will be biased. The bias in the estimation of each coefficient will depend on the correlation between each variable and the omitted ones and between the omitted ones and the dependent variable. The correlation coefficients between the dependent variable and the two possible problematic ones are very small. The correlation between ‘sick leave’ and ‘good health’ is -0.0731 and statistically significant even at a 1 percent level. The correlation of the dependent variable with ‘bad health’ is 0.0778 and is also significant at a 1 percent level. Table 8 shows the direction of the bias in the estimation of each coefficient when each and when both of the endogenous explanatory variables are excluded. The sign of the bias in the estimation of each coeffient when omitting a variable is calculated by multiplying the sign of the correlation between the dependent variable and the omitted one by the sign of the correlation between the omitted variable and each regressor. The total bias resulting from omitting both ‘good health’ and ‘bad health’ can be clearly identified when the bias caused by omitting each one of the possibly endogenous variables is in the same direction. In the rest of the cases it is necessary to derive the nonseparable likelihood function of the ZINB model in order to obtain the sign of these biases and the magnitude. 26 Table 8: Sign of Omitted Variable Bias Good Health Sick Leave (-) Corr with good health Sign of bias D2009 (-) (-)*(-)=(+) Good Health 1 Bad Health (-) (-)*(-)=(+) Male (+) (-)*(+)=(-) Age (-) (-)*(-)=(+) Age*male (-) (-)*(-)=(+) Age2 (-) (-)*(-)=(+) Hours Worked (+) (-)*(+)=(-) Number of babies (+) (-)*(+)=(-) Number of Children (+) (-)*(+)=(-) Married (-) (-)*(-)=(+) Wage (-) (-)*(-)=(+) Unemployment (-) (-)*(-)=(+) Metropolitan Region (+) (-)*(+)=(-) Bad Health (+) Corr with bad health Sign of bias (+) (+)*(+)=(+) (-) (+)*(-)=(-) 1 (-) (+)*(-)=(-) (+) (+)*(+)=(+) (-) (+)*(-)=(-) (+) (+)*(+)=(+) (-) (+)*(-)=(-) (-) (+)*(-)=(-) (-) (+)*(-)=(-) (-) (+)*(-)=(-) (+) (+)*(+)=(+) (-) (+)*(-)=(-) (+) (+)*(+)=(+) Total Bias of omitting both variables (+) - (-) (+) Uncertain (+) (-) (-) (-) Uncertain (+) Uncertain Uncertain In order to deal properly with the potential simultaneity problem, an adequate proxy for health status would be required. The most reliable one that can be generated using this dataset, which is correlated with current health status but not with the current number of sick leave days taken, is the lag of the health status of each individual. For individuals in the 2009 survey, the available lag is their self-reported health status in 2006 and for the individuals in the 2006 survey it is their self-reported health in 2004. Health status in the previous year would be a much more consistent proxy but data is only available for the years 2004, 2006 and 2009. Once again dummies representing each lagged health status were created due to the categorical nature of the variable, and then these dummies were grouped into ‘lagged good health’ and ‘lagged bad health’ variables. Neither of the lagged variables is significantly correlated with the dependent variable. The correlation between the variable ‘good health’ and its lag is 0.2103 and between ‘bad health’ and its lag is 0.2214. Both of these correlations are significant at a 1 percent level but are not very high. The use of a proxy that is not very accurate could give rise to a bias generated by measurement error of the health status variables. This can cause the estimated coefficients to be biased. Omitted variable bias and measurement errors of variables are two sources of endogeneity. When a variable that affects the dependent and one or more of the independent variables is not included in the regression, then the explanatory variable(s) in question will be correlated with the error term. Measurement error of any of the independent variables also causes them to be correlated with the error term. The attempt to avoid the possible problems caused by endogeneity of the health status variables by excluding them from the regression or replacing 27 them with the most accurate proxies available could indeed result in the rise of endogeneity due to omitted variables or measurement errors. Due to the count properties, overdispersion and excess zeros of the dependent variable, controlling for this prospective simultaneity problem is not trivial. Estimation including the potentially problematic variables, excluding them and, alternatively, using the available proxies of those variables will be carried out in the next section in order to observe the nature of this potential problem and deal with it in the best possible way. VII. Results and Discussion This section first of all reports and discusses the parameters obtained using a zero-inflated negative binomial model (ZINB), followed by an analysis of the results obtained using a Hurdle model. Subsequently, several robustness checks are presented: a comparison between the observed and predicted sick leave, a comparison with the results of the estimation of a hurdle model, estimations that attempt to deal with potential endogeneity of certain explanatory variables and finally the results of the inclusion of various interactive variables in the original ZINB regression. VII.1 Zero-Inflated Negative Binomial Model Table 9 presents the results of the first ZINB regression which includes a negative binomial (column 1) to model the count process and a logit as the inflation model (column 2). ‘Inflate’ is the estimation of the equation that determines whether the observed count is zero. The model includes the dummies ‘good health’ and ‘bad health’ to control for individual’s health status and an array of other explanatory variables described previously. Alpha is the dispersion parameter of the NB model. When alpha is zero (and lnα = -infinity) it is appropriate to use a Poisson model for the estimation. In Table 9 (column 3) we can see that a Poisson model would not be suitable as alpha is significantly different from zero. This indicates that unobserved heterogeneity accounts for at least part of the overdispersion of the dependent variable. The Vuong test is used to compare the ZINB to a standard NB model. As the z-value is significant, it shows that indeed the ZINB fits the data better than a NB would. The overdispersion of the number of sick leave days taken in a year is due to both unobserved heterogeneity (significant lnα) and to the excess zero counts (significant Vuong test). 28 The model reveals that the dummy representing the year 2009 (‘D2009’) is significant at a 5 percent level in the selection equation and at a 1 percent level in the duration equation of the ZINB. The coefficients show that in the year 2009, the probability of individuals taking zero sick leave days increased, but so did the number of days taken off work per year. The decrease in the probability of individuals taking at least one sick leave day during 2009 could be due to the fact that many people got treated for the first illnesses included in the GES plan at the beginning and therefore took fewer sick leave days in 2009. The coefficient of the variable representing the reform in the NB equation is 0.269, indicating that in the year 2009 the expected number of sick leave days are 1.309 (exp(0.269) times the number of days expected in 2006. This increase in 2009 with respect to 2006 could be due to people getting treated for their possibly worse health conditions now that treatment for new illnesses is cheaper and more available, or to an exacerbation of moral hazard in the health care market. The data shows that people’s level of sickness remained similar between 2006 and 2009 which should therefore not have caused major changes in labour absenteeism due to real illness. Another possible scenario is that regulation to restrain excessive granting of medical certificates may have been less effective in 2009 but this is unlikely as it is very difficult to control given the information asymmetry between doctors and their patients and because the GES plan adds no additional cost to doctors for granting sick leave, so their incentives are not likely to have changed since 2006. The inclusion of depression in the GES plan could also have contributed to the increase in the duration of sick leave in 2009. Between 2005 and 2007 sick leave granted due to mental disorders (depression, anxiety and stress) increased by 82 percent and the average length of each episode was 15 days.18 Treatment of depression was included in the GES plan in 2006, therefore its effect on sick leave is included in this study (as in the 2006 survey respondents reported sick leave in the previous year). Sick leave granted for depressive disorders is usually longer than for other health conditions as the recovery period is extensive. It is also interesting to consider that the symptoms of depression can be very subtle, increasing the information asymmetry between doctors and patients, and possibly aggravating the moral hazard problems in the health care market. In the long run, the inclusion of depression as part of the AUGE plan could cause a decrease of sick leave as this condition will be treated better, but in the meantime, labour absenteeism is likely to increase in order for patients to recover. The variables representing sickness of the respondents are significant at a 1 percent level in both equations. Having good health decreases the probability of taking a sick leave day in the year, whereas having bad health has the opposite effect in the selection equation. In the duration equation, the variable representing good health has a negative effect on the number of days of 18 Superintendencia de Salud (2008). 29 absenteeism during the year, and the variable representing bad health has a positive effect. Therefore, the worse the health of the individual, the more likely it is that he/she will be absent from work due to sick leave and the higher the number of total days taken off during the year. The NB equation shows that people who are in the category bad health are expected to take 1.885 (exp(0.634)) more days off work than if they report having regular health. Those who report having good health are expected to take 0.665 (exp(-0.408)) less days off work due to illness than those with regular health. The dummy variable ‘male’ negatively affects the probability of taking a sick leave day and the duration of absence during the year. This reveals that women are more likely to take a sick leave day and take more of them. The coefficient in the NB equation indicates that men are expected to take -0.459 (exp(-0.778)) times the expected number of days of absence that a female takes each year. Regarding the variables ‘age’ and ‘age2’, they were not significant in either equation. The interacting term ‘age*male’ is significant and positive in the duration equation, indicating that older men take slightly more sick leave days than older women. As men have a lower life expectancy than women, this could be due to a more rapid deterioration of men’s health. Men are also likely to do physical labour which is more strenuous. This could lead to exhaustion, more injuries and in general a more rapid worsening of their health. Depending on the working conditions, certain jobs could increase the likelihood of contracting certain illnesses. For example, the work environment of miners enhances the chances of suffering from chronic bronchitis, pneumoconiosis and even lung 30 Variables D2009 Bad health Good health Male Age Age2 Age*male Hours Worked Number of Babies Number of Children Married Wage Unemployment Metropolitan Region Constant (1) Sick Leave Table 9: ZINB Estimation (2) Inflate 0.269*** (0.0754) 0.634*** (0.142) -0.408*** (0.0757) -0.778*** (0.262) -0.0127 (0.0194) 0.000181 (0.000214) 0.0151** (0.00608) 0.00647* (0.00349) 0.356*** (0.0732) 0.00482 (0.0391) -0.0561 (0.0740) 6.68e-10** (3.14e-10) 0.0179 (0.0130) -0.0808 (0.0669) 3.107*** (0.462) 0.119** (0.0571) -0.316*** (0.114) 0.446*** (0.0608) 0.543*** (0.202) 0.0133 (0.0153) 0.000176 (0.000169) 0.000735 (0.00462) -0.00134 (0.00206) 0.0503 (0.0488) 0.0651** (0.0287) -0.322*** (0.0576) 8.78e-10*** (2.36e-10) -0.0105 (0.0106) -0.168*** (0.0528) 0.810** (0.365) (3) lnalpha (4) Marginal Effects 0.678*** (0.0549) 0.514* (0.263) 6.755*** (1.723) -2.57*** (0.357) -1.934*** (0.274) -0.0466*** (0.0113) 0.0234** (0.0114) 0.96*** (0.268) -0.157 (0.134) 0.679*** (0.257) -2.68e-10 (1.10e-9) 0.0828* (0.0464) 0.196 (0.234) - Observations 18,506 AIC Vuong z = 18.48 Pr>z = 0.0000 Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 Inflate: Pr(Number of sick leave days) = 0 cancer. In general, women do not have jobs which involve such high levels of physical effort. This could lead to them having better health than men when they are older and thus taking less sick leave days. Observing data of Chilean private health insurers in 2006 shows that men are charged a relatively higher premium than women since they reach their sixties. 19 This is due to the fact that from this age onwards they generally impose higher costs than women for the insurer, as they have relatively worse health. 19 Superintendencia de Salud (2006). 31 Individuals that are married or live with their partner show a higher probability of taking at least one sick leave day each year. The effect of this variable is only significant in the selection equation. People who live with their spouse or partner could possibly have an alternative source of income in the household and therefore be more willing to miss work when they feel slightly ill. Their threshold of pain could be lower due to additional monetary support at home. The number of babies is only significant in the duration equation and the number of children in the selection equation. A higher number of babies (younger than 1) in the household causes people to take more sick leave days. Having an additional baby increases the expected number of sick leave days by 1.428 (exp(0.356)) days a year. The number of children (between 1 and 8 years old), on the other hand, has a negative effect on the probability of being absent from work. It is likely that families with children and babies are under more income pressure which will make them more reluctant to take sick leave days that could eventually jeopardize the stability of their current job and their possibilities of finding a new one (for example, due to bad recommendations of previous employers). Working and taking care of young children can also increase the stress burden on parents, making them take longer sick leave when they are actually ill. Individuals with higher income levels are expected to take less sick leave days due to educational factors and/or the opportunity cost of their time at work. The regression shows that the effect of wage is significant and positive, but very small in both equations. This indicates that those who earn a higher salary are less likely to take a sick leave day. This could be because they are more work driven. On the other hand, when they do take time off, it is for a slightly longer period possibly because they have more access to a larger array of medical treatments that are not yet included in the AUGE plan and therefore are more expensive. The longer sick leave could also be due to stress related conditions suffered by people with higher salaries being more work driven and to the fact that sick leave granted for these conditions is generally longer than for other illnesses. Unemployment in the region where the respondent lives is revealed to have no significant effect in either equation of the model. The number of hours worked per week has a significant effect solely in the duration equation. This effect is positive, but very small and significant only at a 10 percent level. Finally, living in the region of Santiago has a positive effect on the probability of taking a sick leave day during the year and no significant effect on the number of sick leave days taken. The increased probability of being absent from work at least once could be due to higher levels of stress in the capital city, the much easier access to doctors and also to the large amount of pollution that causes many people to suffer from respiratory diseases. 32 Looking at the marginal effects after the ZINB estimation (column 4), gives a broader overview of the results of the model. The marginal effect of the dummy representing the reform is positive and significant at a 10 percent level. Compared to the year 2006, the model predicts that in the year 2009 people take on average 0.514 more days of sick leave. This result supports the hypothesis of an increase in sick leave due to the reform, although the effect is rather small. This increase could be due to the treatment of real illnesses given a rise in sickness levels in the later year or to an intensification of moral hazard in the health care market. Regarding the health status variables, having good health decreases predicted sick leave days each year by 2.57 days and having bad health increases them by 6.76. Men are predicted to take almost 2 days less off work than women each year. The total effect of ‘age’ reveals that older people are predicted to take slightly fewer days off work. Although this effect is statistically significant, its economic significance is very low due to the tiny coefficient of the marginal effect (-0.047). Those who work longer hours and those who have an extra baby are predicted to take slightly more days off work, consistent with the idea that they are likely to be more stressed. People who are married or live with their partner take 0.679 more days of sick leave. The small positive marginal effect of the variable representing unemployment indicates that less sick leave is taken by people living in regions where unemployment is higher. Wage, the number of children in the household and the dummy representing living in the Metropolitan Region has no significant marginal effect on the predicted value of the dependent variable. As suggested by the analysis of the data prior to the estimation, having good health has a negative influence on sick leave and the opposite occurs with bad health. Being married and living in the Metropolitan Region both cause an increased probability of taking at least one day of sick leave during the year. Also, females have a higher probability of taking a sick leave day and take more days per year. The effect of the variable ‘D2009’ on the probability of taking a sick leave day is negative but the effect on the total number of days taken is positive. This is consistent with the data analysis that showed a decrease in the number of people that took sick leave in 2009 with respect to 2006 but an increase in the average number of days taken per year. VII.2 Hurdle Model The hurdle model is estimated as an alternative to the ZINB for estimating in the presence of count data with excess zero. As mentioned earlier it takes into account the two different decision processes regarding sick leave behaviour. A logit model is used to represent the binary process and a NB the count process. . The logit equation models the probability of an individual taking a 33 positive number of sick leave days during the year (the opposite of the inflate equation in the ZINB). In terms of explanatory variables included, the model in Table 10 is analogous to the ZINB model in Table 9. As in the ZINB model, having good health, being male, having children and earning a higher wage increase the probability of taking at least one sick leave day during the year, whereas having bad health, being married or living with a partner and living in the metropolitan region decrease this probability. The difference with the results of the ZINB model lies in the variable of interest as in the selection equation of the hurdle model, the variable ‘D2009’ is not significant. In the duration equation, having good health and being male lead to a lower predicted number of days of labour absenteeism during the year, whereas the variables ‘bad health’, ‘age*male’, ‘hours worked’ ‘number of babies’ and ‘wage’ increase the predicted number of sick leave days. The variable of interest has a positive coefficient in the NB equation that is significant at a 1 percent level, indicating, as in the ZINB model, that in the year 2009, there is an increase in the predicted number of sick leave days taken by individuals, with respect to 2006. The complexity of the calculation of the total marginal effects of the hurdle model is extremely high and beyond the scope of this paper. Observing the coefficients of the hurdle model it can be deduced that the signs and significance levels of the coefficients in the NB equation are the same as those of the duration equation of the ZINB model. In the selection equation, the only difference lies in the coefficient of the variable ‘D2009’. The reasoning regarding this difference will be analyzed in the following subsection. 34 Variables D2009 Good health Bad health Male Age Age2 Age*male Hours Worked Number of Babies Number of Children Married Wage Unemployment Metropolitan Region Constant Table 10: Hurdle Estimation (1) (2) Logit NegBinomial (3) lnalpha -0.0906 (0.0555) -0.473*** (0.0591) 0.352*** (0.111) -0.594*** (0.196) -0.0131 (0.0149) -0.000169 (0.000165) 0.000439 (0.00450) 0.00189 (0.00198) -0.0225 (0.0474) -0.0630** (0.0278) 0.310*** (0.0559) -8.14e-10*** (2.33e-10) 0.0122 (0.0104) 0.156*** (0.0512) -1.041*** (0.354) 0.677*** (0.0548) 0.268*** (0.0752) -0.407*** (0.0756) 0.632*** (0.142) -0.790*** (0.261) -0.0123 (0.0193) 0.000175 (0.000213) 0.0154** (0.00607) 0.00618* (0.00338) 0.351*** (0.0730) 0.00408 (0.0390) -0.0557 (0.0738) 6.73e-10** (3.14e-10) 0.0166 (0.0130) -0.0773 (0.0669) 3.127*** (0.460) Observations 18,506 AIC 1.446 Dependent Variable: Number of sick leave days during the year Logit Model: Pr(Number of sick leave days) > 0 Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 35 VII.3 Robustness Checks (i) Fit of the Model The ability of the model to capture sick leave behaviour can be seen in Figure 2 in the Appendix. The variables represented in the figure are the observed sick leave and the sick leave predicted by the model. Each point represents the probability of a given count for the entire sample. The graph shows up to 21 days of sick leave in order for it to be clearer and the differences between both models to be appreciated. 97 percent of sick leave lasts less than 21 days so it represents an important part of the dependent variable. The predicted values are almost identical to the observed values for most counts, only underestimating in the case of 7 days and 15 days of sick leave. This could be explained by a tendency of those who grant sick leave to round it up to one or two weeks which can be observed by looking at the observed counts of the dependent variable. Overall, Figure 2 indicates that the ZINB estimation model fits the data very accurately for up to 21 days of sick leave during the year. (ii) Potential Endogeneity Due to the potential simultaneity between the dependent and the explanatory variables representing the sickness of individuals mentioned earlier, it is necessary to find the best way to solve this problem in the estimations. If the endogenous variables were continuous, the problem could be detected and solved as done by Rivers and Vuong (1988). A control function approach and the procedure later used by Roebuck et al. (2004) could be followed to check for endogeneity and solve it when using a ZINB estimation method to model count data with overdispersion and excess zeros. Adhering to their course of action, first of all, the existence of simultaneity between the dependent variable and those endogenous ones is tested. Endogenous variables are estimated with their respective lags and other exogenous explanatory variables using a logistic regression model. The residuals of these estimations would then be included in both the NB and in the inflation equation of the original ZINB model which incorporated the possibly endogenous variables and the rest of the explanatory ones. The significance of the residuals in the equations would reveal endogeneity of the variables. To solve the simultaneity problem, the predicted values obtained from the logistic estimations of the endogenous variables would replace the endogenous regressors in the ZINB estimation. 36 Due to the dichotomic nature of the variables ‘good health’ and ‘bad health’, the method initially proposed by Rivers and Vuong (1988) cannot be implemented. Leaving out the possibly endogenous variables could eradicate the problem but at the same time generate other estimation drawbacks mentioned earlier. As shown in Table 8, omitting the variables ‘good health’ and ‘bad health’ causes the coefficient of the variable of interest to be positively biased. This can be seen in the first model in Table 11, where the marginal effect of ‘D2009’ is slightly higher than in the model in Table 9. When the health status variables are included, the model predicts that on average people take 0.514 more sick leave days in 2009 than in 2006. When the potentially problematic variables are excluded, the marginal effect of the variable representing the year 2009 is 0.596. Another way of controlling for possible endogeneity in this case is to replace the problematic variables with proxies. Variables representing the lagged health status are the most reliable proxies available. Dummies representing having had good health in the previous survey and another representing having had bad health were created. As mentioned earlier, although the lags have no significant correlation with contemporary sick leave, neither do they have very high explanatory power of the contemporary health status variables. As they are the best proxies available, they will be used, taking into account their potential drawbacks. The results of the two different ZINB estimations done, one excluding the potentially problematic current health status dummies and the other including the lagged health statuses as proxies for them are recorded in Table 11. Both models yield very similar outcomes. The marginal effect of the variable ‘D2009’ is of similar magnitude (0.596 in the first and 0.622 in the second), at a 5 percent significance level in both models. The dummies ‘male’ and ‘married and the variable representing the number of babies in the household show coefficients of similar magnitudes and are significant at the same levels in both models. The variable representing the number of hours worked per week is significant at a 10 percent level in the second model in Table 11, but not in the first one. The marginal effects of the variables ‘age’, ‘number of children’, ‘wage’, ‘unemployment’and ‘metropolitan region’ are not significant in either model. The lagged health status variables are not significant in either the inflate or NB equation in the second model of Table 11 and neither are their marginal effects. This is likely to be due to the low capacity of this instrument to explain the contemporary health status of individuals. 37 Variables D2009 Lagged Bad health Lagged Good health Male Age Age2 Age*male Hours Worked Number of Babies Number of Children Married Wage Unemployment Metropolitan Region Constant (1) Sick Leave (2) Inflate 0.304*** (0.0783) -0.812*** (0.272) -0.00967 (0.0202) 0.000226 (0.000225) 0.0153** (0.00633) 0.00469 (0.00349) 0.312*** (0.0747) 0.0122 (0.0406) -0.0251 (0.0767) 6.94e-10** (3.25e-10) 0.00779 (0.0134) -0.131* (0.0690) 2.861*** (0.475) 0.124** (0.0572) 0.545*** (0.202) 0.00724 (0.0152) 0.000173 (0.000168) 0.00163 (0.00462) -0.00135 (0.00208) 0.0473 (0.0485) 0.0596** (0.0287) -0.305*** (0.0576) 8.59e-10*** (2.36e-10) -0.0115 (0.0106) -0.166*** (0.0528) 1.331*** (0.358) Observations 18,506 AIC 1.457 Vuong z = 18.22 Pr>z = 0.0000 Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 Inflate: Pr(Number of sick leave days) = 0 (3) lnalpha 0.763*** (0.0572) Table 11: ZINB Estimations (4) (5) Marginal Effects Sick Leave 0.596** (0.27) -2.176*** (0.288) -0.00961 (0.0114) 0.0179 (0.0112) 0.826*** (0.264) 0.122 (0.136) 0.736*** (0.263) -1.7e-10 (1.12e-10) 0.0543 (0.0469) 0.0407 (0.237) - 0.311*** (0.0786) 0.306 (0.225) -0.0521 (0.0706) -0.793*** (0.273) -0.0107 (0.0201) 0.000228 (0.000224) 0.0148** (0.00637) 0.00496 (0.00348) 0.320*** (0.0748) 0.00534 (0.0407) -0.0290 (0.0768) 6.93e-10** (3.25e-10) 0.00794 (0.0134) -0.130* (0.0690) 2.905*** (0.480) z= 38 (6) Inflate 0.123** (0.0573) -0.170 (0.172) 0.0145 (0.0530) 0.548*** (0.202) 0.00730 (0.0152) 0.000177 (0.000168) 0.00148 (0.00462) -0.00136 (0.00208) 0.0478 (0.0486) 0.0593** (0.0287) -0.306*** (0.0576) 8.61e-10*** (2.36e-10) -0.0116 (0.0106) -0.166*** (0.0528) 1.324*** (0.360) 18,506 1.457 18.26 Pr>z = 0.0000 (7) lnalpha (8) Marginal Effects 0.760*** (0.0571) 0.622** (0.271) 1.77 (1.233) -0.194 (0.237) -2.178*** (0.289) -0.0139 (0.0116) 0.0188* (0.0113) 0.848*** (0.265) -0.142 (0.137) 0.729*** (0.263) -1.79e-10 (1.12e-10) 0.0551 (0.047) 0.0453 (0.237) - Comparing the two models presented in Table 11 to the original ZINB estimation shown in Table 9 reveals that the variable marginal effect of the variable ‘D2009’ is smaller and significant only at a 10 percent level in the original model (compared to 5 percent in the latter models). The sizes of the marginal effects of the variables ‘male’ ‘married’ are slightly smaller in Table 9 and that of ‘number of babies’ is larger. The variable ‘age’ no longer has a significant marginal effect in the models in Table 11 as it did in Table 9 at a 1 percent level. The variables ‘age’ and ‘age2’ continue to be insignificant in both the logit and NB equations and the interactive variable ‘age*male’ is significant only in the NB equation in all three models. The two models in Table 11 are very similar indicating that possibly the lagged health status is such an inaccurate proxy that it generates similar results as not accounting at all for health status. These two models differ slightly from that in Table 9, as the coefficient of the variable of interest is slightly overestimated and more highly significant in the last two models. When not including health status variables (or including a bad measure of them), the effect of ‘D2009’ is positively biased as it includes part of the effect of health status on sick leave, confirming that shown in Table 8. The results of the model which omits the health status variables has an overall outcome which is similar to the original model in Table 9, suggesting that the variables ‘good health’ and ‘bad health’ do not in fact contribute much to the explanation of sick leave behaviour of people. This is likely to be due to the fact that self-reported health status is generally very prejudiced by recent sickness episodes and is therefore likely to be strongly explained by labour absenteeism in the past year. (iii) Comparison with Hurdle Model Table 10 shows the results of estimating using a hurdle model. The coefficients of the hurdle model are not directly comparable to those of the ZINB model, therefore only the significance of the different variables in each model will be contrasted. The significance levels of all of the variables when using the hurdle model are mostly the same as those of the ZINB in both the logit NB equations. It is important to note that the only variable whose significance differs is ‘D2009’ as, although it is significant at a one percent level in the negative binomial equation, it is not significant in the logit equation when estimating with a hurdle model. This casts doubt on the robustness of the expected positive effect of the 35 illnesses added to the GES plan between 2006 and 2009 on sick leave. The difference in the significance level of the variable representing the reform in the logit equation could be due mainly to the specification differences between the hurdle and ZINB models. The effect of the GES reform on the probability of taking sick leave does not appear to 39 be robust to different functional estimation forms as it is not significant in the logit equation of the hurdle model. The effect of the reform on the number of days of sick leave taken is robust to alternative estimation methods as its coefficient is significant at a 1 percent level in the NB equations of both the ZINB and hurdle models. Perhaps the effect of the inclusion of the 35 new illnesses in the reform on the probability of taking sick leave is very small and not captured by the hurdle model due to its specification differences with the ZINB model. As explained in Section VI, due to the characteristics of the decision making process regarding sick leave behaviour, the ZINB model is a more appropriate estimation method in this case. As the marginal effect of the variable ‘D2009’ is only significant at a 10 percent level in the ZINB model it is possible that the effect of the reform on sick leave is not very strong, and is overlooked by the hurdle model. (iv) Alternative Specifications Several regressions were estimated including different interaction terms of the variable ‘D2009’ with others in order to determine any specificities of the effect of the reform on the number of sick leave days taken each year. It is possible that the effect of the reform is heterogeneous amongst the population and this can be determined by estimating alternative specifications. Some of the interactive variables revealed to have significant effects in at least one of the equations of the ZINB but in most cases their marginal effects were not significant. The marginal effects of the interaction variables added to the regression were calculated separately in order to determine whether they in fact have a significant effect on the predicted number of days of sick leave taken per year. The results that can shed some light on the heterogeneity of the effect of the expansion of the GES plan are presented in Table 12. The interaction between the variable representing the GES plan and the wage of the individuals is significant at a 1 percent level in both equations of the ZINB model. The coefficients in both equations are negative. This shows that compared to 2006, in 2009, those with higher wage had a higher probability of taking sick leave but took less of them. Given the decrease in the price of medical care of 35 new illnesses included in the AUGE plan after 2005, more health care could be purchased with a given income level in 2009, possibly causing a rise in the probability of taking sick leave. This is consistent with the theoretical framework that this study is based on, which 40 Table 12: ZINB Alternative Specifications Variables D2009 Bad health Good health Male Age Age2 Age*male Hours Worked Number of Babies Number of Children Married Wage Unemployment Metropolitan Region D2009*wage D2009*unemployment D2009*Met. Region Constant (1) sickleave (2) inflate 0.479*** (0.0907) 0.670*** (0.143) -0.393*** (0.0758) -0.779*** (0.262) -0.0168 (0.0195) 0.000226 (0.000216) 0.0154** (0.00609) 0.00541 (0.00344) 0.360*** (0.0733) -0.00807 (0.0395) -0.0530 (0.0740) 6.63e-10** (3.15e-10) 0.0188 (0.0130) -0.0729 (0.0670) -6.37e-07*** (1.39e-07) 3.214*** (0.463) 0.373*** (0.0687) -0.327*** (0.114) 0.486*** (0.0613) 0.550*** (0.202) 0.0150 (0.0153) 0.000161 (0.000169) 0.00111 (0.00464) -0.000892 (0.00207) 0.0473 (0.0488) 0.0581** (0.0287) -0.305*** (0.0579) 8.79e-10*** (2.36e-10) -0.0107 (0.0107) -0.154*** (0.0530) -8.84e-07*** (1.27e-07) 0.701* (0.367) Observations 18,506 AIC 1.445 Vuong z = 18.53 Pr>z = 0.0000 Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 Inflate: Pr(Number of sick leave days) = 0 (3) lnalpha 0.682*** (0.0555) (4) Marginal Effects (5) sickleave (6) inflate 0.493 (0.313) 7.384*** (1.836) -2.653*** (0.363) -1.973*** (0.278) -0.0487*** (0.0116) - 0.482** (0.235) 0.629*** (0.142) -0.405*** (0.0758) -0.790*** (0.262) -0.0123 (0.0193) 0.000175 (0.000213) 0.0153** (0.00608) 0.00638* (0.00349) 0.358*** (0.0731) 0.00303 (0.0391) -0.0548 (0.0740) 6.43e-10** (3.15e-10) 0.0316 (0.0195) -0.0634 (0.0691) -0.0257 (0.0269) 2.968*** (0.484) -0.370** (0.186) -0.319*** (0.114) 0.450*** (0.0609) 0.542*** (0.202) 0.0133 (0.0153) 0.000177 (0.000169) 0.000717 (0.00462) -0.00129 (0.00206) 0.0470 (0.0489) 0.0627** (0.0287) -0.323*** (0.0576) 8.77e-10*** (2.36e-10) -0.0401*** (0.0152) -0.201*** (0.0544) 0.0598*** (0.0216) 1.100*** (0.381) 0.0193* (0.0114) 0.997*** (0.273) -0.181 (0.137) 0.653** (0.261) -2.88e-10 (1.12e-10 0.0873* (0.0472) 0.186 (0.237) 3.84e-07 (4.39e-07) - (7) lnalpha (8) Marginal Effects (9) sickleave (10) inflate 0.677*** (0.0549) 2.454*** (0.830) 6.719*** (1.716) -2.57*** (0.357) -1.925*** (0.273) -0.0463*** (0.0112) 0.023** (0.0114) 0.974*** (0.267) -0.156 (0.134) 0.684*** (0.256) -3.42e-10 (1.1e-09) 0.203*** (0.0684) 0.337 (0.241) -0.237** (0.095) - 0.536*** (0.0998) 0.661*** (0.141) -0.400*** (0.0754) -0.882*** (0.261) -0.0124 (0.0193) 0.000164 (0.000213) 0.0170*** (0.00605) 0.00662* (0.00345) 0.366*** (0.0724) 0.00814 (0.0386) -0.0851 (0.0739) 6.57e-10** (3.13e-10) 0.0328** (0.0133) 0.167* (0.0901) -0.572*** (0.139) 2.897*** (0.462) 0.284*** (0.0751) -0.315*** (0.114) 0.446*** (0.0608) 0.529*** (0.201) 0.0133 (0.0153) 0.000175 (0.000169) 0.000956 (0.00462) -0.00140 (0.00206) 0.0516 (0.0489) 0.0660** (0.0286) -0.322*** (0.0576) 8.59e-10*** (2.36e-10) -0.00248 (0.0108) -0.00392 (0.0712) -0.370*** (0.108) 0.682* (0.367) 18,506 1.447 z = 18.50 Pr>z = 0.0000 18,506 1.447 z = 18.38 Pr>z = 0.0000 41 (11) lnalpha (12) Marginal Effects 0.662*** (0.0545) 0.901** (0.355) 7.05*** (1.762) -2.55*** (0.357) -2.00*** (0.276) -0.0481*** (0.0113) 0.0242** (0.0114) 0.994*** (0.27) -0.151 (0.133) 0.596** (0.257) -2.58e-10 (1.11e-09) 0.108** (0.048) 0.527* (0.319) -0.782 (0.495) - predicted an increase of consumption of medical care due to the decrease in its price that slightly relaxed individual’s budget constraint. The marginal effect of the interactive variable is not significant. The unemployment variable interacted with the year dummy is significant at a one percent level in the selection equation. Its positive coefficient indicates that the higher the unemployment in the region where individuals live, they are less likely to take sick leave in 2009 than they were in 2006. The marginal effect associated to this interactive variable is -0.237, significant at a 5 percent level. This reveals that for any given unemployment rate, people take less days off work in 2009 than they did in 2006. These findings could be explained by a possibly tighter labour market in 2009 20, which could generate more fear of job loss making people less willing to miss work for longer than necessary. The interaction between the variables ‘D2009’ and ‘metropolitan region’ has a significant negative effect in both the inflate and in the NB equation, revealing that those who live in the region of Santiago take less days of sick leave in 2009 than in 2006. This can be explained by the tightening of the labour market in the later year, decreasing reluctance to miss work as it could jeopardize job stability. The probability of those living in the metropolitan region to take a sick leave day shows an increase in 2009 with respect to 2006. This could be due to the inclusion of depression and similar conditions in the GES plan. A tighter labour market could lead to higher levels of stress in 2009 and a rise in the probability of being absent from work.The marginal effect of this interaction variable is not significant. VIII. Conclusions The aim of this study is to determine whether there has been an increase in labour absenteeism due to sick leave after the extension of the GES plan which is part of the recent Chilean health care reform. The main hypothesis is an intensification of moral hazard resulting from the more comprehensive coverage and cheaper access for an increasing number of illnesses that the GES plan includes. An increase in individuals’ opportunities to act strategically in the health care market could lead to a rise in the number of days they are absent from work each year. Data about sick leave prior to 2005 (when the plan begun) is not available, making it impossible to estimate the total effect of the implementation of the GES plan. The effect of the inclusion of the 35 illnesses incorporated between 2006 and 2009 is measured. Data regarding the number of sick leave episodes or doctor visits during the year is not available. This would be 20 As seen in Figure 1. 42 more accurate to reveal the change in moral hazard behaviour due to the AUGE plan, but the total days of sick leave during the year is used instead as it is provided in the EPS. The results of the ZINB model show that the extension of the health care reform has a negative effect on the probability of taking a sick leave day and a positive effect on the number of days of sick leave taken per year. The negative effect in the selection equation could be caused because by 2006, people had already been treated for the first 25 illnesses included in the GES plan. The marginal effect of the variable representing the reform is 0.514, indicating that when ‘D’2009 takes on value one, the predicted days of sick leave per year increase by this amount. This positive effect of the expansion of the GES reform on the number of days of labour absenteeism could have been caused by a series of factors. Possibly, people had certain medical needs that had been neglected before, and with the introduction of the 35 new illnesses they were able to get treated. A decrease in Chileans health conditions between 2006 and 2009 could have also caused an increase of sick leave but this is unlikely due to no significant difference found in the data. Another possible reason for the increase in sick leave is an intensification of moral hazard caused by the fuller coverage and access to the medical treatment of 35 new illnesses between 2006 and 2009, especially of depression. If regulatory measures put in place to restrain the possibly excessive granting of medical certificates had been more lenient in 2009 this would have caused sick leave to increase, but there was no important change of this regulation and the GES plan did not alter doctor’s incentives regarding the issuance of sick leave, so it is not likely that this is the main cause. Regarding sick leave behaviour in general, it was found that those with good health take less sick leave, and those with bad health take more. The results also reveal that women, married people, those who have more babies and those that work longer hours per week take more sick leave days. The significance of most of the explanatory variables in the selection and duration equations of the hurdle and ZINB models are very similar. The variable of interest (‘D2009’) is not significant in the selection equation of the hurdle model. This suggests that, even though the ZINB is allegedly a more appropriate estimation method in this case, the results that reveal a positive effect of the reform on sick leave are not very strong. As sick leave behaviour is not solely explained by health status, personal characteristics of individuals and macroeconomic variables, other factors must be influencing the decision to be or not to be absent from work. In order to determine whether the increase in sick leave in 2009 is entirely due to changes in moral hazard in the health care market a more specific array of data must be at hand. This would be interesting area for future research on this topic. 43 The results obtained, although not as solid as expected, highlight the importance of taking into account possible changes in the incentives of consumers of health care when designing health policies. The lack of effective control over the issuance of medical certificates to grant sick leave and the generous sick leave reimbursement of the Chilean health care system may be allowing important strategic behaviour of the population, which could be being intensified by the introduction and expansion of the AUGE plan. 44 IX. Appendix Figure 1: Macroeconomic Indicators Source: INE 45 .5 .4 .3 .2 .1 0 Probability .6 .7 .8 .9 1 Figure 2: Observed and Predicted Sick Leave 0 1 2 3 4 5 6 7 8 9 101112131415161718192021 number of days of sick leave observed sick leave predicted sick leave 46 Table 13: Sick Leave Reimbursement Rates in Different Countries Sweden First sick leave day is not paid. 80 percent of income is paid for the following 364 days and 75 percent for the next 550 days. Germany 80 percent of gross wage is paid for the first six weeks of a sickness episode and 70 percent from the seventh week onward. Spain 60 percent from the forth until the twentieth day and 75 percent from the twenty first day onward. Belgium 60 percent of the wage for the first month and 55 percent after that with a maximum of 118.36 euros daily earnings (January 2009) France First three days of sick leave are not reimbursed. After that, 50 percent of average wage is paid for a maximum of 360 days in a three year period. The maximum monthly earnings used in the calculation of the benefits is 2,885 euros (January2009) Italy The first three days are not reimbursed. 50 percent of the average daily earnings are reimbursed for the following 20 days of incapacity and 66.6 percent after that up to a maximum of 180days. Netherlands 70 percent of the wage is reimbursed for sick leaves of up to 104 weeks. Argentina First three days are not paid. 60 percent of average weekly earnings are reimbursed for up to 26 weeks. Colombia First four days are not paid, after that 66.6 percent of the average wage is paid for up to 180 days. Venezuela First three days of sick leave are not reimbursed. After that, 66.7 percent of the salary is paid for up to 52 weeks. Uruguay First three days are not reimbursed. After that, 70 percent of the wage is paid for up to a year. 47 Table 14: Respondents Awareness and Use of the AUGE plan 2006 2009 Do you know what the AUGE plan is? Yes No 52.33 52.92 47.67 47.08 Are you aware of the garanties offered by the AUGE plan? Yes No 55.56 60.45 44.44 39.55 Have you or any of your family members received medical attention covered by the AUGE plan? Yes, the respondent has Yes, a family member of the respondent has Yes, both the respondent and another family member have No Source: EPS 2006 and EPS 2009 48 8.74 17.93 15.59 17.14 0.74 1.03 74.93 63.9 Table 15: Illnesses included in AUGE (GES) plan each year 2005 Primary or essential hypertension in persons 15 years old and over Operable congenital heart disease in children under 15 years Pain relief and palliative care for advanced cancer Acute myocardial infarction Type 1 Diabetes Mellitus Type 2 Diabetes Mellitus Breast cancer for people over 15 years old Spinal dysraphism Surgical treatment of scoliosis for under 15 year olds Surgical treatment of cataract Total hip endoprosthesis for over 65 year olds with hip osteoarthritis Cleft lip palate Cancer of under 15 year olds Schizophrenia Testicular cancer in people aged 15 and over Lymphomas in persons 15 years and over Acquired Immunodeficiency Syndrome HIV / AIDS Acute respiratory infection (ARI) in children under 5 years of age Pneumonia in over 65 year olds acquired as outpatients Chronic renal failure Refractory epilepsy in people between 1 and 15 years old Oral health for 6 year old children Prematurity Disorders of impulse generation and conduction in over 15 year olds with a pacemaker 2006 Preventive cholecystectomy Gallbladder Cancer in symptomatic people between 35 and 49 Gastric cáncer Prostate cancer in people aged 15 and over Refractive problems in people 65 years and over Strabismus in children up to 9 years old Diabetic retinopathy Rhegmatogenous retinal detachment Hemophilia Depression in over 15 year olds Orthotics (or aids) for people 65 years old and over Surgical treatment of benign prostatic hyperplasia in symptomatic people Ischemic stroke in persons 15 years old and over Chronic obstructive pulmonary disease in outpatients Moderate and severe asthma in children under 15 years old Respiratory distress syndrome in newborns 2007 Treatment of osteoarthritis of the hip and/or knee over 55 year olds Subarachnoid hemorrhage due to ruptured cerebral aneurysms 49 Surgical treatment of tumors of the central nervous system in people older than 15 Surgical treatment of herniated lumbar Leukemia in people 15 years old and over Ambulatory emergency dental treatment Dental care for people over 60 Severe multiple traumas Emergency care of moderate and severe head trauma Serious ocular trauma Bilateral hearing loss in people 65 and older who require use of hearing aid Pancreatic Cystic Fibrosis Rheumatoid Arthritis Analgesia of childbirth Sever burns Harmful use of alcohol and drugs dependency in under 20 year olds 2008 Parkinson's disease Epilepsy Bronchial asthma Hernias Rheumatoid arthritis Gaucher disease Table 16: Descriptive Statistics of Independent Variables Variable Mean Std. Dev. Min Max Mean 2006 0.477 0.499 0 1 0 D2009 0.727 0.446 0 1 0.731 Good Health 0.0443 0.206 0 1 0.0411 Bad Health 0.619 0.486 0 1 0.621 Male 44.5 12.1 18 85 43.69 Age 45.56 12.69 2 126 45.78 Hours Worked 0.202 0.558 0 11 0.2 Number of Babies 0.597 0.985 0 14 0.58 Number of Kids 0.631 0.483 0 1 0.63 Married 2.36E+07 1.51E+08 0 1.00E+09 4.42E+07 Wage 8.18 2.721 0 13.5 9.17 Unemployment 0.38 0.485 0 1 0.49 Metropolitan Region Source: EPS 2006 and EPS 2009 50 Mean 2009 1 0.722 0.048 0.617 45.44 45.32 0.21 0.62 0.63 2.78E+05 7.05 0.35 Table 17: Correlation Matrix of Independent Variables D2009 Good Health Bad Health Male Age Age*male Age2 Hours Worked Number of babies Number of Children Married Wage Unemployment Metropolitan Region Lagged Good Health Lagged Bad Health D2009*Good Health D2009*Unemp. D2009*Wage D2009*Met. Reg. D2009 Good Health Bad Health Male Age Age*male Age2 Hours Worked Number of babies Number of Children Married Wage Unemployment Metropolitan Region Lagged Good Health Lagged Bad Health D2009*Good Health D2009*Unemp. D2009 1 -0.0107 0.0166 -0.0038 0.0726 0.0195 0.0633 -0.0179 0.013 0.0203 -0.0066 -0.1449 -0.3885 -0.0527 0.02 -0.0029 0.7615 0.8895 0.4686 0.4728 Married 1 -0.0101 0.0168 -0.019 -0.0026 -0.0026 -0.0083 0.0046 Good Health Bad Health 1 -0.3511 0.0597 -0.2549 -0.0268 -0.2501 0.0211 0.0297 0.0364 -0.0126 -0.0142 -0.0069 0.0148 0.2103 -0.1659 0.4376 -0.0195 0.0664 -0.0049 Wage 1 -0.0595 0.1411 -0.0152 0.1406 -0.0271 -0.0215 -0.023 -0.0133 0.0153 -0.0009 0.001 -0.124 0.2214 -0.1537 0.0207 -0.0271 0.0109 Unemployment 1 0.0506 0.0445 -0.0169 0.0173 -0.1102 -0.1289 1 0.0569 0.0354 0.0093 -0.3095 -0.0418 Male Age Age*male Age2 Hours Worked 1 0.0619 1 0.9145 0.3803 1 0.0644 0.988 0.3824 1 0.132 -0.0227 0.1084 -0.0318 1 -0.0138 -0.1369 -0.0522 -0.1311 0.0184 0.0158 -0.189 -0.0417 -0.1831 0.0415 0.2204 0.1553 0.2684 0.1307 0.0561 0.0247 0.0316 0.0332 0.0347 0.0133 -0.0028 -0.006 -0.0049 -0.0027 0.0126 -0.0404 -0.005 -0.0402 -0.0023 -0.001 0.0356 -0.1244 -0.0059 -0.1242 0.0046 -0.0348 0.1113 0.0009 0.1109 -0.0268 0.0295 -0.0539 0.0109 -0.0596 -0.0022 -0.0037 0.0753 0.0204 0.0669 -0.0176 0.0512 0.0323 0.0562 0.0244 0.0415 -0.0322 0.0379 -0.0179 0.0346 -0.0164 Metropolitan Region Lagged Good Health Lagged Bad Health D2009*Good HealthD2009*Unemployment 1 0.0119 -0.0064 -0.04 0.0278 1 -0.1362 0.1128 0.0488 51 1 -0.0748 0.0072 1 0.6679 1 Number of babies Number of Children 1 0.2657 0.0532 -0.0194 0.0158 -0.0118 0.0168 -0.018 0.0282 0.0281 -0.0052 0.0048 D2009*Wage 1 0.1082 -0.0279 0.0357 0.0078 0.0342 -0.0151 0.0368 0.0445 -0.0086 0.0278 D2009*Met. Region X. References Allebeck, P. and Mastekaasa, A. (2004). Risk Factors for Sick Leave – General Studies. Scandinavian Journal of Public Health Vol. 32 no. 63 (October), pp. 49-108. Askildsen, J.E., Bratberg, E. and Nilsen, O.A. (2005). Unemployment, Labor Force Composition and Sickness Absence: a Panel Data Study. Journal of Health Economics Vol. 14, Issue 11 (November), pp. 1807-1101. Henrekson, M. and Persson, M. (2004). The Effects on Sick Leave of Changes in the Sickness Insurance System. Journal of Labor Economic Vol. 22, No. 1 (January), pp. 87-113. Johansson, P. and Palme, M. (2002). Assesing the Effect of Public Policy on Worker Absenteeism. Journal of Human Resources Vol. 37, No.2 (Spring), pp.381-409. Manning, W., Newhouse, J., Duan, N., Keeler, E. and Leibowitz, A. (1987). Health Insurance and the Demand for Medical Care: Evidence from a Randomized Experiment. The American Economic Review Vol. 77, No. 3 (June), pp. 251-277. Moreau, M., Valente, F., Mark, R., Pelfrene, E., De Smet, P., De Backer, G. and Kornitzer, M. (2004). Occupational Stress and Incidence of Sick Leave in the Belgian Workforce: the Belstress. Journal of Epidemiology and Community Health Vol. 58, pp. 507 – 516. Nordberg, M. and Roed, K. (2009). Absenteeism, Health Insurance and Business Cycles. HERO Online Working Paper Series. Rivers, D., and Vuong, Q. H. (1988). Limited Information Estimators and Exogeneity Tests for Simultaneous Probit Models. Journal of Econometrics, Vol 39, pp. 347-66. Roebuck, C., French, M. and Dennis, M. (2004). Adolescent Marijuana Use and School Attendance. Economics of Education Review, Vol 23, pp. 133-141. Superintendencia de Salud (2008). Evolución de la licencias medicas curativas emitidas años 2005-2007. Superintendencia de Salud (2010). Licencias Medicas – Gasto por Subsidio por Incapacidad Laboral (S.I.L). 52 Tokman, M., Rodriguez, J. and Larrain, F. (2004). Subsidios por Incapacidad Laboral 19912002: Incentivos Institucionales, Crecimiento del Gasto y una Propuesta de Racionalización. CEP Estudios Públicos Vol.93. Ziebarth, N and Karlsson, M. (2010). A natural experiment on sick pay cuts, sickness absence, and labor costs. Journal of Public Economics Vol. 94, Issues 11-12, (December), pp. 11081122. 53