Woods 1 Senior Project An Exploration of Hazard Functions: Applied to the Analysis of the Displacement Spells of Older Workers Royce Woods Fall 2011 Woods 2 Abstract This study intends to explore duration analysis and apply it to an economic idea. More specifically, through the use of hazard functions, the displacement spells of older and younger workers can be effectively juxtaposed. Here similar models to those employed in Chan and Stevens (2001) and Johnson and Mommaerts (2010) are used. However, instead of using the Health and Retirement Study (HRS), or the Survey of Income and Program Participation (SIPP), the Displaced Worker Supplement (DWS) Census data set is used here. Also, instead of estimating several probability functions, two hazard models are implemented here. As a result of these changes, the results this study yields diverge from those of previous literature. The most striking disparity is that age seems to have no effect on the hazard rate of reemployment. Woods 3 Table of Contents Introduction …………………………………………………………………………………….4 The Hazard Function ……………………………………………………………………………..5 Application …………………………………………………………………………………...11 Literature Review ……………………………………………………………………………13 Theoretical Framework ……………………………………………………………………19 Model Specification and Data Data ……………………………………………………………20 …………………………………………………………………………………………...25 Model Interpretation ……………………………………………………………………………26 Conclusion …………………………………………………………………………………...28 Works Cited …………………………………………………………………………………...30 Appendix …………………………………………………………………………………...33 Woods 4 Introduction Unlike other econometric methods, duration analysis enables one to examine the time elapsed until an event occurs. There are several methods encompassed in this type of analysis. This study focuses primarily on hazard functions. Hazard functions calculate the probability an event occurs in a given interval of time. The hazard function is the only feasible technique that enables the examination of statechange probabilities over various intervals of time. Perhaps the only other alternative would be to estimate several probability functions for each time period since the initial state. This, however, is not efficient. For that reason, the hazard function is particularly powerful. It is also powerful due to its wide range of applications. Hazard functions can be applied in any instance where the time elapsed until an event of interest occurs is of importance. This can be helpful in finding the failure rate of products, or the death rates of mice in an experiment. Clearly the applications are vast. Here reemployment will be examined using hazard functions. More specifically, this study will focus on the probability individuals are reemployed, given that they are displaced, over a certain spell of time. Moreover, this study intends to find the hazard rate of reemployment over consecutive intervals of time. For example, if a specific interval were to be isolated, the probability of reemployment will be over the probability an observation remained in the initial state until the interval of interest, which gives the probability of reemployment during that interval. The hazard function, in its entirety, is a collection of these points. This study primarily explores the effect age has on the duration of displacement. In a duration analysis of this sort, hazard functions are a prime method of estimation. Woods 5 The Hazard Function The hazard function is defined as the ratio of the probability density function and the survival function of a random variable. The probability density function describes the chances an event occurs at a given time. On the other hand, the survival function is formally defined as the one minus the cumulative distribution function. In other words, it is necessary to focus on the probability an observation survives until the interval of interest. The cumulative distribution function is essentially the sum of the probability distribution function up to the time in question. In other words, it is the chance the event occurs prior to the interval of interest. When the probability density function is over the survival function, it creates the hazard function (Alemi 2007). For example, where “t” is the standard interval of time for the analysis and “X” (a random variable) is the time at which an observation leaves the initial state, the formula below forms the hazard function (Alemi 2007). 𝑃(𝑋 = 𝑡) 𝑃(𝑋 > 𝑡) One of the methods used to implement the hazard function is the Kaplan-Meier estimator. It calculates the probability an observation does not leave the initial condition during a given interval of time. Essentially, it is a ratio of those surviving the interval of time in question, over all the observations that are at risk. It is formally defined as: 𝑆(𝑡) = ∏ (1 − 𝑡𝑖≤𝑡 𝑑𝑖 ) 𝑛𝑖 Woods 6 Here ti is the length of the study up to i, and di is the number of observations that have transitioned from the initial condition. Also, ni represents the number of observations that are at risk up to time ti. This calculates the probability that an observation does not leave the initial state during the interval of time in question, given that observation is in the initial state at the start of the interval (Frankin 2006). To find the hazard rate, the equation below must be applied (Stats Direct 2011). ℎ(𝑡) = −ln(𝑠(𝑡)) The primary drawback of this method is that it is not reasonable for use with continuous predictors. This is due to the fact that it produces a separate estimation for each level of the continuous predictor in question. It is limited in the number of regressors that can be added as well for the same reason. In other words, a different survival curve is estimated for each covariate. Also, it is non-parametric, meaning that it is not as precise as its parametric and semiparametric counterparts due to the fact that there is no fit to the distribution. For those reasons, other methods must be explored (Introduction to Survival Analysis). There are other methods which are variants of the accelerated life model (“Accelerated Life Models”). The primary difference between these methods is the distribution assumed for the time to the event of interest and error terms. The distributions, in terms of the time elapsed until the event of interest, that are applicable are the Weibull, log-logistic, exponential, and lognormal. With each iteration of the accelerated failure time model is different assumptions. For instance, the log-logistic variation has assumptions that are unique from the other versions of this model. It presumes the survival time of an observation and error term are logistically distributed, and implies that the function varies non-monotonically over time. In Woods 7 other words, it changes over time in a manner that is not uniform. The hazard rate for the loglogistic hazard model is represented as (“Parametric Models”): 1 ℎ𝑖 (𝑡, 𝑋) = 1 𝜆γ t[(γ) − 1] γ[1 + (𝜆𝑡) 1 γ ( ) ] Where 𝜆𝑖 = 𝑒 −(𝑋𝑖 𝛽) As illustrated above, the log-logistic hazard model has the two parameters “𝜆” and “γ" which represent its location and shape, respectively. More specifically, “𝜆” is the rate at which observations leave the initial state during interval “t,”given that they are in the initial state at the beginning of the interval. Here “γ" determines if the hazard is rising or falling. To be precise, if γ < 1, then the hazard rate rises over time then falls. On the other hand, if γ ≥ 1, the hazard is declining. The log-logistic hazard model can never be increasing monotonically (“Parametric Models”). These same rules, with regard to the interpretations of “σ, "which is equivalent to the “γ" from the log-logistic version, apply to the log-normal version of this model. However, its specification differs due to the fact it assumes a log-linear distribution for the survival time and a normal distribution for the error terms. It includes a "𝜑" here which represents the standard normal cumulative density function and 𝜇 = 𝑋𝛽 (“Parametric Models”). 1 ℎ(𝑡) = 𝑒 𝑡𝜎√2𝜋 [ −1 {ln(𝑡)−𝜇}2 ] 2𝜎2 1 − 𝜑{ ln(𝑡)− 𝜇 𝜎 } Also, there is the exponential case to consider. This is a general case where the conditional probability of an event does not change over time. It is extremely simplistic and can Woods 8 be expressed as (“Parametric Models”): ℎ(𝑋) = 𝜆𝑖 = 𝑒 𝑋𝑖 𝛽 The exponential distribution can be expressed in the Weibull variant of this method. Furthermore, when the hazard rate is constant, it can be considered the exponential variant. Generally speaking, this type of function assumes that the hazard function varies monotonically over time. In simpler terms, it varies in a uniform manner. Whether the function is monotonically increasing or decreasing is dependent upon whether “p” is greater than or less than one, respectively. If “p” is equal to one then it is constant. It is also clearly different in terms of representation, as shown below (“Parametric Models”). ℎ𝑖 (𝑡, 𝑋) = 𝜆𝑝(𝜆𝑡)𝑝−1 Where 𝜆 = 𝑒 𝑋𝑖 𝛽 In spite of the clear differences present in the accelerated life model's iterations, they are predicated on the same idea. That is, that this all can be interpreted in a similar fashion to a standard semilog model. This notion is illustrated in the function below (“Parametric Models”). ln(𝑇) = 𝑋𝛽 + 𝜀 The accelerated time model is complicated by the acceleration factor. From this above equation (“Parametric Models”): 𝑇 = 𝑒 𝑋𝛽 𝑒 𝑧 Therefore, if a variable 𝑋𝑘 is altered by 𝛿, then the survival ratio becomes (“Parametric Models”): 𝑇(𝑋𝑘 + 𝛿) = 𝑒 [𝑋𝑘−(𝑋𝑘+ 𝛿)]𝛽𝑘 = 𝑒 𝛽𝑘𝛿 𝑇(𝑋𝑘 ) Woods 9 Here 𝑒 𝛽𝑘𝛿 is the acceleration factor or time ratio and it pertains to the effect of a variable on the duration that a subject remains in the initial state (“Accelerated Life Models”). In other words, it conveys the effect a variable has on survival time (“Parametric Models”). Moreover, when the non- exponentiated coefficient is negative, there is a negative effect of survival time, and the opposite occurs when it is positive. From there, the coefficients can be exponentiated to produce the time ratio. By subtracting one from the time ratio, the percentage change in expected survival time can be obtained (“Parametric Models”). The main drawback to this method is that unless a Weibull or Exponential variant of is used, the hazard ratio cannot be found, which makes it impossible to determine the effect a regressor has on the hazard rate. The proportional hazard model’s strength is in its focus on the effect of the regressors on the hazard rate which can be obfuscated in the accelerated life models. Instead of measuring the effect of a regressor by multiplying the predicted event time by an acceleration factor, proportional hazard models measure the effect of a regressor by producing the hazard ratio, which is much more straightforward. The coefficients produced by accelerated time models can be further manipulated to yield the hazard ratio by multiplying the negative scale parameter by the parameter estimates, then exponentiating the result (Crumer 2011). However, this transformation is only applicable to the Weibull and exponential variants of this model. That is due to the fact that despite being an accelerated time model, it satisfies the proportionality assumption. This assumption means that the hazard function differs from the baseline hazard by a certain proportion. For that reason, hazard ratios cannot be derived from the vast majority of accelerated failure time models (“Parametric Models”). Also, the Cox proportional hazard model does not rely on probability density functions Woods 10 from parametric distributions. On the other hand, accelerated failure time models rely on a number of assumptions regarding the distribution of the durations and error terms, which can alter the veracity of the results yielded. Proportional hazard functions use the probability that an observation leaves the initial state given that it is at risk, which frees them of the burdensome assumptions present in accelerated time models (Monogan 2010). For those reasons, in order to implement the hazard function in this study, it is prudent to use the Cox proportional hazard model. It is proportional due to the fact that the hazard function is proportional to the baseline hazard. The baseline hazard refers to the function without the inclusion of the extra explanatory variables. In other words, it is the hazard at any point in time when the regressors are all zero. As a result, for the assumption of proportionality to be met, the explanatory variables must be constant through time. If this is not the case, the variables must be stratified in the model to accommodate for that (Mason 2005). The basic form of the Cox proportional hazard model (Mason 2005): ℎ(𝑡) = ℎ0 (𝑡)𝑒 𝑋𝑖 𝛽 It can also be expressed as (Fox 2002): log ℎ(𝑡) = 𝛼(𝑡) + 𝑋𝑖 𝛽 The baseline hazard is the bolded portion of the equation below (Mason 2005): ℎ(𝑡) = 𝒉𝟎 (𝒕)𝑒 𝑋𝑖 𝛽 The proportionality condition asserts that the basic form of the Cox model is valid. In other words the expression below must be true (Mason 2005). ℎ(𝑡) = 𝑒 𝑋𝑖 𝛽 ℎ0 (𝑡) Stratification allows for the form of the hazard function to vary with different levels of Woods 11 the stratified variables. For instance, suppose that there is a variable that does not meet the proportionality condition, but has significance in the model. It is necessary to adjust the model without explicitly estimating its effect on the outcome. In other words, there is no change in the parameter fit for this model. Assume the variable is “Z” and is counted with the subscript “j.” The model would look like this (“Cox Proportional Hazards Model” 2004): ℎ(𝑡|𝑋, 𝑍 = 𝑗) = ℎ𝑗 (𝑡)𝑒 𝑋𝑖 𝛽 This is essentially a form of nonlinear regression which is used to explain the nonlinear relationship between variables. This type of estimation is useful for distinguishing the contributions of the various regressors in the context of a hazard function. For that reason, it shall be implemented here. Application To examine the issue of the reemployment of older displaced workers, the implementation of the hazard function is appropriate. The hazard function allows one to estimate the probability of a worker gaining reemployment in a certain interval of time, given that he is displaced. This is a powerful tool which will aid in examining the consequences of displacement for older workers. Due to the nature of the procedure, it can help in determining the reasons behind the disparity between the unemployment spells of older and younger displaced workers. In the process of exploring this idea, it is necessary to cover the known facts. In the interest of clarity, it would be prudent to define displacement. Moreover, a worker is displaced if he is unemployed due to the firm closing, downsizing, or moving away. It is important to isolate Woods 12 displaced workers, since non-displaced workers may not be seeking reemployment as vigorously. For instance, a worker is more likely to leave the labor force if he is unemployed and non-displaced. This concern only increases with age due to the increasing possibility of retirement. Also, it is evident that as one ages, their probability of working decreases. This is especially clear in older worker’s likelihood of being reemployed. Workers aged 50 to 61, who lost their jobs between 2008 and 2009, were 33 percent less likely to be reemployed than their younger counterparts according to the Urban Institute ( J o h n s o n , a n d P a r k 2 0 11 ) . Furthermore, those older workers that managed to obtain reemployment sustained deep cuts in compensation ( J o h n s o n , a n d P a r k 2 0 11 ) . In fact, the new median wage for older reemployed workers fell 36 percent below the old wage ( J o h n s o n , a n d P a r k 2 0 11 ) . It could be the case that older workers have a more difficult time in the labor market than those with youth. There are various reasons as to why older labor force participants may spend longer periods of time displaced. It is possible that this fast paced world dominated by technology has alienated many older workers. Not to mention that their reservation wages tend to be higher than their younger counterparts, due to the perceived increased value of their skills. For these reasons, it is possible that employers may not wish to hire workers of an advanced age. The idea that the probability of reemployment plummets as age increases is prevalent in the media. Schoen, a contributor to MSNBC, cited how the ‘mass layoffs’ often reported are largely composed of older workers ( S c h o e n 2 0 11 ) . These displaced older workers are Woods 13 relatively vocal about their strife and diminished opportunities. On the other hand, some older workers may choose to spend a period of time displaced or out of the work force due to their lower perceived job search cost and higher propensity to retire. It may be the case that reemployment tends to be more difficult for older workers. If this is true, it could be considered to be problematic. Also, there are many causes that can be considered. For that reason, it would be interesting to test whether older workers have a more difficult time gaining reemployment through examining their displacement spells using the hazard function. Literature Review Further exploration of the uses of hazard functions is paramount to this study. The hazard function is integral to this study. In addition to examining matters of employment, and various other economic ideas (like reemployment), it can be applied to a myriad of areas. Hazard functions are used in quality engineering, biology, and practically every other area where failure rates need to be examined. For instance, hazard functions can be implemented in analyzing the completion rates of older, non-traditional students in higher education. Essentially, in recent times, older students have comprised a larger portion of total undergraduates in the United States, according to the United States Department of Education’s 2003 study. Various studies have shown that older students have a lower probability of completing their degree or certification for various reasons. Generally, these reasons revolve around the idea that the opportunity cost of further education increases as one ages. However, Calcagno et al (2007) consider the idea that the cause of older Woods 14 students' increased dropout rate is a result of their skills rusting after spending significant amounts of time away from the educational system. To test the idea that older students’ graduation rates in a given term are not as high as traditional students’ hazard rate of degree completion due to diminished skills, Calcagno et al (2007) employ a Discrete-Time Hazard model, controlling for cognitive mathematics ability. Once the longitudinal dependent variable, which indicates completion, becomes satisfied the student in question is deleted from the sample in future time periods. Students who stop out are also eliminated from the sample. Generally, their hazard model is calculating the probability that a student completes his degree or certification, in a given term, if he has not done so in the past. Calcagno et al (2007) found that the erosion of the skills of older students is the primary cause of their lower probability of completing their degree or certificate in a given term. Previous studies have already indicated that older students have a lower probability of completing their degrees or certifications. However, when cognitive mathematics ability is controlled for, older students have a higher hazard rate of completion. This supports the notion that more than age itself is responsible for the lower completion rates of older students. This is just one other area where hazard functions can aid in analysis. Hazard functions can even handle interesting subjects like war. Collier et al (2004) examines the duration of civil wars. They note that civil wars tend to last longer than any other type of warfare. In fact, they tend to last over six times longer than any other type of international conflict. Clearly, the length of civil wars is a problem. In order to understand how this problem can be remedied, it is necessary to understand the cause. In other words, Collier et al (2004) seeks to examine the determinants of the duration of civil wars. Woods 15 In order to do this they implement a hazard function. Their hazard function determines the probability that a war ends in a given month. Collier et al (2004) has numerous variables to control for various demographics amongst other factors. With the information yielded from this study, they were able to derive sound conclusions that could likely lead to solutions. Moreover, through their study, Collier et al (2004) are able to determine the primary determinants of prolonged civil war. These determinants are high income inequality, low per capita income, and ethnic division. On the other hand, factors that shorten the length of civil wars are a decline in the prices of exports, and external military intervention. In other words, the key to quelling such a conflict is to intervene. Without such intervention, civil wars can persist for, in some cases, generations. This is yet another interesting finding resulting from the implementation of hazard functions. Price (2009) uses hazard functions to analyze the relationship between obesity and crime. Moreover, obese individuals are viewed as disadvantaged and, due to their constraints resulting from obesity, are hypothesized to be more likely to commit crimes at a younger age. To exacerbate this, obese people tend to have lower wages and less overall occupational attainment. Generally speaking, individuals that conform to those characteristics are more likely to commit crimes. To test this idea, Price (2009) implements a hazard function. He is essentially attempting to calculate the probability that an obese individual transitions from legitimate labor market activities to their first conviction within a given year. This hazard is the product of the probabilities that an obese individual is presented the opportunity to commit a crime, and the chances that the reservation wages gained from said crime are high enough. To econometrically Woods 16 analyze this Price (2009) uses a Cox proportional hazard model. His results are consistent with his hypothesis; obese individuals have a higher probability of committing a crime in a given year. The various measures he uses to measure the effect of obesity show have a positive relationship with the probability of conviction in his hazard function. In order to develop a basis for this study, it is necessary to consult relevant articles on the subject. For that reason, several articles were gathered to illustrate both the application of relevant economic theory to the subject matter, and the various uses of hazard functions. As a result, all of the articles gathered use hazard functions, and some of those applications do not strictly pertain to the matter of reemployment. However, the most relevant articles deal primarily with the economic theory and statistical methods associated with the hazard function. For instance, the appropriate theory and relevant statistical methods are more than adequately addressed in an article by Chan and Stevens (2001). They explore the employment consequences of being displaced at an advanced age. Given the fact that job loss rates amongst seniors have risen considerably between 1981 and 1993 their prospects for reemployment are of the utmost importance (Farber 1997). Prior to this paper many studies had excluded older workers from post-displacement reemployment studies. This is likely because of the looming shadow of retirement which could complicate matters, due to the possibility of retirement solely as a result of job loss. The issue of retirement is dealt with through employing hazard models on displaced and non-displaced unemployed workers to compare the groups and highlight the effects of displacement on reemployment. Moreover, if a worker is displaced, the chances are greater that Woods 17 leaving their job was not their choice, meaning they are more likely to be seeking reemployment. On the other hand, the non-displaced unemployed may not necessarily be seeking reemployment. Chan and Stevens (2001) define their hazard functions as a probits to give the probability of an individual returning to work, in a given month, if they are displaced or nondisplaced, to contrast the two groups. This model allows them to examine the effects of displacement on reemployment over time, while controlling for various worker characteristics. They use a sample with 9,668 observations, from the Health and Retirement Study (HRS), with 1,668 of those observations being displaced workers. Also, the worker's ages range from 50 to 66+. Chan and Stevens (2001) found a vast disparity between the probability of a displaced and non-displaced older worker gaining reemployment. For instance, at the age of 55 just 60% of displaced men were reemployed, while more than 80% were reemployed amongst nondisplaced unemployed men two years after losing their jobs. This gap in reemployment remains through four years of unemployment. Therefore, generally speaking, displacement is a detriment to older workers gaining reemployment, according to their study. Furthermore, these results are theoretically rationalized in Chan and Stevens (2001). In essence they infer that older displaced workers can be forced out of the labor force as a result of their displacement. In other words, they are much more likely to retire. This is due to the fact that older workers face diminished utility from continuing to search for employment or work. Generally their compensation is substantially lower than it was at their previous positions. As a result, older worker's incentive to work falls accordingly. Such outcomes make retirement more appealing to older workers. Woods 18 The unemployment spells of displaced older and younger workers is juxtaposed directly in Johnson and Mommaerts (2010). Johnson and Mommaerts (2010) examine the determinants of reemployment for displaced workers. Citing the Bureau of Labor Statistics’ 2010 study, they note that displaced older workers tend to remain unemployed longer than their younger counterparts. Given that, in recent times, a greater portion of the labor force is composed of older workers. Johnson and Mommaerts (2010) seek to further explore the duration of unemployment spells for displaced older workers. To explore the nature in which reemployment varies by age, Johnson and Mommaerts (2010) employ a hazard model using the Survey of Income and Program Participation (SIPP) Census panel data set. With this data, they observed respondents from the time they are displaced to the time they find employment, leave the labor force, leave the survey, or the survey ends. Observations that leave the labor force, leave the survey, or did not find reemployment by the end of the survey, are censored. From there, they implement the hazard function through estimating a logit model for the log odds of becoming reemployed, controlling for various demographic and financial factors. Using their methods Johnson and Mommaerts (2010) are able to conclude that older displaced workers have a lower probability of finding reemployment in a given month than their younger counterparts. According to their findings, men aged 50 to 61 are 39 percent less likely than those aged 25 to 34 to gain reemployment within six months of job loss. Furthermore, similar results were found for the female estimation. However, the authors note that the reality could be much worse for older workers since their sample excludes displaced workers once they cease searching for employment. Woods 19 Johnson and Mommaerts (2010) findings are also consistent with those of Chan and Stevens (2001). Johnson and Mommaerts (2010) found that older displaced workers have a more difficult time gaining reemployment. In their study, the unemployment duration of older and younger workers are compared. This could be rationalized as the result of older workers having higher reservation wages. This clearly demonstrates the disparity between older and younger workers in the difficulty of finding reemployment, which is paramount to this study. This study is derived from these masterfully done works. Here the duration of displacement will be examined with the hazard function. However, unlike previous studies, this analysis sports a different data set in the Displaced Worker Supplement (DWS), and employs the Cox proportional hazard model as seen in this Price (2009). For the sake of comparison an accelerated failure time model is also estimated. Theoretical Framework For further insight regarding the effect age has on reemployment and its various implications, it is necessary to examine the relevant theory. With that in mind, the job search and retirement models are implemented to aid in the explanation of the reasons behind the econometric findings. Generally speaking, the retirement and job search models aid in describing occurrences in the labor market. The job search model deals primarily with the strategy used when seeking employment. Essentially the strategies one employs in his quest for employment is dependent upon the reservation wage. The reservation wage is the wage at which a worker chooses to take a job due to the opportunity cost of continuing a job search being too high. Some of the relevant Woods 20 variables that alter one's reservation wage are the availability of unemployment insurance, distribution of job offers, and the minimum wage rate. These variables, amongst others, all alter the one's job search strategy. Furthermore, as a worker ages, retirement becomes more and more likely. For that reason, a retirement model is also relevant to this research. Essentially as one ages, the opportunity cost of working becomes higher. This is especially true for workers of an advanced age, who largely face deep pay cuts upon reemployment, and may have partial social security benefits. This clearly has an impact on their reemployment prospects. Model Specification and Data Due to their study's relevance to this study, Johnson and Mommaerts (2010) would be best suited for examining the reemployment of older workers. The use of hazard functions appears to be prudent in the estimation of the probability of reemployment over a given period of time. However, this study will diverge from Johnson and Mommaerts (2010) due to the fact it will use the Displaced Worker Supplement (DWS) Census data set instead of the Survey of Income and Program Participation (SIPP) Census data set. Also, instead of implementing the hazard function using a logit model of the log odds of reemployment, a Cox Proportional hazard function will be implemented here. This will enable the juxtaposition of the two approaches. Based on the model, the DWS data set may suffice in its estimation. However, it may not be possible to control for all of the factors they do in Johnson and Mommaerts (2010) more extensive study. The DWS data set is relatively extensive and has all the variables that are absolutely necessary for this estimation using a hazard function. Woods 21 Another difference is that the SIPP census data set used for the model in question covers the years 1996-2007. The DWS data set is more recent and updated to the year 2009. For the most part, this difference appears to be negligible, with the exception of the atypical economic environment. It seems as if the changes, on the whole, will be trivial and not affect the previous results. In the end, a myriad of procedures can be carried out to examine the probability of reemployment of older workers. However, a hazard function best suits the needs of this study. It offers a relatively straightforward approach to examining the unemployment spells of displaced older workers. With the aid of the related literature this study may contribute to the pool of knowledge regarding the employment prospects of older workers. This study’s model will be estimating, Reemployment Hazard (found with Weeks without Work) = f(Age, Race, Education, Marital Status, Receiving Unemployment Benefits, Household Income, Job Tenure in Months, Year, Union Member), Where, Weeks without Work = The amount of time in weeks a worker has been unemployed. Age = The age of an individual. Race = A set of dummy variables that indicate the race of an individual, defined as Black, Hispanic or other. Education = A set of dummy variables indicating the highest level of educational attainment a worker has achieved. Marital Status = A dummy variable that shows whether a worker is married or not. Receiving Unemployment Benefits = Whether a respondent is receiving unemployment benefits or not. Household Income = A continuous variable representing the income a household receives. Job Tenure = The amount of time in months a worker has held their previous occupation. Year = The year defined as a set of dummy variables. Union member = A dummy variable indicating whether an individual belongs to a union. Based on the borrowed models and theory applied to this subject matter, the variables Woods 22 present in this model are summarized in the above equation. All the variables present are relevant to assessing one's probability of reemployment over a certain interval of time. The “Age” variable is the primary variable of interest. Essentially, it's the main independent variable considering the fact that the primary focus of this study is testing whether the probability of being reemployed, over a given interval of time, differs amongst various age groups. Chan and Stevens (2001) and Lahey (2008) have regarded the difference in reemployment prospects amongst various age groups as an effect of statistical discrimination. In other words, older workers are less attractive to employers than their younger counterparts, for a given reservation wage. However, “Age” is associated with an increase in the reservation wage for older workers, due to the increase in perceived value in skills that comes with age. This creates an ostensible disparity between the wages older workers are willing to accept the wages they are offered. As a result, it is very likely that older workers face difficulty when seeking employment. For that reason, it is expected that “Age” has a negative relationship with the probability of reemployment, from a theoretical perspective. The “Race” variable's relationship with the probability of reemployment over a given interval of time depends upon the race in question. For the purposes of this study, “Race” will be a set of dummy variables, which will be defined as white, black, or other. Basically, different races may have different reservation wages. For instance, Holzer (2008) found that young black male youth have higher reservation wages than their white counterparts. This notion is to be assumed to extend to older workers as well. This implies that different races may employ different job search strategies. The “Education” variable's relationship to reemployment probability is rather Woods 23 straightforward. Generally, it's assumed that the more human capital a worker has, the more marketable he is. As a result, it has a direct effect on one's job search strategy. Depending on how a worker perceives his abilities, he is more likely to have a higher reservation wage. The better educated one is, the more likely he is to place a higher value on his abilities. For that reason, the “Education” variable is likely to have a negative relationship with the likelihood of one being reemployed. On the other hand, “Receiving Unemployment Benefits” may be negatively correlated with reemployment. Unemployment benefits decrease the opportunity cost of continuing a job search. If the incentive to gain employment swiftly is diminished, then one will to remain unemployed longer. Thus, it causes changes in a worker's job search strategy. For that reason, whether an individual is receiving UI benefits negatively impact their chances of being reemployed. Similarly, “Household Income” is negatively related to the probability of a worker becoming reemployed. The greater one's household income is, the longer their job search is likely to be. For instance, if the individual in question is displaced, and their household income is high, their job search is likely to be longer (Alexopoulos, and Gladden 2006). This is an instance, depending upon the circumstances, where one would consider retirement, especially if they are old enough to receive partial social security benefits. As a result, this is likely inversely related with the likelihood of becoming reemployed. Another factor altering job search strategy is “Job Tenure.” The longer a worker is on the job, the more human capital he acquires. As a result, the worker in question should have a higher reservation wage decreasing the cost of rejecting job offers (Mortensen 1988). In this event the Woods 24 worker may choose to stay unemployed for a longer period of time, reducing his reemployment prospects. This makes his job search strategy, differ from one who has less experience. One's job search strategy could also be altered by “Marital Status.” It is assumed that a married individual would have more of an incentive to seek reemployment, given his situation. In other words, the fact that an individual is married implies that there is a greater need for income.For that reason, his reservation wage is likely to be lower than one without a spouse (Franz 1980). For that reason, a married individual's job search strategy would differ from his unmarried counterparts. Job search strategies would also differ in the case of union members versus non-union members. Moreover, theory suggests that a union member would have a lower reservation wage due to various other protections offered by the union, according to Johnson and Mommaerts (2010). This could be a result of a more advantageous bargaining position a union affords its members. As a result, it is likely that a union member would spend a shorter time unemployed than his non-union affiliated counterparts. Lastly, the “Year” variable serves to control for differences in one's job search strategy, time makes. For instance, different years could be in different parts of the business cycle. Clearly, someone in a recessionary period would employ a vastly different job search strategy than one in an expansionary period. As a result, depending upon the year, the parameter estimate for this variable will be different. More specifically, it'll be negative during recessionary times and positive during expansionary years. Here it is likely to be negative due to the data set covering a recessionary period. Woods 25 Data As previously noted, this study will implement the Displaced Worker Survey (DWS) from the Census. Furthermore, this iteration of the DWS data set is the most current and covers years 2007-2009. As a result, this data set is more current than other studies regarding this topic. The DWS has the necessary variables to conduct a study of the duration of displacement. The most important variable to such a study is the duration of unemployment. This is captured in the weeks without work variable (WKSWO). It represents the amount of time in weeks a worker was without work. Since individuals that were not displaced, were eliminated from the sample, this variable captures the duration of displacement. In this sample, the mean time of unemployment is roughly 14.2 weeks, while the standard deviation, maximum, and minimum are 18.9 weeks, 160 weeks, and 0 weeks respectively. Another variable captured in this model is the “AGE” variable. This variable represents the age of an individual. Here the mean, standard deviation, minimum, and maximum of age are 40.8, 11.8, 20, and 65, respectively. Clearly, it is prudent to cap the maximum age at 65 years to minimize the effect of retirement. The remainder of the variables are various demographic and control variables. They deal with union membership, unemployment benefits, marriage, job tenure, family income, race, and the year. Perhaps, the only notable manipulation that took place amongst these variables is with regard to the family income variable. It was represented by income bands originally. In order to make it more manageable, the income bands were converted to one continuous variable. Other than that these variables are untainted, and their descriptive statistics and explanations are available in table A-1 of the appendix. Woods 26 Model Interpretation In order to effectively interpret the models, it is prudent to understand their constituent parts. For the Cox model, when the parameter estimates are exponentiated it yields the hazard ratio, which is the ratio of the hazard rate produced from a change in a variable over the baseline hazard. In order to measure the change in the hazard rate caused by a unit change in a variable, one must be subtracted from the hazard rate. The accelerated failure time model is a bit more complicated. Here, the parameter estimates are time ratios when exponentiated. In order to obtain the hazard ratio, the parameter estimates must be multiplied by the negative shape parameter and then exponentiated. With this in mind, we can interpret the very interesting results yielded by this model. The results yielded by this model are interesting. This is largely due to the fact that many variables that would be expected to have a significant impact on the reemployment hazard are not significant in either the accelerated failure time or Cox versions of the model. Moreover, this is a staunch deviation from the results found in the related literature. For those reasons, the results, reported in table A-2, are quite dubitable, but not completely outside of reason. This is true especially for the “Age” variable's parameter estimate. In both models the “Age” variable is not significant as a result of its high p-values. On the other hand, in the Cox model there is a somewhat lower p-value for this variable than in accelerated failure timeversion. Clearly, from a statistical perspective, age does not have a significant effect on the probability a worker gains reemployment in a given week, given that they are displaced. This is evidenced by both models. Woods 27 However, the Cox Proportional model demonstrates that whether one has spent some time in college has a significant effect on the reemployment hazard. Spending some time in college increases one's chances of reemployment in a given week by 48.1%. This differs from the accelerated failure time version of this model because this particular variable is not significant there. The remainder of the variables are also demonstrates a disparity between these two model. In both models whether one is receiving unemployment benefits is strongly significant. The accelerated failure time model and the Cox model both produce low p-values for the variable representing whether an individual is receiving unemployment benefits. Furthermore, if one is receiving unemployment benefits then their chances of reemployment in a given week decrease by 56.1% in the Cox model and by 44.1% in the accelerated failure time model. In other words, unemployment benefits appear to be a detriment to reemployment. This non-uniform effect carries over to the variable indicating whether one is a member of a union. If an individual is in a union they have a 34% greater chance of gaining reemployment in a given week, according to the Cox model. Moreover, their chances of reemployment in a week are not significant in the accelerated failure time model. This is one of the few significant variables currently in the model. Another significant variable is the one indicating whether an individual is black or not. In the Cox Proportional model it decreases one’s chances of reemployment in a given week. On the other hand, according to the accelerated failure time model, this variable is not significant. In the Cox model one’s chances of reemployment in a given week decrease by 28.6%. However, they do not deviate vastly from one another. In fact, the year 2009 seems to Woods 28 have a positive, significant effect in both models. Perhaps, there was some sort of mild recovery during that year. Their differences can most likely be attributed to the assumption regarding the distribution of survival times in the accelerated failure time model. Whether the actual distribution of the survival times was consistent with the Weibull distribution or approaching that form determines which method is more accurate in this case. The more surprising result is that most of the variables are not significant. This could be a result of the powerful effect the unemployment benefits variable has on the model or some multicollinearity. Conclusion The results of this study are particularly interesting for various reasons. Perhaps the most interesting aspect of what was revealed was that most of the variables expected to have a major effect on the model are not significant. Also, these results diverge from the previous studies cited earlier rather starkly. Of course, there are reasons for all these events that can be explained logically. For instance, the “Age” variable was not significant in either model. This is likely a result of the effect the variables had on the models. It is very likely that a markedly significant variable, like unemployment benefits, could have overpowered it. It is not beyond the realm of plausibility that this is the case. Also, it is not entirely implausible that a variable is significant in one model, but not in the other. Each model has different assumptions attached, which could alter the results based on the accuracy of the assumptions in this case. There is a distinct possibility that one model is more suited for this study than the other. As a matter of fact, these models have different Woods 29 specifications, but are widely applicable. Luckily, and not surprisingly, whether one is receiving unemployment benefits has a negative impact on an individual's probability of reemployment in a given week in both models. This affirms the hypothesis that this is true. These results were expected and not particularly troubling. The fact that most of the education variables are insignificant is the most troubling aspect of these results. It would seem as if education would either increase or decrease one's probability of reemployment. One could assume that either those with more human capital would have a higher reservation wage lengthening their job search, or would be in high demand which would shorten their job search. It appears that the latter is the case since the Cox model shows that whether one has some college experience increases the probability he gains reemployment in a given week. On the whole, these results could be due to the data only covering years 2007- 2009. During these years there was an atypical economic environment which could have skewed the results. If this study could be conducted differently, it would be prudent to control for the business cycle. However, other than that, this study was conducted successfully. Woods 30 Works Cited "Accelerated Life Models."Web. 1 Nov 2011. <http://www.mas.ncl.ac.uk/~nmf16/teaching/mas3311/handout9.pdf>. Alemi, Farrokh. "Hazard Functions for Combination of Causes."Youtube.George Mason University .Online. 2007.<http://youtu.be/TT7mdmMAmPg>. Alexopoulos, Michelle, and Tricia Gladden."Wealth, Reservation Wages, and Labor Market Transitions in the US."University of Toronto. University of Toronto, 01 Jan 2006. Web. 8 Nov 2011. <http://www.iza.org/iza/en/papers/transatlantic/1_alexopoulos.pdf>. "BIOST 515."Cox proportional hazards models. N.p., 04 Mar 2004. Web. 1 Nov 2011. <http://courses.washington.edu/b515/l17.pdf>. Bureau of Labor Statistics. 2010. “Unemployed persons by Age, Sex, Race, Hispanic or Latino Ethnicity, Marital Status, and Duration of Unemployment.” Washington, DC: U.S. Department of Labor. ftp://ftp.bls.gov/pub/special.requests/lf/aat31.txt. Calcagno, Juan, Peter Crosta, Thomas Bailey, and Davis Jenkins. "Does Age of Entrance Affect Community College Completion Probabilities? Evidence from a Discrete-Time Hazard Model."Educational Evaluation and Policy Analysis. 29.3 (2007): 218-35. Print. C h a n , S e w i n . " J o b Lo s s a n d E m p l o ym e n t P a t t e r n s o f O l d e r Wo r k e r s . " J o u r n a l o f La b o r E c o n o m i c s ( 2 0 0 1 ) : 4 8 4 - 5 2 1 . We b . 11 J a n 2 0 11 . Collier, Paul, AnkeHoeffler, and MånsSöderbom."On the Duration of Civil War."Journal of Peace Research. 41.3 (2004): 253-73. Print. Crumer, Angela. "Comparison between Weibull and Cox proportional hazards models." . Kansas State University , May 2011. Web. 18 Nov 2011. <http://krex.k Woods 31 state.edu/dspace/bitstream/2097/8787/3/AngelaCrumer2011.pdf>. F a r b e r, H e n r y S . “ T h e C h a n g i n g F a c e o f J o b Lo s s i n t h e U n i t e d S t a t e s , 1 9 8 1 - 1 9 9 5 ” B ro o k i n g s P a p e r s o n E c o n o m i c A c t i v i t y : M i c ro e c o n o m i c s (1997), pp.55-128. Fox, John. "Cox Proportional-Hazards Regression for Survival Data." Fox Companion, Feb 2002. Web. 1 Nov 2011. <http://cran.r-project.org/doc/contrib/Fox-Companion/appendix cox-regression.pdf>. Franklin, David. "How to Build the Kaplan-Meier Curve from the Ground Up."The Programmers Cabin.N.p., 08aug2006. Web. 1 Nov 2011. <http://www.theprogrammerscabin.com/OT060830.pdf>. Franz, Wolfgang. United States. National Bureau of Economic Research.Reservation wage of unemployed persons In the Federal Republic of Germany: Theory and Empirical Tests. Cambridge: NBER, 1980. Print. <http://www.nber.org/papers/w0578>. H o l z e r, H a r r y. " R e s e r v a t i o n Wa g e s a n d t h e i r L a b o r M a r k e t E f f e c t s f o r B l a c k a n d W h i t e M a l e Yo u t h . " J o u r n a l o f H u m a n R e s o u rc e s . ( 1 9 8 6 ) : 157-177. Print. Introduction to Survival Analysis. UCLA: Academic Technology Services, Statistical Consulting Group. From http://www.ats.ucla.edu/stat/sas/seminars/ sas_survival/default.htm (accessesedNovember 1, 2011). Johnson, Richard, and CorinaMommaerts. "Age Differences in Job Loss, J o b S e a r c h , a n d R e e m p l o ym e n t . " U r b a n I n s t i t u t e . ( 2 0 1 0 ) J o h n s o n , R i c h a r d , a n d J a n i c e P a r k . " C a n U n e m p l o ye d O l d e r Wo r k e r s F i n d Woods 32 Wo r k ? " U r b a n I n s t i t u t e 2 5 . 1 ( 2 0 11 ) : We b . 1 5 M a r 2 0 11 . < h t t p : / / w w w. u r b a n . o r g / u r l . c f m ? I D = 4 1 2 2 8 3 > . "Kaplan-Meier survival estimates." Stats Direct.StatsDirect, 2011.Web. 23 Nov 2011. <http://www.statsdirect.com/help/survival_analysis/kaplan.htm>. L a h e y, J o a n n a . 2 0 0 8 . “ A g e , Wo m e n , a n d H i r i n g : A n E x p e r i m e n t a l S t u d y. ” J o u r n a l o f H u m a n Resources 43(1): 30–56. Mason, Carl. "Cox proportional hazard models." UC Berkeley, 05 Dec 2005. Web. 1 Nov 2011. <http://www.demog.berkeley.edu/213/Week14/welcome.pdf>. Monogan, Jamie. "The Cox Proportional Hazards Model." Lecture. Washington University in St. Louis. St. Louis. April 6, 2010. <http://monogan.myweb.uga.edu/teaching/pd/16duration2.pdf>. Mortensen, Dale. "Wages, Separations, and Job Tenure: On-the-job Specific Training or Matching." Journal of Labor Economics . 6.4 (1988): 445-471. "Parametric Models."New York University, n.d. Web. 1 Nov 2011. <https://files.nyu.edu/mrg217/public/parametric.pdf>. Price, Gregory. "Obesity and crime: Is there a relationship?." Economics Letters. (2009): 149-52. S c h o e n , J o h n . " H o w a r e o l d e r, l a i d - o ff w o r k e r s f a r i n g ? . " M S N B C 2 0 11 : We b . 1 5 M a r 2 0 11 . < h t t p : / / w w w. m s n b c . m s n . c o m / i d / 1 5 5 3 7 9 1 7 / n s / b u s i n e s s answer_desk/>. U.S. Department of Education, National Center forEducationStatistics. (2003). Integratedpostsecondaryeducation data system- Fall Woods 33 e n ro l l m e n t s u r v e y ; 2 0 0 2 [ D a t a F i l e ] , Wa s h i n g t o n , D C Woods 34 Append ix Ta b l e A - 1 Variables Variable AGE FEMALE LJTEN Definition The age of an individual. A dummy variable indicating gender An observations tenure at their last job in years. Mean 40.8 .396 4.8 Std 11.77 .49 6.2 Min 20 0 .00274 Max 65 1 41 MARRIED A dummy variable indicating whether an individual is married. A dummy variable indicating whether an individual has received unemployment benefits A dummy variable indicating whether an individual is a member of a union. .55 .497 0 1 .43 .495 0 1 .100 .300 0 1 A dummy variable indicating whether an individual’s highest level of educational attainment is less than high school. A dummy variable indicating whether an individual’s highest level of educational attainment is a high school diploma. A dummy variable indicating whether an individual’s highest level of educational attainment is some college. A dummy variable indicating whether an individual’s highest level of educational attainment is a college degree. .075 .264 0 1 .31 .46 0 1 .33 .47 0 1 .206 .405 0 1 A dummy variable indicating whether an individual’s highest level of educational attainment is an advanced degree. A dummy variable indicating whether an individual is white. A dummy variable indicating whether .077 .266 0 1 .725 .446 0 1 .087 .281 0 1 UIBENS UNION LTHS HS SCOL COL ADV WHITE BLACK Woods 35 HISP OTHERS YearOne an individual is black. A dummy variable indicating whether an individual is Hispanic. A dummy variable indicating whether an individual is not black, white, or Hispanic. A dummy variable indicating whether an individual was displaced in 2007. .133 .341 0 1 .0535 .23 0 1 .231 .421 0 1 YearTwo A dummy variable indicating whether an individual was displaced in 2008. .361 .48 0 1 YearThree A dummy variable indicating whether an individual was displaced in 2009. .41 .492 0 1 0 1 0 160 FAMILYINC A variable representing the income of 58,426 40,461 a family. WKSWO A variable representing the amount of 14.17 18.9 time in weeks a worker has been without work. The source for all the data used in this analysis was collected from: http://www.ceprdata.org/cps/dws_data.php Woods 36 Ta b l e A - 2 Regression Table Dependent Variable: WKSWO Regression (1) Cox Proportional (2) AFT Intercept None (Semi-Parametric) .9272 (<.0001) AGE .00220 (.5777) .01174 (.8917) .0008603 (.9043) -.0004816 (.9958) -.82323 (<.0001)* .29569 (.0361)* Reference .17976 (.3305) .39300 (.0362)* .23436 (.2435) .42494 (.0631) Reference -.33748 (.0222)* -.22052 (.0918) -.10798 (.5933) Reference .0006 (.7242) -.0158 (.6527) .0016 (.5934) .0194 (.5996) .2104 (<.0001)* -.0709 (.2367) Reference .00398 (.5854) -.0944 (.2056) .0495 (.5351) -.1131 (.2201) Reference .0954 (.0921) .0742 (.1542) .1068 (.2303) Reference FEMALE LJTEN MARRIED UIBENS UNION LTHS HS SCOL COL ADV WHITE BLACK HISP OTHERS YearOne YearTwo YearThree FAMILYINC .00271 (.9804) .36295 (.0008)* 2.26342E-7 (.8428) 1.002 1.012 1.001 1.000 0.439 1.344 1.197 1.481 1.264 1.529 0.714 0.802 0.898 1.003 1.438 1.000 -.0125 (.7767) -.1264 (.0038)* <-.0001 (.7515) 0.559 1.417 .999140877 Not Applicable Generalized 𝑅 2 Weibull Shape Not Applicable 2.7561 -P-Values in parenthesis- Asterisk denotes significant values - Hazard Ratios are to the right