MEMORANDUM Department of Economics University of Oslo

advertisement
MEMORANDUM
No 20/2010
Identifying Trend and Age Effects in Sickness
Absence from Individual Data:
Some Econometric Problems
Erik Biørn
ISSN: 0809-8786
Department of Economics
University of Oslo
This series is published by the
University of Oslo
Department of Economics
In co-operation with
The Frisch Centre for Economic
Research
P. O.Box 1095 Blindern
N-0317 OSLO Norway
Telephone: + 47 22855127
Fax:
+ 47 22855035
Internet: http://www.sv.uio.no/econ
e-mail:
econdep@econ.uio.no
Gaustadalleén 21
N-0371 OSLO Norway
Telephone:
+47 22 95 88 20
Fax:
+47 22 95 88 25
Internet:
http://www.frisch.uio.no
e-mail:
frisch@frisch.uio.no
Last 10 Memoranda
No 19/10
Michael Hoel, Svenn Jensen
Cutting Costs of Catching Carbon
Intertemporal effects under imperfect climate policy
No 18/10
Hans Jarle Kind, Tore Nilssen, Lars Sørgard
Price Coordination in Two-Sided Markets: Competition in the TV
Industry
No 17/10
Vladimir Krivonozhko, Finn R. Førsund and Andrey V. Lychev
A Note on Imposing Strong Complementary Slackness Conditions in
DEA
No 16/10
Halvor Mehlum and Karl Moene
Aggressive elites and vulnerable entrepreneurs
- trust and cooperation in the shadow of conflict
No 15/10
Nils-Henrik M von der Fehr
Leader, Or Just Dominant? The Dominant-Firm Model Revisited
No 14/10
Simen Gaure
OLS with Multiple High Dimensional Category Dummies
No 13/10
Michael Hoel
Is there a green paradox?
No 12/10
Michael Hoel
Environmental R&D
No 11/10
Øystein Børsum
Employee Stock Options
No 10/10
Øystein Børsum
Contagious Mortgage Default
No 09/10
Derek J. Clark and Tore Nilssen
Learning by Doing in Contests
No 08/10
Jo Thori Lind
The Number of Organizations in Heterogeneous Societies
Previous issues of the memo-series are available in a PDF® format at:
http://www.sv.uio.no/econ/forskning/publikasjoner/memorandum
IDENTIFYING TREND AND AGE EFFECTS
IN SICKNESS ABSENCE FROM INDIVIDUAL DATA:
SOME ECONOMETRIC PROBLEMS∗)
ERIK BIØRN
Department of Economics, University of Oslo,
P.O. Box 1095 Blindern, 0317 Oslo, Norway
E-mail: erik.biorn@econ.uio.no
Abstract: When using data from individuals who are in the labour force to disentangle the
empirical relevance of cohort, age and time effects for sickness absence, the inference may be biased,
affected by sorting-out mechanisms. One reason is unobserved heterogeneity potentially affecting
both health status and ability to work, which can bias inference because the individuals entering
the data set are conditional on being in the labour force. Can this sample selection be adequately
handled by attaching unobserved heterogeneity to non-structured fixed effects? In the paper we
examine this issue and discuss the econometric setup for identifying from such data time effects
in sickness absence. The inference and interpretation problem is caused, on the one hand, by the
occurrence of time, cohort and age effects also in the labour market participation, on the other
hand by correlation between unobserved heterogeneity in health status and in ability to work.
We show that running panel data regressions, ordinary or logistic, of sickness absence data on
certain covariates, when neglecting this sample selection, is likely to obscure the interpretation of
the results, except in certain, not particularly realistic, cases. However, the fixed individual effects
approach is more robust in this respect than an approach controlling for fixed cohort effects only.
Keywords: Sickness absence, health-labour interaction, cohort-age-time problem, self-selection,
latent heterogeneity, bivariate censoring, truncated binormal distribution, panel data
JEL classification: C23, C25, I38, J22
* This paper is part of the project “Absenteeism in Norway – Causes, Consequences, and Policy
Implications”, financed by the Norwegian Research Council (grant #187924). I am grateful to
Knut Røed for comments.
1
Introduction
During the last two decades, the rate at which workers have been absent from work
due to sickness – absenteeism – has risen in several countries. Norway, for instance,
has seen a sharp increase, from around 4–5 per cent of paid hours in the early 1990s
to around 6.5 per cent in 2010. This rise has occurred despite general improvements
in self-reported health conditions. In a recent paper, Biørn et al. (2010) have, by
exploiting individual data on long-term absence spells for virtually all workers in
Norway over a 13-year period, addressed this problem empirically, attempting, in
particular, to disentangle the empirical relevance of cohort, age and time effects
by means of “fixed effect methods”. It is obvious that the data available for a
study of this kind are potentially affected by sorting-out mechanisms because the
individuals entering the data set are conditional on being in the labour force. It may
be questioned whether this sample selection can be adequately treated by handling
unobserved heterogeneity through fixed effects, and whether suppressing individual
heterogeneity and instead conditioning on cohort or age is likely to accentuate the
bias in the estimation of time effects.
In this paper we examine this issue and discuss more thoroughly the econometric setup for identifying from such data time effects in sickness absence while, as far
as possible, controlling for cohort/age effects and systematic sample selection. The
inference and interpretation problems arise, on the one hand, because of the occurrence of time and cohort/age effects also in the labour market participation, on the
other hand because unobserved heterogeneity which most likely affects both health
status and ability to work. Specifically, the modelling and inference should account
for these two latent variables being correlated. Running regressions – ordinary or
logistic – of sickness absence data on certain regressors, without accounting for this
sample selection, is likely to obscure the interpretation of the findings and make it
difficult to explain their message to non-specialists.
The content of the paper is organized as follows. In Section 2 a simple basic
model is formulated, explaining jointly degree of ability to work and degree of sickness by time, cohort and age, accounting for the exact collinearity of the latter, as
well as individual heterogeneity. The modelling of unobserved heterogeneity and
its implication for the interpretation of the coefficients are discussed in Section 3.
Derived sickness probabilities are discussed in Section 4, where we emphasize the
distinction between conditioning on individual effects and conditioning on cohort or
age. Next, in Section 5, models treating degree of sickness as an observable quantitative variable are discussed, while in Section 6 models treating it as binary (sick
versus non-sick) are considered. In the two latter sections, selection bias problems
and ways of coming to grips with them are put in focus. Some concluding remarks
follow in Section 7.
1
2
Notation and basic model: Heterogeneity unmodelled
Let i and t denote individual and time period (year) and let ci and ait be birth
cohort and age. The three variables are collinear, since by definition
(2.1)
ait ≡ t−ci .
Let 1{A} = 1 and = 0 if event A is true and untrue, respectively, and define
wit = 1{Individual i belongs to the labour force at time t},
sit = 1{Individual i is reported sick at time t}.
Let also
wit∗ = Degree of ability to work, individual i, time t,
s∗it = Degree of sickness, individual t, time t,
both quantitative and continuous, although not frequently observable in this way.
Regardless of whether (wit∗ , s∗it ) are observable or latent, we postulate that they
s
depend on cohort, time, age, and latent heterogeneity (µw
i , µi ) as
w
(2.2)
wit∗ = βw ci + γw t + δw ait + µw
i + εit ,
(2.3)
s∗ = βs ci + γs t + δs ait + µsi + εsit ,
w it
2
0
σw σws
εit
w
s
(2.4)
|[ci , t, µi , µi ] ∼ IID
,
≡ IID(0, Σ).
εsit
0
σws σs2
s
where εw
it and εit are genuine disturbances. Whether or not the covariance matrix Σ
is diagonal, i.e., whether σws = 0 or 6= 0, will be important for the selection bias issue.
s
Ways of modelling the latent individual effects (µw
i , µi ) and their consequences will
be discussed in Section 3.
We treat cohort, year and age as quantitative, but the terms involving these
variables in (2.2)–(2.3) and formulae derived from them can be easily replaced by
terms in cohort, year, age dummies – if desired. Specifically, we may extend t, ci , ait
to (column) vectors of cohort, time, age dummies, and extend the scalar coefficients
and (βw , γw , δw ) and (βs , γs , δs ) to (row) vectors of dummy coefficients, paying regard
to the definitional relationships between the dummies which correspond to (2.1).
Our primary objective is to identify γs , in combination with βs or δs if possible,
while controlling for observed and unobserved heterogeneity. Because cohort, time
and age are linearly related, confer (2.1), and the equations under consideration are
linear, the dimension of the equations must be reduced accordingly (2.2) or (2.3) is
confronted with data. As a starting point for the empirical modelling we therefore
can take either of the following versions of the equations:
w
(2.5)
wit∗ = (βw −δw )ci + (γw +δw )t + µw
i + εit
w
≡ (γw +βw )t + (δw −βw )ait + µw
i + εit
w
≡ (βw +γw )ci + (δw +γw )ait + µw
i + εit ,
(2.6)
s∗it = (βs −δs )ci + (γs +δs )t + µsi + εsit
≡ (γs +βs )t + (δs −βs )ait + µsi + εsit
≡ (βs +γs )ci + (δs +γs)ait + µsi + εsit .
2
3
Extensions: Modelling systematic heterogeneity
The latent effects are likely to be correlated with observed regressors, for instance
because norms with respect to labour force participation and absenteeism are correlated with cohort. Econometrically, a ‘norm’ is a latent entity, to be attached to,
‘proxied by’, observable variables to be of relevance. A simple way of formalizing
this is
(3.1)
(3.2)
(3.3)
w
w
µw
i =αw + λw ci + νi ≡ αw + λw (t−ait ) + νi ,
µs =αs + λs ci + νis ≡ αs + λs (t−ait ) + νis ,
w i
2
νi
0
ωw ωws
|[ci , t] ∼ IID
,
≡ IID(0, Ω),
νis
0
ωws ωs2
and concurrently modify (2.4) to
w
2
εit
0
σw σws
w
s
(3.4)
|[ci , t, νi , νi ] ∼ IID
,
≡ IID(0, Σ).
εsit
0
σws σs2
Inserting (3.1)–(3.2) in (2.5)–(2.6), we obtain
(3.5)
wit∗ = αw + (βw +λw −δw )ci + (γw +δw )t + νiw + εw
it
≡ αw + (γw +βw +λw )t + (δw −βw −λw )ait + νiw + εw
it
≡ αw + (βw +λw +γw )ci + (δw +γw )ait + νiw + εw
,
it
(3.6)
s∗it = αs + (βs +λs −δs )ci + (γs +δs )t + νis + εsit
≡ αs + (γs +βs +λs )t + (δs −βs −λs )ait + νis + εsit .
≡ αs + (βs +λs +γs )ci + (δs +γs)ait + νis + εsit
This stylized modelling of heterogeneity makes (βw , βs ) unidentifiable, as it
implies that we in (2.5)–(2.6) must extend (βw , βs ) to (βw +λw , βs +λs ) and replace
s
w
s
(µw
i , µi ) by (νi , νi ). In view of (3.3)–(3.4), the composite disturbances
s
w
w
s
s
(uw
it , uit ) = (νi +εit , νi +εit )
have a vector error components form with components mutually orthogonal (εz ⊥
νz , z = w, s) and orthogonal to both regressors, with standard deviations (τw , τs ) =
1
1
[(σw2 +ωw2 ) 2 , (σs2+ωs2 ) 2 ], covariance τws = σws + ωws and correlation coefficient κws =
τws /[τw τs ]. We will to some extent stick to (3.1)–(3.4) as a way of modeling systematic heterogeneity on the following.
However, unobserved heterogeneity may be related also to other observable
variables than cohort, some of which time-varying, reflecting (gradual) changes in
‘norms’ (‘norm drift’); (3.1)–(3.2) may be argued to be too ‘simplistic’. Consider a
s
variant of (2.2)–(2.3) where uni-dimensional heterogeneity (µw
i , µi ) is generalized to
s
two-dimensional heterogeneity (µw
it , µit ) and (3.1)–(3.2) are extended to
‡w
‡
‡
w
µw
it = αw + λw ci + γw t + δw ait + νi + εit ,
µsit = αs + λs ci + γs‡ t + δs‡ ait + νis + ε‡s
it .
3
It is easy to show that this essentially implies extending (γw , δw , γs , δs ) in (3.5)–(3.6)
‡w ‡s
s
to include (γw‡ , δw‡ , γs‡ , δs‡ ), and (εw
it , εit ) to include (εit , εit ), respectively.
Obvious, but important, conclusions so far are:
Conclusion 1: The interpretation of ‘time effect in absenteeism’ depend on
which mechanism determines the two kinds of unobserved heterogeneity and
whether cohort or age is the other control variable.
Conclusion 2: The time effects in absenteeism obtained from (2.6), with
(2.4) assumed, and with heterogeneity accounted for, i.e., γs + δs or γs + βs ,
may be a more stable ‘structure’ – the equation has a higher degree of ‘autonomy’ – than the time effects according to (3.6), with (3.3)–(3.4) assumed, or
extensions of it. The latter, unlike the former, changes when the parameters
of (3.2) change.
4
Sickness probabilities
4.1
Threshold values for sickness and ability to work
As remarked, wit∗ and s∗it , in particular the former, may not be observable as continuous variables, while their qualitative counterparts – whether or not individual
i is in the labour force and/or is sick at time t – are usually known. Let w̄, s̄ be
unknown critical threshold values for the two continuous variables determining the
status ‘being in the labour force’ and ‘being reported sick’:
wit∗ ≥ w̄ =⇒ Individual i is observed belonging to the labour force.
s∗it ≥ s̄ =⇒ Individual i is observed being declared sick by a doctor.
The work ability threshold w̄ may be time invariant or time dependent, in the
latter case capturing, inter alia, (worker) ‘norm drift’, the sickness threshold s̄ may,
likewise, be time invariant or time dependent, in the latter case also capturing
(worker) ‘norm drift’ as well as drift in doctors’ norms or attitudes with respect
to issuing sickness certificates. We want to derive expressions for the corresponding
sickness probabilities. Let, as a start, ψ(u, v) be the joint density of the standardized
s
disturbances in (2.5)–(2.6), or in (3.5)–(3.6), i.e., (u, v) = (εw
it /σw , εit /σs ), or (u, v) =
s
(uw
it /τw , uit /τs ), and define, for arbitrary a, b,
R∞R∞
(4.1)
f (a, b) = P (u > a, v > b) = a b ψ(u, v) du dv,
f (a, b)
P (u > a, v > b)
(4.2)
=
.
g(a, b) = P (v > b|u > a) =
P (u > a)
f (a, −∞)
In (2.5)–(2.6), while utilizing (3.1)–(3.2), it is convenient to define
(4.3)
(4.4)
µw∗
i
µiw†
µs∗
i
µs†
i
=
=
=
=
w
(βw −δw )ci + µw
i = αw + (βw +λw −δw )ci + νi ,
w
(βw +γw )ci + µw
i = αw + (βw +λw +γw )ci + νi ,
s
s
(βs −δs )ci + µi = αs + (βs +λs −δs )ci + νi ,
(βs +γs )ci + µsi = αs + (βs +λs +γs )ci + νis .
4
They can be interpreted as representing ‘gross individual heterogeneity’ inclusive of
cohort effects. Then (3.5)–(3.6) can be rewritten more simply as
w†
w
w
wit∗ = (γw +δw )t + µw∗
i + εit ≡ (δw +γw )ait + µi + εit ,
s
s∗it = (γs +δs )t + µs∗
+ εsit ≡ (δs +γs )ait + µs†
i
i + εit .
(4.5)
(4.6)
Combining these expressions with (2.5)–(2.6), using the definition of the binary
variables in Section 2, we obtain
wit = 1 ⇐⇒
sit = 1 ⇐⇒
w†
w∗
εw
it ≥ w̄−(γw +δw )t−µi = w̄−(γw +δw )ait −µi ,
s†
εsit ≥ s̄−(γs +δs )t−µs∗
i = s̄−(γs +δs )ait −µi .
wit∗ ≥ w̄ ⇐⇒
s∗it ≥ s̄ ⇐⇒
We introduce, in order to simplify notation, putting the kind of parameters
identifiable from binary response data (confer Section 6) in focus, two sets of rescaled
parameters, obtained by normalizing coefficients and thresholds against the relevant
disturbance standard deviations. The first is related to (2.5)–(2.6), the second to
(3.5)–(3.6), giving, respectively, ‘σ-normalized’ parameters:
(4.7)
γwσ
(4.8)
γsσ
µw∗
µiwσ = i ,
σw
µs∗
µisσ = i ,
σs
w̄
w̄σ =
,
σw
s̄
s̄σ = ,
σs
γw +δw
,
=
σw
γs +δs
,
=
σs
µ†iwσ
µ†isσ
µw†
= i ,
σw
µs†
= i ,
σs
and ‘τ -normalized’ parameters:
γw +δw
,
τw
γs +δs
,
=
τs
βw +λw −δw
,
τw
βs +λs −δs
=
,
τs
(4.9)
γwτ =
βwτ =
(4.10)
γsτ
βsτ
w̄
,
τw
s̄
s̄τ = ,
τs
w̄τ =
where, obviously, (w̄τ , s̄τ , γwτ , γsτ ) are smaller (in absolute value) than (w̄σ , s̄σ , γwσ , γsσ ).1
We then obtain from (2.5)–(2.6)
(4.11)
wit = 1 ⇐⇒
εw
it
σw
≥ w̄σ −γwσ t−µiwσ = w̄σ −γwσ ait −µ†iwσ ,
(4.12)
sit = 1 ⇐⇒
εsit
σs
≥ s̄σ −γsσ t−µisσ = s̄σ −γsσ ait −µ†isσ ,
and, likewise, from (3.5)–(3.6)
(4.13)
(4.14)
1
wit = 1 ⇐⇒
uw
it
τw
≥ w̄τ −γwτ t−βwτ ci = w̄τ −(γwτ +βwτ )t+βwτ ait ,
sit = 1 ⇐⇒
usit
τs
≥ s̄τ −γsτ t−βsτ ci = s̄τ −(γsτ +βsτ )t+βsτ ait .
Possible smooth ‘norm-drift’ in w̄ and s̄ could be absorbed into (γwτ , γsτ ) or (γwσ , γsσ ).
5
4.2
Probabilities conditional on individual effects
Conditioning on individual effects, we can, using (4.1)–(4.2), (4.7)–(4.8) and (4.11)–
(4.12), express the probability of being sick unconditionally and conditional on being
in the labour force, as, respectively,2
(4.15)
(4.16)
P (sit = 1; t, µisσ ) = f (−∞, s̄σ −γsσ t−µisσ )
= f (−∞, s̄σ −γsσ ait −µ†isσ ),
P (sit = 1|wit = 1; t, µiwσ , µisσ ) = g(w̄σ −γwσ t−µiwσ , s̄σ −γsσ t−µisσ )
= g(w̄σ −γwσ ait −µ†iwσ , s̄σ −γsσ ait −µ†isσ ).
s
If εw
it and εit are stochastically independent, then
g(w̄σ −γwσ t−µiwσ , s̄σ −γsσ t−µisσ )
≡ g(−∞, s̄σ −γsσ t−µisσ ) ≡ f (−∞, s̄σ −γsσ t−µisσ ).
4.3
Probabilities conditional on cohort or on age
Conditioning instead on cohort, or equivalently on age, we can, using (4.1)–(4.2),
(4.9)–(4.10) and (4.13)–(4.14), express the probability of being sick unconditionally
and conditional on being in the labour force, as, respectively,3
(4.17)
P (sit = 1; t, ci ) = f (−∞, s̄τ −γsτ t−βsτ ci )
= f (−∞, s̄τ −(γsτ +βsτ )t+βsτ ait ),
(4.18) P (sit = 1|wit = 1; t, ci ) = g(w̄τ −γwτ t−βwτ ci , s̄τ −γsτ t−βsτ ci )
= g(w̄τ −(γwτ+βwτ )t+βwτ ait , s̄σ −(γsτ+βsτ )t+βsτ ait ).
s
w
s
If not only εw
it and εit , but also νi and νi are stochastically independent, then
g(w̄τ −γwτ t−βwτ ci , s̄τ −γsτ t−βsτ ci )
≡ g(−∞, s̄τ −γsτ t−βsτ ci ) ≡ f (−∞, s̄τ −γsτ t−βsτ ci ).
5
Models treating sickness as quantitative.
In this section, leaving the probability expressions in Section 4, we return to the
setup presented in Sections 2 and 3 and consider three models with sickness assumed quantitatively observable, say measured as the number of sickness days per
unit of time. All models condition on time or on age; otherwise they differ with
respect to the conditioning assumed: the individual effect (Section 5.1), the birthcohort (Section 5.2), the age (Section 5.3). Conditioning on age and on cohort give,
however, models which mirror models where the conditioning is on time and cohort.
We assume throughout that the observable variables are s∗it , wit , t, ci .
2
Formally, the latter probability is conditional both on being in the labour force, and on unobserved individual-specific heterogeneity in sickness and ability to work.
3
Formally, the latter probability is conditional both on being in the labour force, and on the
observed cohort to which the individual belongs.
6
5.1
Conditioning on individual effect
s
Assume first that (µw
i , µi ) are treated as fixed effects and, accordingly, that the heterogeneity submodel (3.1)–(3.3) is ‘suspended’. It follows from (2.4)–(2.6) and (4.6)
that only γs +δs and the composite parameters µs∗
i defined in (4.4) can be identified.
With respect to the sample, we distinguish between two cases:
[A] If the sample were not censored by labour force participation, the sick-leave trend
estimated by regressing s∗it linearly on (t, µs∗
i ), would have been γs +δs , since then
(5.1)
s∗
E(s∗it |t, µs∗
i ) = (γs +δs )t + µi .
[B] If the sample is censored by labour force participation, the sick-leave trend we
actually estimate differs from γs +δs . We have
(5.2)
s∗
s
s∗
E(s∗it |wit = 1; t, µs∗
i ) = (γs +δs )t + µi + E(εit |wit = 1; t, µi ).
This equation exemplifies a bivariate sample selection model, whose last term accounts for the sample selection; see, e.g., Cameron and Trivedi (2005, Section 16.5.3).
This model type is sometimes referred to as Amemiya’s ‘Type 2 Tobit Model’; confer
Amemiya (1985, Section 10.7).
In the binormal case, where
1
1
2
ψ(u, v) = (2π)−1 (1−ρ2 )− 2 e− 2 (u −2ρ uv+v
2 )/(1−ρ2 )
,
we can express E(εsit |wit = 1; t, µs∗
i ) analytically as follows. Letting φ(·) and Φ(·) be
the univariate normal density and c.d.f., respectively, we get, by exploiting φ′ (u) =
−uφ(u), E(v|u) = ρu, and E(u|a ≤ u ≤ b) = [φ(a)−φ(b)]/[Φ(b)−Φ(a)] [see, Johnson,
Kotz and Balakrishnan (1994, Section 10.1) or Biørn (2008, Appendix 8A)]:
φ(a)−φ(b)
(5.3)
E(v|a ≤ u ≤ b) = ρ
,
Φ(b)−Φ(a)
and also that, for any a,
φ(a)
φ(−a)
(5.4)
λ(a) ≡
≡
,
Φ(a)
1−Φ(−a)
(5.5)
λ′ (a) ≡ −ξ(a) = −λ(a)[λ(a)+a].
s
Therefore, if (εw
it , εit ) are binormal, letting ρws = σws /(σw σs ) and using the ‘σnormalized’ parameters (4.7)–(4.8), we obtain
(5.6)
s w
w∗
E(εsit |wit = 1, t, µs∗
i ) = E[εit |εit ≥ w̄ − (γw +δw )t−µi ; t, ci ]
s
w
ε
ε
= σs E( σits | σitw ≥ w̄σ − γwσ t−µiwσ ; t, ci )
= ρws σs λ(γwσ t− w̄σ +µiwσ ).
Inserting (5.6) in (5.2) we obtain
(5.7)
s∗
E(s∗it |wit = 1; t, µs∗
i ) = (γs +δs )t + µi + ρws σs λ(γwσ t− w̄σ +µiwσ )
= σs [γsσ t + µisσ + ρws λ(γwσ t− w̄σ +µiwσ )].
7
Hence, utilizing (5.5), we find that the correct sickness trend, allowing for the systematic censoring, is, in general, non-linear and given by
(5.8)
∂E(s∗it |wit = 1; t, µs∗
i )/∂t = γs +δs + ρws σs ∂λ(γwσ t− w̄σ +µiwσ )/∂t
= σs [γsσ −γwσ ρws ξ(γwσ t− w̄σ +µiwσ )].
If ρws 6= 0, i.e., if the genuine disturbances in (2.2) and (2.3) are correlated, the
sickness trend (5.8) depends on σs , γwσ , w̄σ and µiwσ . Hence, when ρws 6= 0, the
correct trend will be individual-specific.
What can be said about the sign of the last component in (5.8)? First, (5.5)
implies that ξ(γwσ t− w̄σ +µiwσ ) is likely to be positive. Second, assume that some
common unspecified factors lead both to absenteeism and drop-out from the labour
force and hence ρws < 0. Third, assume that the trend in inclusion into (exclusion
from) the labour market is negative (positive), i.e., γwσ < 0. Hence, (5.8) most likely
implies that ∂E(s∗it |wit = 1; t, ci , µsi )/∂t < γs +δs .
Conclusion 3: If the sample is censored by labour force participation and
ρws 6= 0, the (theoretical) regression E(s∗it |wit = 1; t, µs∗
i ) is, in general, nons∗
linear in (t, µi ). Its form depends on the coefficients of (2.5) and (2.6) as
s
well as the distribution of εw
it , εit , as expressed by (5.7) in the binormal case.
A linear regression of s∗it on (t, µs∗
i ) will result in biased estimation of the
composite sickness trend coefficient σs γsσ = γs+δs and the composite individual
∗
s∗
effects µs∗
i . If ρws = 0 the bias disappears: ∂E(sit |wit = 1; t, µi )/∂t = σs γsσ = γs+
δs . In the latter case, (2.5)–(2.6) form a recursive structure, conditional on the
individual effects: first labour market participation is decided, next sickness is
determined. Conditional on the individual effects, there are no latent elements
bringing feedback from the latter to the former.
5.2
Conditioning on birth-cohort
Assume next that heterogeneity modeled as (3.1)–(3.2) is part of the model, and
let ci be the conditioning variable in addition to t. It follows from (3.3)–(3.6) and
(4.6) that only γs +δs and βs +λs −δs (or one-to-one transformations of them) can
be identified. With respect to the sample, we again distinguish between two cases.
[A] If the sample were non-censored, the trend coefficient we would have estimated
by regressing s∗it linearly on (t, ci ) would have been γs +δs , since then
(5.9)
E(s∗it |t, ci ) = αs + (γs +δs )t + (βs +λs −δs )ci .
[B] If the sample is censored by labour force participation, the trend coefficient ac-
tually estimated by regressing s∗it on (t, ci ) differs from γs +δs . We have
(5.10)
E(s∗it |wit = 1; t, ci ) = αs +(γs +δs )t+(βs +λs −δs )ci +E(usit |wit = 1; t, ci ).
This equation exemplifies again a bivariate sample selection model, whose last term
accounts for the effects of the sample selection on the expected response variable.
Now, however, the origin of the selection is the composite disturbance usit = νis +εsit .
8
s
w
s
Assume in addition that (εw
it , εit ) and (νi , νi ) are binormal, implying that
s
(uw
it , uit ) are binormal with standard deviations (τw , τs ), covariance τws and correlation coefficient κws . From (5.3), (5.4) and (5.10), introducing the ‘τ -normalized’
parameters, (4.9)–(4.10), we then obtain
(5.11) E(s∗it |wit = 1; t, ci ) = αs +(γs +δs )t+(βs +λs −δs )ci +κwsτs λ(γwτ t+βwτ ci − w̄τ )
= αs + τs [γsτ t+βsτ ci +κws λ(γwτ t+βwτ ci − w̄τ )],
where κws 6= 0 if at least one of σws and ωws is non-zero. We then find, in a similar
way as (5.7), that the correct trend and the correct cohort effects are, in general,
non-linear and given by, respectively,
(5.12)
(5.13)
∂E(s∗it |wit=1; t, ci )/∂t = γs+δs +κws τs ∂λ(γwτ t+βwτ ci − w̄τ )/∂t
= τs [γsτ −γwτ κws ξ(γwτ t+βwτ ci − w̄τ )],
∂E(s∗it |wit=1; t, ci )/∂ci = βs+λs−δs +κwsτs ∂λ(γwτ t+βwτ ci − w̄τ )/∂ci
= τs [βsτ −βwτ κws ξ(γwτ t+βwτ ci − w̄τ )].
If κws 6= 0, both derivatives depend on τs , γwτ , βwτ and w̄τ , which implies that the
correct trend is cohort-specific, while the correct cohort effect is time-varying.
What can be said about the sign of the last components in (5.12) and (5.13)?
First, (5.5) implies that ξ(γwτ t + βwτ ci − w̄τ ) is likely to be positive. Second, assume [1] that some common latent individual-specific factors lead to absenteeism
and drop-out from labour force and hence ωws < 0, or [2] that some unspecified
time-varying factors also lead to absenteeism and drop-out from labour force and
hence σws < 0. Together, [1] or [2] suggests κws < 0. Third, assume that the trend in
inclusion into (exclusion from) the labour force is negative (positive), i.e., γwτ < 0.
Hence, (5.12) implies that ∂E(s∗it |wit = 1; t, ci )/∂t < γs+δs .
Conclusion 4: If the sample is censored by labour force participation, the
(theoretical) regression E(s∗it |wit = 1; t, ci ) is, in general, non-linear in (t, ci ).
Its form depends on the coefficients of both (2.2)–(2.3) and (3.1)–(3.2), as well
s
as the distribution of (uw
it , uit ), as expressed by (5.11) in the binormal case. A
linear (empirical) regression of s∗it on (t, ci ) will result in biased estimation of
the adjusted trend coefficient γs +δs . If both σws = 0 and ωws = 0 hold, implying
κws = 0, the biases disappear: ∂E(s∗it |wit = 1; t, ci )/∂t = τs γsτ = γs + δs and
∂E(s∗it |wit = 1; t, ci )/∂ci = τs βsτ = βs + λs −δs . In the latter case, (3.5)–(3.6)
form a recursive structure, unconditional on the individual effects: first labour
market participation is decided, next sickness is determined. Conditional on
cohort, but unconditional on the individual effects, there is no feedback from
the latter to the former.
5.3
Conditioning on age
Assume again that (3.1)–(3.2) are part of the model, and let t and ait be the conditioning variables. It follows from (3.3)–(3.6) and (4.6) that only γs +βs +λs and
9
δs −βs −λs (or one-to-one transformations of them) can be identified. With respect
to the sample, we again distinguish between two cases.
[A] If the sample were non-censored, the trend coefficient we would have estimated
by regressing s∗it linearly on (t, ait ) would have been γs+βs+λs , since then
(5.14)
E(s∗it |t, ait ) = αs + (γs +βs +λs )t + (δs −βs −λs )ait .
[B] If the sample is censored by labour force participation, the trend coefficient ac-
tually estimated by regressing s∗it on (t, ait ) differs from γs +βs +λs . We have4
(5.15) E(s∗it |wit=1; t, ait ) = αs +(γs+βs+λs )t+(δs−βs−λs )ait +E(usit |wit=1; t, ait ).
From (5.3), (5.4), (5.15) and (4.9), we obtain, in the binormal case,
(5.16)
E(s∗it |wit = 1; t, ait ) = αs + (γs +βs +λs )t + (δs −βs −λs )ait
+ κws τs λ[(γwτ +βwτ )t−βwτ ait − w̄τ ].
The correct trend and the correct age effects therefore become5
(5.17) ∂E(s∗it |wit = 1;t, ait )/∂t
= γs +βs +λs +κws τs ∂λ[(γwτ +βwτ )t−βwτ ait − w̄τ ]/∂t
= τs [(γsτ +βsτ )−(γwτ +βwτ )κws ξ[(γwτ +βwτ )t−βwτ ait − w̄τ ]] ,
(5.18) ∂E(s∗it |wit = 1;t, ait )/∂ait
= δs −βs −λs +κwsτs ∂λ[(γwτ +βwτ )t−βwτ ait − w̄τ ]/∂ait
= τs [−βsτ +βwτ κws ξ[(γwτ +βwτ )t−βwτ ait − w̄τ ]] .
If κws 6= 0, both derivatives depend on τs , γwτ , βwτ and w̄τ , which implies that the
correct trend is age-specific and the correct age effect is time-varying.
Conclusion 5: If the sample is censored by labour force participation, the
(theoretical) regression E(s∗it |wit = 1; t, ait ) is, in general, non-linear in (t, ait ).
Its form depends on the coefficients of both (2.2)–(2.3) and (3.1)–(3.2), as
s
well as the distribution of (uw
it , uit ), as given by (5.16) in the binormal case.
A linear (empirical) regression of s∗it on (t, ait ) will result in biased estimation of the actual trend coefficient γs + βs + λs and of the adjusted age coefficient δs − βs − λs . If both σws = 0 and ωws = 0 hold, implying κws = 0, the
biases disappear: ∂E(s∗it |wit = 1; t, ait )/∂t = τs (γsτ + βsτ ) = γs + βs + λs and
∂E(s∗it |wit = 1; t, ait )/∂ait = −τs βsτ = δs − βs − λs . Then (3.5)–(3.6) form a
recursive structure, unconditional on the individual effects. Conditional on
cohort, but unconditional on the individual effects, there is no feedback from
sickness to labour force participation.
4
5
This equation, of course, mirrors (5.10).
These equations mirror (5.12)–(5.13).
10
6
Models treating sickness as dichotomously observable
6.1
General remarks
Having explored the situation where the degree of absenteeism, s∗it , is assumed to be
recorded quantitatively, we next consider models where absenteeism is assumed to
be recorded qualitatively (dichotomously). This may sometimes be a more realistic
assumption. Or even if continuous observations are available, the analyst may want
to exploit it only dichotomously for ‘institutional’ reasons, because of measurement
problems which may plague the data collection, suggesting a need for ‘robustifying’
the results, etc. This corresponds to the approach of Biørn et al. (2010). With
respect to the sample, we distinguish between cases [A] and [B], as in Section 5.
[A] Data for all individuals, whether in the labour force or outside, are in the sample.
Then we could want to make inference on trend effects in the sickness probability
from (4.15) or (4.17). If we base inference on (4.15) when (3.1)–(3.2) are part of
the data generating mechanism, using standard binomial logit or probit analysis –
and hence conditioning on ci or ait – we would estimate τ -normalized coefficients.
If we base inference on (4.17), using binomial logit or probit analysis – and hence
conditioning on µisσ – we would estimate σ-normalized coefficients.6 Derivatives of
the (log-)probabilities, ‘marginal effects’, could be estimated from either.
[B] The sample is only from individuals being in the labour force. Then, to obtain
valid inference on trend effects in the sickness probability, we should account for
the implicit censoring. Again, we could only obtain inference on τ - or σ-normalized
coefficients. Since the relevant sickness-absence probabilities underlying our binary
response data are conditional on wit = 1, they are of the form (4.16) or (4.18). When
conditioning on µisσ (µ†isσ ), we obtain more robust inference on the trend in the
sickness probability than when conditioning on ci (ait ).
To see this we differentiate the relevant expressions for the conditional logprobability of absenteeism with respect to time and the other relevant covariates.
R∞
R∞
Let Ψu (u; b) = b ψ(u, v)dv and Ψv (v; a) = a ψ(u, v)du, and write (4.1) as
R∞
R∞
∂f (a, b)/∂a = −Ψu (a; b),
(6.1)
f (a, b) = a Ψu (u; b)du = b Ψv (v; a)dv =⇒
∂f (a, b)/∂b = −Ψv (b; a).
Now differentiation of (4.2) gives
(6.2)
(6.3)
∂g(a, b)
∂ ln g(a, b)
= −g(a, b)Ga (a, b) ⇐⇒
= −Ga (a, b),
∂a
∂a
∂g(a, b)
∂ ln g(a, b)
= −g(a, b)Gb (a, b) ⇐⇒
= −Gb (a, b),
∂b
∂b
6
In both cases the non-normalized coefficients are non-identifiable when only discrete informa∗
tion is exploited since no metric for (wit
, s∗it ) and (w̄, s̄) is exploited.
11
where
R∞
R∞
ψ(a, v)dv
ψ(a,
v)dv
Ψu (a; b) Ψu (a;−∞)
Ga (a, b) =
−
= R ∞Rb∞
− R ∞R−∞
,
∞
f (a, b)
f (a,−∞)
ψ(u, v)du dv
ψ(u, v)du dv
a b
a −∞
R∞
ψ(u, b)du
Ψv (b; a)
Gb (a, b) =
= R ∞Ra∞
.
f (a, b)
ψ(u,
v)du
dv
a b
6.2
Conditioning on individual effect
It follows by combining (6.2)–(6.3) with (4.16) that the derivative of the log-probability
of absenteeism with respect to time is
(6.4)
∂ ln P (sit = 1|wit = 1; t, µiwσ , µisσ )/∂t
= ∂ ln g(w̄σ −γwσ t−µiwσ , s̄σ −γsσ t−µisσ )/∂t
= γsσ Gb (w̄σ −γwσ t−µiwσ , s̄σ −γsσ t−µisσ )
+ γwσ Ga (w̄σ −γwσ t−µiwσ , s̄σ −γsσ t−µisσ ).
The first term after the last equality sign represents the direct effect of the trend
in absenteeism – mirroring the effect of the trend term in (2.6). It is positive when
γsσ > 0 since Gb (w̄σ−γwσ t−µiwσ , s̄σ−γsσ t−µisσ ) is positive. The second term represents
the indirect effect, via the trend in the ability to work and dropping out of the labour
market – mirroring the effect of the trend term in (2.5). It is negative if γwσ < 0,
since Ga (w̄σ −γwσ t−µiwσ , s̄σ −γsσ t−µisσ ) is, most likely, positive.
6.3
Conditioning on cohort or age
Combining (6.2)–(6.3) with (4.18), it follows, likewise, that
(6.5)
∂ lnP (sit=1|wit=1; t, ci )/∂t
= ∂ ln g(w̄τ −γwτ t−βwτ ci , s̄σ −γsτ t−βsτ ci )/∂t
= γsτ Gb (w̄τ −γwτ t−βwτ ci , s̄τ −γsτ t−βsτ ci )
+ γwτ Ga (w̄τ −γwτ t−βwτ ci , s̄τ −γsτ t−βsτ ci ),
(6.6)
∂ lnP (sit=1|wit=1; t, ci )/∂ci
= ∂ ln g(w̄τ −γwτ t−βwτ ci , s̄σ −γsτ t−βsτ ci )/∂ci
= βsτ Gb (w̄τ −γwτ t−βwτ ci , s̄τ −γsτ t−βsτ ci )
+ βwτ Ga (w̄τ −γwτ t−βwτ ci , s̄τ −γsτ t−βsτ ci ),
or equivalently
(6.7) ∂ ln P (sit = 1|wit=1; t, ait )/∂t
= ∂ ln g(w̄τ −(γwτ+βwτ )t+βwτ ait , s̄σ −(γsτ+βsτ )t+βsτ ait )/∂t
= (γsτ +βsτ )Gb (w̄τ −(γwτ+βwτ )t+βwτ ait , s̄σ −(γsτ+βsτ )t+βsτ ait )
+ (γwτ +βwτ )Ga (w̄τ −(γwτ+βwτ )t+βwτ ait , s̄σ −(γsτ+βsτ )t+βsτ ait ),
(6.8) ∂ ln P (sit = 1|wit=1; t, ait )/∂ait
= ∂ ln g(w̄τ −(γwτ+βwτ )t+βwτ ait , s̄σ −(γsτ+βsτ )t+βsτ ait )/∂ait
= −βsτ Gb (w̄τ −(γwτ+βwτ )t+βwτ ait , s̄σ −(γsτ+βsτ )t+βsτ ait )
− βwτ Ga (w̄τ −(γwτ+βwτ )t+βwτ ait , s̄σ −(γsτ+βsτ )t+βsτ ait ).
12
Again, the first terms after the last equality signs represents the direct effects, while
the second terms represent the indirect effect, via the ability to work and dropping
out of the labour market.
6.4
The recursive case and a synthesis
It is illuminating to compare the last five expressions with those obtained when the
R∞
structure is recursive, i.e., ρws = 0 or τws = 0. We then have g(a, b) = b φ(v)dv =
1−Φ(b), which imply:7
Ga (a, b) = 0,
Gb (a, b) =
φ(−b)
φ(b)
=
= λ(−b).
1−Φ(b)
Φ(−b)
Then (6.4)–(6.8) are simplified to
ρws = 0 (Recursivity conditional on individual effects) =⇒
∂lnP (sit=1|wit=1; t, µiwσ , µisσ )
∂lnP (sit=1; t, µisσ )
=
= γsσ λ(γsσ t+µisσ −s̄σ ),
∂t
∂t
τws = 0 (Recursivity conditional on cohort) =⇒
∂lnP (sit=1|wit=1; t, ci )
∂lnP (sit=1; t, ci )
=
= γsτ λ(γsτ t+βsτ ci −s̄τ ),
∂t
∂t
∂lnP (sit=1|wit=1; t, ci )
∂lnP (sit=1; t, ci )
=
= βsτ λ(γsτ t+βsτ ci −s̄τ ),
∂ci
∂ci
τws = 0 (Recursivity conditional on age) =⇒
∂lnP (sit=1|wit=1; t, ait )
∂lnP (sit=1; t, ait )
=
= (γsτ+βsτ )λ((γsτ +βsτ )t−βsτ ait −s̄τ ),
∂t
∂t
∂lnP (sit=1|wit=1; t, ait )
∂lnP (sit=1; t, ait )
=
= −βsτ λ((γsτ+βsτ )t−βsτ ait −s̄τ ).
∂ait
∂ait
Conclusion 6: When we condition on (a) the individual latent heterogeneity
or (b) cohort (or age) only, we should account for sample truncation when formulating the appropriate response probabilities as functions of covariates and
likelihood functions for estimating trends in absenteeism, except when σws = 0
(in case (a)) and ωws = σws = 0 (in latter case (b)). The correct form of
the likelihood function will, in the general case, reflect the mixture of discrete
choice and sample truncation.
7
Concluding remarks
We have in this paper presented a simple model framework for analyzing jointly
degree of sickness and degree of work ability with two kinds of latent heterogeneity
7
If u, v are independent =⇒ ψ(u, v) = φ(u)φ(v), f (a, b) = [1−Φ(a)][1−Φ(b)], g(a, b) = [1−Φ(b)] ∀ a,
where φ(u) and Φ(u) are the univariate density and the c.d.f. of u or v [confer Section 4.1], then
Ψu (u; b) = [1−Φ(b)]φ(u), Ψv (v; a) = [1−Φ(a)]φ(v), and hence Ga (a, b) = 0, Gb (a, b) = φ(b)/[1−Φ(b)].
13
interacting, one related to absenteeism (sickness absence), the other related to ability to work. Obtaining valid inference on trend effects has been the main focus of
the paper. Sometimes also cohort effects or age effects can be uncovered. We have
shown that correlation pattern of the two kinds of latent heterogeneity is important. Treating the two decisions as recursive may not be always be the answer, and
neglecting the sample selection may obscure the interpretation of the coefficients
estimated.
An overall conclusion, somewhat related to and extending conclusions derived
for bivariate ‘Tobit models’ in literature, is that when we stick to linear regression,
the conditions which need to be satisfied for estimated composite trends (time effects) to be unbiased are stronger when the other covariate (conditioning variable)
is cohort or age, than when we condition on individual effects (and, by implication, eliminate any relationship between individual heterogeneity and cohort). In
the former case, the genuine disturbances in the underlying sickness equation and
work ability equation should be uncorrelated. The latter case, a kind of ‘double
recursivity’ should hold: both the genuine disturbances and the latent individual
effects in the two equations should be uncorrelated. Inference on sickness absence
trends obtained by linear regression with fixed individual effects (additive shifts in
the intercept) included, may therefore be characterized as more robust than that
obtained when including only cohort or age as regressors and throwing all heterogeneity into (gross) disturbances. Essentially, these conclusions also carry over to
the case where absenteeism is only observed dichotomously.
Natural, and rather straightforward, extensions, not elaborated in the paper,
could be to replace the time, cohort and age variable by corresponding time, cohort
and age dummies. Genuine ‘economic regressors’ could also be included, formally
as extensions of the models’ intercepts, except that no such regressor could be individual specific, in order to avoid perfect collinearity with the individual effects.
Neither could, for a similar reason, time specific regressors be included in models
where time dummies replace the continuous time variable.
References
Amemiya, T. (1985): Advanced Econometrics. Cambridge (Ma.): Harvard University Press.
Biørn, E. (2008): Økonometriske emner. En videreføring. Oslo: Unipub.
Biørn, E., Gaure, S., Markussen, S., and Røed, K. (2010): The Rise in Absenteeism: Disentangling the Impacts of Cohort, Age and Time. IZA Discussion Paper No. 5091. IZA, Bonn.
Cameron, A.C. and Trivedi, P.K. (2005): Microeconometrics. Methods and Applications. Cambridge: Cambridge University Press.
Johnson, N.L., Kotz, S., and Balakrishnan, N. (1994): Continuous Univariate Distributions, Volume 1, Second Edition. New York: Wiley.
14
Download