forecasting eastern

advertisement
Applying the Lee-Carter model to countries in
Eastern Europe and the former Soviet Union
DIPLOMARBETE
HANNA SCHERP
2007-11-27
1
INTRODUCTION .........................................................................................................3
1
1.1
2
MORTALITY DATA .............................................................................................4
Mortality history and trends in Eastern Europe and the former Soviet Union .................................4
MODELLING MORTALITY ..................................................................................8
2.1
Earlier experience of the Lee-Carter model ..........................................................................................8
2.2
Lee-Carter, the model .............................................................................................................................9
2.3
Estimating the parameters ......................................................................................................................9
2.3.1
Estimating the parameters using Weighted Least Squares ....................................................................9
3
FITTING AND APPLYING THE LEE-CARTER MODEL ...................................14
3.1
Data .........................................................................................................................................................14
3.2
Fitting the parameters ...........................................................................................................................14
3.3
Eastern Europe; Bulgaria, Czech Republic, Hungary and Slovakia ................................................15
3.3.1
Comments to the results for the other countries in Eastern Europe ....................................................19
3.4
Former Soviet Union; Baltic States, Ukraine and Russia ..................................................................22
3.4.1
Baltic States ........................................................................................................................................23
3.4.2
Russia and Ukraine .............................................................................................................................27
3.5
4
Increasing time trend ............................................................................................................................31
FITTING THE LEE-CARTER MODEL EXCLUDING PARTS OF THE DATA ...33
4.1
The Czech Republic ...............................................................................................................................33
4.2
Latvia ......................................................................................................................................................35
4.3
Ukraine ...................................................................................................................................................37
5
DISCUSSION .....................................................................................................39
REFERENCES ..........................................................................................................41
APPENDIX 1 ................................................... ERROR! BOOKMARK NOT DEFINED.
APPENDIX 2 ................................................... ERROR! BOOKMARK NOT DEFINED.
2
Introduction
In this paper, the Lee-Carter method will be used to model the mortality pattern for some
countries in Eastern Europe and the former Soviet Union. The aim of this paper is to study the
method’s ability to fit the mortality pattern in the countries, in this region, where the
development in mortality differs massively from that in Western Europe. There are very few
articles to be found where the Lee-Carter model has been applied to mortality data from
Eastern Europe or former Soviet Union. The reason may be that the mortality pattern is so
complex for some of the countries that the possibility of modelling and forecasting the
mortality with this method seems remote.
The first Section gives a background of the development in mortality and life expectancy for
the studied countries. The data source used is the “Human Mortality Database”
(www.mortality.org). In this database mortality statistics could be found for the following
countries in Eastern Europe and former Soviet Union; Bulgaria, Czech Republic, Estonia,
Hungary, Latvia, Lithuania, Russia, Slovakia and Ukraine.
In Section 2 the Lee-Carter Model is described and in Section 3 the model is applied to the
mortality data. To get a better fit of the model some extreme observations are excluded from
the data and the results are shown in Section 4. A summary and discussions are found in
Section 5.
3
1 Mortality Data
The Human Mortality Database provides detailed mortality data for around 30 different
countries; live birth counts, deaths counts, populations size, population exposed to risk, death
rates and life tables. In this paper we will use the death rates for all countries in Eastern
Europe and the former Soviet Union represented in the Human Mortality database. The
database was launched in May 2002 and is a collaborative project involving research teams in
the Department of Demography at the University of California, Berkeley (USA), and at the
Max Planck Institute for Demographic Research, in Rostock (Germany). It enjoys financial
support from the National Institute on Aging (USA), and receives technical advice and
assistance from a long list of international collaborators. The database is available for
everyone and could be found on the internet; www.mortality.org.
There is available mortality data in the Human Mortality Database for the countries studied
from the late 40s or from the 50s and onwards, so the mortality trends will be studied starting
after the World War II. There is a major divergence in mortality between the Western and
Eastern Europe.
1.1 Mortality history and trends in Eastern Europe and the former
Soviet Union
The mortality in the whole of Europe declined after the war, most due to the use of antibiotics,
the general immunisation and an overall increase in health. The countries in Eastern Europe,
that earlier had had a very high mortality rate, were getting closer to the mortality of the
Nordic Countries and the Western Europe where the mortality had been the lowest in Europe.
But in the mid 60s the decrease in mortality rates became stagnant. This was mainly caused
by the increase in circulatory diseases, traffic accidents and deaths caused by alcoholism. The
stagnancy in mortality was observed in all European countries, but in the Western part the
trend did not last. In the early 70s mortality rates were decreasing again. For the countries in
Eastern Europe the decrease in mortality began to level out or even turn into an increase. This
divergence between Western and Eastern Europe has lasted from the 70s and until 30 years
later. There are distinct differences among the countries. The trends differ slightly between
the countries in the Eastern Europe and former Soviet Union, but there are a lot of related
patterns.
The life expectancies in Estonia, Latvia, Lithuania, Ukraine and Russia have had a similar
pattern since 1965 until the mid 90’s. For some years in the mid 80s the mortality rates
decreased in the former Soviet Union as a result of Gorbatchev’s anti-alcoholic campaign, but
this was not long lasting. Soon the mortality increased in the beginning of the 90s the reason
was the economic crisis due to the change into market economy. When the economic crisis
was over, the mortality rate decreased again, but in Russia and Ukraine this improvement was
just temporary, while in the Baltic States the positive trend continues ( France Meslé, INED,
Paris 2002). One interesting thing to notice is the ethnic differences in Estonia. The two main
ethnic groups in Estonia are Estonians and Russians. In the late 80s, just before Estonia
became self-governed (1991), the ethnic differences were not statistically significant, but in
2000 the mortality among the Estonians had fallen 4% among the males and 11% among
females, when at the same time the mortality among Russians had increased by 24% for men
and 1% for women. The causes of death where the differences are as highest are alcoholic
liver cirrhosis, alcoholic poisoning, homicide, influenza and pneumonia (Mall Leinsalu, 2004).
4
In Figure 1 the development in life expectancy at birth in the Baltic States, Russia and
Ukraine and also a comparison with Swedish data are found.
Until the mid 80s the life expectancy trends in Bulgaria, Czech Republic, Hungary and
Slovakia were very similar. In contrary to the life expectancy in the Baltic States and the
former Soviet Union, the life expectancy has increased since 1988 in Czech Republic, since
1992 in Slovakia and since 1993 in Hungary. However, in Bulgaria it took until the end of the
90s until the trend turned upwards. This is valid for the total population. Looking at female
and males separately one can see that the trend for female is increasing for almost the whole
period ( France Meslé, INED, Paris 2002).
In Figure 2 the life expectancy for females and males respectively compared with the Swedish
data are presented.
5
Figure 1 Life expectancy at birth in the Former Soviet Union compared to Sweden.
6
Figure 2 Life expectancy at birth in Eastern Europe compared to Sweden.
7
2 Modelling mortality
In 1992 Ronald Lee and Lawrence Carter1 proposed a method for modelling and forecasting
U.S. mortality. This method has been widely used for modelling and forecasting mortality in
several countries. One reason for its popularity is probably the simplicity of the model and the
straight forward way of forecasting the mortality ones the model is fitted. The model is based
on the past experience of the age-specific mortality data. This model will be used in this paper
to fit the historical mortality rates in the Eastern Europe and the former Soviet Union for those
countries where statistics is found in the Human Mortality Database. Due to the complexity in
the long-term trends in mortality for those countries we can expect some problems by using
the model.
2.1 Earlier experience of the Lee-Carter model
The Lee-Carter method has been used for fitting and forecasting the mortality for almost
every developed country in the world, for example the G7 countries 2 (Canada, France,
Germany, Italy, Japan, UK and US), Australia3, Portugal4, Canada5, Chile6, Japan7, Brazil8,
Austria9, the Nordic countries10 and England and Wales11. For most of the countries the LeeCarter model applies very well, but for some it does not. Examples of nations where the
model does not give a good fit are; Australia, UK and Austria. The problem is that the
historical age pattern of mortality rates varies over time, but the model could give a good fit
even though the mortality pattern has fluctuated, or the mortality has decreased with different
pace.
There are a number of articles written where different extensions of the model are used, to get
a better adjustment of the model. However there are also criticisms of the Lee-Carter model.
In 2005 Girosi and King criticised the model for “insufficiently appreciated properties”.
Despite the criticism the modelling of US mortality, which the model originally was designed
for, is now used by the US Bureau of the Census.
There is one article12 in which the model is applied to Hungarian data. The authors make an
extension of the model but their conclusion is that this does not solve the problem and they
are not able to achieve a good fit for the whole studied period 1949-2003.
One thing that one has to keep in mind is that the Lee-Carter model was designed for the U.S.
population mortality and one could not demand that the model should work for every other
country.
1
Lee & Carter (1992)
Shripad Tuljapurkar, Nan Li & Carl Boe (2000)
3
Heather Booth, John Maindonald & Len Smith (2002)
4
E. Coelho (2001)
5
Lee & Nault (1993)
6
Lee & Rofman (1994)
7
Wiltmoth (1998)
8
Fígoli (1998)
9
Carter & Prskwetz (2001)
10
Koissi, Shapiro and Högnäs (2004)
11
Renshaw & Haberman (2003b)
12
S. Baran, J. Gáll, M. Ispány & G. Pap (2006)
2
8
2.2 Lee-Carter, the model
The model that Lee and Carter suggested in 1992 is
log q x (t )  a x  bx t   x (t ) ,
where
q x (t ) is the probability of death at age x in year t.
a x is the average of log q x (t ) over time t which describes the general pattern of mortality by
age.
 t is the time trend for the general mortality.
b x indicates the sensitivity of log q x (t ) at age x as the  t varies.
 x (t ) is the residual term at age x and time t.
2.3 Estimating the parameters
The estimation of a x is straight forward because it is the average of log q x (t ) over time t. The
estimation of bx and  t could not be solved explicitly and the model cannot be fit with
ordinary regression methods. There are several methods that could be used. In the original
paper (Lee, Carter 1992) the singular value decomposition (SVD) method is used to find a
least squares solution.
The estimation approach that is used here is the weighted least squares. Another approach is
the maximum likelihood estimation, which makes it possible to have a weight equal to the
observed number of deaths for age x and year t. Koissi, Shapiro and Högnäs13 compared these
methods in their paper “Fitting and forecasting mortality rates for Nordic countries using the
Lee-Carter method”. They found that the different methods gave quite identical results. In
this paper the aim is to see if the Lee-Carter model can be applied to the data at all, so the
choice of method should be of minor importance.
2.3.1 Estimating the parameters using Weighted Least Squares14
The one year death probabilities for n calendar years and m number of ages or age groups are
observed.
The model is
log qx (t )  ax  bx t   x (t ) , and we need to estimate a x , bx and  t .
To achieve a unique solution the following restrictions are used,
13
14
Koissi, Shapiro and Höganäs (2004)
”Skatta parametrarna i Lee-Carters modell för dödlighet” Bengt von Bahr. Translated, with own derivations.
9
2
x m
b
t n
1
x

(a)
t
0
(b)
t min
x min
These restrictions make no modifications to the model. If we replace bx with bx *=c bx then
 t could be replaced by  t *=  t /c and the model is unchanged.
To estimate the parameters we choose the values that minimize Q;
Q   (a x  bx  t  m xt ) 2 ,
x ,t
where mxt  log qx (t ) and Q is subject to the constraints (a) and (b) above.
To find values that minimize Q we introduce Lagrange’s multipliers;  and  and minimize
R  Q     t    bx .
2
t
x
Thus we first take the derivative of R in respect of a x , bx and  t respectively;
dR
 2 (a x  bx  t  m xt ) for every x.
da x
t
dR
 2 bx (a x  bx  t  m xt )   for every t.
d t
x
dR
 2  (a x  bx  t  m xt )  2bx for every x.
dbx
t
The derivatives are set equal to zero:
dR
0
dbx
dR
0 ,
da x
and
dR
0
dk x
We solve the first equation;
2 (a x  bx t  m xt )  0
=>
t
t n
na x  bx   t   m xt  0
t
na x   m xt  0
(  t  0 )
=>
t min
t
=>
t
ax 
1
 mxt
n t
(1)
10
The estimate for a x is thus computed as the average over time of the logarithm of the central
death, which corresponds to the definition of a x in Section 3.2.
We now define z xt  m xt  a x =>
z
xt
 0 for every x.
t
We can now rewrite the second equation:
2 bx (a x  bx t  m xt )    2 bx (bx t  z xt )  
x
x
If we set the new expression of the derivative with regard to  t equal to zero we have
2 bx (bx t  z xt )    0
=>
x
 b (b 
x
x
t
 z xt )   / 2
=>
x
x m
x
2
(  bx  1 )
 t  bx   bx z xt   / 2
2
x
=>
x min
 t   bx z xt   / 2
(2)
x
Taking the sum over t in the equation (2) above we get
 (   b z
t
x
t
x
xt
)   / 2
  t   bx z xt   / 2
t
m b x
x
=>
t
t
x
z
xt
t
 m / 2
t
t n
(  t  0 )
=>
t min
(  z xt  0 )
=>
 0
t
We now have an expression for  t by putting   0 in equation (2). Thus
 t   bx z xt for every t (3)
x
The constraint for  t is now fulfilled.
If we use the expression z xt in the third and last equation;
2 (a x  bx t  mxt )  2bx  2 t ( t bx  z xt )  2bx
t
and set this equal to zero we have
11
2  t ( t bx  z xt )  2bx  0
=>
bx (  t2   )    t z xt
t
for every x
(4)
t
We take the square of both sides and summarise over x and get
 (b ( 
x
x
 ( z
  )) 2 
2
t
t
(    )
2
t
x
2
 (b )
2
x
t
x

t
t
x
    t2 
(taking the square root of both sides)
=>
xt
)2
=>
t
 (  z
t
t
=>
x min
t
 (  z
 
2
t
2
(  bx  1 )
t
2
x
=>
x m
2
(  t2   )   (  t z xt ) 2
t
)2
t
  (  t z xt )
x
xt
x
xt
)
2
t
We are now able to get an expression for bx by inserting the equation for  in (4):
(     t2 
bx (  t2   )    t z xt
t
t
t
bx (  t2    t2 
t
t
t
x
 z
t
bx 
 (  z
t
 (  z
t
t
x
) 2 )    t z xt
2
xt
) )
=>
t
=>
t
xt
t
x
xt
 (  z
2
xt )
for every x
(5)
t
The estimation of a x is easily calculated from the observed one year death probabilities. The
equations for estimating bx and  t ; (3) and (5) are complicated to solve explicitly, but it is
possible to find a solution with a rather low number of iterations. To get a starting value for
1
bx , we assume that bx is independent of x, and equal to
, where m is the number of ages
m
or age groups.
In the original paper written by Lee and Carter they make a re-estimation of  t to get the
observed number of deaths equal to the fitted number of deaths, i.e.
Dt   exp( ax  bx t ) N x ,t ,
x
12
where Dt is the total number of deaths in year t and N x ,t is the population of age x in year t.
No analytic solution is available so it can only be done by searching over a range of value of k.
However, this second stage of estimation does not have any impact on whether the Lee-Carter
model could be fitted to the data or not.
13
3 Fitting and applying the Lee-Carter model
3.1 Data
There are deaths rates per age and calendar year for each country in the mortality database.
However, for Estonia there are no observed deaths for some ages in some calendar years,
therefore the 5-year age groups have been used for those. Because of this, the values of  t , bx
and the size of the residuals between Estonia and the other countries fitted with 1 year age
group may not be comparable.
The data available for the studied countries are:
Country
Time period
Ages used in the model
Bulgaria
Czech Republic
Hungary
Estonia
Latvia
Lithuania
Russia
Slovakia
Ukraine
1947-2003
1950-2004
1950-2001
1959-2005
1959-2003
1959-2003
1959-2005
1950-2005
1959-2005
0 – 100, 1 year age groups
0 – 98, 1 year age groups
0 – 98, 1 year age groups
0 - 105, 5 years age groups
0 – 98, 1 year age groups
0 – 100, 1 year age groups
0 – 100, 1 year age groups
0 – 98/97 (f/m), 1 year age groups
0 – 100, 1 year age group
3.2 Fitting the parameters
The parameters are fitted by using the method presented in Section 2.3. We are not making
any re-estimation of the k factor, because the aim of this paper is just to see how well the LeeCarter model could be fitted to the countries studied. We are not going to make any forecasts
or use the fitted models. The model is applied for male and female separately. Because of the
different trend in mortality history discussed earlier, we will look at the results for Eastern
Europe(Bulgaria, Czech Republic, Hungary and Slovakia) separately from the countries in the
former Soviet Union (the Baltic States, Ukraine and Russia).
We apply the Lee-Carter model to the data and start by analysing the  t values and the
bx values. Residuals will then be studied. For the countries where the model fits very well, the
estimated time trend is linearly decreasing and the bx values are decreasing with age. To get a
time trend, a k t -curve, that declines linearly it is not necessary that the historical life
expectancy is linear. One criterion for forecasting the mortality is to have linearity in the fitted
mortality trend, k t . When forecasting the mortality the time trend is extrapolated to the future
and if we do not have a linear time trend this straightforward way does not work. In the
following Section 3.3 the outcome of the applied models for the countries in Eastern Europe
will be analysed. In Section 3.4 the models fitted for the countries in the Former Soviet Union
will be studied.
14
3.3 Eastern Europe; Bulgaria, Czech Republic, Hungary and
Slovakia
We will start to analyse the fitted model by looking at the estimated parameters. In Figure 3
the estimated the general pattern of mortality, a x , is presented for females and males
separately, and we see that the shape of a x is almost the same for all countries.
Figure 3 General pattern of mortality, ax, for the countries in Eastern Europe.
Because of the similarity in the mortality trend history in these countries we will only study
Bulgaria in detail. The corresponding graphs for the rest of Eastern Europe are found in
Appendix 1.
The estimated mortality time trend  t , for Bulgaria in Figure 4 looks similar for males and
females. The k t -curve is very steep from 1947 until the mid 60s, and then it levels out. In the
mid 90s the curve becomes steep again. These curves are not linear, which does not necessary
indicate that we do not have a good fit of the model. However, if one would like to make a
forecast of the mortality trend there is no historical continuous pattern that we can build the
15
forecast on. How do we forecast a mortality trend that has changed only in 5-10 years and
does not look at all like it did during the previous 40 years? The bx values are decreasing by
age, which means that the mortality rates at younger ages decline more rapidly than for older
ages.
Figure 4 The mortality trend,  t and the age-specific constant bx .
For ages over 80 the bx values are negative for both sexes. For the male population the bx
values are very close to zero from the age of 40. The estimated mortality rates for these ages
are almost unchanged during the studied period. Negative bx profiles give an increasing
mortality rate if the time trend is negative and declining.
In Figure 5(female data) and 6(male data) the mortality rates for some ages are shown
together with the rates obtained by the applied model. The model has captured the mortality
trend well for the female population in the 30-80 year age span. In contrary to the female data,
the male data seem to create a problem with the ages between 40 and 60 years. The model
gives a too low estimate of the mortality rates in the first half of the studied period and a too
high estimate after 1980, for a 50 year old male. We also find that the model has a slightly
increase in the mortality trend at this age, while the observed mortality seems to have
decreased in the latest years. The model does not capture this trend at all and if this continues
the model is giving the wrong assumptions.
16
Figure 5 Estimated mortality rates (dotted coloured lines) compared to actual mortality rates (black solid
lines). Bulgarian female data.
Figure 6 Estimated mortality rates (dotted coloured lines) compared to actual mortality rates (black solid
lines). Bulgarian male data.
Figure 7 plots the sum of squares residuals per age and calendar year. As we already seen in
the comparison of the fitted model and the actual mortality rates the male residuals are very
high in the broad age band 35-60 years. For both males and females the sum-of-squares
residuals are high for younger and older ages. To get a better fit of the model we could try to
exclude the population under 20 years and over 80 years. The sum-of-squares residuals are
higher for males around the 60s and 70s and also in the mid 90s. This corresponds to the
observed deviation of the estimated rates from the actual in Figure 6. The female sum-ofsquares residuals follow the pattern of the male residuals but they are not that large.
17
Figure 7 Sum of squares age residuals and sum of squares calendar year residuals.
The spread of the size of the residuals is found in Figure 8(females) and 9(males). To have a
correctly captured model the residuals should not show any structure, they should be random.
We do not want a model that will over-estimate the rates consequently for some calendar
years or age groups and under-estimate for other periods or age bands. However, the calendar
year residual pattern shows a clear systematic effect in the Bulgarian male model. We see that
the rates are over-estimated for the first part of the studied period and under-estimated in the
latter part.
One question that may arise when looking at the comparison of the estimated and the actual
mortality rates in Figure 6, is why the model is not able to fit the data by just decrease the
mortality in the early years of the period and increase the adjusted rates in the latter part? The
probable reason why the model is not able to fit the data is because of the negative bx values.
If we increase the  t values in the early years this does not imply that the mortality increases
for all ages because the bx values have different signs in different age-groups. We found that
the Lee-Carter model could not be applied to the Bulgarian male mortality data with
satisfying results.
18
Figure 8 Plot of the size of the calendar year residuals and the age residuals for Bulgarian females.
Figure 9 Plot of the size of the calendar year residuals and the age residuals for Bulgarian males.
3.3.1 Comments to the results for the other countries in Eastern Europe
The fitted model of the Czech Republic mortality data shows very similar result as the
Bulgarian data, except that the fit of the model looks much better for middle aged males. As
for Bulgaria the sum-of-squares residuals per age are high for the youngest and the oldest. We
also have a systematic effect for the calendar year for the fitted male data, though it is not as
outstanding as for the Bulgarian male model. The Hungarian models have a slightly
systematic effect even in the female model. The model does not give a good fit of the data. In
Figure 10, where the fitted and actual mortality rates for Hungarian males is shown, we even
find for some ages that the estimated trend is increasing while the actual trend is decreasing.
19
Another problem with the Hungarian data is the deviation from the age of profile for the male
data in the broad age span 30 to 75. The bx values are negative at the same time as the  t
values are negative. If the applied model was going to be used for forecasting the future
mortality, the  t values would be projected as a straight line with decreasing values. The  t
time trend after the year 1970 is negative. This means that if the bx is negative the estimated
mortality: ln q x (t )  a x  bx t , would increase for each year. The bx  t term is positive for x
>35 and t >1975. This may result in mortality curves crossing each other with higher
mortality for a 60 year old male than for an 80 year old male in the future. The problem with
negative bx values is not found for female data. The problem with negative bx values is also
found when fitting Slovakian data. As for all other countries the systematic calendar year
effect could be found in the male residuals.
Figure 10 The solid black line shows the actual mortality rates for Hungarian males at different ages and
the dotted lines are the estimated rates given by the fitted Lee-Carter model.
To summarise the analysis of the outcome of fitting the Lee-Carter model to mortality data in
Eastern Europe, the fit of the model to female data gives a better adjustment than for males.
The sum-of-squares residuals are very high for the youngest and oldest ages. The sum-ofsquares residuals per calendar year are highest in the most recent years, and we would not
want to use a model that does not fit the data today, especially if we would like to use the
model for forecasting.
For Hungarian and Slovakian data we obtain negative bx values, which might cause problems
in the long run if we would like to make a forecast of the mortality. The  t curves are quite
linear for all countries except for Bulgaria. The country where we got the best fit of the model
is the Czech Republic; the  t curves are linear and the bx values are positive for all ages. The
Czech Republic is the only country with decreasing mortality for all ages. In Section 4 we will
20
apply the model to the same data but exclude the population under the age of 20 and over the
age of 80.
All estimated parameters and residual graphs for the studied countries in Eastern Europe are
presented in Appendix 1.
21
3.4 Former Soviet Union; Baltic States, Ukraine and Russia
In the previous Section we analysed the fit of the Lee-Carter model to the countries in Eastern
Europe and as we saw in the development of the Life expectancy tables in Figure 1 and 2 the
mortality in the Former Soviet Union differs a lot from that in Eastern Europe. In this Section
we will start to look at the model adjusted to the Baltic States mortality data. There are a
divergence in the mortality trend from the mid 90s for the Baltic States, compared to the
trends in Russia and Ukraine. The fitted parameter ax, the general pattern of mortality, is
found in Figure 11, for all the studied countries in the Former Soviet Union.
Figure 11 General pattern of mortality, ax, for the countries in the Former Soviet Union.
22
3.4.1 Baltic States
The adjustment of Estonian data to the Lee-Carter model differs from the fitted models of
Latvia and Lithuania. When applying the model to Estonia we used 5-year age groups and this
could maybe contribute to the different results but we will not analyse this further. The
estimated parameters and residual plots for the Estonian model are presented in Appendix 2.
We start to look at the fitted  t values - the time trend. We choose to study the results for the
fitted Lithuanian Lee-Carter model. The time trend is decreasing for the female data but for
the male data we get an increasing  t -curve. The reason why we get an ascending time trend
is probably because the increasing mortality rates are in majority in the male population. This
is in contrary to the countries in Eastern Europe where the trend is increasing for some ages
but it is not outweighing the decreasing trends for other ages. In Figure 12 the  t curves and
bx curves are presented. The bx values for ages 40-60 years are slightly negative for females
and clearly positive for males. In both cases the conclusion is that the mortality rates are
increasing in these age bands, because the time trend is negative and declining for females
and the male curve is ascending and positive. The graph with the  t -curves is though very
illustrating; the overall female mortality trend is decreasing while the overall male trend is
increasing. Due to the nature of the model, (if we replace bx with bx *=c bx then  t could be
replaced by  t *=  t /c and the model is unchanged) we could easily change the sign of the  t
and bx to get a decreasing male  t -curve. This is discussed in Section 3.5.
Figure 12 The time trend, kt and the sensitivity of logqx, bx, Lithuanian mortality.
In Figure 13 (female) and 14 (male) the actual mortality rates together with the fitted LeeCarter values are found for the Lithuanian population. There has been an upward trend for all
ages in the male population until the mid 90s when the rates began to decrease. For the female
population the mortality trend has been decreasing since 1960 except for ages between 40 and
60 years with a negative bx values discussed earlier.
23
Figure 13 The solid black line shows the actual mortality rates for Lithuanian females at different ages
and the dotted lines are the estimated rates given by the fitted Lee-Carter model.
Figure 14 The solid black line shows the actual mortality rates for Lithuanian males at different ages and
the dotted lines are the estimated rates given by the fitted Lee-Carter model.
24
The sum-of-squares residuals are, as in Eastern Europe, high in the youngest and the oldest
ages. The female residuals are a bit larger than the male sum-of-squares residuals, both per
age and calendar year. There is some kind of systematic effect that could be observed in the
size of the residuals per calendar year. The difference from that we saw in the pattern for the
countries in Eastern Europe is that here we have the same effect in the female model. Looking
at the plots of the size of the residuals for the models of Eastern Europe, the residuals had a
shape of two “waves”, while for the Baltic States we have a pattern of four “waves”. The
sum-of-squares residuals and the size of the residuals could be found in Figures 15, 16 and 17.
Figure 15 Sum-of-squares residual, Lithuania.
Figure 16 Plot of the size of the calendar year residuals and the age residuals for Lithuanian .females.
25
Figure 17 Plot of the size of the calendar year residuals and the age residuals for Lithuanian .males.
To summarise the goodness of fit in the Latvian and Lithuanian models, the female models do
not capture the peak in the mortality in 1993 at all. The declining trend in the recent years is
not found in the male models either. In contrary to the results for Eastern Europe the male
models give a much better fit than the female ones. A big problem with the male models is
that the estimated Lithuanian trend for ages around 30-50 is up going, while the actual
mortality looks decreasing even though there might be some uncertainty in this statement.
However this trend seems to have been captured by the Latvian male model.
In Section 4 we will fit the model again but this time after excluding the year 1993-1995 with
the extreme mortality rates, and the ages below 25 and 80 years old and above will also be
excluded.
All graphs of the estimated parameters, in comparison with the actual mortality and the fitted
mortality, sum-of-squares residuals and size of the residuals for Estonia, Latvia and Lithuania
are found in Appendix 2.
26
3.4.2 Russia and Ukraine
There are not many differences between the fitted models of Ukraine and Russia. The shape
of the curves of the estimated parameters and the size of the residuals are very similar. The
only outstanding fact about Russia is the sum-of-squares residuals for the fit of the extreme
year of 1993. This residual is twice as high as the corresponding Ukraine residual.
Figure 18 The mortality trend,  t and the age-specific constant
bx
Figure 19 The solid black line shows the actual mortality rates for Russian females at different ages and
the dotted lines are the estimated rates given by the fitted Lee-Carter model.
27
Figure 20 The solid black line shows the actual mortality rates for Russian males at different ages and the
dotted lines are the estimated rates given by the fitted Lee-Carter model.
The  t - curves, the time trend is found in Figure 18. Here we have an increasing trend, both
for females and males. The curves have two bumps before and after the time of the economic
crisis. The female and male curves have the same shape. In the same figure the bx -curves
show that the values are positive for all ages over 14 years (except for 1 observation).
The plots of the estimated mortality compared to the actual show that the model follow the
mortality very well for most of the ages. The fit of the model is better for the male data than
for female. The adjusted female model has problems to capture the upside and downside
around the years of the economic crisis. These results are presented in Figure 19 and 20.
The sum-of-squares residuals, found in Figure 21, tells us the same thing we saw when
studying the fitted mortality rates. The female sum-of-squares residuals are much higher than
the male residuals for younger ages. If we look at the sum-of-squares residuals per calendar
year, the female residuals are higher at the earliest years of the studied period and also a little
bit higher during the extreme years. We will make a re-estimation of the model after
exclusion of the population below 40 years and over 80 years and also the mortality history
for the years with the highest mortality rates, 1988-1990 and 1993-1995. The results could be
found in Section 4.
28
The spread of the residuals is shown in Figures 22 and 23 and it could not be argued that they
look random per calendar year. There is some kind of systematic error in the estimation of the
1960s and we see a major over-estimation in the late 80s and as we already noticed the years
1993-1995 are under-estimated for all ages. We will see what the effects will be when we refit the model.
Figure 21 Sum of squares age residuals and sum of squares calendar year residuals.
Figure 22 Plot of the size of the calendar year residuals and the age residuals for Russian females
29
Figure 23 Plot of the size of the calendar year residuals and the age residuals for Russian males
30
3.5 Increasing time trend
It is interesting that the Lee-Carter model gives an ascending  t -curve and positive bx values
for those populations with a predominantly increasing trend in mortality.
Because of the nature of the Lee-Carter model:
log q x (t )  a x  bx t   x (t ) ,
bx * c and  t / c is also a solution to the equation and we receive the same estimate of
log q x (t ) . If the  t -curve is divided by -1 then the bx -curve could be multiplied by -1 and we
still have the same estimated mortality. The graphs for the estimated parameters in the
Ukrainian model would then look as in Figure 25. The original curves could be found in
Figure 88.
Figure 24 The Ukrainian model fitted with
bx  bx * 1 and  t   t /  1
'
'
The  t -curves are now descending as they usually are when applying the Lee-Carter model.
(Most of all countries have a population where the life expectancy is improving). The bx curve is turned upside down and almost all bx values become negative. Now we can compare
this to the Hungarian model in Figure 52 for example, where the  t -curve is declining even
though for the age band 35-75 years the bx values are negative the trend for those ages is
increasing. But it is more illustrative to keep the  t -curve ascending in the cases when the
trend really is increasing as in Ukraine where the model gives this appearance. The reason is
that for those countries the overall mortality time trend is as the  t -curve shows – namely
increasing.
31
However, in the situations when a model is fitted to a population where we achieve  t -curves
for female and male data and one is decreasing and the other one is increasing, it would be
easier to compare the results if we “turned” one of the curves upside down. In Figure 25 we
see the Latvian  t - and bx -curves for males and the original curves for the female model.
Figure 25 The male
 t -
and
bx -curves compared to the original  t - and bx -curves obtained with the
female model.
32
4 Fitting the Lee-Carter model excluding parts of the data
In the previous Section we applied the Lee-Carter model to mortality data for all available
data found in the Human Mortality Database, without making any reflection on if this is the
best way of fitting the Lee-Carter model. One could argue that the trends might be different
for different age groups and also the quality of the oldest population might not be satisfying.
The fit of the model would therefore maybe have a better fit if we exclude parts of the
population. There could also be some time periods that we would like not to influence the
model. Looking at the sum-of-squares of the residuals per age in all our fitted models we see
that for every country they increase marked from the age of 90. The sum-of-squares residuals
at ages under 30 are even higher. We will now apply the model to the Czech Republic, Latvia
and Ukraine after excluding some part of the population and time periods.
4.1 The Czech Republic
The model fitted to the original Czech Republic mortality data shows high sum-of-squares
residuals for the youngest ages. The sum-of-squares residuals in the most recent years are also
comparatively high. In Figure 26 the new sum-of-squares residuals per calendar year are
presented as a percentage of the residuals in the model applied with the original data. The
residuals are reduced significantly as a consequence of that the ages with the highest residuals
are excluded. Looking at the age sum-of-squares residuals only a minor decrease is shown.
The female model even gets higher residuals for ages 60 and above.
Figure 26 Sum-of-squares residuals after re-fitting the model to the population of 20-79 years old.
(Percentage of the original)
Figure 27 Sum-of-squares residuals per age for the original model and the new model.
33
We also see that the residuals with the most decreased values are the ones in the most recent
years. In the model applied to all data the female sum-of-squares residual are worryingly high
for the most recent years, but after re-fitting the model these are much lower and more in line
with the residuals in the other years. It feels more important to have small residuals in the
most recent years than in the beginning of the period because we want to capture the present
trend.
The systematic calendar year effect could still be seen in the male model. In Figure 28 and 29
below, the plots of the size of the residuals per calendar year are presented both for the model
applied with the original data and the model with the population between 20-79 years. Note
that the scale differs between the old and new residual plots.
Figure 28 The original plot of the residuals compared with the new in the right hand picture. Males
Figure 29 The original plot of the residuals compared with the new in the right hand picture. Females
34
4.2 Latvia
The Lee-Carter model is applied to the Latvian population between 25-79 years old where the
time period 1993-1995 is excluded. The residuals are not extremely high for this period and
the model captures this rather well; although the death rates are extreme between the years
1993 and 1999, see Figure 1. As for the Czech Republic re-fitted model we eliminate the high
sum-of-squares for the most recent calendar year. The result is lower sum-of-squares residuals
in the female model in every age group except for the 70 year-olds, which have slightly higher
new residuals. In the male model we get higher residuals for age groups 50 and 55.
Figure 30 Sum-of-squares residuals after re-fitting the model to the population of 20-79 years old.
(Percentage of the original)
Figure 31 Sum-of-squares residuals per age, for the original model and the new model.
The plots of the size of the residuals per calendar year are shown in figure 32 and 33. Note
that the scale differs between the old and new residual plots. The reason for the comparison is
to see what have happened with the pattern after the re-estimation. In the original model we
could see that the residuals did not look random in the second half of the estimated period. In
the new model this systematic effect seems to have disappeared in the male model. In the
female model we did not have any systematic effect in the original model.
35
Figure 32 The original plot of the residuals compared with the new in the right hand picture. Males
Figure 33 The original plot of the residuals compared with the new in the right hand picture. Females
36
4.3 Ukraine
The population we will use when re-fitting the Lee-Carter model to the Ukrainian mortality
data is the age interval 40-79 years and we also exclude the calendar years 1988-1990 and
1993-1995. The age sum-of-squares residuals decrease dramatically and mostly in the female
model. The new sum-of-squares residuals are below 20% for the latter part of the studied time
period. This is one way of getting a better fit of the model if we think that some age groups or
time periods of the estimated period are of minor interest to the study. For example, one
reason for excluding the high and low peaks in mortality due to the economic crisis is if we
think that it could be seen as an isolated event that only influenced the mortality during a
limited period. The comparisons of the plot of the size of the residuals in the original model
and the new are shown in Figure 36 and 37. The re-fitted model looks more random than the
original model where we had a clear systematic effect.
Figure 34 Sum-of-squares residuals per age, for the original model and the new model.
Figure 35 Sum-of-squares residuals after re-fitting the model to the population of 20-79 years old.
(Percentage of the original)
37
Figure 36 The original plot of the residuals compared with the new in the right hand picture. Males
Figure 37 The original plot of the residuals compared with the new in the right hand picture. Females
38
5 Discussion
The model gave a comparatively good fit to both Russian and Ukrainian data, especially after
excluding some part of the data as we did in Section 4, and there is a clear trend in the  t values. The question is however if we could use the model? Is it reasonable to build a forecast
on the  t trend? Do we really think that the rates in these countries will keep on increasing
forever? Would it not be possible that this trend will stop and the life expectancy in Russia
and Ukraine will start to improve? The conclusion is that the Lee-Carter model could be fitted
for some of the studied countries with satisfying results, although one should probably be very
reluctant to use the model for forecasting mortality.
There are strengths in the Lee-Carter model that we could see. One is that the model is able to
fit the mortality rates for a population as in the Russian case despite the fluctuating mortality
pattern. Another thing is that the model parameter,  t , really shows the predominant trend in
the population, with a descending or ascending curve.
We have found that one criterion to get a good fit of the model is that the historical mortality
in the population has to be homogenous within each gender (if we fit models for females and
males separately). If we get negative and positive bx values, the population is not
homogenous and most likely we get a systematic pattern in the size of the residuals. The LeeCarter model has problems to capture a good fit when the mortality has an increasing trend for
some ages and decreasing for others. In the Western countries where the model has been
applied with good results the overall trend has been decreasing mortality rates – probably
within the whole population but with different pace.
For those countries studied in this paper the model gave a god fit to the countries where the
trend is either overall increasing (Russia and Ukraine) or decreasing (Czech Republic). In
Eastern Europe the female population is more homogenous than the male, while in the Baltic
States we have a more homogenous male mortality development.
When forecasting mortality one question is how long period that should be used in the
projection. The period should not be too long to avoid the inclusion of mortality improvement
or impairment causes, the effects of which are already exhausted. This is one reason why it is
impossible to apply a model to the countries studied in this paper. There have been major
changes in mortality due to one cause in recent years that is unlikely to have any impact on
the future mortality trends. Looking at the Baltic States the trend in mortality is very different
after they became self-governed in 1991 and maybe the mortality statistics before that date
does not tell us anything about the future mortality. One thing that could be seen is that the
Baltic States become more West-oriented for each day and a better way of forecasting the
mortality in those countries could be to look at other countries’ statistics with similar trends –
if they exist - and build the forecast on that.
We have applied the Lee-Carter model to 9 different countries for males and females
separately, so there have been several models to analyse. Of course one could analyse each
country’s models more deeply and found more explanations to the results and other
underlying causes to the outcomes. In this paper we are only making a first (and second)
attempt to adjust the Lee-Carter model. Due to the fact that we are not using the models
further, it is not easy to state when a model gives a good fit or not. How small does the size of
39
the residuals have to bee to have a good fit? In our case, often a visual judgement feels
enough to make a conclusion. The analysis in this paper could be seen as a first study of the
Lee-Carter models ability to fit the mortality data of the countries in Eastern Europe and the
Former Soviet Union.
40
REFERENCES
Bahr, Bengt von (2006). “Skatta parametrarna I Lee-Carters modell för dödlighet”, working
paper, DUS.
Baran S., Gáll J., Ispány M., Pap G. (2006). ”Forecasting Hungarian Mortality rates using the
Lee-Carter method”, Acta Oeconomica, Vol 57 (1) pp. 21-34 (2007)
Booth, Heather, Maindonald, John and Smith, Len (2002) “Applying Lee-Carter under
conditions of variable mortality decline”, Population Studies, 56(3), pp. 325-336.
Carter, Lawrence R. e Prskawetz, Alexia (2001). “Examining Structural Shifts in Mortality
Using the Lee-Carter Method”, Max Planck Institute for Demographic Research WP 2001007, Germany.
Coelho E., (2001). “The Lee-Carter method for forecasting mortality, the Portuguese
experience”, Departamento de Estatísticas Sociais, Instituto Nacional de Estatística – Portugal.
Fígoli, Moema G. Bueno, (1998), "Modelando e projectando a mortalidade no Brasil, Revista
Brasileira de Estudos de População", Vol.15, n.º 1.
France Meslé, (2002). “Mortality in Eastern Europe and the former Soviet Union :
long-term trends and recent upturns”, Paper presented at IUSSP/MPIDR Workshop
"Determinants of Diverging Trends in Mortality"
Girosi F., King G. (2005). “A Reassessment of the Lee-Carter Mortality Forecasting Method”,
http://gking.harvard.edu/files/lc.pdf.
Koissi, Shapiro, Högnäs. (2004). “Fitting and forecasting mortality rates for Nordic countries
using the Lee-Carter model”, Proceeding of the 39th Actuarial Research Conference, ARCH.
Lee R. D., Carter L. (1992). “Modeling and Forecasting U.S. Mortality”, Journal of the
American Statistical Association, September 1992, Vol 87, No. 419.
Lee, R. D. and Nault F., (1993). “Modeling and forecasting provincial mortality in
Canada”, paper presented at the World Congress of the International Union for Scientific
Study of Population, Montreal, 1993.
Lee, Ronald D. and Rafael Rofman (1994) “Modelacion y Proyeccion de la Mortalidad
en Chile,” NOTAS 22, no 59, pp. 182-213.
Leinsalu M. (2004). “Troubled transitions, Social variation and long-term trends in health and
mortality in Estonia”, Doctoral thesis in sociology, Health Equity Studies No 2.
Renshaw, A., & Haberman, S. (2003b). “Lee-Carter mortality forecasting: a parallel
generalized linear modelling approach for England and Wales mortality projections”, Applied
Statistics 52, 119-137.
41
Tuljapurkar S. Li N. Boe C., (2000). A universal pattern of mortality decline in the G 7
countries” , Nature, 405: 789-792. 2000.
Wilmoth, J. R., (1998). Is the pace of Japanese mortality decline converging toward
international trends? Population and Development Review 24(3), 593-600.
42
Download