Applying the Lee-Carter model to countries in Eastern Europe and the former Soviet Union DIPLOMARBETE HANNA SCHERP 2007-11-27 1 INTRODUCTION .........................................................................................................3 1 1.1 2 MORTALITY DATA .............................................................................................4 Mortality history and trends in Eastern Europe and the former Soviet Union .................................4 MODELLING MORTALITY ..................................................................................8 2.1 Earlier experience of the Lee-Carter model ..........................................................................................8 2.2 Lee-Carter, the model .............................................................................................................................9 2.3 Estimating the parameters ......................................................................................................................9 2.3.1 Estimating the parameters using Weighted Least Squares ....................................................................9 3 FITTING AND APPLYING THE LEE-CARTER MODEL ...................................14 3.1 Data .........................................................................................................................................................14 3.2 Fitting the parameters ...........................................................................................................................14 3.3 Eastern Europe; Bulgaria, Czech Republic, Hungary and Slovakia ................................................15 3.3.1 Comments to the results for the other countries in Eastern Europe ....................................................19 3.4 Former Soviet Union; Baltic States, Ukraine and Russia ..................................................................22 3.4.1 Baltic States ........................................................................................................................................23 3.4.2 Russia and Ukraine .............................................................................................................................27 3.5 4 Increasing time trend ............................................................................................................................31 FITTING THE LEE-CARTER MODEL EXCLUDING PARTS OF THE DATA ...33 4.1 The Czech Republic ...............................................................................................................................33 4.2 Latvia ......................................................................................................................................................35 4.3 Ukraine ...................................................................................................................................................37 5 DISCUSSION .....................................................................................................39 REFERENCES ..........................................................................................................41 APPENDIX 1 ................................................... ERROR! BOOKMARK NOT DEFINED. APPENDIX 2 ................................................... ERROR! BOOKMARK NOT DEFINED. 2 Introduction In this paper, the Lee-Carter method will be used to model the mortality pattern for some countries in Eastern Europe and the former Soviet Union. The aim of this paper is to study the method’s ability to fit the mortality pattern in the countries, in this region, where the development in mortality differs massively from that in Western Europe. There are very few articles to be found where the Lee-Carter model has been applied to mortality data from Eastern Europe or former Soviet Union. The reason may be that the mortality pattern is so complex for some of the countries that the possibility of modelling and forecasting the mortality with this method seems remote. The first Section gives a background of the development in mortality and life expectancy for the studied countries. The data source used is the “Human Mortality Database” (www.mortality.org). In this database mortality statistics could be found for the following countries in Eastern Europe and former Soviet Union; Bulgaria, Czech Republic, Estonia, Hungary, Latvia, Lithuania, Russia, Slovakia and Ukraine. In Section 2 the Lee-Carter Model is described and in Section 3 the model is applied to the mortality data. To get a better fit of the model some extreme observations are excluded from the data and the results are shown in Section 4. A summary and discussions are found in Section 5. 3 1 Mortality Data The Human Mortality Database provides detailed mortality data for around 30 different countries; live birth counts, deaths counts, populations size, population exposed to risk, death rates and life tables. In this paper we will use the death rates for all countries in Eastern Europe and the former Soviet Union represented in the Human Mortality database. The database was launched in May 2002 and is a collaborative project involving research teams in the Department of Demography at the University of California, Berkeley (USA), and at the Max Planck Institute for Demographic Research, in Rostock (Germany). It enjoys financial support from the National Institute on Aging (USA), and receives technical advice and assistance from a long list of international collaborators. The database is available for everyone and could be found on the internet; www.mortality.org. There is available mortality data in the Human Mortality Database for the countries studied from the late 40s or from the 50s and onwards, so the mortality trends will be studied starting after the World War II. There is a major divergence in mortality between the Western and Eastern Europe. 1.1 Mortality history and trends in Eastern Europe and the former Soviet Union The mortality in the whole of Europe declined after the war, most due to the use of antibiotics, the general immunisation and an overall increase in health. The countries in Eastern Europe, that earlier had had a very high mortality rate, were getting closer to the mortality of the Nordic Countries and the Western Europe where the mortality had been the lowest in Europe. But in the mid 60s the decrease in mortality rates became stagnant. This was mainly caused by the increase in circulatory diseases, traffic accidents and deaths caused by alcoholism. The stagnancy in mortality was observed in all European countries, but in the Western part the trend did not last. In the early 70s mortality rates were decreasing again. For the countries in Eastern Europe the decrease in mortality began to level out or even turn into an increase. This divergence between Western and Eastern Europe has lasted from the 70s and until 30 years later. There are distinct differences among the countries. The trends differ slightly between the countries in the Eastern Europe and former Soviet Union, but there are a lot of related patterns. The life expectancies in Estonia, Latvia, Lithuania, Ukraine and Russia have had a similar pattern since 1965 until the mid 90’s. For some years in the mid 80s the mortality rates decreased in the former Soviet Union as a result of Gorbatchev’s anti-alcoholic campaign, but this was not long lasting. Soon the mortality increased in the beginning of the 90s the reason was the economic crisis due to the change into market economy. When the economic crisis was over, the mortality rate decreased again, but in Russia and Ukraine this improvement was just temporary, while in the Baltic States the positive trend continues ( France Meslé, INED, Paris 2002). One interesting thing to notice is the ethnic differences in Estonia. The two main ethnic groups in Estonia are Estonians and Russians. In the late 80s, just before Estonia became self-governed (1991), the ethnic differences were not statistically significant, but in 2000 the mortality among the Estonians had fallen 4% among the males and 11% among females, when at the same time the mortality among Russians had increased by 24% for men and 1% for women. The causes of death where the differences are as highest are alcoholic liver cirrhosis, alcoholic poisoning, homicide, influenza and pneumonia (Mall Leinsalu, 2004). 4 In Figure 1 the development in life expectancy at birth in the Baltic States, Russia and Ukraine and also a comparison with Swedish data are found. Until the mid 80s the life expectancy trends in Bulgaria, Czech Republic, Hungary and Slovakia were very similar. In contrary to the life expectancy in the Baltic States and the former Soviet Union, the life expectancy has increased since 1988 in Czech Republic, since 1992 in Slovakia and since 1993 in Hungary. However, in Bulgaria it took until the end of the 90s until the trend turned upwards. This is valid for the total population. Looking at female and males separately one can see that the trend for female is increasing for almost the whole period ( France Meslé, INED, Paris 2002). In Figure 2 the life expectancy for females and males respectively compared with the Swedish data are presented. 5 Figure 1 Life expectancy at birth in the Former Soviet Union compared to Sweden. 6 Figure 2 Life expectancy at birth in Eastern Europe compared to Sweden. 7 2 Modelling mortality In 1992 Ronald Lee and Lawrence Carter1 proposed a method for modelling and forecasting U.S. mortality. This method has been widely used for modelling and forecasting mortality in several countries. One reason for its popularity is probably the simplicity of the model and the straight forward way of forecasting the mortality ones the model is fitted. The model is based on the past experience of the age-specific mortality data. This model will be used in this paper to fit the historical mortality rates in the Eastern Europe and the former Soviet Union for those countries where statistics is found in the Human Mortality Database. Due to the complexity in the long-term trends in mortality for those countries we can expect some problems by using the model. 2.1 Earlier experience of the Lee-Carter model The Lee-Carter method has been used for fitting and forecasting the mortality for almost every developed country in the world, for example the G7 countries 2 (Canada, France, Germany, Italy, Japan, UK and US), Australia3, Portugal4, Canada5, Chile6, Japan7, Brazil8, Austria9, the Nordic countries10 and England and Wales11. For most of the countries the LeeCarter model applies very well, but for some it does not. Examples of nations where the model does not give a good fit are; Australia, UK and Austria. The problem is that the historical age pattern of mortality rates varies over time, but the model could give a good fit even though the mortality pattern has fluctuated, or the mortality has decreased with different pace. There are a number of articles written where different extensions of the model are used, to get a better adjustment of the model. However there are also criticisms of the Lee-Carter model. In 2005 Girosi and King criticised the model for “insufficiently appreciated properties”. Despite the criticism the modelling of US mortality, which the model originally was designed for, is now used by the US Bureau of the Census. There is one article12 in which the model is applied to Hungarian data. The authors make an extension of the model but their conclusion is that this does not solve the problem and they are not able to achieve a good fit for the whole studied period 1949-2003. One thing that one has to keep in mind is that the Lee-Carter model was designed for the U.S. population mortality and one could not demand that the model should work for every other country. 1 Lee & Carter (1992) Shripad Tuljapurkar, Nan Li & Carl Boe (2000) 3 Heather Booth, John Maindonald & Len Smith (2002) 4 E. Coelho (2001) 5 Lee & Nault (1993) 6 Lee & Rofman (1994) 7 Wiltmoth (1998) 8 Fígoli (1998) 9 Carter & Prskwetz (2001) 10 Koissi, Shapiro and Högnäs (2004) 11 Renshaw & Haberman (2003b) 12 S. Baran, J. Gáll, M. Ispány & G. Pap (2006) 2 8 2.2 Lee-Carter, the model The model that Lee and Carter suggested in 1992 is log q x (t ) a x bx t x (t ) , where q x (t ) is the probability of death at age x in year t. a x is the average of log q x (t ) over time t which describes the general pattern of mortality by age. t is the time trend for the general mortality. b x indicates the sensitivity of log q x (t ) at age x as the t varies. x (t ) is the residual term at age x and time t. 2.3 Estimating the parameters The estimation of a x is straight forward because it is the average of log q x (t ) over time t. The estimation of bx and t could not be solved explicitly and the model cannot be fit with ordinary regression methods. There are several methods that could be used. In the original paper (Lee, Carter 1992) the singular value decomposition (SVD) method is used to find a least squares solution. The estimation approach that is used here is the weighted least squares. Another approach is the maximum likelihood estimation, which makes it possible to have a weight equal to the observed number of deaths for age x and year t. Koissi, Shapiro and Högnäs13 compared these methods in their paper “Fitting and forecasting mortality rates for Nordic countries using the Lee-Carter method”. They found that the different methods gave quite identical results. In this paper the aim is to see if the Lee-Carter model can be applied to the data at all, so the choice of method should be of minor importance. 2.3.1 Estimating the parameters using Weighted Least Squares14 The one year death probabilities for n calendar years and m number of ages or age groups are observed. The model is log qx (t ) ax bx t x (t ) , and we need to estimate a x , bx and t . To achieve a unique solution the following restrictions are used, 13 14 Koissi, Shapiro and Höganäs (2004) ”Skatta parametrarna i Lee-Carters modell för dödlighet” Bengt von Bahr. Translated, with own derivations. 9 2 x m b t n 1 x (a) t 0 (b) t min x min These restrictions make no modifications to the model. If we replace bx with bx *=c bx then t could be replaced by t *= t /c and the model is unchanged. To estimate the parameters we choose the values that minimize Q; Q (a x bx t m xt ) 2 , x ,t where mxt log qx (t ) and Q is subject to the constraints (a) and (b) above. To find values that minimize Q we introduce Lagrange’s multipliers; and and minimize R Q t bx . 2 t x Thus we first take the derivative of R in respect of a x , bx and t respectively; dR 2 (a x bx t m xt ) for every x. da x t dR 2 bx (a x bx t m xt ) for every t. d t x dR 2 (a x bx t m xt ) 2bx for every x. dbx t The derivatives are set equal to zero: dR 0 dbx dR 0 , da x and dR 0 dk x We solve the first equation; 2 (a x bx t m xt ) 0 => t t n na x bx t m xt 0 t na x m xt 0 ( t 0 ) => t min t => t ax 1 mxt n t (1) 10 The estimate for a x is thus computed as the average over time of the logarithm of the central death, which corresponds to the definition of a x in Section 3.2. We now define z xt m xt a x => z xt 0 for every x. t We can now rewrite the second equation: 2 bx (a x bx t m xt ) 2 bx (bx t z xt ) x x If we set the new expression of the derivative with regard to t equal to zero we have 2 bx (bx t z xt ) 0 => x b (b x x t z xt ) / 2 => x x m x 2 ( bx 1 ) t bx bx z xt / 2 2 x => x min t bx z xt / 2 (2) x Taking the sum over t in the equation (2) above we get ( b z t x t x xt ) / 2 t bx z xt / 2 t m b x x => t t x z xt t m / 2 t t n ( t 0 ) => t min ( z xt 0 ) => 0 t We now have an expression for t by putting 0 in equation (2). Thus t bx z xt for every t (3) x The constraint for t is now fulfilled. If we use the expression z xt in the third and last equation; 2 (a x bx t mxt ) 2bx 2 t ( t bx z xt ) 2bx t and set this equal to zero we have 11 2 t ( t bx z xt ) 2bx 0 => bx ( t2 ) t z xt t for every x (4) t We take the square of both sides and summarise over x and get (b ( x x ( z )) 2 2 t t ( ) 2 t x 2 (b ) 2 x t x t t x t2 (taking the square root of both sides) => xt )2 => t ( z t t => x min t ( z 2 t 2 ( bx 1 ) t 2 x => x m 2 ( t2 ) ( t z xt ) 2 t )2 t ( t z xt ) x xt x xt ) 2 t We are now able to get an expression for bx by inserting the equation for in (4): ( t2 bx ( t2 ) t z xt t t t bx ( t2 t2 t t t x z t bx ( z t ( z t t x ) 2 ) t z xt 2 xt ) ) => t => t xt t x xt ( z 2 xt ) for every x (5) t The estimation of a x is easily calculated from the observed one year death probabilities. The equations for estimating bx and t ; (3) and (5) are complicated to solve explicitly, but it is possible to find a solution with a rather low number of iterations. To get a starting value for 1 bx , we assume that bx is independent of x, and equal to , where m is the number of ages m or age groups. In the original paper written by Lee and Carter they make a re-estimation of t to get the observed number of deaths equal to the fitted number of deaths, i.e. Dt exp( ax bx t ) N x ,t , x 12 where Dt is the total number of deaths in year t and N x ,t is the population of age x in year t. No analytic solution is available so it can only be done by searching over a range of value of k. However, this second stage of estimation does not have any impact on whether the Lee-Carter model could be fitted to the data or not. 13 3 Fitting and applying the Lee-Carter model 3.1 Data There are deaths rates per age and calendar year for each country in the mortality database. However, for Estonia there are no observed deaths for some ages in some calendar years, therefore the 5-year age groups have been used for those. Because of this, the values of t , bx and the size of the residuals between Estonia and the other countries fitted with 1 year age group may not be comparable. The data available for the studied countries are: Country Time period Ages used in the model Bulgaria Czech Republic Hungary Estonia Latvia Lithuania Russia Slovakia Ukraine 1947-2003 1950-2004 1950-2001 1959-2005 1959-2003 1959-2003 1959-2005 1950-2005 1959-2005 0 – 100, 1 year age groups 0 – 98, 1 year age groups 0 – 98, 1 year age groups 0 - 105, 5 years age groups 0 – 98, 1 year age groups 0 – 100, 1 year age groups 0 – 100, 1 year age groups 0 – 98/97 (f/m), 1 year age groups 0 – 100, 1 year age group 3.2 Fitting the parameters The parameters are fitted by using the method presented in Section 2.3. We are not making any re-estimation of the k factor, because the aim of this paper is just to see how well the LeeCarter model could be fitted to the countries studied. We are not going to make any forecasts or use the fitted models. The model is applied for male and female separately. Because of the different trend in mortality history discussed earlier, we will look at the results for Eastern Europe(Bulgaria, Czech Republic, Hungary and Slovakia) separately from the countries in the former Soviet Union (the Baltic States, Ukraine and Russia). We apply the Lee-Carter model to the data and start by analysing the t values and the bx values. Residuals will then be studied. For the countries where the model fits very well, the estimated time trend is linearly decreasing and the bx values are decreasing with age. To get a time trend, a k t -curve, that declines linearly it is not necessary that the historical life expectancy is linear. One criterion for forecasting the mortality is to have linearity in the fitted mortality trend, k t . When forecasting the mortality the time trend is extrapolated to the future and if we do not have a linear time trend this straightforward way does not work. In the following Section 3.3 the outcome of the applied models for the countries in Eastern Europe will be analysed. In Section 3.4 the models fitted for the countries in the Former Soviet Union will be studied. 14 3.3 Eastern Europe; Bulgaria, Czech Republic, Hungary and Slovakia We will start to analyse the fitted model by looking at the estimated parameters. In Figure 3 the estimated the general pattern of mortality, a x , is presented for females and males separately, and we see that the shape of a x is almost the same for all countries. Figure 3 General pattern of mortality, ax, for the countries in Eastern Europe. Because of the similarity in the mortality trend history in these countries we will only study Bulgaria in detail. The corresponding graphs for the rest of Eastern Europe are found in Appendix 1. The estimated mortality time trend t , for Bulgaria in Figure 4 looks similar for males and females. The k t -curve is very steep from 1947 until the mid 60s, and then it levels out. In the mid 90s the curve becomes steep again. These curves are not linear, which does not necessary indicate that we do not have a good fit of the model. However, if one would like to make a forecast of the mortality trend there is no historical continuous pattern that we can build the 15 forecast on. How do we forecast a mortality trend that has changed only in 5-10 years and does not look at all like it did during the previous 40 years? The bx values are decreasing by age, which means that the mortality rates at younger ages decline more rapidly than for older ages. Figure 4 The mortality trend, t and the age-specific constant bx . For ages over 80 the bx values are negative for both sexes. For the male population the bx values are very close to zero from the age of 40. The estimated mortality rates for these ages are almost unchanged during the studied period. Negative bx profiles give an increasing mortality rate if the time trend is negative and declining. In Figure 5(female data) and 6(male data) the mortality rates for some ages are shown together with the rates obtained by the applied model. The model has captured the mortality trend well for the female population in the 30-80 year age span. In contrary to the female data, the male data seem to create a problem with the ages between 40 and 60 years. The model gives a too low estimate of the mortality rates in the first half of the studied period and a too high estimate after 1980, for a 50 year old male. We also find that the model has a slightly increase in the mortality trend at this age, while the observed mortality seems to have decreased in the latest years. The model does not capture this trend at all and if this continues the model is giving the wrong assumptions. 16 Figure 5 Estimated mortality rates (dotted coloured lines) compared to actual mortality rates (black solid lines). Bulgarian female data. Figure 6 Estimated mortality rates (dotted coloured lines) compared to actual mortality rates (black solid lines). Bulgarian male data. Figure 7 plots the sum of squares residuals per age and calendar year. As we already seen in the comparison of the fitted model and the actual mortality rates the male residuals are very high in the broad age band 35-60 years. For both males and females the sum-of-squares residuals are high for younger and older ages. To get a better fit of the model we could try to exclude the population under 20 years and over 80 years. The sum-of-squares residuals are higher for males around the 60s and 70s and also in the mid 90s. This corresponds to the observed deviation of the estimated rates from the actual in Figure 6. The female sum-ofsquares residuals follow the pattern of the male residuals but they are not that large. 17 Figure 7 Sum of squares age residuals and sum of squares calendar year residuals. The spread of the size of the residuals is found in Figure 8(females) and 9(males). To have a correctly captured model the residuals should not show any structure, they should be random. We do not want a model that will over-estimate the rates consequently for some calendar years or age groups and under-estimate for other periods or age bands. However, the calendar year residual pattern shows a clear systematic effect in the Bulgarian male model. We see that the rates are over-estimated for the first part of the studied period and under-estimated in the latter part. One question that may arise when looking at the comparison of the estimated and the actual mortality rates in Figure 6, is why the model is not able to fit the data by just decrease the mortality in the early years of the period and increase the adjusted rates in the latter part? The probable reason why the model is not able to fit the data is because of the negative bx values. If we increase the t values in the early years this does not imply that the mortality increases for all ages because the bx values have different signs in different age-groups. We found that the Lee-Carter model could not be applied to the Bulgarian male mortality data with satisfying results. 18 Figure 8 Plot of the size of the calendar year residuals and the age residuals for Bulgarian females. Figure 9 Plot of the size of the calendar year residuals and the age residuals for Bulgarian males. 3.3.1 Comments to the results for the other countries in Eastern Europe The fitted model of the Czech Republic mortality data shows very similar result as the Bulgarian data, except that the fit of the model looks much better for middle aged males. As for Bulgaria the sum-of-squares residuals per age are high for the youngest and the oldest. We also have a systematic effect for the calendar year for the fitted male data, though it is not as outstanding as for the Bulgarian male model. The Hungarian models have a slightly systematic effect even in the female model. The model does not give a good fit of the data. In Figure 10, where the fitted and actual mortality rates for Hungarian males is shown, we even find for some ages that the estimated trend is increasing while the actual trend is decreasing. 19 Another problem with the Hungarian data is the deviation from the age of profile for the male data in the broad age span 30 to 75. The bx values are negative at the same time as the t values are negative. If the applied model was going to be used for forecasting the future mortality, the t values would be projected as a straight line with decreasing values. The t time trend after the year 1970 is negative. This means that if the bx is negative the estimated mortality: ln q x (t ) a x bx t , would increase for each year. The bx t term is positive for x >35 and t >1975. This may result in mortality curves crossing each other with higher mortality for a 60 year old male than for an 80 year old male in the future. The problem with negative bx values is not found for female data. The problem with negative bx values is also found when fitting Slovakian data. As for all other countries the systematic calendar year effect could be found in the male residuals. Figure 10 The solid black line shows the actual mortality rates for Hungarian males at different ages and the dotted lines are the estimated rates given by the fitted Lee-Carter model. To summarise the analysis of the outcome of fitting the Lee-Carter model to mortality data in Eastern Europe, the fit of the model to female data gives a better adjustment than for males. The sum-of-squares residuals are very high for the youngest and oldest ages. The sum-ofsquares residuals per calendar year are highest in the most recent years, and we would not want to use a model that does not fit the data today, especially if we would like to use the model for forecasting. For Hungarian and Slovakian data we obtain negative bx values, which might cause problems in the long run if we would like to make a forecast of the mortality. The t curves are quite linear for all countries except for Bulgaria. The country where we got the best fit of the model is the Czech Republic; the t curves are linear and the bx values are positive for all ages. The Czech Republic is the only country with decreasing mortality for all ages. In Section 4 we will 20 apply the model to the same data but exclude the population under the age of 20 and over the age of 80. All estimated parameters and residual graphs for the studied countries in Eastern Europe are presented in Appendix 1. 21 3.4 Former Soviet Union; Baltic States, Ukraine and Russia In the previous Section we analysed the fit of the Lee-Carter model to the countries in Eastern Europe and as we saw in the development of the Life expectancy tables in Figure 1 and 2 the mortality in the Former Soviet Union differs a lot from that in Eastern Europe. In this Section we will start to look at the model adjusted to the Baltic States mortality data. There are a divergence in the mortality trend from the mid 90s for the Baltic States, compared to the trends in Russia and Ukraine. The fitted parameter ax, the general pattern of mortality, is found in Figure 11, for all the studied countries in the Former Soviet Union. Figure 11 General pattern of mortality, ax, for the countries in the Former Soviet Union. 22 3.4.1 Baltic States The adjustment of Estonian data to the Lee-Carter model differs from the fitted models of Latvia and Lithuania. When applying the model to Estonia we used 5-year age groups and this could maybe contribute to the different results but we will not analyse this further. The estimated parameters and residual plots for the Estonian model are presented in Appendix 2. We start to look at the fitted t values - the time trend. We choose to study the results for the fitted Lithuanian Lee-Carter model. The time trend is decreasing for the female data but for the male data we get an increasing t -curve. The reason why we get an ascending time trend is probably because the increasing mortality rates are in majority in the male population. This is in contrary to the countries in Eastern Europe where the trend is increasing for some ages but it is not outweighing the decreasing trends for other ages. In Figure 12 the t curves and bx curves are presented. The bx values for ages 40-60 years are slightly negative for females and clearly positive for males. In both cases the conclusion is that the mortality rates are increasing in these age bands, because the time trend is negative and declining for females and the male curve is ascending and positive. The graph with the t -curves is though very illustrating; the overall female mortality trend is decreasing while the overall male trend is increasing. Due to the nature of the model, (if we replace bx with bx *=c bx then t could be replaced by t *= t /c and the model is unchanged) we could easily change the sign of the t and bx to get a decreasing male t -curve. This is discussed in Section 3.5. Figure 12 The time trend, kt and the sensitivity of logqx, bx, Lithuanian mortality. In Figure 13 (female) and 14 (male) the actual mortality rates together with the fitted LeeCarter values are found for the Lithuanian population. There has been an upward trend for all ages in the male population until the mid 90s when the rates began to decrease. For the female population the mortality trend has been decreasing since 1960 except for ages between 40 and 60 years with a negative bx values discussed earlier. 23 Figure 13 The solid black line shows the actual mortality rates for Lithuanian females at different ages and the dotted lines are the estimated rates given by the fitted Lee-Carter model. Figure 14 The solid black line shows the actual mortality rates for Lithuanian males at different ages and the dotted lines are the estimated rates given by the fitted Lee-Carter model. 24 The sum-of-squares residuals are, as in Eastern Europe, high in the youngest and the oldest ages. The female residuals are a bit larger than the male sum-of-squares residuals, both per age and calendar year. There is some kind of systematic effect that could be observed in the size of the residuals per calendar year. The difference from that we saw in the pattern for the countries in Eastern Europe is that here we have the same effect in the female model. Looking at the plots of the size of the residuals for the models of Eastern Europe, the residuals had a shape of two “waves”, while for the Baltic States we have a pattern of four “waves”. The sum-of-squares residuals and the size of the residuals could be found in Figures 15, 16 and 17. Figure 15 Sum-of-squares residual, Lithuania. Figure 16 Plot of the size of the calendar year residuals and the age residuals for Lithuanian .females. 25 Figure 17 Plot of the size of the calendar year residuals and the age residuals for Lithuanian .males. To summarise the goodness of fit in the Latvian and Lithuanian models, the female models do not capture the peak in the mortality in 1993 at all. The declining trend in the recent years is not found in the male models either. In contrary to the results for Eastern Europe the male models give a much better fit than the female ones. A big problem with the male models is that the estimated Lithuanian trend for ages around 30-50 is up going, while the actual mortality looks decreasing even though there might be some uncertainty in this statement. However this trend seems to have been captured by the Latvian male model. In Section 4 we will fit the model again but this time after excluding the year 1993-1995 with the extreme mortality rates, and the ages below 25 and 80 years old and above will also be excluded. All graphs of the estimated parameters, in comparison with the actual mortality and the fitted mortality, sum-of-squares residuals and size of the residuals for Estonia, Latvia and Lithuania are found in Appendix 2. 26 3.4.2 Russia and Ukraine There are not many differences between the fitted models of Ukraine and Russia. The shape of the curves of the estimated parameters and the size of the residuals are very similar. The only outstanding fact about Russia is the sum-of-squares residuals for the fit of the extreme year of 1993. This residual is twice as high as the corresponding Ukraine residual. Figure 18 The mortality trend, t and the age-specific constant bx Figure 19 The solid black line shows the actual mortality rates for Russian females at different ages and the dotted lines are the estimated rates given by the fitted Lee-Carter model. 27 Figure 20 The solid black line shows the actual mortality rates for Russian males at different ages and the dotted lines are the estimated rates given by the fitted Lee-Carter model. The t - curves, the time trend is found in Figure 18. Here we have an increasing trend, both for females and males. The curves have two bumps before and after the time of the economic crisis. The female and male curves have the same shape. In the same figure the bx -curves show that the values are positive for all ages over 14 years (except for 1 observation). The plots of the estimated mortality compared to the actual show that the model follow the mortality very well for most of the ages. The fit of the model is better for the male data than for female. The adjusted female model has problems to capture the upside and downside around the years of the economic crisis. These results are presented in Figure 19 and 20. The sum-of-squares residuals, found in Figure 21, tells us the same thing we saw when studying the fitted mortality rates. The female sum-of-squares residuals are much higher than the male residuals for younger ages. If we look at the sum-of-squares residuals per calendar year, the female residuals are higher at the earliest years of the studied period and also a little bit higher during the extreme years. We will make a re-estimation of the model after exclusion of the population below 40 years and over 80 years and also the mortality history for the years with the highest mortality rates, 1988-1990 and 1993-1995. The results could be found in Section 4. 28 The spread of the residuals is shown in Figures 22 and 23 and it could not be argued that they look random per calendar year. There is some kind of systematic error in the estimation of the 1960s and we see a major over-estimation in the late 80s and as we already noticed the years 1993-1995 are under-estimated for all ages. We will see what the effects will be when we refit the model. Figure 21 Sum of squares age residuals and sum of squares calendar year residuals. Figure 22 Plot of the size of the calendar year residuals and the age residuals for Russian females 29 Figure 23 Plot of the size of the calendar year residuals and the age residuals for Russian males 30 3.5 Increasing time trend It is interesting that the Lee-Carter model gives an ascending t -curve and positive bx values for those populations with a predominantly increasing trend in mortality. Because of the nature of the Lee-Carter model: log q x (t ) a x bx t x (t ) , bx * c and t / c is also a solution to the equation and we receive the same estimate of log q x (t ) . If the t -curve is divided by -1 then the bx -curve could be multiplied by -1 and we still have the same estimated mortality. The graphs for the estimated parameters in the Ukrainian model would then look as in Figure 25. The original curves could be found in Figure 88. Figure 24 The Ukrainian model fitted with bx bx * 1 and t t / 1 ' ' The t -curves are now descending as they usually are when applying the Lee-Carter model. (Most of all countries have a population where the life expectancy is improving). The bx curve is turned upside down and almost all bx values become negative. Now we can compare this to the Hungarian model in Figure 52 for example, where the t -curve is declining even though for the age band 35-75 years the bx values are negative the trend for those ages is increasing. But it is more illustrative to keep the t -curve ascending in the cases when the trend really is increasing as in Ukraine where the model gives this appearance. The reason is that for those countries the overall mortality time trend is as the t -curve shows – namely increasing. 31 However, in the situations when a model is fitted to a population where we achieve t -curves for female and male data and one is decreasing and the other one is increasing, it would be easier to compare the results if we “turned” one of the curves upside down. In Figure 25 we see the Latvian t - and bx -curves for males and the original curves for the female model. Figure 25 The male t - and bx -curves compared to the original t - and bx -curves obtained with the female model. 32 4 Fitting the Lee-Carter model excluding parts of the data In the previous Section we applied the Lee-Carter model to mortality data for all available data found in the Human Mortality Database, without making any reflection on if this is the best way of fitting the Lee-Carter model. One could argue that the trends might be different for different age groups and also the quality of the oldest population might not be satisfying. The fit of the model would therefore maybe have a better fit if we exclude parts of the population. There could also be some time periods that we would like not to influence the model. Looking at the sum-of-squares of the residuals per age in all our fitted models we see that for every country they increase marked from the age of 90. The sum-of-squares residuals at ages under 30 are even higher. We will now apply the model to the Czech Republic, Latvia and Ukraine after excluding some part of the population and time periods. 4.1 The Czech Republic The model fitted to the original Czech Republic mortality data shows high sum-of-squares residuals for the youngest ages. The sum-of-squares residuals in the most recent years are also comparatively high. In Figure 26 the new sum-of-squares residuals per calendar year are presented as a percentage of the residuals in the model applied with the original data. The residuals are reduced significantly as a consequence of that the ages with the highest residuals are excluded. Looking at the age sum-of-squares residuals only a minor decrease is shown. The female model even gets higher residuals for ages 60 and above. Figure 26 Sum-of-squares residuals after re-fitting the model to the population of 20-79 years old. (Percentage of the original) Figure 27 Sum-of-squares residuals per age for the original model and the new model. 33 We also see that the residuals with the most decreased values are the ones in the most recent years. In the model applied to all data the female sum-of-squares residual are worryingly high for the most recent years, but after re-fitting the model these are much lower and more in line with the residuals in the other years. It feels more important to have small residuals in the most recent years than in the beginning of the period because we want to capture the present trend. The systematic calendar year effect could still be seen in the male model. In Figure 28 and 29 below, the plots of the size of the residuals per calendar year are presented both for the model applied with the original data and the model with the population between 20-79 years. Note that the scale differs between the old and new residual plots. Figure 28 The original plot of the residuals compared with the new in the right hand picture. Males Figure 29 The original plot of the residuals compared with the new in the right hand picture. Females 34 4.2 Latvia The Lee-Carter model is applied to the Latvian population between 25-79 years old where the time period 1993-1995 is excluded. The residuals are not extremely high for this period and the model captures this rather well; although the death rates are extreme between the years 1993 and 1999, see Figure 1. As for the Czech Republic re-fitted model we eliminate the high sum-of-squares for the most recent calendar year. The result is lower sum-of-squares residuals in the female model in every age group except for the 70 year-olds, which have slightly higher new residuals. In the male model we get higher residuals for age groups 50 and 55. Figure 30 Sum-of-squares residuals after re-fitting the model to the population of 20-79 years old. (Percentage of the original) Figure 31 Sum-of-squares residuals per age, for the original model and the new model. The plots of the size of the residuals per calendar year are shown in figure 32 and 33. Note that the scale differs between the old and new residual plots. The reason for the comparison is to see what have happened with the pattern after the re-estimation. In the original model we could see that the residuals did not look random in the second half of the estimated period. In the new model this systematic effect seems to have disappeared in the male model. In the female model we did not have any systematic effect in the original model. 35 Figure 32 The original plot of the residuals compared with the new in the right hand picture. Males Figure 33 The original plot of the residuals compared with the new in the right hand picture. Females 36 4.3 Ukraine The population we will use when re-fitting the Lee-Carter model to the Ukrainian mortality data is the age interval 40-79 years and we also exclude the calendar years 1988-1990 and 1993-1995. The age sum-of-squares residuals decrease dramatically and mostly in the female model. The new sum-of-squares residuals are below 20% for the latter part of the studied time period. This is one way of getting a better fit of the model if we think that some age groups or time periods of the estimated period are of minor interest to the study. For example, one reason for excluding the high and low peaks in mortality due to the economic crisis is if we think that it could be seen as an isolated event that only influenced the mortality during a limited period. The comparisons of the plot of the size of the residuals in the original model and the new are shown in Figure 36 and 37. The re-fitted model looks more random than the original model where we had a clear systematic effect. Figure 34 Sum-of-squares residuals per age, for the original model and the new model. Figure 35 Sum-of-squares residuals after re-fitting the model to the population of 20-79 years old. (Percentage of the original) 37 Figure 36 The original plot of the residuals compared with the new in the right hand picture. Males Figure 37 The original plot of the residuals compared with the new in the right hand picture. Females 38 5 Discussion The model gave a comparatively good fit to both Russian and Ukrainian data, especially after excluding some part of the data as we did in Section 4, and there is a clear trend in the t values. The question is however if we could use the model? Is it reasonable to build a forecast on the t trend? Do we really think that the rates in these countries will keep on increasing forever? Would it not be possible that this trend will stop and the life expectancy in Russia and Ukraine will start to improve? The conclusion is that the Lee-Carter model could be fitted for some of the studied countries with satisfying results, although one should probably be very reluctant to use the model for forecasting mortality. There are strengths in the Lee-Carter model that we could see. One is that the model is able to fit the mortality rates for a population as in the Russian case despite the fluctuating mortality pattern. Another thing is that the model parameter, t , really shows the predominant trend in the population, with a descending or ascending curve. We have found that one criterion to get a good fit of the model is that the historical mortality in the population has to be homogenous within each gender (if we fit models for females and males separately). If we get negative and positive bx values, the population is not homogenous and most likely we get a systematic pattern in the size of the residuals. The LeeCarter model has problems to capture a good fit when the mortality has an increasing trend for some ages and decreasing for others. In the Western countries where the model has been applied with good results the overall trend has been decreasing mortality rates – probably within the whole population but with different pace. For those countries studied in this paper the model gave a god fit to the countries where the trend is either overall increasing (Russia and Ukraine) or decreasing (Czech Republic). In Eastern Europe the female population is more homogenous than the male, while in the Baltic States we have a more homogenous male mortality development. When forecasting mortality one question is how long period that should be used in the projection. The period should not be too long to avoid the inclusion of mortality improvement or impairment causes, the effects of which are already exhausted. This is one reason why it is impossible to apply a model to the countries studied in this paper. There have been major changes in mortality due to one cause in recent years that is unlikely to have any impact on the future mortality trends. Looking at the Baltic States the trend in mortality is very different after they became self-governed in 1991 and maybe the mortality statistics before that date does not tell us anything about the future mortality. One thing that could be seen is that the Baltic States become more West-oriented for each day and a better way of forecasting the mortality in those countries could be to look at other countries’ statistics with similar trends – if they exist - and build the forecast on that. We have applied the Lee-Carter model to 9 different countries for males and females separately, so there have been several models to analyse. Of course one could analyse each country’s models more deeply and found more explanations to the results and other underlying causes to the outcomes. In this paper we are only making a first (and second) attempt to adjust the Lee-Carter model. Due to the fact that we are not using the models further, it is not easy to state when a model gives a good fit or not. How small does the size of 39 the residuals have to bee to have a good fit? In our case, often a visual judgement feels enough to make a conclusion. The analysis in this paper could be seen as a first study of the Lee-Carter models ability to fit the mortality data of the countries in Eastern Europe and the Former Soviet Union. 40 REFERENCES Bahr, Bengt von (2006). “Skatta parametrarna I Lee-Carters modell för dödlighet”, working paper, DUS. Baran S., Gáll J., Ispány M., Pap G. (2006). ”Forecasting Hungarian Mortality rates using the Lee-Carter method”, Acta Oeconomica, Vol 57 (1) pp. 21-34 (2007) Booth, Heather, Maindonald, John and Smith, Len (2002) “Applying Lee-Carter under conditions of variable mortality decline”, Population Studies, 56(3), pp. 325-336. Carter, Lawrence R. e Prskawetz, Alexia (2001). “Examining Structural Shifts in Mortality Using the Lee-Carter Method”, Max Planck Institute for Demographic Research WP 2001007, Germany. Coelho E., (2001). “The Lee-Carter method for forecasting mortality, the Portuguese experience”, Departamento de Estatísticas Sociais, Instituto Nacional de Estatística – Portugal. Fígoli, Moema G. Bueno, (1998), "Modelando e projectando a mortalidade no Brasil, Revista Brasileira de Estudos de População", Vol.15, n.º 1. France Meslé, (2002). “Mortality in Eastern Europe and the former Soviet Union : long-term trends and recent upturns”, Paper presented at IUSSP/MPIDR Workshop "Determinants of Diverging Trends in Mortality" Girosi F., King G. (2005). “A Reassessment of the Lee-Carter Mortality Forecasting Method”, http://gking.harvard.edu/files/lc.pdf. Koissi, Shapiro, Högnäs. (2004). “Fitting and forecasting mortality rates for Nordic countries using the Lee-Carter model”, Proceeding of the 39th Actuarial Research Conference, ARCH. Lee R. D., Carter L. (1992). “Modeling and Forecasting U.S. Mortality”, Journal of the American Statistical Association, September 1992, Vol 87, No. 419. Lee, R. D. and Nault F., (1993). “Modeling and forecasting provincial mortality in Canada”, paper presented at the World Congress of the International Union for Scientific Study of Population, Montreal, 1993. Lee, Ronald D. and Rafael Rofman (1994) “Modelacion y Proyeccion de la Mortalidad en Chile,” NOTAS 22, no 59, pp. 182-213. Leinsalu M. (2004). “Troubled transitions, Social variation and long-term trends in health and mortality in Estonia”, Doctoral thesis in sociology, Health Equity Studies No 2. Renshaw, A., & Haberman, S. (2003b). “Lee-Carter mortality forecasting: a parallel generalized linear modelling approach for England and Wales mortality projections”, Applied Statistics 52, 119-137. 41 Tuljapurkar S. Li N. Boe C., (2000). A universal pattern of mortality decline in the G 7 countries” , Nature, 405: 789-792. 2000. Wilmoth, J. R., (1998). Is the pace of Japanese mortality decline converging toward international trends? Population and Development Review 24(3), 593-600. 42