Documentation for Cohort Survival Excel Workbook

advertisement
Documentation for Cohort Survival Excel Workbook
These work sheets were adapted from sheets created by Dr. Tim Chapin1 at Florida State
University to work with Canadian data for use in the lab section of Geography 3235:
Quantitative Methods for Geographic Analysis offered at the University of Lethbridge.
The modifications were made by Chris Jackson2 as part of an applied study under the
supervision and instruction of Dr. Ian MacLachlan3, Associate Professor of Geography at
the University of Lethbridge.
While spreadsheet layout is similar to that of the Chapin model, most formulas
have been modified to produce results consistent with the Cohort Component Quick
Basic program described in Ottensman (1986). Calculations pertaining to incoming
births were further modified to improve the accuracy of the model according to
Klosterman (1990).
The intent of this document is to provide the necessary background to the
“Lethbridge 65+” and “Lethbridge 85+” spreadsheet models. The “Lethbridge 65+”
model uses two preceding five year intervals in calculating births and migration but has
no data beyond the open-ended 65+ cohort. The “Lethbridge 85+” model uses only one
preceding five year period to predict migration, but has added resolution in older fiveyear cohorts up the open-ended 85+ cohort. These compromises are necessary because
Canada’s 1991 census population was published with five-year cohorts from 0 to 64
years and two ten year cohorts for 65-74 and 75-84 before the open-ended 85+ cohort.
The 1996 and 2001 censuses were published with five-year cohorts all the way to the
open-ended 85+ cohort.
To facilitate an 18 cohort model with two five year periods used to average
migration and births, a special tabulation of the census would be required for every
region the model is applied to. With these modifications, these models can be used for
any region in Canada provided that five-year cohort-sex structure data are available from
the 1991, 1996 and 2001 censuses.
-1-
Part A of this document outlines the basic organisation of the model and the
purpose of each page. Part B describes exactly how to operate the model to obtain a
projection using Canadian population data.
Part A
For each of the eleven sheets, the function of the sheet, the required inputs (if any) and
the outputs will be described.
Census Population
Requires the input of age-sex cohort data for the 2001 and 1996 census years, with the
additional 1991 census year in the ‘Lethbridge 65+’ model. This data will be accessed by
sheets relating to migration, births and the projections. Input locations are highlighted
with light turquoise.
Projections
No inputs are required. This sheet summarizes the model once all the other sheets have
been completed. Observed data and projections for each of the six projection years are
displayed both by sex and as a total for the region.
2001 – 2011, 2011 – 2021, and 2021 – 2031
No inputs required. These sheets are where the model takes data entered (census,
migration, fertility, survival, etc) and performs the actual projections.
Survival Summary
No inputs required. This sheet summarizes the male and female cohort specific survival
rates as calculated on the male and female survival rates sheets.
Fertility Rates
This sheet requires the input of population and births data by cohort for the region for
three consecutive years, as suggested by Klosterman (1990). The five year fertility rates
will be summed and divided by three to compute average five year age-specific fertility
rates for use in the Births sheet and in the projections. This sheet is also where the sex at
-2-
birth ratio must be entered, expressed as males per 100 female births. Inputs are
highlighted in light turquoise.
Births '91-'96, '96-'01 (65+) or Births ’96 – 01 (85+)
Here the model calculates the number of births expected during the inter-censal period
for use as the first cohort in the Migration sheet. The 65+ uses two preceding five year
periods to estimate migration, hence two calculations for expected births: one for 1996
and one for 2001.
Migration Rates
Sex-specific migration rates are computed for use in the projections, which are assumed
to be constant throughout the time period. Because the 65+ model uses the two preceding
five year periods, migration rates for each of the two previous five year periods will be
averaged to produce age-sex specific average five year migration rates.
F and M Survival Rates
The survival sheets require the input of the lx column from the life table. The
calculations used by Statistics Canada are followed to reproduce the published life table
values. Five year survival rates for each sex are calculated using the Tx values and the
equations described in Klosterman (1990) and summarized in Table 5.2 (Klosterman
1990, 74). These survival rates are summarized on the Survival Summary sheet and used
in the projections, births and migration rate calculations.
Part B
The model is described in-depth, with further information on where to obtain the
necessary data and how to format it for use with the model. While part A was organized
sheet-by-sheet in the same order as the model, part B is arranged in the order that the user
would logically follow in entering the data required for the model. Thus certain pages
(i.e. summary pages with no inputs or calculations) will not have specific headings, but
will be mentioned in the sections they summarize.
-3-
Census Population
All population data are readily available from Statistics Canada through the E-Stat
website: http://estat.statcan.ca/. To run both models, sex-specific census data by five year
cohort for the region is required for the census years of 1991, 1996, and 2001. The
“Lethbridge 85+” model only requires census data from 1996 and 2001 and no further
formatting of data is required. To run the “Lethbridge 65+” model, the census data need
to be condensed into 14 cohorts, with five year intervals from 0 to 64 years and an openended 65+ cohort. Total population by cohort will automatically be calculated for each
census year.
Male and Female Survival Rates
The purpose of these pages is to recreate the values published by Statistics Canada in the
Alberta Life Tables, 1995 – 1997. To do so, the user must input the lx column from the
published life tables1. These tables should be copied into an intermediate worksheet
(available in the data workbook) from Adobe Acrobat (or other .PDF reader) by using
either the ‘select text’ or the ‘select table’ tool.
1. If the ‘select text’ tool is used, the entire table can be copied and pasted into the
blank Excel book. The data will need to be formatted into usable columns by
selecting the entire column that contains the data and clicking Data > Text to
Columns to open the Text to Columns Wizard. At the first step, select ‘Delimited’,
as the data are not suited for ‘Fixed Width’ columns; click next. Select ‘Space’ as
the delimiter and click finish. There should be separate columns for the data, but
two columns for the year because of the space between ‘0’ and ‘Years’.
2. If the ‘Select Table’ tool is used, each page must be copied one at a time. After
highlighting and copying a page, select the destination cell in Excel, right-click and
‘Paste Special’. In the dialog that opens, highlight the ‘CSV’ option and click OK.
The table should appear just as it does in the original life table, but these steps need
to be repeated for each page of the life table.
1
Available online in English for Canada at http://statcan.ca/english/freepub/84-537-XIE/tables.htm
-4-
After the life table has been copied into the intermediate worksheet, there may be
some unnecessary data present, such as extra column headers where each page separator
would have been in the original document. Delete the rows without data, and then
highlight only the lx column of the life table. Right-click on the highlighted column and
select ‘Format Cells’; format these cells as ‘numbers’ with zero decimal places, if not
done automatically by Excel. The lx column should now be ready for copying into the
model workbook.
It should be noted that the lx column of the model extends down to the 109th year,
even though some life tables (i.e. the Alberta Male Life Table) may not have data for
these later years. For the Tx column to be calculated properly, it is crucial that any empty
cells at the bottom of the lx column (i.e. not filled with life table data) are assigned a
value of zero.
In calculating Lx, Statistics Canada applies a separation factor (Fx) to account for
the unequal distribution of deaths in the youngest cohorts. This factor is necessary
because there is a higher risk of death during the earlier part of the first four cohorts,
coupled with a reduced risk as the interval progresses. The factor is derived from the
number of individuals who died in year x before their birthday and the number of
individuals in year x who died after their birthday. For ages five and over, the factor is
set to 0.500, under the assumption that the distribution of deaths is then equal throughout
the interval2. Using this separation factor produces Lx values identical to those published
by Statistics Canada, with the exception of rounding error.
Having the model recreate the life table allows the user to check that the model is
working correctly by comparing the calculated life table values with the published life
table values.
The Tx values calculated (using the adjusted Lx) are then used to calculate five
year survival rates as shown in of Table 5.2 of Klosterman (1990, 74). Tx values for each
cohort number (e.g. 0-1 for cohort 1, 5-6 for cohort 2 and so on) are reported in column
T. For cohort 0, Dx values equal the stationary population (100,000) multiplied by the
length of a cohort (five years). For the rest of the cohorts, Dx for cohort n equals the
2
Separation Factor described in detail in Life Table Methods, available online in .PDF format at
http://statcan.ca/english/freepub/84-537-XIE/method.pdf - p. 18 and Appendix 1, p. 67
-5-
difference between Tx n and Tx n+1. Survival rates for each cohort with the exception of
the final open-ended cohort are equal to Dx n+1 divided by Dx n, or the stationary
population that has survived into cohort n+1 divided by the population in cohort n. Fiveyear age-sex specific survival rates are summarized on the ‘Survival Summary’ tab of the
workbook.
Fertility Rates
To calculate five year fertility rates, the user must input the female population by five
year cohort, and the number of births for each cohort. As noted in Klosterman (1990,
82), the variability of fertility estimates for small regions can be reduced by using the
average number of births for three consecutive years, ideally centered on a census year.
For use with Lethbridge data, average births can only be determined for three years prior
to the census year 2001 because Alberta Vital Statistics Annual Reviews have only been
published up to 2000.
Because this fertility data is not available for the census year, and because this
data is not published by Statistics Canada, population for the years for which births data
is available will be interpolated based on the 1996 and 2001 cohort populations. To
calculate females in cohort n, the difference between the population of cohort n in 1996
and that of 2001 will first be calculated. This difference will then be spread evenly over
each of the five intervening years. In 1998, for example, it will be assumed that cohort n
is equal to the 1996 population plus 40% of the difference between 1996 and 2001. The
reasoning for this is that 1998 is two-fifths (40%) of the way from 1996 to 2001.
These population estimates will then be used to calculate the number of live births
per 1000 women (births / population / 1000), one year fertility rates (births / population),
and five year fertility rates (one year fertility * 5). The three five year fertility rates
calculated will be averaged to produce average five year fertility rates for use in the
model.
The model requires the input of three years worth of population and births data,
and will compute and summarize the age-specific average five year fertility rates on the
‘Fertility Rates’ worksheet. The ‘Fertility Rates’ worksheet also requires the input of the
region’s sex ratio at birth, expressed as number of male births per 100 female births, for
-6-
use on other pages relating to births. Average sex at birth ratio for Alberta from 1991 to
2001 may be calculated from Births by sex by Province, available from CANSIM table
051-0013.
Births
In calculating the number of births expected during each female five year interval, the
age-specific fertility rate is applied to the average population by age cohort. An average
population is required because the number of women in each cohort changes dramatically
over the five year period due to mortality or migration (Klosterman 1990, 59). The total
projected births over the five year period are separated into expected males and females
using the sex at birth ratio, and then subjected to the sex-specific survival rate for
individuals entering year 0, as presented on the ‘Survival Summary’ sheet.
Migration Rates
The residual method of determining net migration is used in the calculation of age-sex
specific migration rates. Net migration is assumed to be the residual between the
expected number of survivors into a cohort and the actual number of cohort members as
determined by a census count. Net Migration, then, is equal to the actual population in
present time t of cohort n minus the expected number of survivors into cohort n, the
product of the population of cohort n-1 at time t-1 multiplied by the survival rate of the n1 cohort.
Migration for the first cohort is estimated using the projected number of births as
described above and the 0 to 4 year survival rate. For the open-ended cohort, the
expected survivors available to migrate are represented by the sum of the last two cohorts
multiplied by their respective survival rates. To illustrate, the migration of the 85+ cohort
in the “Lethbridge 85+” model, the expected survivors in that cohort in 2001 is calculated
as the sum of the 79-84 cohort population multiplied by the cohort 17 survival rate and
the 85+ cohort population multiplied by the cohort 18 survival rate.
Migration rates are calculated by dividing the number of net migrants by total
population in that age range that is available to migrate. For instance, cohort 4 migration
-7-
rate is equal to the number of net migrants in time t divided by the population in cohort 4
in time t-1.
For the “Lethbridge 65+” model, the migration rates from ’91 to ’96 are averaged
with the rates from ’96 to ’01 to produce average age-specific migration rates used for the
projections.
Five Year Projections
The 10 year ranges on the three projection tabs represent the years from which data is
used: the 2001 – 2011 uses population data from 2001 to 2011, with data from earlier
years used to calculate the birth and migration rates. Each tab contains two five year
projections.
To project cohort n in time t+ 1, the population of cohort n-1 in time t is
multiplied by the survival rate for cohort n-1. To these survivors are added the projected
migrants, the product of the cohort n migration rate and the population of cohort n at time
t.
The number of births is calculated for each time period using the average agespecific female population from time t to t+1. The average number of females in each
cohort is then multiplied by the age-specific fertility rate. The total number of births is
then summed. Expected males and females are calculated using the sex at birth ratio
from the ‘Fertility Rates’ sheet, and then multiplied by the sex-specific year 0 survival
rate to come to the number of survivors into the 0 to 4 year cohort.
For the open ended cohorts, expected survivors include those surviving into the
open cohort from a five year cohort and those who were previously in the open-ended
cohort who survive into the projected time period. Thus, the 85+survivors in time t+1
include the survivors from the 79-84 year cohort and the 85+ cohort in time t. These
survivors will then be multiplied by the migration rate for the open-ended cohort to find
the projected population of the open-ended cohort in t+1. The projected population is
then used as the population used to project the next (t+2) period.
Other Notes
-8-
The next step for this model would involve the implementation of macros to allow the
user to specify whether or not they would like to include the migration component to give
this dimension of flexibility to the model. This macro would function as a prompt to the
user asking to select either no migration or the calculation of migration rates, supplying
either a series of zeroes or the migration rates to the reference cells for the projection
sheets.
Bibliography
Klosterman, R. E. 1990. Community Analysis and Planning Techniques. Savage, MD:
Rowman & Littlefield Publishers, Inc.
Ottensman, J.R. 1986. BASIC Microcomputer Programs for Urban Analysis and
Planning. New York: Chapman and Hall.
1
Dr Tim Chapin info / original model available at
http://garnet.acns.fsu.edu/~tchapin/urp5261/exercise/Generic%20CC%20Model.xls
2
Chris Jackson Contact: chris.jackson@uleth.ca
3
Dr. Ian MacLachlan Contact: maclachlan@uleth.ca
-9-
Download