Documentation for Cohort Survival Excel Workbook These work sheets were adapted from sheets created by Dr. Tim Chapin1 at Florida State University to work with Canadian data for use in the lab section of Geography 3235: Quantitative Methods for Geographic Analysis offered at the University of Lethbridge. The modifications were made by Chris Jackson2 as part of an applied study under the supervision and instruction of Dr. Ian MacLachlan3, Associate Professor of Geography at the University of Lethbridge. While spreadsheet layout is similar to that of the Chapin model, most formulas have been modified to produce results consistent with the Cohort Component Quick Basic program described in Ottensman (1986). Calculations pertaining to incoming births were further modified to improve the accuracy of the model according to Klosterman (1990). The intent of this document is to provide the necessary background to the “Lethbridge 65+” and “Lethbridge 85+” spreadsheet models. The “Lethbridge 65+” model uses two preceding five year intervals in calculating births and migration but has no data beyond the open-ended 65+ cohort. The “Lethbridge 85+” model uses only one preceding five year period to predict migration, but has added resolution in older fiveyear cohorts up the open-ended 85+ cohort. These compromises are necessary because Canada’s 1991 census population was published with five-year cohorts from 0 to 64 years and two ten year cohorts for 65-74 and 75-84 before the open-ended 85+ cohort. The 1996 and 2001 censuses were published with five-year cohorts all the way to the open-ended 85+ cohort. To facilitate an 18 cohort model with two five year periods used to average migration and births, a special tabulation of the census would be required for every region the model is applied to. With these modifications, these models can be used for any region in Canada provided that five-year cohort-sex structure data are available from the 1991, 1996 and 2001 censuses. -1- Part A of this document outlines the basic organisation of the model and the purpose of each page. Part B describes exactly how to operate the model to obtain a projection using Canadian population data. Part A For each of the eleven sheets, the function of the sheet, the required inputs (if any) and the outputs will be described. Census Population Requires the input of age-sex cohort data for the 2001 and 1996 census years, with the additional 1991 census year in the ‘Lethbridge 65+’ model. This data will be accessed by sheets relating to migration, births and the projections. Input locations are highlighted with light turquoise. Projections No inputs are required. This sheet summarizes the model once all the other sheets have been completed. Observed data and projections for each of the six projection years are displayed both by sex and as a total for the region. 2001 – 2011, 2011 – 2021, and 2021 – 2031 No inputs required. These sheets are where the model takes data entered (census, migration, fertility, survival, etc) and performs the actual projections. Survival Summary No inputs required. This sheet summarizes the male and female cohort specific survival rates as calculated on the male and female survival rates sheets. Fertility Rates This sheet requires the input of population and births data by cohort for the region for three consecutive years, as suggested by Klosterman (1990). The five year fertility rates will be summed and divided by three to compute average five year age-specific fertility rates for use in the Births sheet and in the projections. This sheet is also where the sex at -2- birth ratio must be entered, expressed as males per 100 female births. Inputs are highlighted in light turquoise. Births '91-'96, '96-'01 (65+) or Births ’96 – 01 (85+) Here the model calculates the number of births expected during the inter-censal period for use as the first cohort in the Migration sheet. The 65+ uses two preceding five year periods to estimate migration, hence two calculations for expected births: one for 1996 and one for 2001. Migration Rates Sex-specific migration rates are computed for use in the projections, which are assumed to be constant throughout the time period. Because the 65+ model uses the two preceding five year periods, migration rates for each of the two previous five year periods will be averaged to produce age-sex specific average five year migration rates. F and M Survival Rates The survival sheets require the input of the lx column from the life table. The calculations used by Statistics Canada are followed to reproduce the published life table values. Five year survival rates for each sex are calculated using the Tx values and the equations described in Klosterman (1990) and summarized in Table 5.2 (Klosterman 1990, 74). These survival rates are summarized on the Survival Summary sheet and used in the projections, births and migration rate calculations. Part B The model is described in-depth, with further information on where to obtain the necessary data and how to format it for use with the model. While part A was organized sheet-by-sheet in the same order as the model, part B is arranged in the order that the user would logically follow in entering the data required for the model. Thus certain pages (i.e. summary pages with no inputs or calculations) will not have specific headings, but will be mentioned in the sections they summarize. -3- Census Population All population data are readily available from Statistics Canada through the E-Stat website: http://estat.statcan.ca/. To run both models, sex-specific census data by five year cohort for the region is required for the census years of 1991, 1996, and 2001. The “Lethbridge 85+” model only requires census data from 1996 and 2001 and no further formatting of data is required. To run the “Lethbridge 65+” model, the census data need to be condensed into 14 cohorts, with five year intervals from 0 to 64 years and an openended 65+ cohort. Total population by cohort will automatically be calculated for each census year. Male and Female Survival Rates The purpose of these pages is to recreate the values published by Statistics Canada in the Alberta Life Tables, 1995 – 1997. To do so, the user must input the lx column from the published life tables1. These tables should be copied into an intermediate worksheet (available in the data workbook) from Adobe Acrobat (or other .PDF reader) by using either the ‘select text’ or the ‘select table’ tool. 1. If the ‘select text’ tool is used, the entire table can be copied and pasted into the blank Excel book. The data will need to be formatted into usable columns by selecting the entire column that contains the data and clicking Data > Text to Columns to open the Text to Columns Wizard. At the first step, select ‘Delimited’, as the data are not suited for ‘Fixed Width’ columns; click next. Select ‘Space’ as the delimiter and click finish. There should be separate columns for the data, but two columns for the year because of the space between ‘0’ and ‘Years’. 2. If the ‘Select Table’ tool is used, each page must be copied one at a time. After highlighting and copying a page, select the destination cell in Excel, right-click and ‘Paste Special’. In the dialog that opens, highlight the ‘CSV’ option and click OK. The table should appear just as it does in the original life table, but these steps need to be repeated for each page of the life table. 1 Available online in English for Canada at http://statcan.ca/english/freepub/84-537-XIE/tables.htm -4- After the life table has been copied into the intermediate worksheet, there may be some unnecessary data present, such as extra column headers where each page separator would have been in the original document. Delete the rows without data, and then highlight only the lx column of the life table. Right-click on the highlighted column and select ‘Format Cells’; format these cells as ‘numbers’ with zero decimal places, if not done automatically by Excel. The lx column should now be ready for copying into the model workbook. It should be noted that the lx column of the model extends down to the 109th year, even though some life tables (i.e. the Alberta Male Life Table) may not have data for these later years. For the Tx column to be calculated properly, it is crucial that any empty cells at the bottom of the lx column (i.e. not filled with life table data) are assigned a value of zero. In calculating Lx, Statistics Canada applies a separation factor (Fx) to account for the unequal distribution of deaths in the youngest cohorts. This factor is necessary because there is a higher risk of death during the earlier part of the first four cohorts, coupled with a reduced risk as the interval progresses. The factor is derived from the number of individuals who died in year x before their birthday and the number of individuals in year x who died after their birthday. For ages five and over, the factor is set to 0.500, under the assumption that the distribution of deaths is then equal throughout the interval2. Using this separation factor produces Lx values identical to those published by Statistics Canada, with the exception of rounding error. Having the model recreate the life table allows the user to check that the model is working correctly by comparing the calculated life table values with the published life table values. The Tx values calculated (using the adjusted Lx) are then used to calculate five year survival rates as shown in of Table 5.2 of Klosterman (1990, 74). Tx values for each cohort number (e.g. 0-1 for cohort 1, 5-6 for cohort 2 and so on) are reported in column T. For cohort 0, Dx values equal the stationary population (100,000) multiplied by the length of a cohort (five years). For the rest of the cohorts, Dx for cohort n equals the 2 Separation Factor described in detail in Life Table Methods, available online in .PDF format at http://statcan.ca/english/freepub/84-537-XIE/method.pdf - p. 18 and Appendix 1, p. 67 -5- difference between Tx n and Tx n+1. Survival rates for each cohort with the exception of the final open-ended cohort are equal to Dx n+1 divided by Dx n, or the stationary population that has survived into cohort n+1 divided by the population in cohort n. Fiveyear age-sex specific survival rates are summarized on the ‘Survival Summary’ tab of the workbook. Fertility Rates To calculate five year fertility rates, the user must input the female population by five year cohort, and the number of births for each cohort. As noted in Klosterman (1990, 82), the variability of fertility estimates for small regions can be reduced by using the average number of births for three consecutive years, ideally centered on a census year. For use with Lethbridge data, average births can only be determined for three years prior to the census year 2001 because Alberta Vital Statistics Annual Reviews have only been published up to 2000. Because this fertility data is not available for the census year, and because this data is not published by Statistics Canada, population for the years for which births data is available will be interpolated based on the 1996 and 2001 cohort populations. To calculate females in cohort n, the difference between the population of cohort n in 1996 and that of 2001 will first be calculated. This difference will then be spread evenly over each of the five intervening years. In 1998, for example, it will be assumed that cohort n is equal to the 1996 population plus 40% of the difference between 1996 and 2001. The reasoning for this is that 1998 is two-fifths (40%) of the way from 1996 to 2001. These population estimates will then be used to calculate the number of live births per 1000 women (births / population / 1000), one year fertility rates (births / population), and five year fertility rates (one year fertility * 5). The three five year fertility rates calculated will be averaged to produce average five year fertility rates for use in the model. The model requires the input of three years worth of population and births data, and will compute and summarize the age-specific average five year fertility rates on the ‘Fertility Rates’ worksheet. The ‘Fertility Rates’ worksheet also requires the input of the region’s sex ratio at birth, expressed as number of male births per 100 female births, for -6- use on other pages relating to births. Average sex at birth ratio for Alberta from 1991 to 2001 may be calculated from Births by sex by Province, available from CANSIM table 051-0013. Births In calculating the number of births expected during each female five year interval, the age-specific fertility rate is applied to the average population by age cohort. An average population is required because the number of women in each cohort changes dramatically over the five year period due to mortality or migration (Klosterman 1990, 59). The total projected births over the five year period are separated into expected males and females using the sex at birth ratio, and then subjected to the sex-specific survival rate for individuals entering year 0, as presented on the ‘Survival Summary’ sheet. Migration Rates The residual method of determining net migration is used in the calculation of age-sex specific migration rates. Net migration is assumed to be the residual between the expected number of survivors into a cohort and the actual number of cohort members as determined by a census count. Net Migration, then, is equal to the actual population in present time t of cohort n minus the expected number of survivors into cohort n, the product of the population of cohort n-1 at time t-1 multiplied by the survival rate of the n1 cohort. Migration for the first cohort is estimated using the projected number of births as described above and the 0 to 4 year survival rate. For the open-ended cohort, the expected survivors available to migrate are represented by the sum of the last two cohorts multiplied by their respective survival rates. To illustrate, the migration of the 85+ cohort in the “Lethbridge 85+” model, the expected survivors in that cohort in 2001 is calculated as the sum of the 79-84 cohort population multiplied by the cohort 17 survival rate and the 85+ cohort population multiplied by the cohort 18 survival rate. Migration rates are calculated by dividing the number of net migrants by total population in that age range that is available to migrate. For instance, cohort 4 migration -7- rate is equal to the number of net migrants in time t divided by the population in cohort 4 in time t-1. For the “Lethbridge 65+” model, the migration rates from ’91 to ’96 are averaged with the rates from ’96 to ’01 to produce average age-specific migration rates used for the projections. Five Year Projections The 10 year ranges on the three projection tabs represent the years from which data is used: the 2001 – 2011 uses population data from 2001 to 2011, with data from earlier years used to calculate the birth and migration rates. Each tab contains two five year projections. To project cohort n in time t+ 1, the population of cohort n-1 in time t is multiplied by the survival rate for cohort n-1. To these survivors are added the projected migrants, the product of the cohort n migration rate and the population of cohort n at time t. The number of births is calculated for each time period using the average agespecific female population from time t to t+1. The average number of females in each cohort is then multiplied by the age-specific fertility rate. The total number of births is then summed. Expected males and females are calculated using the sex at birth ratio from the ‘Fertility Rates’ sheet, and then multiplied by the sex-specific year 0 survival rate to come to the number of survivors into the 0 to 4 year cohort. For the open ended cohorts, expected survivors include those surviving into the open cohort from a five year cohort and those who were previously in the open-ended cohort who survive into the projected time period. Thus, the 85+survivors in time t+1 include the survivors from the 79-84 year cohort and the 85+ cohort in time t. These survivors will then be multiplied by the migration rate for the open-ended cohort to find the projected population of the open-ended cohort in t+1. The projected population is then used as the population used to project the next (t+2) period. Other Notes -8- The next step for this model would involve the implementation of macros to allow the user to specify whether or not they would like to include the migration component to give this dimension of flexibility to the model. This macro would function as a prompt to the user asking to select either no migration or the calculation of migration rates, supplying either a series of zeroes or the migration rates to the reference cells for the projection sheets. Bibliography Klosterman, R. E. 1990. Community Analysis and Planning Techniques. Savage, MD: Rowman & Littlefield Publishers, Inc. Ottensman, J.R. 1986. BASIC Microcomputer Programs for Urban Analysis and Planning. New York: Chapman and Hall. 1 Dr Tim Chapin info / original model available at http://garnet.acns.fsu.edu/~tchapin/urp5261/exercise/Generic%20CC%20Model.xls 2 Chris Jackson Contact: chris.jackson@uleth.ca 3 Dr. Ian MacLachlan Contact: maclachlan@uleth.ca -9-