MAXIMUM UTILIZATION ANALYZING SURVIVAL SIDNEY J CUTLER, M.A., OF THE LIFE TABLE METHOD IN AND (Receivedfor publicationAug. 18, 1958) FRED EDERER, B.S. BETHESD~L,MD. of patient survival is necessary for the evaluation of 44EASUREMENT treatment of usually fatal chronic diseases. This is particularly true for I cancer. The American College of Surgeons, recognizing this, requires the maintenance of a cancer case registration and follow-up program for approval of a Acceptance of survival as a criterion for measuring hospital cancer program.’ the effectiveness of cancer therapy is also attested to by the very large number of papers published every year reporting on the survival experience of cancer patients. Although the proportion of patients alive 5 years after diagnosis (S-year survival rate) is the most frequently used index for measuring the efficacy of therapy in cancer, an increasing number of investigators are reporting on the manner in which patient populations are depleted during a period of time, e.g., survival curves. A popular and relatively simple technique for describing survival experience over time is known as the actuarial or life table method. Whereas the method and its uses have been admirably described by a number of authors,2-6 A principal adone important aspect has received relatively little attention. vantage of the life table method is that it makes possible the use of all survival information accumulated up to the closing date of the study. Thus, in computing a S-year survival rate one need not restrict the material to only those patients who entered observation 5 or more years prior to the closing date. We will show that patients who entered observation 4, 3, 2, and even one year prior to the closing date contribute much useful information to the evaluation of 5-year survival. Let us consider a group of patients entering observation continuously beginning with Jan. 1, 1946. Sometime early in 1952, we decide to analyze the survival experience of these patients to obtain a 5-year survival rate. We choose Dec. 31, 1.951, as the closing date, i.e., the follow-up status and survival time of each patient is recorded as of that date. 699 CliTLEK Of the patients entering 1951, only those diagnosed 5 years.* The exposure is shown in Table I. the study during YEAR the in 1946 were exposed time for patients 6 >.ears ended in each of the calendar OF DIAGNOSIS YlT4RS OF I<SPOSURE 5 4 3 2 1 Less be supposed, intuitively, that the patients from 1947 to 1951 are of no value in computing 31, 1951, since each of these patients This, however, is not true. Merrell for whom less than the required should not be discarded through maximum reliable S-year expcsure survival time from utilization is just to show how partial the analysis. for a large of 5 years.t survival >.ears even has is available demonstrated it is possible when primary the objective can be included Data out that patients information method, series observation rate as of Dec. for less than 5 years. have pointed survival OF DYING to 6 to 5 to ‘I to 3 to 2 than 1 survival Wilder’ The information show how much is gained by doing so. ter are used for illustrative of years’ of the life table rates short and Shulman5 TO RISK who entered a S-year was under observation number 31 I 1946 1947 1948 1949 1950 1951 It might on Dec. to the risk of dl-ing for at least entering ‘I‘ADLE CALENDAR J. Chron. Dis. December, 1958 .\ND EDEKEIZ that, to compute longest possible of this paper in the life table from the Connecticut is and to Cancer Regis- purposes.$ THlS ANA\TOMY Oli THE I,IFE T.W,E$ Table II provides patients with through 1951. localized the basic cancer facts, The cases are divided from the date of diagnosis example, a patient died during of patients column the third )‘ear after that left observation (3, 4, or S), according Jan. diagnosis, column 20, 1946, i.e., during to the reason 126 male the period 1946 one for each year of diagnosis. of one year, during each interval concerning during here. (x to x+2).-This in intervals who was diagnosed 31, 1951, diagnosed into 6 cohorts, The columns of Table II are described Column 1. EVeearsAfter Diagnosis elapsed as of Dec. of the kidney, the time F-or and died on Oct. 5, 1948, interval 2-3. The number is entered for removal gives i.e., O-1, 1-2, etc. in the appropriate from observation. Columns 2. Alive at Beginning of In,terval (1,).-The entry on the first line of this column indicates the number of cases alive at diagnosis, i.e., the initial number of patients Column 3. in the cohort. Died During Interval (d,).- Tn practice, other *In this example, date of entry into the study is dnfirxxl as data of diagnosis. refnrrnce dates, such as date of initiation of a particular course) of therapy, may be used. tin the snrirs reported by Wilder, the range of exposure time was from one day to, but not including. 5 years. $We wish to thank Dr. Matthew H. Griswold, Director, Division of Canrrr and Other Chronic Diseasw. Connecticut State Department of Health, for his courtesy in making these data available. BWe borrowed the phrase “anatomy of the life table” from Pearl’s2 excellent textbook Biometry and Medicnl Stntistics. Volume Number 8 6 LIFE TABLE ‘~‘A1~1.11 II. (126 hlalc YEARS (1) X TO >: + ALI\‘Ii NING OSIS SI~R\TVAT. 1 jA7.A FOR SINGLIi E’KAR AT COHORTS HI<(;INDCrRINC; 19-I-h-1951 aml Ikz~Iosetl ,1 OLLOW-UP OF INTIiR\‘AI. 701 SURVIVAL Resitlents \I’ith Localizetl Kidnev Ca~~rcr: Follo\rwl ‘l‘hrough I kc. 31,‘1951) Conwcticrlt AI’TI’R DIAGb METHOD IN ANALYZING INTERVAL WITHDRAWN (4) 1 ALIVE DITRINC; INTERVAL* (5) w x ux Patients diagnosed in 1946 (1946 cohort) 9 o-1 1-2 2-3 3-4 4-5 S-6 t - : - 4 4 1 - 1 - ._____Patients diagnosed in 1947 (1947 cohort) 0-l l-2 2-1 3-2 4-5 18 1: 10 f, 6 Patients diagnosed in 1948 (1948 cohort) 21 10 7 O-I l-2 2-.3 3-4 I 11 1 7 - I 7 - Patients diagnosed in 1949 (1949 cohort) 2”; O-l l-2 2-3 12 3 1 16 , ” 15 Patients diagnosed in 1950 (1950 cohort) _______ 19 13 o-:! l-2! 5 1 : 11 Patients diagnosed in 1951 (19.51 cohort) O-l 1 25 *Alive at closing date of study. ( 8 1 2 ~ I 1.5 alive :Five-year survival date table computing rate. of the study. for 5 of this needed 2 through z 21 10 4 126 DUR- obtained DUR- rate; by cell, 2 - here the NUMBEI TO THE merely survival ‘E 116.5 51.5 30.5 COL. M ‘ of the of the initial . (9) pX Bow . .x 0.60 0.54 0.50 0.44 0.441 P?X 126 patients. II. (P,X THROUGH were Px) END DIAG- PROPORTIOS FROM OF INTERVAL NOSIS SURVIVING C C’MULATIVE Dec. 31, 1951) of Table 0.60 0.90 0.93 0.88 1.00 cohorts the account 6 yearly 0.40 0.10 0.07 0.12 0.00 (8) PK COL. 7) (1 - 3f COL. 6) (7) qx (COL. SURVIVING PROPORTION DYING ‘ROPORTION to complete data 5) COL. OF DYING - w -( COL. RISK EXPOSED E.FFECTIVE it is included cell 1.5 w* W, VG INTERVAI ALIVE WITHDRAWN summing, survival by 2 4 6 NG INTERVAL LOW-UP LOST TO FOL- the 5-year were INTERVAL OF INTERVAL DIED DURING AT BEGINNING line is not at the closing iThis *Columns O-l i-2 2-3 3-4 4-5 5-q (1) XTOXf 1 AFTER DIAGNOSIS YEARS ALIVE III. COMBINED LIFE TABLE AND COMPUTATIONOF S-YEAR SURVIVAL RATE Residents With Localized Kidney Cancer Diagnosed 1946-1951 and Followed Through TABLE (126 Male Connecticut 0 2 Volume 8 Number 6 LIFE TABLE Column 4. enter the Dec. 31, alive. 1951, of patients whose was unknown. The is the time elapsed Thus, line, i.e., a patient interval observed under from the analysis method Dec. follow-up. The interval on their the third II, but In contrast, to assuming of patients year known during Although only by pooling by summing after diagnosis, In practice, the entries the data contributed diagnosis in 1949, and 3 years after A statistical after measure be discussed later. in Columns 1 through COMI’l_‘TATION Similarly, diagnosis. Thus, however, 5 of Table OF SCRVIVAL II, column for all Table summing III for was For cell by cell. 3 for each yearly who died within tabulations one year for 6 individual information resulting from year some after three in 1947, in 1948, of 5 years of Tables in 1948, diagnosed diagnosed contributes a period we will explain each by comparing to have died in the second 18 patients cohort during cohort information 3), one was diagnosed in precision III for each of 126 cases would be tabulated known each diagnosed that, survival For example, of the from obser- in this to show how much data. we date, from observation line of Column cohort column withdrew all 6 cohorts. of the 21 patients survival of the gain First, on of 47 cases 2, Column diagnosis; of patient provided sur- follow-up all patients Note 2-3. of cases on the closing as withdrawals by summing procedure Line to that of lost this alive patients in Table of the 5 patients III, one in 1950. to our knowledge than subsequent to similar omission are entered (1946) the total to the pooled (Table were alive 4 years alive lost to be on the fourth that was For example, on the first rather \JTe used the latter II and II I, we find that after these information for the pooled III, assumed cases been interval all the information as in Table have by dashes) cohort in Table 11, we obtained of diagnosis, shown in Table III. the cohorts is entered Interval (w,).-In to one of the cohorts example, cohorts. patient from date of diagnosis the that which we used the available obtained directly, for each we date, to date last known complete 31, 1951, are recorded zeros (symbolized the last. a full 5 years, column closing to that for cases with complete date of diagnosis. in 1949 alad alive on Dec. in Table intervals of lost Withdrawn Alive During 5. number 31, 1951. during it is usually of lost cases was similar depends this as of the of observation experience is equivalent vival experience information. vation length status for 3 years and 4 months the life table cases remaining the survival 3-4. In applying Column 703 SURVIVAL from date of diagnosis date of last contact, the survival enter IN ANALYZING Lost to Follow-up During Interval (u,)“. -In number to follow-up METHOD 6 7 were information diagnosis. this procedure will how the basic data summarized are used to compute survival rates. RATES The first step in preparing a life table is to distribute the deaths, losses, and This withdrawals with respect to the interval in which they left 0bservation.t __~__. *WF!al’e using the letter “u” to represent “untraced” cases, rather than the letter “1” which comes to mind as it symbol for “lost” cases, because “1” is a standard life table notation for “alive at beginning of interval.” tFor a detailed account of the mechanics of recording and tabulating survival data, see Berkson and Gage,” pp. 4-5. 704 CUTLER iuformatiou entries which is summarized iu (‘olumus is entered this column iu (~oiumns 3, 4, wd 5 For year the by according 4 + number subtracting alive from Column 6. that patients at the the 11, + it1 W,). beginning number losses, of the second alive at the and withdrawals equally year beginning during (60) was of the first the first year to assume number during that the date of diagnosis exposed 7. as the probability ----- the calendar for one-half to risk of the year, the year. year during For 31, 1951 for these example, (withdrawn 15 patients 1951 and that, on the year. is obtained one-half by subtracting from the sum of the number the lost (UX + w,) /2. Proportion Dying During Interval (q,).-This of dying is assumed were exposed Thus, 1,’ = 1, Column the interval.* 15 were alive on Dec. during was observed alive at the beginning and withdrawn for each follo\v- during an interval for one-half iu 1951, distributed each patient effective from observation on the average, diagnosed It is reasonable was roughly The (d, + E$ective Number Exposed to Risk of Dying (I,‘).-It lost or withdrawn of the 25 patients number in the study,, Sue-c-cssive clltries is completed by a series of four computations 6 through 9). to the risk of dying, average, The sum of the 11 I. of cases 15). The life table up year (Columns alive). uumber to the formula: (126)) the sum of the deaths, (47 + the total the first liue of (~01um11 2 (126 (~;\sw). on are obtained example, 3, 4, and 5 of Table equals Ix+1 = 1, obtained J. Chron. Dis. December, 1958 AND EDERER the interval. It is obtained is also referred to by dividing the *The computing procedure given here is based on the assumption that, for cases wit,hdrawn alive and cases lost to follow-up, survival subsequent to date of last contact is similar to that for cases with complete follow-up information. For cases withdrawn alive, this assumption introduces no bias, brcause there is no reason to believe that patients alive on the closing date are different from patients observed for a longer period. However, for cases lost to follow-up, this assumption may introduce a bias. Patients lost to follow-up were alive when last observed, and whether their survival experience is better than, worse than, or equal to the survival of patients remaining under follow-up is highly Farspeculative. For example, cancer patients may be lost to follow-up for a variety of reasons. advanced cases may leave their usual place of residence to enter the household of a relative; successfully treated patients may stop reporting to t.he tumor clinic, because they feel that no further medical care is required. It is therefore important to keep the proportion of cases lost to foliow-up at a minimum. Survival rates based on a series in which a substant,ial proportion of paGent have been lost to follow-up are of highly questionable value, because it is impossible to determine the extent to which they are biased. Some investigators, such as Paterson and Tod8 recommend that lost cases be counted as dead “to avoid undesirable uncertainty. although (it) may result in a slight bias against t,he ellicacy of treatment.” Other investigators, such as Ryan and his colleagues9 omit lost cases from the analysis of survival. The latter procedure involves the assumption that from date of diagnosis the survival experience of lost cases is similar to that of cases with complete follow-up. We prefer the first of the several possible assumptions regarding lost cases, namely that subsequent The complete to date of last contact their survival is similar to that for cases with complete follow-up. The omission of lost cases from the computation of survival rates discards available information. Registry assumption that lost cases died immediately after the date of last contact is cont.rary to fact. experience with intensive field investigation of lost cases, which resulted in recovery of some, indicates that such patients often live for several years beyond the initial date of last contact.1D Although cases withdrawn alive and cases lost to follow-up are treated alike in the computations (1) it is import,ant described here, we distinguish between the two in the life table for reasons mentioned: to be aware of the number of cases lost to follow-up because of their potential bias, and (2) other computational methods may treat the two groups differently. Volume Number 8 6 number LIFE of deaths TABLE METHOD by the effective IN ANALYZING number exposed d,._ 70.5 SURVIVAL to risk: qx= 1,” To express as a percentage, Column ternately multiply Proportion 8. as the probability is obtained by 100. Surviving the In,terval &,).-This of surviving by subtracting the proportion px = l‘o express as a percentage, Column 9. is obtained multiply is generally- by cumulatively Note that successive survival entries cumulative rates Although vals of one year after of days, intervals the first intervals. rates the proportion ps x It from unity: Diagnosis surviving give the l-year, II I). a survival of rate. It each interval : The 2-year, 3-year, successive 4-year, cumulative curve. in Table the date of diagnosis, to End survival . . . px.* (Table illustrated III werecarried the life table out in inter- may be set up in terms weeks, months, years, etc. In fact, the life table may be organized in of varying length. For example, one might record experience during ::ear in monthly This type intervals, of presentation during occur described here may be used whatever Gz\IN IN I.‘TILI%ING Istandard the first year. EXPERIENCE error and experience The method thereafter when a large of computing in annual proportion survival rates the size of the intervals. OF (‘OHORTS provides the ma)- be desirable of deaths The From to as the cumulative pz x in drawing the interval to al- rate. qx. in this column the computations during Surviving referred p1 x survival are plotted 1 - is referred or the survival by 100. multiplying px = and S-year dying Proportion Cumulative (I;‘,).-This Interval the interval, WITH a measure PARTIAL of the FOLLOW-UP confidence with which one may interpret a statistical result. Thus, the standard error of the survival rate indicates the extent to which the computed rate may have been influenced by sampling error variati0n.t to and from per cent confidence the same conditions errors on either For example, the computed by adding survival and subtracting rate, one obtains twice the standard an approximate 9.5 interval. This means that in repeated observations under the true survival rate will lie within a range of two standard side of the computed rate, an average of 95 times in 100. Thus, the computed 5-year survival rate for male patients with localized cancer of the kidney is 44 per cent. The standard error, computed according to the method explained iu the Appendix, is 6 per cent. It is therefore likely that *This f~xmula is based on the assumption that the various interval survival probabilities are statistically independent. tThe 126 cases of localized cancer of the kidney are in effect a sample from a population of male patients with localized kidney cancer. An illus,tration of sampling variation may be drawn from baseball. A 0.250 hitter may, in four times “at bat,” get one hit. Frequently, though, he will ge6 no hits or two hits. And not too infrequently he will get three hits. If we watch a game and see a player get two hits in four times “at bat,” it is difticult for us to judge how good a hit,ter this player really is. We have to watch this player for many games before we can get a reliable estimate of his batting average. Survival rates are similar to batting averages in the sense that they are relative frequencies, i.e.. the numerator is part of the denominator. For each hit there must be at least one time “at bat.” and for each death there must be at least, one case exposed to the risk of dying, CUTLER the true 5-year survival AND EDEKEK rate is not smaller than 32 per cent not larger and than 56 per cent. Admittedly, the computed rate does not yield a veq. firm estimate true survival rate, but we must bear in mind that it was based 126 patients and only 9 of these patients were diagnosed on a of the series of onl? a full 5 >.ears prior to the date of study. Furthermore, whereas the survival rate based on all informaon these 126 patients provides at least a rough idea of the true rate tion available (one-third to one-half), discarding follow-up information explained in the discussion The would result computing in Table III, method can be applied fore used it to compute larger that patient the information on the cohorts in an extremely unreliable with partial estimate. This is follows. applied to the total to any selected a series series portion of S-year of 126 cases, of the group. survival rates illustrated We have thereon successively based cohorts. A S-year survival rate was computed for the 9 patients diagnosed in 1946, all of whom had a S-J-ear exposure time. \Ve then added the 1X patients diagnosed in 1947, who had a 4-year exposure time, and computed a S-year survival This procedure was utilized and their Table rate based the on was continued in estimating corresponding available until the the S-year standard information known for these experience survival 27 patients. of all 126 patients rate. The successive rates in the uppermost section of errors are shown IV. The 1946 cohort a standard of 9 cases yielded The error of 17 per cent. very unreliable estimate; a 5-year survival large standard the true rate is probabl>- rate of 53 per cent, error tells between us that with this is a 19 and X7 per cent,* a very wide range. The combined experience of the 1946 and 1947 cohorts yielded rate of 46 per cent, with a standard error of 10 per cent. Thus, the a survival addition error of information on cases with 4 full years of exposure reduced 17 per cent to 10 per cent, of 43 per cent. of the available information from addition a relative on cohort decrease 1948 (3 full years the standard The of exposure) reduces the standard error to 7.5 per cent, etc. The utilization of all available information on all the cohorts results in a standard error of 6.0 per cent. Thus, the standard error of the survival per cent less than the standard We series then computed of successively groups of patients: breast cancer, rate survival enlarged kidney in women; based error based rates cohorts cancer breast on and corresponding of patients with regional cancer with of varying size and with in Fig. 1. rate than based on varying mortality In ever)’ instance, the combined the corresponding experience standard are the 95 per cent confidence 53 * Z(17). errors for additional in women; IV). \I’e did this ill order to experience for patient groups The results are shown error of the 5-year 1946 through error for the 1946 cohort limits: is 6.5 in men; localized involvement, experience. of cohorts of four involvement, the standard --__*These information standard for each regional and cancer of the lip, both sexes combined (Table illustrate the advantage of utilizing all available graphically all available on cases with a full 5 J-ears of exposure. 1951 by at least survival is smaller one-third \‘olume 8 Number 6 LIFE TABLE METHOD IN ANALl-ZING 707 SURVIVAL 'I‘ABLE IV. FIVE-YEAR SUR\-IVAL KATES AXD -THEIR STANDARD ERRORS FOR FIVE GROWS OF CANCER PATIENTS, SHOWING THE ~?DuCTIOK IN STANDARD ERROR \%‘ITHINCREASE IN COHORT SIZE COHORT NUMBEROF CASES DIAGNOSED 5-YEAR SURVIVAL RATE STANDARD ERROR OF 5-YEAR SURVIVAL RATE PER CENT REDUCTION INSTANDARD ERROROF~-YEAR SURVIVAL RATE Kidney, localized 1946 1946m-1947 1946.-1948 1946~-1949 1946m.1950 1946-.1951 0.53 0.46 0.43 0.43 0.45 0.44 2: 48 82 101 126 h 1946 1946-.1947 1946~-1948 1946m-1949 1946~-1950 1946m.1951 idney, regional 0.18 0.33 0.28 0.25 0.23 0.24 11 23 30 39 0.171 0.098 0.075 0.064 0.063 0.060 0.116 0.101 0.091 0.071 0.069 0.070 13 22 36 41 40 Breast, localized 1946 1946m.1947 1946- 1948 1946~-1949 1946&l 950 1946~-1951 0.64 0.64 0.64 0.64 0.65 0.65 225 454 695 963 1,227 1,490 Bread, 1946 1946m.1947 1946- 1948 1946- 1949 1946--1950 1946m.1951 208 443 708 0.033 0.025 0.023 0.022 0.022 0.021 24 30 33 33 36 regional 0.42 0.38 0.39 0.035 0.025 0.021 0.020 0 020 0.020 29 40 43 43 43 Lip 1946 1946.-1947 1946~-1948 1946--l 949 1946--1950 1946.-1951 61 109 i69 224 283 332 0.71 0.65 0.68 0.68 0.68 0.67 0.060 0.048 0.042 0.040 0.040 0.039 20 30 33 33 35 J. Chron. Dis. December, 1958 708 The advantage of utilizing >rears of exposure informatiol1 was greater This is because: (1) particularly groups. were diagnosed in the first >.ear (1946), in each of the subsequent of follow-up (0.40) years; was much on patient for localized kidney cohorts cancer for few cases (9) of localized compared with and (2) the mortalit>, larger lvith less than than than of 23 cases rate during in succeeding l’ears 5 other kitlne~~ ca1lcer average an the the (annual first >.ear average of Thus, because of this mortality pattern, the information on surviv:il 0.07). during the first year after diagnosis contributed ver). substantially to the information on survival over a S-year period. Five of the 6 annual cohorts (1946-1950) contributed complete information In general, the relative agnosis. cohorts with relative increase of added the mortality partial survival and lung or stomach that little results. tends cancer, is gained and diseases, to taper (3) the may data be inadequate, pattern may at a later For relatively some before be unnecessary to evaluate the effects important information. for only 5 years (1) the completeness magnitude of the such within evaluating as one year therapeutic to wait until a .5-year sur- of therapy. A I-year, In other because high shortI> diseases, be so depleted vival rate can be computed or 3-year rate may provide mortalit)- relative is often more than one year it may frequently with: intervals. mortality group directi), (2) the relative off thereafter. the patient by waiting Therefore, will vary the first few follow-up and in other diagnosis information size of the cohort*; information; during In cancer, after follow-up in the initial rates regarding survival during the first year after digain in utiliziug survival information on patient instances, of significant 2-year, survival changes in the time.” DISCUSSION A category tionally chosen all available of patients information of localized kidney cause it is frequently groups in combination of patients desirable with radiation, Similarly, with of survival if survival breast find that therapy 126 cases This was done be- experience cancer of relative11 in evaluating the treated by surgery in any one year the number is small. As an illustration, in 1946 were treated is to be evaluated was intenof utilizing rates-only in 6 years. the survival localized in Connecticut per year the advantage if we were interested we would the combined of the 225 cases diagnosed therapy. to describe For example, of patients receiving few new cases to illustrate in men were diagnosed of patients. experience relatively example for the computation cancer small survival with as the principal onlv 25 b,. the combined for a specific subgroup with respect to age, the number of cases per year would usually be small. Therefore, in order to increase the reliability of survival rates computed for various patient groups of clinical interest, It is of paramount computing trial. survival For example, in a clinical ___-- trial. *See the Appendix rates it is important importance if the rates a 3-year survival It may be possible for a discussion to utilize all available to use all available information. survival information are going to be used as criteria rate may have been selected to determine of effective sample size. in in a clinical as a criterion which of the several treatments VoIume x Kumber 709 1.1~~ TABLE METHOD IN ANALYZING SURVIVAL 6 KIDNEY KIDNEY,LOCALIZED ,REGIONAL 1 1.00 [ ‘.O(’ 1 .90 - 90 F .eo - 2 ,712 - 2 70 t &I3 ,, 2 .60 - -I 2 2 .5’l w r- 1 - 2 40- : w > A .SD - 20 - 80 - & .50 2 40-T 5 w > 30- In 20-(, IO J 1946 l946- 1946- ,946- ,946- ,946- 1947 1946 1949 1950 1951 BREAST, - ” 1946 LOCALIZED l946- l946- 1946- l946- lS46- 1947 1946 1949 1950 1951 BREAST ,REGIONAL 1.00 I .90 w -00 5 lx .70 i 4’ > .60 2 50- z oz 40- g 3o 1;1 .20 0- - - I 0’ 1946 l946- l946- 1946- l946- l946- 1947 1948 1949 1950 1951 l946- 1946- l946- l946- l946- 1946- 1947 1946 1949 1950 1951 LIP .J I I I: i 0' 1946 1946- lS46- 1946- ,S46- ,946- 1947 1948 1949 1950 195, Pig. I.-Decrease in the 9-5 per cent confidence interval for the B-year survival rate as cases with less than 5 years’ exposure to the risk of dying we added. (The 95 per cent confidence interval is ohtained by adding -2 standard errors to the survival rate.) Source: Table Iv. 710 CGTLER being tested yields the best survival death or for a full 3 years. the earliest possible survival including data survival information to survive of reduction patients the life table for cancer had the opportunity terms before all patients inferior treatments have been followed would be discontinued to at point. We have illustrated S-year Thus, J. Chron. Uis. December, 1958 AND EDERER in standard in this paper, method patients, for computing emphasizing on cases which survival the advantage entered rates gained with b) the series too late to have a full 5 years. error the reduction The advantage is measured in of the survival rate. For the five series of in standard error ranged from one-third to two-thirds. REFERENCES 1. 2. Manual for Cancer Programs, Bull. &4m. Coil. Surgeons 38:149, 1953. Pearl, I~.: Introduction to Medical Biometry and Statistics, Philadelphia and London, 1 j: New York, 1956, Oxford liniversity Press. Berkson, J., and Gage, R. P.: Calculation of Survival Rates for Cancer, Proc. Staff Meet., Mayo Clin. 25:270, 1950. Merrell, M., and Shulman, L. E.: Determination of Prognosis in Chronic Disease, Illustrated by Systemic Lupus Erythematosus, J. CHRON. Drs. 1:12, 1955. Griswold, M. H., Wilder, C. S., Cutler, S. J., and Pollack, E. S.: Cancer in Connecticut, 1935-1951, Hartford, 1955, Connecticut State Department of Health, pp. 112-113. Wilder, C. S.: Estimated Cancer Survival Rates Confirmed, Connecticut Health Bulletin 70:217. 1956. Paterson, 1C.l and Tod, hl. C.: Presentation of the Results of Cancer Treatment, Brit. J. Radlol. 23:146, 1950. Ryan, A. J., et al.: Breast Cancer in Connecticut, 1935-1953, J..4.M.ii. 167:298, 1958. Griswold, M. H.: Personal communication. Cutler, S. J., Griswold, M. H., and Eisenberg, H.: An Interpretation of Sure-it-al Rates: Cancer of the Breast, J. Nat. Cancer Inst. 19:1107, 1957. Greenwood, M.: Reports on Public Health and Medical Subjects, No. 33, .L\ppcndis 1, The “Errors of Sampling” of the Survivorship Tables, London, 1926, H. M. Stationer) Office. 5. 6. 7. 8. 9. ::: 12. 1923, W. B. Saunders Company. Hill, A. B.: Principles of Medical Statistics. APPENDIX Contputing the S’tandnud Error of the 5-Ymv Snrvhd Kate.-The method for con,puting the standard error of the S-year survival rate was developed by Greellwood (see ref. 12) and is also described by Merrell and Shulman (see ref. 5). The formula is Sa = 5 9 rq J qx -__ x = 1 l’, where sg is the standard k-year survival rate is d, = ” i -__ (II \ If1 - error of the S-year survival (12 __ dl + rate. ;- + d? In general, . . + A__, 1’; the standard d; error of the Columns 10 and 11 of Appendix Table I show how the calculation of the standard error of the S-year survival rate is carried out as a continuation of the computation of the sur\-ival rate.* The first 9 columns are a replica of Table III. (1) Subtract d, from I’, for each line (Column 10); (2) divide g, by I’, - d, for each line (Column 11); (3) total the entries in the first 5 lines of Col__--_ *The standard error computed in this illustratkxl is, itself, only ai11estimate of the true standard wror. For PXAnd, since it is based on relatively small numbers of cases, it is not a very reliable estimate. ample, had there been, due to sampling variation, one death in the last interval, rather than rmne, the computed standard error would be 0.0216 rather than 0.0187. - 1 OF x 126 60 38 21 10 4 p’ INTERVAL NING ALIVE AT BEGIN- ; - 47 5 TER\‘AL IN- - it 2 - (4) Ur TERVAL ING IN- UP DUR- TO DURING LOST FOLLOW- DIED WITH- J 15 11 15 7 6 4 (5) WX DURIN( OF DYING COL. 5) (6) I’, 116.5 51.5 30.5 16.5 7.0 w NUMBEI R TO THE - 2 - yi COL. 12 RISK EXPOSED F:FFECTI\‘E ; ( COL. I NTERVP LL - ALIVE DRAWh - 0.40 0.10 0.07 0.12 0.00 3t COL. 6) (7) qx (COL. DYING 1PROPORTION 0.60 0.90 0.93 0.88 1.00 PX (8) (1- car.. 7) SURVIVING PROPORTION - THROUGH . .x 0.60 0.54 0.50 0.44 0.44* (PI x p2 x . ‘r”) x Px) END DIAG- PROPORTIOr J FROM OF INTERVAL NOSIS SURt’IVING C UMULATIVE COMPUTATION OF THE ~-YEAR SURVIVAL RATE AND ITS STANDARD ERROR. (Data from Table III) 3) 69.5 46.5 28.5 14.5 7.0 _____- (10) I’,-d, COL. (COL. 6- 0.0187t 0.0058 0.0022 0.0024 0.0083 0.0000 (11) 10) 7f q x/I .‘-dx COL. (COL. *Five-year survival rate. tThis is the sum of the five entries in Column 11. The square root of this number, when multiplied by the 5-year survival rate, yields the standard error of the 5-year survival rate: s = (0.44) 0.0187 = (0.44) (0.37) = 0.060. $See footnote *, Table III. 1-2 2-3 3-4 4-S 5-61 o-1 (1) XTOX+ YEARS AFTER DIAGNOSIS - ;~PPEKDIX TABLE I. 712 CUTLER AND J. Chron. Dis. December, 1958 EDERER U”lll 11: 0.0187; (4) take the square root of this number: 40.0187 = 0.137; (5) multiply the result by Ps: 0.137 X 0.44 = 0.060. This is the standard error of the S-year survival rate. The standard error of survival rates for end-points other than 5 years is computed similarly. For example, to compute the standard error of the 3.year survival rate, the first three entries in Column 11 must be totaled, the square root taken, and multiplied by P,. Effective Sample S&.-The concept, effectiue sample size, provides another way of assessing the benefit of including in the life table cases with partial survival information. The concept relates to the fact that the reliability of a statistical result depends on the size of the sample, i.e., the number of cases observed. For example, the standard error of a survival rate, P, when all cases have been followed until death or for the required time interval (i.e., no losses from observation or withdrawals alive prior to the cut-off date) is given by the binomial formula In formula where 1, is the sample size, i.e., the initial number of cases. is inversely proportional to the square root of the sample size. ‘r.lI3LE ;IPPENDIS 126 1,4% Kidney, localized Kidney, regional Breast, localized Breast, regional Lip *Since 5 years the cut-off date 1,531 332 was Dec. 31, rligihla for 1951, cases diagnosed the standard error I I SAMPLE 1946-1951 COHORT* (l), SIZE 1946 EFFECTIVE SAMPLE SIZE COHORTj 9 11 225 208 61 68 37 516 595 14.5 in 1947 or later were eligible for less Lhan of observation. tActua1 number of cases 5 years of observation. Let us consider the 1946-1951 localized kidney cancer cohort (.\ppendix Table I), for which the survival rate is 0.44, and its standard error, 0.060. Of the initial 126 cases in this cohort, IVe now ask how a substantial number were withdrawn alive less than 5 years after diagnosis. large a cohort, with a S-year survival rate of 0.44 and with all cases followed to death or for a full 5 years, would have a standard error equal to 0.060. To answer this question, we solve equation (1) for I,, placing a circumflex over the I, to indicate that this is a hypothetical value: Substituting P = 0.44 and s = 0.060, we obtain Had we started with The result, 68, is the eyective sample size, which we interpret as follows. about 68 cases (instead of 126) and followed them all until death or survival for 5 years and found that 44 per cent survived 5 years, then the standard error would have been equal to that we actually obtained in our cohort of 126 cases. Thus, the survival rate we obtained is as reliable as one based on 68 cases. This is in sharp contrast to 9 cases which were eligible for 5 years of observation. These three values are compared for the five cancer groups discussed in the text. In each instance, the effective sample size based on the 1946-1951 cohort is substantially larger than the number of cases eligible for 5 years of observation (1946 cohort).