Number: 4 Indian Health Service (IHS) Statistical Note Prepared by the IHS Division of Program Statistics Date: September 24, 1993 SUBJECT: Problems in Calculating Rates for a Small Number of Events Introduction The Indian Health Service (IHS) is comprised of regional administrative units called Area Offices. The Area Offices are divided into basic administrative units called service units. The size of the service units varies considerably from a portion of a county and a population under 1,000 to multiple counties with a combined population of over 80,000. It is often necessary to conduct planning activities at the service unit level and sometimes for smaller geographic areas. This requires calculating morbidity and vital event rates at these levels. As a result, the number of events involved may be quite small. The purpose of this statistical note is to describe the problems in calculating rates for a small number of events and suggest ways of overcoming those problems. Description of Problem Calculating morbidity and mortality rates at the service unit level and below should be done with caution. For small geographic areas, the number of events (e.g., hospital discharges, births, and deaths) are often quite small. As a result, the use of morbidity and mortality rates at the service unit level and below may be limited by yearly fluctuations in the rates which are purely random. 1 Infant mortality rates can be used to illustrate this problem. A service unit's infant mortality rate for any one year should be considered an estimate of the "true" infant mortality rate. (The "true" infant mortality rate is considered to be the real probability of an infant death. It is assumed that the observed number of infant deaths in a geographic area varies by chance each year. Therefore, as the number of births increases, less chance is involved and the observed rate is a better estimate of the "true" rate.) If a service unit has very few births, the observed infant mortality rate in any one year may be very different from the true rate. Addressing the Problem The confidence interval is the most common method used to assess the adequacy of an observed rate as an estimate of its true value. Basically a 95 percent confidence interval is defined so that the probability is 95 percent that the true rate is included in the interval. If the interval is very wide, the estimate of the true rate is not very useful. The interval generally becomes narrower as the number of occurrences (e.g., births) upon which the rate is based increases. The formula and an example for a 95 percent confidence interval are shown below. Let r = rate (e.g., infant mortality rate) n = denominator upon which rate is based (e.g., number of live births) p = per population factor (e.g., if rate is per 1,000 live births, p = 1,000; if rate is per 100,000 population, p = 100,000) 2 The limits of the 95% confidence interval are: _____ upper limit (ul): r + 1.96 \/p*r/n _____ lower limit (ll): r - 1.96 \/p*r/n For example, the confidence interval for an infant mortality rate based on 10 infant deaths and 300 live births is: r = 10/300 * 1,000 = 33.3 per 1,000 live births _____________ ul: 33.3 + 1.96 \/1,000*33.3/300 = 33.3 + 20.6 = 53.9 _____________ ll: 33.3 - 1.96 \/1,000*33.3/300 = 33.3 - 20.6 = 12.7 The interval (12.7, 53.9) shows that the area's true rate is not known with much precision. Suppose the numbers of births and infant deaths increased tenfold. Then: r = 100/3,000 * 1,000 = 33.3 per 1,000 live births ______________ ul: 33.3 + 1.96 \/1,000*33.3/3,000 = 33.3 + 6.5 = 39.8 ______________ ll: 33.3 - 1.96 \/1,000*33.3/3,000 = 33.3 - 6.5 = 26.8 The interval (26.8, 39.8) is much narrower than the one in the first situation. 3 Two common methods of increasing the number of occurrences are to combine years and to combine smaller geographic areas into larger ones. The IHS Division of Program Statistics normally uses three years of data to calculate rates, and places more emphasis on Area-level rates than on service unit-level rates. Caution should be taken not to combine too many years of data in attempting to calculate a stable rate. The resulting rate will be misleading if conditions changed during the time period covered by the rate. A useful rule is that any rate based on fewer than 20 events in the numerator will have a 95 percent confidence interval about as wide as the rate itself (i.e., from .5 times the rate to 1.5 times the rate). Therefore, it is preferred not to calculate rates involving fewer than 20 events which is often the case for service units. The Indian population size that would result in 20 events would vary based on the likelihood of the event occurring. Since maternal deaths occur on a relatively infrequent basis in the Indian population, no service unit would have 20 such deaths even over a three-year period. Infant mortality occurs at a greater rate, and therefore there would be some service units with at least 20 infant deaths during a three-year period. If it is necessary to publish a rate based on fewer than 20 events, then there should be a note cautioning the reader that the rate is imprecise. 4