Small Number Nnote

advertisement
Number: 4
Indian Health Service (IHS) Statistical Note
Prepared by the IHS Division of Program Statistics
Date: September 24, 1993
SUBJECT: Problems in Calculating Rates for a Small Number of Events
Introduction
The Indian Health Service (IHS) is comprised of regional administrative units called Area
Offices. The Area Offices are divided into basic administrative units called service units.
The size of the service units varies considerably from a portion of a county and a
population under 1,000 to multiple counties with a combined population of over 80,000. It
is often necessary to conduct planning activities at the service unit level and sometimes for
smaller geographic areas. This requires calculating morbidity and vital event rates at these
levels. As a result, the number of events involved may be quite small. The purpose of
this statistical note is to describe the problems in calculating rates for a small number of
events and suggest ways of overcoming those problems.
Description of Problem
Calculating morbidity and mortality rates at the service unit level and below should be done
with caution. For small geographic areas, the number of events (e.g., hospital discharges,
births, and deaths) are often quite small. As a result, the use of morbidity and mortality
rates at the service unit level and below may be limited by yearly fluctuations in the rates
which are purely random.
1
Infant mortality rates can be used to illustrate this problem. A service unit's infant mortality
rate for any one year should be considered an estimate of the "true" infant mortality rate.
(The "true" infant mortality rate is considered to be the real probability of an infant death.
It is assumed that the observed number of infant deaths in a geographic area varies by
chance each year.
Therefore, as the number of births increases, less chance is involved
and the observed rate is a better estimate of the "true" rate.) If a service unit has very
few births, the observed infant mortality rate in any one year may be very different from
the true rate.
Addressing the Problem
The confidence interval is the most common method used to assess the adequacy of an
observed rate as an estimate of its true value. Basically a 95 percent confidence interval
is defined so that the probability is 95 percent that the true rate is included in the interval.
If the interval is very wide, the estimate of the true rate is not very useful. The interval
generally becomes narrower as the number of occurrences (e.g., births) upon which the
rate is based increases.
The formula and an example for a 95 percent confidence interval are shown below.
Let
r = rate (e.g., infant mortality rate)
n = denominator upon which rate is based
(e.g., number of live births)
p = per population factor
(e.g., if rate is per 1,000 live births, p = 1,000;
if rate is per 100,000 population, p = 100,000)
2
The limits of the 95% confidence interval are:
_____
upper limit (ul): r + 1.96 \/p*r/n
_____
lower limit (ll): r - 1.96 \/p*r/n
For example, the confidence interval for an infant mortality rate based on 10 infant
deaths and 300 live births is:
r = 10/300 * 1,000 = 33.3 per 1,000 live births
_____________
ul: 33.3 + 1.96 \/1,000*33.3/300 = 33.3 + 20.6 = 53.9
_____________
ll: 33.3 - 1.96 \/1,000*33.3/300 = 33.3 - 20.6 = 12.7
The interval (12.7, 53.9) shows that the area's true rate is not known with much
precision.
Suppose the numbers of births and infant deaths increased tenfold. Then:
r = 100/3,000 * 1,000 = 33.3 per 1,000 live births
______________
ul: 33.3 + 1.96 \/1,000*33.3/3,000 = 33.3 + 6.5 = 39.8
______________
ll: 33.3 - 1.96 \/1,000*33.3/3,000 = 33.3 - 6.5 = 26.8
The interval (26.8, 39.8) is much narrower than the one in the first situation.
3
Two common methods of increasing the number of occurrences are to combine years and
to combine smaller geographic areas into larger ones. The IHS Division of Program
Statistics normally uses three years of data to calculate rates, and places more emphasis
on Area-level rates than on service unit-level rates. Caution should be taken not to
combine too many years of data in attempting to calculate a stable rate. The resulting rate
will be misleading if conditions changed during the time period covered by the rate.
A useful rule is that any rate based on fewer than 20 events in the numerator will have a
95 percent confidence interval about as wide as the rate itself (i.e., from .5 times the rate
to 1.5 times the rate). Therefore, it is preferred not to calculate rates involving fewer than
20 events which is often the case for service units. The Indian population size that would
result in 20 events would vary based on the likelihood of the event occurring. Since
maternal deaths occur on a relatively infrequent basis in the Indian population, no service
unit would have 20 such deaths even over a three-year period. Infant mortality occurs at
a greater rate, and therefore there would be some service units with at least 20 infant
deaths during a three-year period. If it is necessary to publish a rate based on fewer than
20 events, then there should be a note cautioning the reader that the rate is imprecise.
4
Download