Statistical Considerations in Rate Comparisons

advertisement
Note: Javascript is disabled or is not supported by your browser. All content is viewable but it
will not display as intended.
Skip to global menu 5 Skip to local menu 2 Skip to content 3 Skip to footer 6
Advanced




























Topics:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
All
Mobile | Inicio en español | Text Size: Font Larger Font Smaller


Home
About Us
o Organization Chart
o Visitor Information
o Volunteer with DSHS
o Site Map
o
o
o
o
o
o
o
o
o
o
o

Commissioner
Legislative Information
DSHS Council
Advisory Committees Lists
Library Resources
Customer Service
Contractor Resources
Contracts and Budgets
Data and Reports
More...
News
o
o
o
o
Press Office
News Releases
News Updates

I am a...
o Health Professional
o Public Citizen
o Parent
o Licensee
o DSHS Contractor
o eGrants User
o Student
o DSHS Job Applicant
o News Media Representative
o Government Official
o More...

I want to...
o Prepare for an Emergency
o Obtain/Renew a Professional License
o Find Information About EMS
o Get a Birth or Death Certificate
o Get information about immunizations
o Learn about WIC
o Find a Mental Health Facility
o Learn about funding opportunities
o Learn about doing business with DSHS
o Access eGrants
o Search jobs
o Contact Customer Service
o More...

Resources
o Calendar of Events
o
o
o
o
o
o
o
o
o

Open Meetings
Disease Reporting
Forms and Literature Catalog
Library Resources
Funding Information Center
Research Articles by DSHS Staff
Find Services
o Mental Health Services Search
o Substance Abuse Services Search
o DSHS Laboratory
o Health Service Regions
o Texas Local Public Health Organizations
o Other Health Sites
o
Rates are necessary for any assessment of public health measures, because the number of
occurrences in a population being studied will obviously depend on the size of the population at
risk. Estimated rates that are based on data for small populations, or where the event under
consideration is relatively rare, will exhibit considerable variability. Comparisons or rankings
based on these rates should be evaluated carefully before conclusions are reached.
An example is given here to illustrate some of the issues involved in rate comparisons and to
clarify some statistical terms.
Assume that the underlying rate of teen pregnancy is 30 per 1,000 females aged 13 to 19. The
underlying rate is the value that is presumed to apply for the circumstances of the population
being studied. The actual number of pregnancies observed may be considered as one of a large
series of possible results that could have arisen, because of random variation. The rate is
arbitrarily given as "30 per 1000" so that the numerator is scaled to a convenient size. The rate is
actually 0.03 and can be regarded as the probability that a woman in the specified age group will
become pregnant in that year, under the prevailing conditions. Because of random variability in
the process, in a given year, in a county with 500 females in that age group, and with a
underlying pregnancy rate of 0.03, there is a more than 11% chance of seeing 10 or fewer teen
pregnancies and a more than 12% chance of seeing 20 or more. These probabilities are calculated
using the binomial distribution. (Although the assumptions for applying the binomial distribution
may not be strictly met, it is generally regarded as appropriate to apply it in this situation.)
Since we do not know the true underlying rate, we estimate it using the data we have available.
Using the example above, if 10 pregnancies are observed in 500 women, the teen pregnancy rate
would be estimated as 20 per 1000, while if 20 pregnancies occurred, the estimated rate would be
40 per 1000. For the example given, neither of these estimates is a very good match for the "true"
rate, and could result in misleading conclusions, and yet each of the estimates is quite likely to be
made on the basis of one year's observations on the subject 500 women. Obviously, if the
population "at risk" were substantially greater than 500, the variability in estimated rates would
not be so large. The problem is that in over one third of Texas counties there are less than 500
women in the relevant age group. Practical solutions to this problem include aggregating several
years' data or combining several counties' populations. These choices are only appropriate if the
populations are sufficiently similar or no changes that may affect the rate occur over time.
A statistical solution is the use of confidence intervals, presented with the point estimate for the
rate and giving an estimate of upper and lower bounds for the true rate at the desired level of
confidence (commonly 95%). The larger the population "at risk", the smaller the confidence
interval will be and the more precise the estimated rate. Comparisons between populations can
be based on confidence intervals for their rates. If the intervals overlap, we can not say with the
required certainty that the true rates for the two populations are different, even if the difference
in estimated rates appears to be quite large, because the observed difference could have occurred
by chance.
There is another side to this problem. Confidence intervals for estimated rates for very large
counties will be small, and unlikely to overlap, even if the numerical difference between the rates
is very small. In this situation, a statistically significant difference can be demonstrated between
the rates, although the practical difference may have little meaning in a public health context.
This discussion is intended to illustrate the concepts involved in making rate comparisons; it is
intended to illustrate principles and not specific statistical procedures. There are many more
sophisticated statistical methods for making rate comparisons, especially among multiple
populations. The example given here involved the binomial distribution; in this case, the size of
both the numerator and the denominator affect the rate variability. For some measures,
specifically mortality rates, where the probability of occurrence is very small, e.g. 20 per
100,000, the Poisson distribution is commonly used. Poisson confidence intervals depend on the
number of events only, but the population at risk has to be sufficiently large to result in
reasonable number of events.
Selection of a minimum "sample size" to control variability in rate estimates, based on either the
population at risk or the number of events, is not a straightforward matter. Commonly used
standards include requiring greater than 20 deaths or sometimes denominators greater than 20
when rates are larger, as in some natality measures. Even with these restrictions, considerable
variation in estimated rates will be seen, and user judgment must always be a factor in reaching
conclusions based on estimated rates.
Last updated April 10, 2015
Contact Us | Visitor Information | Site Map | Search | Topics A-Z | Compact with Texans | File
Viewing Information
Internet Policy | HHS Agencies | Homeland Security | Statewide Search | Texas.gov | Privacy
Practices
Download