Epidemiologic Measures of Event Frequency

advertisement
Epidemiologic Measures of Event Frequency
A.
Preliminaries Considerations
1.
Mathematical Aspects of Epidemiologic Measures:
a)
COUNTING (accurately, precisely, and reliably)
b)
NUMERATOR (those with existing or new condition, i.e. “cases”)
c)
DENOMINATOR (population in which existing or new condition is counted)
d)
ESTIMATION (all measures done in samples are used to estimate some “true”
characteristic of the population)
2.
Mathematical Types of Measures
a)
Ratio:
a)
Obtained by dividing one quantity by another without implying any
relationship between the numerator and denominator, i.e. x/y…e.g. (#
apples)/(# oranges).
b)
Range of values: (-) to (+)
c)
Epidemiology Examples: Risk Ratio, Rate Ratio, Odds Ratio; in fact, all
epidemiology measures, which involve numerator and denominator are, by
definition, ratios (including prevalence and incidence)
b)
Proportion
a)
A specific kind of ratio in which elements included in the numerator must also
be included in the denominator, i.e. a/(a+b)…e.g. (# apples)/[(# apples) + (#
oranges)].
b)
All probabilities (including “chance”, “likelihood”) are of this form.
c)
Range of values: (0.00) to (1.00).
d)
Epidemiology Examples: Prevalence, Incidence Proportion, all expressions of
Risk.
c)
Odds
a)
b)
c)
d)
d)
Rate
a)




Another specific ratio used to express the frequency (“likelihood” in a generic
sense) of an event, related to probability but with distinct mathematical
properties.
Defined as the probability of an event occurring divided by the probability of
the event not occurring, i.e. [a/(a+b)] / [b/(a+b)] = a/b
Range of values: (0.00) to (+)
Epidemiology Examples: Prevalence Odds, Incidence Odds, and used in the
construction of the ever-present Odds Ratio.
Strictly defined as a velocity or speed:
A ratio in which there is a distinct numerator-denominator relationship in
which those in the numerator also contribute to the denominator, but they do
so in terms of time (or person-time)
Thus, all true epidemiologic “rates” have time as intrinsic to the
denominator and are expressed with units of person-time in the
denominator i.e. a/(total “person-time” for a+b).
Range of values: (0.00) to (+ )
Epidemiology Example: Incidence Rate, Mortality Rates
1
b)


3.
4.
B.
“Rate” is used in a variety of ways by epidemiologists and others, not all of
which are strictly correct.
For example, to speak of “prevalence rate” is incorrect since prevalence is a
proportion not a rate
For our purposes, use the term “rate” only for measures in which persontime is the unit of measurement in the denominator (either implied or stated).

This means, for example, that 10/10,000/year is not a technically
correct rate expression (i.e. not equivalent to 10/10,000 personyears), although this form is commonly referred to as a “rate”a
Mathematical Form of Epidemiology Measures
a)
Epidemiology measures (except simple counts) should be expressed with numerators
and denominators that promote ease of understanding and comparison:
a)
Numerators should generally have at least one whole integer and no more than
on decimal point
b)
Denominators should be appropriately converted to a factor of 10 (e.g. 100 or
%; 1,000; 10,000; 100,000)
c)
Example: Six events in 6,000 people simplifies, mathematically, to 0.001, but
would be more appropriately expressed in epidemiology as 1/1,000 (or, for
example, 100/100,000 if being compared to another measure with 100,000 as
the denominator.
Time
a)
Time is a crucial component of all epidemiology measures of event frequency and their
interpretation; time should always be specified explicitly
a)
Point in time: point prevalence
b)
Time period: chronologic time over which subjects are observed – incidence
proportion or period prevalence.
c)
Person-time observed: total sum of the amount of time each subject is observed
at risk (expressed in person-years, person-months, person-days etc – where
time is intrinsic to the measure) - incidence rate
Some Introductory Definitions:
1.
Burden:
a)
The amount/frequency of an event or disease in a population; generally referring to
existing events/disease in a defined population at a point in timeb
b)
Estimated by Prevalence measures.
2.
Risk:
a)
Most simply defined as the probability (likelihood/chance) of developing an outcome of
interest (event or disease) over a specified amount of time
b)
Risk is always expressed as the proportion of new events occurring over a specified
time period in a defined population at risk for acquiring the event.
Note that this statement (events per population per time) is a “rate” as defined in the textbook and in Last’s “Dictionary of
Epidemiology”. We will, however, consider a measure a rate only if it has person-time at risk in the denominator (representing the
“population at risk”).
b
One could also consider, although less commonly, “burden” in terms of new event/disease. In this sense burden would refer to
incidence, not prevalence. For example, the occurrence of five new diabetics each year (as number of cases or cases per
population) represents the “burden” of new diabetics who will need educational and support services for their new diagnosis.
2
a
c)
3.
Population
a)
Definition: A collection of individuals (usually people) sharing a specified characteristic
or set of characteristics (usually includes specification of time and geography)
a)
Example 1: All people living in the Portland metropolitan area on Jan. 1, 2006
b)
Example 2: All men ages 30-39 in the Portland metropolitan area on Jan. 1,
2006 who were followed for a year
c)
Example 3: All entering HIP students during the years 2005-2010 followed
through the five years after graduation.
b)
Open Population: a defined population that may gain membership through birth or
immigration or may lose membership through emigration or death from causes other
than the one under study.
a)
Example 1: Those living in Multnomah County during a study of cancer
incidence over a five-year period.
b)
Example 2: The Nurses Health Study, a cohort study across a long enough time
period that some are lost to follow-up or die.
c)
d)
4.
Risk is estimated by Incidence, calculated as either a Rate or a Proportion (see below)
Closed Population: a population that neither adds nor loses membership over the course
of time
a)
Example: Outbreak investigation for a small gathering where complete followup of attendees is feasible.
b)
Example: Clinical studies, using small cohorts followed over a short time that
allows for complete follow-up (although, these are uncommon and too much
loss to follow-up will convert these into “open” populations)
Population At-Risk: a specified population of individuals capable of acquiring the
condition or event of interest. This term does not refer to those with one or more risk
factor, i.e. “at-risk” does not mean “high risk”.
a)
Relevant for all cohort and follow-up studies (including randomized trials).
b)
Conceptualized in terms of either number of people or amount of time
c)
Example: For uterine cancer the population at risk would include all women
(no men) with a uterus (excludes those with a hysterectomy) who do not
already have uterine cancer.
d)
The actual population studied may be further restricted to a specific group of
interest within the broader “at risk” population, e.g. for uterine cancer the study
population might be restricted to only adult women in a specified age range.]
Sample (estimation)
a)
Because we can never adequately observe all (or all possible) members of a population,
we select a sample (smaller group) of individuals from the population to “represent” the
population and the characteristics and experiences of that population.
b)
Representative Sample: Investigators have a number of ways to obtain samples to
adequately “represent” the population (e.g. random samples); the important points are:
a)
Random samples are ideal for the statistics but often impracticable.
b)
It is critical to have a sampling method that maximizes the ability to obtain a
representative sample.
c)
How samples are selected, recruited to participate, and retained provide clues
to whether the sample represents the stated population of interest (or,
conversely, what population it represents).
3
C.
Measures of Frequency
1.
Counts
a)
Definition: the number of affected individuals who either have (existing – prevalent) or
acquire (new – incident) the condition/event (i.e. includes numerator events only).
b)
Uses:
a)
Resource allocation: defining the number of existing or new cases that will
need services (e.g. 100 people who need services will need those services
whether they come from a small population or a large one).
b)
Identifying trends (i.e. comparisons across time) when we can legitimately
assume that the underlying population changes little.
c)
Issues:
a)
Critical pieces of information:

Adequate definition of “case” (condition or event of interest)

Denominator: making explicit the nature and size of the source population
within which the existing or new events are to be found

Time: the time at or during which the conditions/events are counted
b)
Difficult to use count data when making certain comparisons across different
groups/populations or when evaluating frequency in a sample that we want to
generalize to the population.
2.
PREVALENCE:
a)
Point Prevalence: the proportion in a population with a particular existing condition
(prevalent cases) at a specific point in time.
a)
“Point in time”

Usually refers to a general or specific temporal point (e.g. a short survey
period – December, 2004; or a specified date – December 31, 2004)

May also refer to a “point” in the life cycle (e.g. birth, entry into graduate
school, retirement)
b)
Calculation:
(# Existing cases at a point in time) / (total specified population at that point)
c)

d)
b)
Interpretation
The amount (“status” or “burden”) of existing condition in the population at
a given point in time.
Example: A study in metropolitan Atlanta in 1996 identified 577 children (ages
3-10) with autism in a population of 169,710 white children, yielding a
prevalence of 3.4/1,000.
Period Prevalence: the proportion in a population with a particular existing condition
at any time during a specified time-period.
a)
This mixes prevalent (existing) and incident (new) cases

New cases that develop during the period become “existing cases” and are
added to the cases present at the beginning of the period

Any cases “existing” at any time during the observation period will be
included even if the condition resolves during that period.
4
b)


c)
Uses:
For ambiguously defined conditions or those with ambiguous onset, this may
allow the capture of cases that exist but haven’t quite met the threshold of
definition (e.g. mental health conditions)
For acute short-duration conditions where point prevalence would be low
and would not capture the extent of occurrence; most of the cases are likely
to be “new” and thus come close to estimating incidence.
Calculation:
(# Existing cases at any time during a time period) / (total specified population)
d)


e)
c)
Interpretation:
The amount (“status” or “burden”) of existing condition in the population at
any point within a specified time period.
Life-time prevalence is a special application of period prevalence
Example: In a sample of U.S. adults (ages 18-44 years), 7.7% reported having
had a serious mental health disorder at some point during the prior 12 months
Prevalence Odds: the odds of occurrence of an existing condition in a population at a
specified point in time.
a)
An alternate way to assess/measure frequency of existing events in a
population (prevalence)
b)
Calculation:
(# with Existing Condition) / (# without Existing Condition)
c)
d)
e)
d)
Interpretation: The odds of having a condition at a point in time in a specific
population.
Uses: For constructing Odds Ratios in cross-sectional and case-control studies
Example: In a group of 250 subjects, 125 were exposed (had the event –
exposure – of interest); the odds of exposure is 1:1 or 1.0 (125/125)
Issues for Prevalence measures:
a)
Need clear definition (and means of operationalizing that definition) of who to
include in the numerator (“case” definition) and the relevant denominator (the
population within which “cases” exist).
b)
Define and state the specific time (point or period) involved.
c)
Prevalence is determined by factors that affect:

How fast new disease is added to the population (i.e. incidence)

How long the disease “exists” in the population (i.e. average duration until
resolution, cure, or death)
d)
Prevalence is not a measure of “risk” (probability) of getting the disease
(although it is related to risk as noted above).
5
3.
INCIDENCE:
a)
General Definition: the occurrence of new events or cases that develop in a population
at risk during a specified time interval.
Incidence = (# New Events observed over time) / (Population at Risk observed over time)
b)
Population at Risk refers to the population comprised of individuals who are capable
of becoming new cases (i.e. free of the condition under consideration and can get it).
This definition can be thought of in one of two ways:

Person-based definition (at risk persons observed): the population of all
individuals capable of acquiring the event or condition at the beginning of an
observation period. This only works when all individuals can be observed
throughout the specified time interval and for the same amount of time (i.e.
complete follow-up in a closed population)

Time-based definition (at risk person-time observed): The sum total of
every individual’s observed time-at-risk, i.e. the event-free time during
which each individual is observed. Individuals start contributing person-time
when they enter the population or observed sample and they stop
contributing person-time when they leave the at-risk population (by loss to
follow up, death, or developing the event of interest)c


c)
c
Example: 100 initially at-risk individuals, all followed for 5 years with one
new event at the end of each year (5 total):

Person-based population-at-risk: 100 people followed for 5 years

Time-based population at risk: (95 followed for 5 yrs)+(1 for 1
yr)+(1 for 2 yrs)+(1 for 3 yrs)+(1 for 4 yrs)+(1 for 5 yrs) =
(95*5)+(1)+(2)+(3)+(4)+(5)=490 person-years – which translates
approximately to 490 people followed, on average, for 1 year
NOTE: time is essential for both formulations of population at risk, but it is
‘outside’ the formulation as people and ‘inside’ the formulation as time (or,
more correctly, as people-time)
Two forms of Incidence – Because of these two different formulations of the
population at risk (the denominator for incidence) we have two different ways of
constructing incidence measures:

Incidence Proportion (IP) – population at risk is the total number of atrisk people observed over the observation period.

Incidence Rate (IR) – population at risk is the total amount of at-risk
time observed over the observation period.

Example (from above):
IP = (5 new events)/(100 initially at risk) over 5 years = 5/100 (or
5%) over 5 years (or, on average, 1/100 (or 1%) over 1 year).
Three methods for estimating Person-Time (see supplementary handout): (1) Add together the specific contribution of at-risk
person-time for each individual observed (requires the specific onset of follow-up and event occurrence are available); (2) Credit
each subject getting the event, lost to follow-up, or entering the observed group midstream as contributing, on average, half of the
observation period (requires an assumption that these events occur equally throughout the observation period; (3) Multiply midpoint population by the duration of observation (again, assumes the net change in the population size occurs equally throughout the
period of observation)
6
IR = (5 new events)/(490 person-years) = 1.02/100 person-years
(which may be interpreted as 1.02/100 persons followed on average
for 1 year)
d)
Incidence Proportion (or Cumulative Incidence) – the proportion of individuals in
an at-risk population who develop a condition or event over a specific period of time.
d)
This measure of incidence can only be calculated when of all individuals have
complete follow-up throughout the observation period or are followed for the
same amount of time, i.e. only with closed populations.
e)
Interpretation:

Represents an estimate of the “risk” of developing a disease or condition
within a specified time in a specific population-at-risk.

Expressed as the proportion of an at-risk population experiencing an event or
acquiring a condition over a specified time period.

You can also think of this as the accumulated (thus “cumulative”) effect of
the incidence rate operating on the at-risk population over the specified time
period (see below).
f)
Mathematics:

Numerator: number of new events during a specified time period

Denominator: number of subjects followed throughout the specified period.

Calculation:
(# new events in a specified period) / (population at risk at onset of that period)

g)



h)


(expressed as # events/10X over a year)
Issues for Incidence Proportion
Relevant time period must always be specified (e.g. “over x years”) just as
you would in any Risk statement.
This measure is intuitively appealing but does not account for the fact that
those who become cases during the time period continue to be followed even
though they are no longer “at risk”
Assumptions:

Entire population is “at risk” at the beginning i.e. includes no one
who has the condition (prevalent cases) or who can’t get it

Ascertainment and follow-up are complete with no additions to
or subtractions from the population (i.e. it can only be calculated
in a “closed” population)

All subjects are followed for an equal amount of time (even if not
concurrently)
Examples:
Six hundred people recruited in January 2002, all of whom were followed
until December 2004 and 12 of whom acquired the event. IP = 12/600 over 2
years = 2/100 (2%) over 2 years or, on average, 1/100 (1%) over 1 year.
Eight hundred newborns were recruited between 2000 through 2005.
Twenty-four developed a condition within the first year of life. IP = 24/800
over 1 year = 3/100 over 1 year. [Note: Although the study period is 5 years,
that was the recruitment period for the study subjects; each study subject was
followed for only one year after birth (at least for purposes of this study).]
7
b)
Incidence Rate – the instantaneous rate or speed at which new cases are developing
across a specified amount of observed at-risk time (i.e. in an at-risk population). It is
also known as “Incidence Density”, “Force of Mortality”, or “Force of Morbidity”
b)
The denominator of this measure of incidence includes only the amount of
time during the observation period when observed individuals are at risk. Thus
it can be calculated even when one does not have complete or equal follow-up
times, i.e. it can be calculated for both closed and open populations
c)
Mathematics:

Numerator: number of new events in a specified period of time

Denominator: total person-time in the at-risk subjects observed at any time
during the specified period

Calculation:
(# new cases in a specified period)/(total event-free person-time observed)

d)
e)
f)


(expressed as # cases/10x person-years)
Person-Time d – the sum of every individual’s observed time-at-risk, i.e. the
disease- or outcome-free time during which each subject is observed over the
course of the study period.

This makes time intrinsic to the denominator in a way very
different from the expression of time in an incidence proportion

Although time is most often measured in years (thus personyears), is may be measured in other units (e.g. person-days,
person-weeks, person-months) as appropriate for the event or
condition under consideration
Incidence Rate (person-time) allows us to account for people/subjects who
move in and out of an at-risk observation group (i.e. an open population) or in
and out of exposure categories (e.g. smokers who become non-smokers and
vice-versa).
Interpretation of Incidence Rate:
Research and practice: IR is almost always used as an estimate of “Risk”
in open populations, which is important because epidemiologists usually
work with open populations (or samples).

Works best when incidence is relatively low and the observation
time is short (which is the case for most events/conditions)

Offers the only way to estimate risk in an open population
Theoretical: IR also estimates the concept of average instantaneous speed
or rate which, acting at all times on an at-risk population over a specified
time period, produces an accumulation of new cases in a population at risk.

Can be thought of as providing a constant pressure (thus “force
of…”) on the population to produce events
Person-time is the sum of every individual’s observed time-at-risk (the disease free time during which each was observed). This
can be estimated directly by adding every individual’s time of disease-free observation or it can be estimated indirectly in two
ways: 1) by using the mid-period population multiplied times the length of the period (this assumes that on average people come
and go equally through the period); and 2) by assuming that those entering or leaving the population (by getting the condition,
dying or moving away) do so on average half-way through the period of observation. Person-time is usually expressed in personyears but it could be person-hours, person-days, or person-weeks depending on the characteristics of the condition under study).
8
d

g)


h)

c)
This particular use of IR is almost entirely theoretical – to help
understand the dynamic nature of new event development – and
is only relevant when calculated in a closed population
Issues:
Assumes a reasonable estimate of person-time can be developed
Assumes the distribution of individual person-time is not important (e.g. that
100 persons with 1 year of risk exposure is equivalent to 5 individuals with
20 years.)
Special case of recurring events:
Person-time allows us to formulate an appropriate at-risk denominator when
events can recur, i.e. when individuals do not leave the at-risk pool when the
acquire the event/condition (or only leave temporarily.

Example: Frequency of abuse events in a sample of women
followed over 2 years. Risk can be measured as an IP for first
events, or IR where all events, including recurrent, are included
in the numerator and subjects remain at risk throughout the
observation period
Incidence Odds: the odds of occurrence of an new condition in a population at risk in
a specified period of time
b)
An alternate way to assess/measure the frequency of occurrence of new events
in a population at risk over a specified time (incidence)
c)
Calculation:
(# Developing the New Condition over time) / (# Not Developing the Existing Condition over time)
d)
e)
f)
d)
Interpretation: The odds of developing a condition in a population at risk over
a specified period of time
Uses: For constructing Odds Ratios in cohort studies (although this is a less
important measure than prevalence odds, since in cohort studies we can
calculate incidence rates or proportions directly).
Example: In a sample of 140 subjects followed for 2 years, 20 developed the
outcome: Odds of the new event over two years is 20/120 = 1:6 (or 1/6 = 0.17)
Critical Issues related to all Incidence measures
b)
Always need a clear indication of the relevant time period of observation and
how it will be incorporated into the measure (extrinsically in the IP,
intrinsically in the IR).
c)
Need a clear definition and means of identifying the numerator – as new
“cases” of a specified event, disease or condition.
d)
Need a clear definition and means of identifying the denominator – as
specifically relating to the population at risk for the event, disease or condition.
9
2.
Relationship among Prevalence and Incidence measures
a)
Prevalence depends upon both Incidence (the rate at which disease or events occur in
the population) and Average Duration of disease/events:
b)

Prevalence  (Incidence Rate) * (Average Duration) e

This approximation works well only when the disease prevalence is low
(<10%) and it assumes that the population dynamics are in a “steady state”,
i.e. that the incidence rate and disease duration are constant.
Incidence Rate acts on a population-at-risk over a period of Time to produce an
accumulation of cases in that population at risk. That accumulation of cases over time is
expressed in the Incidence Proportion:


e
Incidence Proportion  (Incidence Rate) * (Time)
This approximation works well only when the underlying incidence is low
(<10% Incidence Proportion) and when the observation time period is short
relative to the duration of the condition. [Note: Both of these assumptions
hold in most epidemiologic and clinical studies – except when dealing with
high incidence situations (e.g. epidemics) or prolonged periods of time (e.g.
incidence over 10-20 years or longer.)]
This most clearly illustrates conceptually the importance of incidence and average duration on Prevalence. It is a simplification of
the actual approximation: IR = (Prev) / [(1-Prev)*(Dur)]; when prevalence is low (1-Prev) approaches 1, thus simplifying the
expression.
10
D.
Miscellaneous Epidemiologic Measures:
1.
Mortality

b)
Measures of mortality are expressed as rates. The average person-years of
observation in the denominator is estimated by the mid-year population
multiplied by the duration of observation (most often 1 year)

Mortality Rates are often expressed as proportions, restating the Rate as a
Risk. If you do this, call them Risks not Rates and explicitly state that you
are using the Rate to estimate the Risk
Crude Mortality
b)
Definition – the rate of dying from all causes in a total population.
c)
Calculation:
[(Total # Deaths All Causes) / (Total Mid-Year Population x 1yr)] * 10x P-Y f
(expressed as Deaths/10x Person-Years)
c)
Cause-Specific Mortality
b)
Definition – the rate of dying from a specified cause in the total population.
c)
Calculation
[(Total # Deaths specific cause) / (Total Mid-Year Population x 1yr)]*10x P-Y g
d)
Category Specific Mortality
b)
Definition – the rate of dying from all causes within a specified group (e.g.
females, those in a specific age group, etc.).
c)
Calculation
[(Total # Deaths in specified group) / (Group Mid-Year Population x 1yr)]* 10x P-Y g
2.
Case-Fatality (a form of incidence proportion, not a rate)
a)
Definition – the proportion with a particular disease/condition that die from that
disease/condition in a specified time period
b)
Calculation
[(# Deaths from a specified disease) / (Total # with the specified disease)] * 10x
3.
Attack Proportion (a form of incidence proportion, not a rate)
a)
Definition – The proportion of those in a given exposure category that develop the
disease/condition of interest.
b)
Usually used in the context of outbreaks/epidemics, so the time of observation is often
left out of attack proportion expressions.
c)
Calculation
[(# Developing a disease/condition) / (Total # in an exposure category)] * 10x
f
Usually in these measures, Deaths are counted for one year and the Total Person-Years is estimated as the (Mid-year Population *
1 year), since mortalities are usually calculated for a one-year observation period. The multiplier (x) may be any power of 10 to
produce easily interpretable results (i.e. an integer numerator).
11
Download