Lecture Slides - UCLA School of Public Health

Epi 242 Cancer Epidemiology
Fall, 2009
Hypothesis generating and testing
Descriptive Epidemiology
It is concerned with the distribution of
disease, including consideration of:
 what populations or subgroups do or do
not develop a disease (person),
 in what geographic locations it is most or
least common (place).
 how the frequency of occurrence varies
over time (time)

Descriptive Epidemiology

It is associated with frequency and distribution
of the disease. It describes the general
characteristics of the distribution of a disease
with regard to person (age, sex, race, marital
status, occupation, etc.); place (variation among
countries; within countries: urban/rural areas)
and time (seasonal pattern in disease or time
changes of the disease frequency). Information
on each of these characteristics can provide
clues leading to the formulation of an
epidemiologic hypothesis that is consistent with
existing knowledge of disease occurrence.
Descriptive Epidemiology

There are three types of study design for
descriptive epidemiology: (1) case report
and case-series study design are at the
individual level; (2) correlation study or
ecologic study designs are at the
population level; and (3) cross-sectional
study design is at the individual level.
Descriptive Epidemiology:
Population Distribution
Distribution of cancer in relation to person.
 “Who is getting the disease?”
 Demographic factors: age, sex, race,
marital status, occupation.
 Age, sex, and race are three most
important factors in cancer descriptive
epidemiology.
 Age specific cancer rate (Figure).
Graph 1 indicates that an exogenous agent, acting
continuously throughout life, is believed as the major
etiologic factors as in lung and esophageal cancers.
Graph 2 suggests that the etiologic factors are strongest in
early life. The decreased rate in very old age group could be
explained by:
Diminished exposure to an exogenous agent or a birth cohort
effect
Elimination of a susceptible population subgroup (competing
risk)
Changes in host occurring in meddle age, as age at menopause;
Serious under-reporting in old age.
Graph 3 (bimodal curve) as seen in breast
cancer suggesting different etiologic factors act
in early and late life.
Graph 4 Suggests a strong etiologic factor at
the early age such as liver cancer
Graph 5, The curve peak in childhood and slow
increase in later life as seen in leukemia or
sarcomas, also indicates two different
carcinogens
Graph 6 indicate the small number of
cases and may not be reliable.
Figure 3 Age has no effect on susceptibility to some carcinogens. Left panel, cumulative mesothelioma risk in
US insulation workers. Right panel, cumulative skin tumour risk in mice treated weekly with benzo(a)pyrene.
Mesothelioma rates in humans65 and skin tumour rates in mice64 depend on time since first carcinogenic
exposure but not on age, suggesting an initiating effect of these carcinogens. Lung cancer incidence in
smokers depends on duration of smoking but not on age, and stops increasing when smoking stops 67,
indicating both early- and late-stage effects. Radiation-induced cancer incidence increases with age at
exposure above age 20, suggesting predominantly late-stage effects3, although the large effect of childhood
irradiation also indicates an early-stage effect.
Geographic Distributions
Distribution of cancer in relation to place.
“Where are the rates of disease highest
and lowest?”
 Variations among countries
 Variations within countries, such as
between urban and rural areas
Distribution of cancer according to
time
“Is the cancer rate at present different from
the cancer rate in the past?”
Seasonal patterns of the disease
 Time trends of the disease

100
4500
90
4000
80
3500
70
3000
Per capita cigarette consumption
60
2500
50
Male lung cancer death rate
2000
40
1500
30
1000
20
500
Female lung cancer death rate
0
10
0
Year
*Age-adjusted to 2000 US standard population.
Source: Death rates: US Mortality Public Use Tapes, 1960-1999, US Mortality Volumes,
1930-1959, National Center for Health Statistics, Centers for Disease Control and
Prevention, 2001. Cigarette consumption: Us Department of Agriculture, 1900-1999.
Age-Adjusted Lung Cancer Death
Rates*
5000
19
0
19 0
0
19 5
1
19 0
1
19 5
2
19 0
2
19 5
3
19 0
3
19 5
4
19 0
4
19 5
5
19 0
5
19 5
6
19 0
6
19 5
7
19 0
7
19 5
8
19 0
8
19 5
9
19 0
9
20 5
00
Per Capita Cigarette Consumption
Tobacco Use in the US, 1900-1999
Trends in Ethanol Consumption in the US, 1960-97
10
P e r c a p ita c o n s u m p tio n (g a llo n s )
5
3
T o ta l
Beer
1
S p irits
0 .5
W in e
0 .3
0 .1
1960
Source: NIAAA, NIH
1970
1980
Year
1990
2000
Trends in oral cancer incidence rates* in 9 SEER
areas in the US by gender and race from 19731975 through 1996-2000
200
Rate per 100,000 person-years
100
50
White
10
Male
Female
5
1
1970
1980
1990
Year of diagnosis
*Age standardized to 2000 US population
2000
Black
Trends in Overweight* Prevalence (%), Adults 18 and
Older, US, 1992-2001
Trends in esophageal cancer incidence rates* in
9 SEER areas in the US by gender, race, and cell
type from 1973-1975 through 1996-2000
20
10
10
5
White Black
SCCE
ACE
1
0.5
0.1
1970
1980
1990
Year of diagnosis
*Age standardized to 2000 US population
2000
Rate per 100,000 person-years
Rate per 100,000 person-years
Male
20
Female
5
1
0.5
0.1
1970
1980
1990
Year of diagnosis
2000
Change of the cancer rates may be
caused by many factors:
Changes in diagnostic techniques
Changes in accuracy of tumor registry
Changes in age distribution may cause the
increase in crude rates
Changes in survivals
Improved treatment
Early diagnosis or screening
Changes in actual incidence of disease due to
alterations in environmental or life-style factors
The Sequence of Investigation
for Etiology of Disease

Formulating hypotheses

Testing hypotheses

Intervention
Formulate Hypotheses

The clinician makes an observation regarding
cause, based on his/her experience (case
report/case series study). The epidemiologist
describes the distribution of the frequency of
the disease with regard to person, place, and
time (ecological studies, cross-sectional studies).
In addition, the laboratory data will also supply
certain information regarding to potential
causes for the disease. These data from different
sources can be employed to formulate the
hypotheses.
Testing Hypotheses

These hypotheses may be tested in
sequence by retrospective (case-control)
studies, and if the results are positive, by
the prospective (cohort) studies.
Sometimes, there are only case-controls
studies since prospective studies take a
long time to accomplish.
Intervention

If risk factors are identified by both
retrospective/prospective studies, an
intervention trial may be designed to
ascertain whether or not modification of
such factors is followed by a reduction in
amount of disease.
Hypothesis Generating
A new hypothesis can affect the direction
of future research and the success or
failure of the research depends on the
soundness of the hypothesis.
 By observing patterns and distribution of
cancer incidence, three methods of
hypothesis formulation about disease
etiology.

Method of Difference
If the frequency is markedly different in
two sets of circumstances, the disease
may be caused by some particular factor
that differs between them.
 If the cancer rate is very rare in one
country, but very common in another
country, it may suggest potential life-style
or environmental exposures.

Method of Agreement
The observation that a single factor is common
to a number of circumstances in which a
disease occurs with a high frequency.
 Cervical cancer occurs higher in women with
multiple sexual partners, in women whose
husbands had multiple sexual partners, in
women whose husbands had penial cancer. All
those circumstances indicate that a sexually
transmitted agent/agents may play an important
role in the etiology of cervical cancer.

Method of Concomitant
Variation

The frequency of a factor varies in
proportion to the frequency of disease.
Correlation studies are particularly useful
sources of data for this type of hypothesis
formulation.
Considerations in the Formation of
Hypotheses
Biological basis and support of the hypothesis
 New hypotheses are commonly formed by
relating observations from several different
fields (e.g., clinical, pathological, and laboratory
observations)
 The stronger a statistical association, the more
likely it is to suggest a causal hypothesis (when
you generate hypothesis from existing data).

Considerations in the Formation of
Hypotheses
Observation of changes in frequency of a
disease over time, especially changes that
have occurred over the relatively short
period of time (lung cancer,
adenocarcinoma of esophageal cancer,
etc.)
 Clustering unusual cases of cancer may
indicate the potential environmental
exposures

Starting A Hypothesis






Study subjects: the characteristics of the
persons to whom the hypothesis applies.
The risk factor or potential cause:
environmental or genetic factors
The disease: the expected effect
The exposure-response relationship
The time-response relationship
e.g., “By reducing dietary fat from 40% to 20%
in white males with elevated PSA, the incidence
of prostate cancer will reduce 30% within five
years in this population”.
Hypothesis Testing

Study Design for Hypothesis Testing. There are
several types of epidemiologic studies:
Prospective or retrospective studies are
classified according to time frame of the study;
observational or experimental epidemiological
studies are depended on whether or not the
investigator has control of some factors
(intervention factors, treatment) that may be
associated with a different outcome; and
descriptive or analytic studies are based on
purposes of the study designs (formulating or
testing hypotheses).
Hypothesis Testing

Analytic Epidemiology deals primarily
with the determinants of the disease. In
analytic study design, the investigator
assembles groups of individuals to
determine whether or not the risk of
disease is different for individuals
exposed then it is for individuals not
exposed to a factor of interest.
Hypothesis Testing

There are three types of study design: (1)
case-control (case-reference) studies
(observational study); (2) cohort
(retrospective/prospective) studies
(observational study); (3) intervention
studies. We will focus our discussion on
two major study designs: Case-control
studies and prospective studies.
The Framework for the Interpretation
of An Epidemiological Study
Is there a valid statistical association?
Is the association likely to be due to
chance?
 Is the association likely to be due to bias?
 Is the association likely to be due to
confounding?

The Framework for the Interpretation
of An Epidemiological Study
Can this valid statistical association be judged as
cause and effect?
 Is there a strong association?
 Is there biologic credibility to the hypothesis?
 Is there consistency with other studies?
 Is there evidence of a dose-response
relationship?
 Is the time sequence compatible?