INTRODUCTION
a whole host of study designs are available
the choice is dependent upon the amount of information that is already known about a particular health issue
if little is known, existing data and something that is quick and dirty, should be the approach – why?
as knowledge of the phenomena increases, then so will the complexity and cost of the study major study designs differ from one another in several respects: o number of observations made : as little as one or up to several o directionality of exposure : it varies relative to disease (particularly chronic disease); start with subjects that have the disease and perform a retrospective study; start with disease free subjects and follow them (prospective) o data collection methods : some methods require use of previously collected data; while some require data collection o timing of data collection : questions may arise as to the quality and applicability of the data – particularly if long periods of time have elapsed o unit of observation : an entire group or one individual o availability of subjects : some subjects might not be available for whatever reason
emphasize: how subjects are selected; how designs fit in the spectrum of design options and how each design has inherent strengths and weaknesses
OBSERVATIONAL VERSUS EXPERIMENTAL APPROACHES IN
EPIDEMIOLOGY
2 basic facets of research design
1) manipulation of study factor (M) – exposure is controlled
2) randomization of study subjects (R) – chance determines likelihood of assignment to exposure conditions the various permutations produce three study types: experimental (M and R); quasiexperimental (M); and observational (neither M nor R)
OVERVIEW OF STUDY DESIGNS USED IN EPIDEMIOLOGY
Experimental
greatest control over the research setting; the study factor (exposure) is manipulated and subjects are randomly assigned to exposed and non-exposed groups
clinical trials – used primarily in research and teaching hospitals
community interventions – widespread impact on a population’s health – oriented toward education and behavioral change (smoking cessation classes)
Quasi-Experimental
manipulation of study factor, but not randomization of study subjects – natural experiments
can be used to evaluate the extent to which the programs meet public health goals
Observational Studies
neither manipulation of study factor nor randomization of subjects
an experiment might be impractical or unethical
make use of careful measurement of patterns of exposure and disease in populations
2 main types:
1) descriptive studies – case reports, case series, cross-sectional surveys; individual health characteristics with respect to person, place and time
2) analytic studies – ecologic studies, case-control studies, and cohort studies; designed to test specific etiologic hypotheses, generate new ones, and suggest mechanisms of causation
The 2 X 2 Table
this model tends to underestimate the complexity of the potential linkage between exposure and disease – however, it does provide and conceptual model for understanding more complex issues
Disease Status
Yes No Total
Exposure status
Yes
No
A
C
B
D
A + B
C + D
A + C B + D N
total of individuals with disease = A + C; free from disease = B + D; and so on for exposed and non-exposed – known as marginal totals
cross-sectional study – select the sample and determine which group each individual falls into
cohort study – fill in marginal totals of exposed and non-exposed and track for a period of time, once period of observation is complete then disease status could be classified
case-control – fill in marginal totals of disease status and then determine exposure levels
this approach requires that information regarding cross-classification be known
ECOLOGICAL STUDIES
unit of analysis is the group
number of exposed persons persons (preferably the rate of exposure) and the number of cases (preferably the rate of disease) are known, but the number of exposed cases is not known
the marginal cells are known but the interior cells are not
ecologic comparison studies involve an assessment of the correlation between exposure rates and disease rates among different groups or populations – may include incidence rates, prevalence, or mortality rates
exposure data may be available and may include: SES, environmental parameters, lifestyle characteristics
important characteristic is that the level of exposure for each individual is unknown
generally make use of secondary data collected by other sources; a clear advantage in terms of cost
ecologic trend studies involve correlation of changes in exposure and changes in disease
e.g., association between breast cancer and dietary fat (figure 6-3 in back of notes)
since data is used at the group level, individual exposure-disease relationships might be difficult to identify/define
measurement errors in disease and exposure
CROSS-SECTIONAL STUDIES
or prevalence study, exposure and disease measures are obtained at the individual level
select sample of subjects and then determine distribution of exposure and disease status – not necessary for both, but can be
conducted in a single period of observation – unit of observation is the individual
typically descriptive in nature – provide quantitative estimates of the magnitude of a problem as opposed to testing specific hypothesized exposure-disease associations
2 approaches: collect data on each member of the population ; or, take a sample of the population and draw inferences to the population
when taking a sample population, there are two different types: probability sample and non-probability sample
probability sample: every element in the population has a nonzero probability of being included in the sample; non-probability: does not have the nonzero probability feature
probability samples – simply random, systematic samples and/or stratified samples
non-probability samples – quota and judgmental samples – quota: collect data from a fixed number of subjects with particular characteristics (identified); judgmental sample: perception that the sample is representative of the population
non-random samples are not appropriate for cross-sectional studies
cross-sectional studies can be used at the local, state or national level and can be used to evaluate point in time prevalence or repeated for trend analyses
limitations stem mainly from inability to identify causation
CASE-CONTROL STUDIES
disease does not occur randomly – basic premise epidemiology
a rationale that applies to case-control studies (A + C) compared to (B + D)
one point of observation, unit of observation and analysis is the individual
data comes from both primary and secondary sources
exposure – primary; disease status – secondary
Selection of Cases
two tasks are involved: defining a case conceptually and identifying a case operationally
definition of a case is influenced by several factors – the biggest issue is misclassification
if the criteria is broad or too restrictive – cases will me misclassified or left out
a balance must be achieved
tend toward the side of more restrictive rather than inclusive
Sources of Cases
the goal is to ensure that all true cases have an equal probability of entering the study and that no false cases enter
the ideal situation is to identify and enroll all incident cases in a defined population in a specified time period – highly reliable data
cross-sectional (prevalent) cases make it difficult to identify causal factors
Selection of Controls
ideal controls would have same characteristics of experimental subjects (cases) except for exposure
the cases are presumed to have a given disease because of an excess (or deficiency) of an exposure
Sources of Controls
general concept guiding this is that they should come from the same population – they have the potential to become a case, they just aren’t one, yet
Example
1
Cases Controls
2
3
4
5
6
All cases diagnosed in the community
All cases diagnosed in a sampled
Sample of the general population in a community
Noncases, in a sample of the general population
All cases diagnosed in all hospitals in population, or a specified subgroup
Sample of persons who reside in the same the community neighborhood as cases
All cases from one or more hospitals Sample of patients in one or more hospitals
All cases from a single hospital
Any of the above in the community who do not have the same or related disease
Sample of noncases from the same hospital;
Spouses, relatives, or associates of cases
Population Based Controls
may be the best way to ensure that exposure among the controls is representative
– this can be done randomly or through matched cases, e.g., sex or age
Patients from the Same Hospital as the Cases
justified only when little information has been reported about the diseaseexposure relationship
several, important advantages – but too many inherent limitations unless criterion from above is met
Relatives or Associates of Cases
has to meet the criterion of free from disease – but should be similar to exposed cases on most other, if not all, factors
Measure of Association
the objective of case-control studies is to identify differences in exposure frequency that might be associated with one group having the disease
the guiding principle is to determine how much more or less likely the cases are to be exposed than the controls
-
from our 2 X 2 table; proportion of exposed cases = A/(A + C); not-exposed =
C/(A + C) – the odds of exposure are the ratio of these two proportions:
Proportion (A)
Proportion (C)
odds of exposure for case group = A/C
odds of exposure for control group = B/D
odds ratio (OR) = (A/C)/(B/D) = (AD)/(BC)
OR literally measures the odds of exposure of a given disease
OR = 1.0 – no risk for exposure; OR = 2.0 – cases were twice as likely as the controls to be exposed – associated with twice the risk of disease
should be interpreted with caution, case-control study is retrospective with only one point of observation – no appropriate denominators for the population at risk
Example: A = 204; B = 552; C = 9; D = 145:
AD
204(145)
BC 552(9)
5.95
COHORT STUDIES
a prospective or longitudinal study – starts with a groups of subjects who lack a history of the outcome of interest, yet are still at risk – going from cause to effect – the group is then followed for development of the disease
contain at least two observation points – at the beginning to ascertain disease free status and at the end to ascertain disease development
Types of Cohorts
population based (1) – heterogeneous sample in terms of their exposure
(A+B) and (C+D) – exposed and non-exposed
homogenous with respect to exposure (2) – frequency of exposure in the population cannot be determined
Sources of Cohorts
Special Exposure Groups
determined by lifestyle, occupational, environmental, etc… factors
Special Resources Groups
unique populations – college students, medicare/Medicaid, veterans, etc…
Geographically Defined Groups
Research Strategies
prospective – determination of exposure levels at baseline and follows for occurrence of disease
retrospective – historical data to determine exposure level at some baseline in the past
– ascertainment of disease status along the way
Selection of Comparison Groups
Internal Comparison
Separate Control Cohort
Comparison with Available Population Rates
Sources of Exposure Information
same as for cross-sectional studies
Measures of Association
relative risk (RR) – ratio of risk of disease among the exposed to the risk among the non-exposed
RR =
A
A
B
C
C
D
Example: A=14; B=9; C=49; D=149
14
23
49
198
0 .
609
.
247
2 .
46
interpreted, numerically, as the OR
main limitation – length of time it takes to conduct them
also, loss to follow-up can pose a problem due to the length of the study
exposures may change due to length of the study