Error in Epidemiological Studies

advertisement
ERRORS IN
EPIDEMIOLOGICAL
STUDIES
Assoc. Prof. Pratap Singhasivanon
Department of Tropical Hygiene
Page 2
ERROR
Is defined as a false or mistaken result
obtained in a study or experiment
Consists of 2 components
Systematic error
Random error
Page 3
RANDOM ERROR
Refers to fluctuations around a true
value because of Sampling variability
SYSTEMATIC ERROR
Any difference between the true value and
that actually obtained that is the result
of all causes other than Sampling variability.
Page 4
ERROR
=
A false or mistaken
result obtained in
a study or
experiment
SYSTEMATIC ERROR
BIAS
Error due to factorsthat
inherent in the
design, conduct
and analysis
+
RANDOM ERROR
Fluctuation of and
estimate around the
population value
(RANDOM VARIABILITY)
Result obtained in sample
differs from result that would
be obtained if the entire
population were studies
Page 5
SOURCES AND
TYPES OF MEASUREMENT ERROR
Sources of Error
Observers
Bias
Random
Researchers
Administering
The measure
Bias
Random
Subjects
Bias
Random
Page 6
SYSTEMATIC ERROR :
SELECTION BIAS
INFORMATION BIAS
CONFOUNDING
Page 7
RANDOM ERROR
Is the divergence, due to chance alone, of
an observation on an sample from the true
population value
Page 8
Different combinations of high and low
reliability and validity
RELIABILITY
High
VALIDITY
Low
High
High
Low
Low
Page 9
Internal and External Validity
External
Population
Target
Population
Study
Sample
INT.
EXT.
VALIDITY
Page 10
VALIDITY AND RELIABILITY
HIGH
VALIDITY
A
B
C
D
HIGH
RELIABILITY
LOW
LOW
Page 11
VALIDITY :
A study is valid if its results corresponds
to the truth, no systematic error or
should be as small as possible
Page 12
VALIDITY
Is the expression of the degree to which a
test is capable of measuring what it is
intended to measure
A study is valid if its results corresponds to
the truth, no systematic error and random
error should be as small as possible
Page 13
RELATION SHIP BETWEEN BIAS AND CHANCE
TRUE BLOOD
PRESSURE
(INTRA-ARTERIAL CANULA)
BLOOD PRESSURE
MEASUREMENT
(SPHYGMOMANOMETER)
CHANCE
BIAS
80
90
DIASTOLIC BLOOD PRESSURE (mmHg)
Page 14
SOURCES OF VARIATION
CONDITIONS OF
MEASUREMENT
DISTRIBUTION OF
MEASUREMENT
SOURCE OF
VARIATION
One Patient,
One Observer
Repeated observations
MEASUREMENT
One Patient,
Many Observer,
At one time
One Patient,
One observer,
Many Times of Day
BIOLOGIC
+
MEASUREMENT
Many Patients
Page 15
FRAMEWORK FOR THE INTERPRETATION
OF AN EPIDEMIOLOGIC STUDY
IS THERE A VALID STATISTICAL ASSOCIATION?
Is the association likely to be due chance?
Is the association likely to be due bias?
Is the association likely to be due confounding?
CAN THIS VALID STATISTICAL ASSOCIATION BE JUDGED
AS CAUSE AND EFFECT?
Is there a strong association?
Is there biologic credibility to the hypothesis?
Is there consistency with other studies?
Is the time sequence compatible?
Is there evidence of a dose-response relationship?
Page 16
Precision :
Is the quality of being sharply defined through
exact detail.
The repeated assay of a single test specimen
typically gives rise to a set of results that differ
to a greater or lesser extent from each order.
The smaller the differences, the greater the
precision of the assay method.
Page 17
Measurement
The procedure of applying a standard scale to a
variable or a set of values. (Last, 1988)
Terms used to describe properties of
measurement:
-
Accuracy
Validity
Precision
Reliability
Repeatability
Reproducibility
Page 18
SELECTION BIAS
is a distorsion in the estimate of effect resulting from
the manner in which subject are selected for the
study population
MAJOR SOUREC OF SELECTION BIAS
1) flaws in the choice of groups to be compared
2) choice of sampling frame
3) loss to follow up or nonresponse during data
collection
4) selective survival
Page 19
INFORMATION BIAS is a distortion in the
measurement error or misclassification of subject on
one or more variables
MAJOR SOURCES OF INFORMATION BIAS
1)
2)
3)
4)
invalid measurement
incorrect diagnostic criteria
omissions imprecisions
other inadequacies in previously recorded data
Prevalence of Down syndrome at Maternal Age
9
8
7
6
5
4
3
2
1
0
<20
20-24
25-29
30-34
Maternal Age
35-39
40+
Prevalence of Down syndrome at birth by birth order
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
1
2
3
Birth Order
4
5+
Hypothetical Examples of Unadjusted and Adjusted Relative Risks
According to Type of confounding (Positive or Negative)
Example
No.
Type of
Confounding
Unadjusted
Relative Risk
Adjusted
Relative Risk
1
Positive
3.5
1.0
2
Positive
3.5
2.1
3
Positive
0.3
0.7
4
Negative
1.0
3.2
5
Negative
1.5
3.2
6
Negative
0.8
0.2
7
Qualitative
2.0
0.7
8
Qualitative
0.6
1.8
Page 25
CONFOUNDING
MIXING OF EFFECTS
The estimate of the effect of The exposure
of interest is distorted because it is mixed
With the effect of an Extraneous factor
Page 26
CONFOUNDING
COFFEE DRINKING, CIGARETTE SMOKING
AND CORONARY HEART DISEASE
EXPOSURE
(coffee drinking)
CONFOUNDING
VARIABLE
(cigarette smoking)
DISEASE
(heart disease)
Page 27
The distortion introduced by a confounding factor
can lead to overestimation or under estimation of an
effect depending on the direction of the association
that the confounding factor has with exposure
and disease.
Confounding can even change the apparent
direction of an effect.
Example :
Alcohol
Smoking
Oral cancer
E =
D
D
E =
E =
D
D
E =
E
D
D
E
E
E
E=
E=
D
D
E
D
D
E
E
E
Page 32
Situation in which F is a confounder for a D - E association.
E
E
E
D
D
D
F
F
F
Situation in which F is not a confounder for a D
E
E
D
F
E
D
F
- E association.
E
D
F
D
F
Page 33
To be confounding, the extraneous variable must
have the following characteristics
A confounding variable must be a risk factor for the
disease.
A confounding variable must be associated with the
exposure under study (in the population from which the
case derive).
A confounding variable must not be an intermediate step
in the causal path between the exposure and the
disease.
Page 34
The data-based criterion for establishing the
presence or absence of confounding involve
the comparison of a crude effect measure
with an adjusted effect measure that corrects
for distortions due to extraneous variables.
Confounding is acknowledged to be present
when the crude and adjusted effect measure
d i f f e r
i n
v a l u e.
Page 35
CONTROL OF CONFOUNDING
- RESTRICTION
- MATCHING
DESIGN
- STRATIFICATION
- MATHEMATICAL MODEL
(Multivariate analysis)
ANALYSIS
Page 36
Relation of Confounder to
Disease and Exposure
DISEASE
EXPOSURE
AGE
*MI (%)
25-29
3
16
29
30-34
9
14
10
35-39
16
20
8
40-44
30
21
4
45-49
42
18
3
*MI
**OC
CONTROLS(%) **OC USE (%)
: MYOCARDIAL INFARCTION
: ORAL CONTRACEPTIVE
Page 37
CRUDE RR
E
-
E
RR = 4
+
D
-
CRUDE
=4
RR
D
1000
1000
CRR
2000
• Collapsed
• Collapsed in 1 table without separation into subgroup.
E =
E=
E
E
D
D
E
D
D
E
E
D
D
E
E
E
Page 40
EXPOSURE
ALC
ALC
+
DISEASE
-
200
50
800
950
1750
1000
1000
2000
250
^
CRUDE CIR = 4.0
EXPOSURE
SMOKERS
DISEASE
+
+
-
+
194
21
-
706
79
+
6
94
NON-SMOKERS
29
^
CIR
871
SM
= 1.86
ADJUSTED
^ = 1.13
CIR
^
CIR
SM
= 1.02
Page 41
Degree of Confounding
measures the amount of confounding
rather than mere presence or absence
degree of confounding
=
crude measure
adjusted measure
=
Crude
Adjusted
=
=
1.68
3.97
d.c.
=
1.68
3.97
4.00
1.13
= 3.53
over estimation
= 0.42
under estimation
Page 42
4 fold risk of MI among recent of OC users

as compared to non-users.
a OR(MH)
AGE
25-29
Recent
Use of OC
MI
Controls
Yes
4
62
No
2
244
Yes
9
33
30-34
OR
7.2
8.9
No
12
390
Yes
4
26
No
33
330
Yes
6
9
No
65
362
Yes
6
5
No
93
301
Yes
29
135
No
205
1607
1.5
35-39
3.7
40-44
3.9
45-49
TOTAL
65
1.7
 3.97
Page 43
TYPES OF ASSOCIATION
A. Not statistically associated (Independent)
B. Statistically associated
1. Noncausally associated (Secondarily)
2. Causally associated
a. Indirectly associated
b. Directly causal
Page 44
Association refers to the statistical dependence
between two variables that is ..
The degree to which the rate of disease in person
with a specific exposure is either higher or lower
than the rate of disease among those without that
exposure.
The presence of an association, does not imply
that the observed association is one of cause and
effect.
Page 45
STATISTICAL SIGNIFICANCE
YES
NO
Clinical / Public Health
significance
Sample size big enough
YES
OK
NO
YES
NO
RESEARCH
Page 46
Advantages and disadvantages of the major
observational designs. (cont.)
2. CROSS-SECTIONAL
Advantages
- May study several outcomes
- Control over selection of
subjects
- Control over measurements
- Relatively short duration
- A good first step for a cohort
study
- Yield prevalence, relative
prevalence
Disadvantages
- Dose not establish sequence
of events
- Potential bias in measuring
predictors
- Potential survival bias
- Not feasible for rare conditions
- Does not yield incidence or
true relative risk
Page 47
Advantages and disadvantages of the major
observational designs. (cont.)
4. NESTED CASE-CONTROL
(Prospective or retrospective)
Advantages
Scientific advantages of cohort
design samples stored until
Disadvantage
Requires bank of outcomes
occur
Relatively inexpensive
* All of these observational designs have the disadvantages (compare to
experiment) of being susceptible to the influence of confounding
variables
Page 48
Advantages and disadvantages of the major
observational designs
1. COHORT
Advantages
Establishes sequence of events
Disadvantages
Often requires large
sample sizes
Avoid bias in measuring predictors
Not feasible for rare
outcomes
Avoid survival bias
Can study; several outcomes
Number of outcome events grows over time
Page 49
Advantages and disadvantages of the major
observational designs. (cont.)
3. CASE-CONTROL
Advantage
Disadvantages
Useful for studying rare conditions
Potential bias from sampling two
population
Short duration
Does not establish sequence of
events
Relatively inexpensive
Yield odds ratio(usually a good
predictors
Potential bias in measuring
approximation of relative risk)
Potential survival bias
Limited to one outcome variable
Does not yield prevalence,
incidence, or excess risk
Page 50
CHARCTERISTICS OF
INCIDENCE AND PREVALENCE
INCIDENCE
NUMERATOR
DENOMINATOR
TIME
HOW
MEASURED
PREVALENCE
New cases occurring during All cases counted on a single
a period of time among a
survey or examination of a
group initially free of disease group
All susceptible people
present at the beginning of
the period
All people examined including
cases and new cases
Duration of the period
Single point
Cohort study
Prevalence (cross-sectional)
study
Page 51
Nondifferential Misclassification
SENSITIVITY AND SPECIFICITY REMAIN
CONSTANT IRRESPECTIVE OF THE VALUES
OF THE OTHER VARIABLE :
“BIAS TOWARD THE NULL”
Page 52
Differential Misclassification
WHEN THE MAGNITUDE OF ERROR FOR ONE VARIABLE
DIFFERS ACC. TOTHE ACTUAL VALUE OF ANOTHER
VARIABLE
(DIFF. SENSITIVITY & SPECIFICITY)
EG.
EXPOSURE TO
RADIATION
EMPHYSEMA
CONGENITAL
MALFORMATION
SMOKING
“BIAS TOWARD OR AWAY FROM NULL VALUE”
Page 53
FRAMEWORK FOR THE INTERPRETATION
OF AN EPIDEMIOLOGIC STUDY
IS THERE A VALID STATISTICAL ASSOCIATION?
Is the association likely to be due chance?
Is the association likely to be due bias?
Is the association likely to be due confounding?
CAN THIS VALID STATISTICAL ASSOCIATION BE JUDGED
AS CAUSE AND EFFECT?
Is there a strong association?
Is there biologic credibility to the hypothesis?
Is there consistency with other studies?
Is the time sequence compatible?
Is there evidence of a dose-response relationship?
Page 54
Prevalence of Dyslipidemia
Source pop.
Prevalence = 25%
1.
2.
3.
4.+
5.
6.+
7.
8.
9.
10.
11.
12.+
13.
14.
15.+
16.
17.
18
19.
20.+
Sample 1
Sample 2
Sample 3
4+
6+
8
9
14
Prevalence = 40%
12+
7
17
16
19
Prevalence = 20%
18
14
11
10
5
Prevalence = 0%
Page 55
Advantages and disadvantages of the major
observational designs. (cont.)
Advantages
Disadvantages
Yields incidence, relative risk, excess risk
- Prospective
More controls over
- Retrospective
selection of
Less expensive
Shorter duration
- Double cohort
Useful when distinct
cohort different or
rare exposures
More expensive
Longer duration
Less controls over
selection of subjects
Less controls over
measurements
Potential bias from
sampling two
populations
Page 56
MISCLASSIFICATION WITH REGARD TO DISEASE
(NONDIFFERENTIAL MISCLASSIFICATION)
Exposed
Unexposed
Relative Risk
2,000
8,000
2.0
Number of Cases
20
40
Under Diagnosis
(Sens. = 0.05; Spec.=1.0)
Number Identified as cases
10
20
2.0
Over Diagnosis
(Sens.= 1.0 ; Spec.=0.99) :
Number Identified as cases
40
120
1.3
Number of Individuals
Page 57
Explanation for the observed difference in survival
between propanolol and control group:
1. Chance (Random error)
2. Bias (Systematic error)
Selection
Information
Confounding
3. Effect of propanolol
Page 58
A high reliability means that in
repeated measurements the results
fall very close to each other;
conversely,
A low reliability means that they are
scattered.
1000
250
2000
Download