MDM Review 2009

advertisement
MDM Review 2009
12.14.09
Jason Sanders
Outline
•
•
•
•
•
•
•
•
Measures of frequency
Measures of association
Study designs
INTERMISSION
Threats to study validity
Defining test and study utility
Descriptive statistics
Q and A
Measures of disease frequency
• Incidence (risk, cumulative incidence, incidence
proportion)
I = # new cases of disease during time period
# subjects followed for time period
Important points: only new cases counted in
numerator; time period must be specified
Benefits: easy to calculate and interpret
Drawback: competing risks make I inaccurate over long
time periods
Measures of disease frequency
• Incidence rate (rate)
R = # new cases of disease during time period
total time experienced by followed subjects
Important points: only new cases counted in
numerator; person time summed for each individual
Benefits: accounts for competing risks
Drawback: not as easy to interpret
Measures of disease frequency
• Prevalence (prevalence proportion)
P = # subjects with disease in the population
# of people in the population
Important points: All people with active disease in
numerator; can calculate “point” or “period”
prevalence
Benefits: illustrates disease burden
Drawback: cross-sectional
Disease frequency example
• You have a group of 100 people. At the start of
the study, 10 have active disease. Over the course
of 3 years, 18 new cases develop. You accrue 200
person-years of follow-up.
• Prevalence at start: 10/100 = 0.1 = 10%
• Risk over 3 years: 18/(100-10) = 0.2 = 20%
• Incidence rate: 18/200 = 0.09 cases per py
= 9 cases per 100 py
Measures of disease frequency
Property
Incidence
(risk)
Incidence rate Prevalence
(rate)
Smallest value
Largest value
Dimensionality
Interpretation
0
1
None
Probability
0
Infinity
1/time
Rate; inverse
of waiting time
0
1
None
Proportion
Attributable risk = Risk (E+) – Risk (E-)
“Excess risk due to exposure”
Attributable risk % = [Risk (E+) – Risk (E-)] / Risk (E+)
“% excess risk due to exposure”
Questions on measures of disease
frequency?
Measures of association
RR = risk in E+
risk in ERR = rate in E+
rate in E-
# Cases
Exposed
Unexposed
NE
NU
Total # or Total NTE or PTE
person-time
OR = odds of E+ in cases
odds of E+ in controls
NTU or PTU
E+
E-
Case
A
B
Controls
C
D
Absolute vs. Relative measures of disease frequency
• Risk, rate, prevalence, AR are absolute measures
– Used for describing disease burden, policy, etc.
• Relative risk, relative rate, prevalence proportion,
odds ratio, AR% are relative measures
– Used to describe etiology, association of disease with
exposure, etc.
RR can mean risk ratio or rate ratio
Illustration of cohort study
Risk E+
RR = Risk E+
Risk ERisk E-
Time
“Exposed people are
at X-fold greater risk
to develop disease.”
Illustration of case-control study
Odds E+
OR = Odds E+ cases
Odds E+ controls
Odds E+
Time
“Cases have X-fold
greater odds of being
exposed.”
• What if we could simultaneously achieve:
– Prospective measurement of disease (i.e.
exposure came before disease)
– Measurement of lots of confounders (for
adjustment)
– Controls coming from same population as cases
– Less recall bias
– Less selection bias
– Efficient, low cost study
Nested case-control or case-cohort study
But you know:
1) E preceded D
2) Other
confounders
3) Controls
came from
same group
as cases
Time
You easily
measure
case/control
status
Study design: observational studies (that count)
Prospective cohort Retrospective
cohort
Case-control
Study group
E+ and E- groups
E+ and E- groups
Cases and controls
Measures
Rate ratio, risk
ratio, odds ratio
Rate ratio, risk ratio, Odds ratio
odds ratio
Temporal
relationship
Possible to
establish
Possible to establish Difficult to establish
(except nested)
Time required Long follow-up
Can be efficient
Less time than others
Cost
Depends
Relatively inexpensive
When to use? E is rare and/or D is
frequent among
E+; investigate
result of exposures
E is rare and/or D is
frequent among E+;
save time vs.
prospective cohort
D is rare and E is
frequent among D+;
investigate causes of
disease
Issues
Selection of E-; loss
to follow-up;
change in E over
time
Selection of control
group (selection bias);
accurate E assessment
(recall bias)
Expensive
Selection of E-; loss
to follow-up;
change in E over
time
Study design: experimental study (RCT)
• Requirement: equipoise
• Design:
– Randomize groups to new treatment or standard
• Benefit: Balance frequency of KNOWN and UNKNOWN confounders
in groups (matching)
• Drawback: Expensive; inefficient; doesn’t always work; can’t analyze
variables that are matched on
– Follow groups through time and assess endpoints (risk,
survival, etc.)
• Analysis:
– Intent-to-treat (on-treatment)
• Benefit: Preserve randomization
• Drawback: Subjects might not have followed treatment
– Efficacy
• Benefit: Analyzes subjects who followed treatment for more
accurate assessment of treatment effects
• Drawback: Breaks randomization; introduces more confounding
• Issues: loss to follow-up; time; cost; changing standard of
care during study
Measures in RCTs
Absolute risk reduction…attributable risk backwards:
ARR = Risk (Placebo) – Risk (Treatment)
“Risk reduction attributable to treatment”
NNT = 1 / ARR
“Number of patients you need to treat to prevent 1 case”
Relative risk reduction…attributable risk % backwards:
RRR = [Risk (Placebo) – Risk (Treatment)]/Risk (Placebo)
“% Risk reduction attributable to treatment”
What if we’re interested in the time to
the event, and not just the event?
Survival analysis, log rank, Cox proportional hazards
HR=0.70, 95% CI 0.52-0.95
Proportional?
Bernier et al., NEJM. 2004.
Meta-analysis: steps
1)
2)
3)
4)
5)
6)
7)
8)
9)
Formulate purpose
Identify relevant studies
Establish inclusion and exclusion criteria
Abstract data
Describe effect measure (OR, RR)
Assess heterogeneity (Forrest plot, Q, I2)
Perform sensitivity and secondary analyses
Assess publication bias (Funnel plot)
Disseminate results
Can you group data: Forrest plot, Cochrane’s Q, I2
• Forrest plot – Illustrates size and precision of
effect estimates for multiple studies.
• Cochrane’s Q – A hypothesis test of whether
variation in effect estimates across studies is due
to chance (H0) or not due to chance (H1).
• I2 – Percent of variation in effect estimates across
studies that is due to heterogeneity rather than
chance.
Meta-analysis: heterogeneity and dealing with it
Funnel plot: assessing publication bias
• Plot Sample size (y-axis) vs. Effect (x-axis)
Unskewed distribution: bias minimal
Skewed distribution: bias present
Questions on measures of association
or study design?
Break time?
• Scope and Scalpel 2004: Episode 1
• Scope and Scalpel 2004: Episode 2
• Mr. Pitt Med "Blue Steel" Ad
• MTV Cribs: Pitt Med
• Pitt Med Office: "New PBL Group Day"
Bias, confounding, modification…a wine digression
Bias – systematic error (due to study) resulting in non-comparability; error
that will remain in an infinitely large study; difficult to remove once there
Will a person who enjoys apricot like Bonny Doon if it comes from a
bad barrel?
Confounding – mixing of effects; results in inaccurate estimate of exposureoutcome association; is never “controlled,” rather “adjusted for”
Peach, Grape
Apricot wine
Likability
Effect modification – difference of effect depending on the presence or
absence of a second factor; interesting phenomenon to investigate; detected
with stratification or interaction term in model
Apricot wine
Room
Iced
Likability
Likability
If differ by
>10%,
modification
present
Examining your new test: Sn, Sp, PPV, NPV
Gold standard
New test
Positive
Negative
Positive
A
B
Negative
C
D
Prevalence
alters PPV
most
Sn = A / (A + C)
“Of those with disease, how many did you identify?”
Sp = D / (B + D)
“Of those without disease, how many did you identify?”
PPV = A / (A + B)
“Of those you said had disease, how many truly did?”
NPV = D / (C + D)
“Of those you said did not have disease, how many truly did not?”
Examining your new test: Likelihood ratios
Gold standard
New test
Positive
Negative
Positive
A
B
Negative
C
D
LR is a ratio of two proportions: proportion of those with a
particular result among the diseased compared to the
proportion with that result among the non-diseased
LR(+) = A / (A + C) = Sn
B / (B + D) 1-Sp
LR(-) = C / (A + C) = 1-Sn
D / (B + D) Sp
“The likelihood of a test outcome (+ or -) if you have the disease
is X-fold higher than if don’t have the disease.”
Examining various tests: ROC curves
Picking the best test
depends on:
1) Optimizing Sn and
Sp (highest AUC)
2) Real world
conditions
Sn
1-Sp
HIV: We want highest
Sp and sacrifice Sn
Parametric biostats: T-test, ANOVA, χ2, Pearson
• T-test: if you want to test the difference in means of 2 groups (continuous)
– Assumptions and how to verify them:
• Independence (are subjects related?)
• Random sampling (assumed)
• Normal distribution of variable (histograms, formal test)
• Equal variance of variable in each group (F-test)
• ANOVA: if you want to test the difference in means between ≥2 groups
(continuous)
– Assumptions and how to verify them:
• Same as T-test
• χ2: if you want to test the difference in frequencies among ≥2 groups (categorical)
– Assumptions and how to verify them:
• Cell sizes in table (>5, formal test  Use Fisher’s exact test if unfulfilled)
•
Pearson r: if you want to test the degree of linear relationship between two
continuous variables; does not imply causal association or a mathematical
association other than linear
– Assumptions and how to verify them:
• Linear relationship (look at it)
• Independence, random sampling (as above)
• At least 1 variable must be normally distributed
Nonparametrics: Rank sum, Kruskal-Wallis, Spearman
• Mann-Whitney rank sum: if you want to test the difference in means of 2
groups (continuous)
– Assumptions and how to verify them:
• Independence (are subjects related?)
• Random sampling (assumed)
• Variable follows same distribution in both groups, whatever the
distribution may be
• Kruskal-Wallis: if you want to test the difference in means between ≥2 groups
(continuous)
– Assumptions and how to verify them:
• Same as rank sum
• Spearman r: if you want to test the degree of linear relationship between two
continuous variables; does not imply causal association or a mathematical
association other than linear
– Assumptions and how to verify them:
• Linear relationship (look at it)
• Independence, random sampling (as above)
• Nonparametrics do have assumptions!
• Great alternative if assumptions met, but can lack power and don’t give a good
idea of how the data are different (rely on significance)
P-values and confidence intervals
• P-value
– “Is the data consistent with the null hypothesis? If
not, then there is a “statistically significant”
difference.”
– Depends upon sample size and magnitude of effect;
doesn’t illustrate real values  A POOR MEASURE
• Confidence interval
– “What is the range of possible values for the
difference observed?”
– Provides information on precision of data and
possible range of values  A BETTER MEASURE
Extra slides
Odds and probability
Odds = Chance of something = p
Chance of not something
1-p
If p=50%, odds are 0.5/(1-0.5) = 0.5 / 0.5 = 1. Hence, 50%
chance means that it is equally likely that “something”
and “not something” will happen.
If p=33%, odds are 0.33/(1-0.33) = 0.33/0.67 = ½. Hence,
33% chance means that it is ½ as likely that
“something” will happen compared to “not
something” happening. Alternatively, it is twice as
likely that “not something” will happen compared to
“something” happening.
Standard Deviation vs. Standard Error
• The SD is a measure of the variability in the measurements you
took. Variability can come from biologic variability, measurement
variability, or both. If you believe the tool you use to measure has
zero error, then the variability is solely due to biologic variability. If
you want to emphasize the biologic variability (i.e. scatter) in your
sample, then the SD is the appropriate statistic.
• The SEM is a measure of how well you approximated the true
population mean with your sample. Again, error can come from
biologic variability, measurement variability, or both. If you assume
there is no biologic variability, then the only error comes from the
tool you use to measure. With larger sampling sizes from the
population, the measurement error becomes less and less because
you are more likely to determine the true population mean with
sample sizes that become closer to the true population. If you want
to emphasize how precisely you determined the true population
mean, then the SEM is the appropriate statistic
• The SEM is used to calculate Confidence Intervals.
Extra study design: observational studies
Case report/series Cross-sectional
Ecological
Study group
Patient(s)
Defined pop;
measure E and O
simultaneously in
each person
Select groups
(country, county);
measure E and O in
population
Measures
None
Prevalence
Correlation coefficient
Temporal
relationship
Unable to establish
Unable to establish
Time required
Little
Very little
Cost
Intermediate
Inexpensive
When to use? Interesting/new
case(s)
E and O are common Aggregate data
available; establish
hypotheses
Issues
No temporality;
prevalence bias
Not populationbased
Ecological fallacy;
difficult to adjust for
confounding
Download