Study Design

advertisement
Vascular Surgery Biostatistics Seminar
• We have a website:
http://www.phs.wfubmc.edu/public/edu_vascSurg.cfm
• Course is “experimental”
– Ask questions during lectures
– Let me know of specific statistical issues that you
want covered
• Assignment: for last 2 sessions (review of
student-selected publications)
– Pick 2 articles for class review
– Email PDFs of them to me by October 20th
Texts
1. Gehlbach: Interpreting the Medical Literature
(ISBN 0-07-143789-4)
2. Dawson & Trapp: Basic and Clinical Biostatistics
(ISBN 0-07-141017-1)
3. Good & Hardin: Common Errors in Statistics
(ISBN 0-471-79431-7)
4. Huck: Reading Statistics and Research (ISBN 0205-51067-1)
5. van Belle: Statistical Rules of Thumb (ISBN 0471-40227-3)
Schedule
Seminar # Topic
Date
Time
1
Study design and data collection
9/10
1:30 – 3:00
2
Probability and statistical inference
9/17
2:00 – 4:00
3
Data summary measures and graphical display of results
10/1
2:00 – 4:00
4
Survey of statistical analysis techniques (part I)
10/8
2:00 – 4:00
5
Survey of statistical analysis techniques (part II)
10/15
2:00 – 4:00
6
Evidence-based medicine and decision analysis
11/5
2:00 – 4:00
7
Reading and reviewing analyses in medical literature*
11/19
2:00 – 4:00
8
Review of student-selected medical publications*
12/3
2:00 – 4:00
Study Design
Gehlbach: Chapters 1-6
Hypothetical example: factors affecting
(causing) renovascular disease (RVD)
• Outcomes
– Renal function (GFR,
serum creatinine)
– RVD by diagnostic test
(ultrasound, angiogram)
– End-stage renal disease
(dialysis dependence)
– Renal-related mortality
• Exposures
– Hypertension
– RVD repair
• Open revascularization
• Percutaneous repair
– Risk factors: age, race,
smoking, diabetes,…
Q: How can we examine a specific hypothesis as it relates to RVD?
A: Formulate a hypothesis and design a study!
Design Dilemma
Ideal question
one would pose
Data one can
collect or access
From Good & Hardin, Common Errors in Statistics and How to
Avoid Them:
Before conducting the experiment, trial, survey, data analysis:
1. Write down the objectives
2. Translate those objectives into testable hypotheses
3. List potential findings and resulting conclusions
Research Question vs. Hypothesis
• Research Question:
“How does diabetes
affect renal function
after renal
revascularization?”
• Hypothesis:
“In patients treated for
RVD with
endovascular repair,
those with diabetes
have poorer early
renal function
response than those
without diabetes.”
Good & Hardin: Formulate hypotheses to be
quantifiable, testable, and statistical in nature.
Classification of Study Designs
Observational studies
1. Descriptive or case-series
2. Retrospective (case-control)
3. Cross-sectional
(prevalence), surveys
4. Prospective (cohort)
5. Retrospective cohort
Experimental studies
1. Controlled trials
a) Parallel designs
b) Sequential designs
c) External controls
2. Studies with no controls
Meta-analyses
Adapted from Dawson & Trapp, Basic & Clinical Biostatistics (4th ed)
Observational Studies
Retrospective Designs
• Begin with disease/condition/outcome and
look back for features (“exposure”) of those
with and without outcome
• Useful for:
– Hypothesizing causes of disease
– Identifying risk factors
• Weaknesses:
– Biased case and/or control selection
– Biased exposure ascertainment
– Temporal sequence of exposure/outcome
Retrospective Designs (cont.)
• Advantages:
– Data availability (design of choice for chart
reviews)
– Usually inexpensive
– Can be performed quickly
• Matching cases and controls:
– Prevents imbalance of known risk factor and
potential confounding
– Can reduce variability (increase efficiency)
– Require special analysis techniques
Retrospective Design (example)
Lei et. al., “Familial aggregation of renal
disease…” J Am Soc Neph (1998) 9:1270-1276
– Recruited 689 patients with new onset ESRD
– Used random-digit dialing to recruit 361 controls
from geographic community
– Matched cases to controls (2:1) using 5-year age
groups
– Obtained information on familial history of ESRD
and other risk factors (age, race, sex,
socioeconomic,…)
– Found patients with ≥ 2 relatives with ESRD at
increased risk for ESRD
Retrospective Cohort Design
• Uses previously collected data on a welldefined cohort
• Common approach for disease or treatment
registries since meticulous record-keeping is
required
• All follow-up took place in the past
• Subject to many of the same biases of other
retrospective designs
• Allows estimation of “prospective-like”
measures
Retrospective Cohort (example)
Holland and Lam, “Predictors of hospitalization
and death among pre-dialysis patients…”
Nephrol Dial Transplant (2000) 15:650-658
– Identified predictors of first hospitalization in a
cohort of 362 seen in “pre-dialysis” clinic
– Dialysis initiation and loss to follow-up were
censored events
– Hospitalization (for any cause) was outcome
– Risk factors examined using survival analysis
– Took advantage of records kept in “pre-dialysis”
clinic
Cross-sectional Designs
• Classifies a population or group with respect
to both outcome and exposure at a single
point in time
• Useful for:
– Disease description
– Diagnosis and staging
– Describing disease processes, mechanisms
• Weaknesses:
– Subject to sampling and recall biases
– Temporal order problem
– Can’t estimate disease incidence, only prevalence
Cross-sectional Design (example)
Hansen et. al., “Prevalence of renovascular
disease in the elderly…” J Vasc Surg (2002)
36:443-451.
– 834 participants in the CHS Study were examined
with RDS at a single point in time
– RVD status determined and prevalence in CHS
cohort estimated
– Increased age, lower HDL-c, and increased SBP
associated with RVD
Surveys
• Single point-in-time studies; many utilize
sampling techniques to assure generalizability
• Complex survey designs (e.g., NHANES, NIS HCUP) use probability sampling
– Target population is divided into clusters; subsets
of clusters are sampled randomly
– Certain clusters may be “oversampled” to assure
representation
– Statistical analyses require special methods that
correct variance for study design
Complex Survey (example)
Mondrall et. al., “Operative mortality for renal
artery bypass in the United States” J Vasc Surg
(2008) 48:317-322
– Examined RABG from NIS/H-CUP survey, 20002004
– Observed 10% in-hospital post-op mortality
– Risk factors for increased mortality included: age,
female gender, Hx renal failure, CHF, lung disease
– In-hospital mortality higher than previously
reported
– Used methods that accounted for survey design
Ecologic Studies
• Use data from large groups to compare rates
of exposure and disease
• Data are on group-level (e.g., data on air
pollution levels in specific cities could be
compared to rates of lung cancer)
• Can lead to “ecologic fallacy”, because one
doesn’t know whether the actual individuals
disease are subject to the exposure of interest
• Subject to “crackpot” biases
Ecologic Study (example)
Reynolds et. al., “Childhood cancer and
agricultural pesticide use…” Environ Health
Prospect (2002) 110:319-324
– Examined incidence of childhood cancers in
California in relation to pesticide use, 1988-1994
– Data sources: California Cancer Registry; U.S.
Census; California Dept. of Pesticide Regulations
– Looked at cancer of all types, and by specific types
– Found a significant association between childhood
leukemia rates in communities with highest use of
propargite
– No other associations were observed
Prospective Designs
• Start with well-defined cohort and follow-up
for occurrence of disease/outcome
• Considered the optimal design for
observational studies
• Useful for:
– Finding causes and estimating incidence of
disease
– Identification of risk factors
– Following natural history, determining prognosis
Prospective Designs (cont.)
• Weaknesses:
– Subject to selection bias (all studies are) and
surveillance bias
– Losses to follow-up or dropouts
– Temporal changes in health habits (e.g., MRFIT)
• Can be expensive and always take time
• Advantages:
– Correct temporal relationship between exposures
and disease/outcome
– Allows estimation of disease incidence and
relative risks
Prospective Design (example)
Edwards et. al., “Renovascular disease and the
risk of adverse coronary events…” Arch Intern
Med (2005) 165:207-213
– 840 CHS participants with RDS exams from
Hansen et. al.
– Followed for CVD events for an average of 14
months post-RDS
– Participants with RVD found to have nearly twice
the rate of adverse CVD during observation period
than those without RVD
Observational Designs
E(+)
Controls
E(-)
E(+)
E(-)
Cross-sectional
Time
Control
Case
Control
Case
Exposure
Participants,
Patients,
Subjects
Today
Cases
Retrospective Cohort
No Expo.
Controls
Prospective (Cohort)
Cases
E(+)
E(-)
E(+)
E(-)
Retrospective (Case-control)
Experimental Studies
Clinical Trials
Participants are assigned to an experimental
treatment and followed for event of interest
– Clinical trials may…
a)
b)
c)
d)
e)
…be randomized or non-randomized
…include a control group or have no control group
…compare current treatment to an historical control
…employ parallel or cross-over design
…employ blinding of investigator and/or participant
– The randomized, double-blind, placebocontrolled, parallel design is considered to be the
best to determine efficacy
Clinical Trials (cont.)
Randomization
– Purpose: to balance groups on both observed and
unobserved factors
– No guarantees: balance occurs in expectation (i.e.,
there is chance that some factors will not be
balanced)
– In cross-over design, it’s best to randomize
treatment order (if possible)
– Blocking used to assure treatment arm balance at
fixed points
– Stratification used to assure balance on a factor of
interest
With
Outcome
Without
Outcome
With
Outcome
Control
Treatment
Participants
screened for
entry criteria
Without
Outcome
Experimental
Treatment
Clinical Trial: Parallel Group Design
Time
Screening
Baseline
Treatment
Clinical Trial (example 1)
Kay et. al., “Acetylcysteine for prevention of
acute deterioration of renal function…” JAMA
(2003) 289:553-558.
– Experiment to test efficacy of antioxidant
acetylcysteine to prevent acute nephrotoxicity
– 200 patients with moderate renal insufficiency
undergoing elective coronary angiography
– Randomized, double-blind, placebo-controlled
– 12% with increase in SCr in placebo group vs. 4%
in acetylcysteine group (P=0.03)
Screening
B/L
Treatment
(Phase 1)
{Washout}
Treatment
(Phase 2)
Without
Outcome
With
Outcome
Control
Treatment
With
Outcome
Without
Outcome
With
Outcome
Experimental
Treatment
With
Outcome
Experimental
Treatment
Without
Outcome
Participants
screened for
entry criteria
Without
Outcome
Control
Treatment
Clinical Trial: Crossover Design
Clinical Trial (example 2)
Whelton et. al., “Effects of celecoxib and
naproxen on renal function…” Arch Intern Med
(2000) 160:1465-1470
– Experiment to compare effect of celecoxib vs.
naproxen on renal function in elderly cohort
– 29 healthy elderly subjects took either celecoxib
or naproxen for 10 days, had 7-day washout, then
took other med for 10 days
– Randomized treatment order, single-blind design
– At day 6, GFR change on naproxen -7.5
mL/min/1.73m2 vs. -1.1 on celecoxib (P=0.004)
Clinical Trials (other types)
• Non-randomized trials: patients not assigned
to treatment (or treatment order) via
randomization; interpret with caution
• External or historical controls: compare
current experiment to an external control
group (e.g., from prior study or literature);
interpret with caution
• Uncontrolled trial: experimental group only
(no comparison); interpret with caution
Clinical Trial (example 3)
Gomes et. al., “Acute renal dysfunction in highrisk patients after angiography…” (1989)
Radiology 170;65-68
– 145 patients at “high-risk” for renal failure
undergoing angiography after administration with
iohexol (non-ionic contrast)
– Compared to 202 historical controls previously
studied with ionic contrast
– Acute renal dysfunction observed in 5.5% of
iohexol group vs. 10% of historical control group
(P=NS)
– Authors use result to argue for new, randomized
trial of two contrast agents
Clinical Trials (issues)
• Blinding: double-blind is optimal but not
always feasible
– Surgical trials usually impossible to blind both
investigator and participant
– Some trials are “open-label” and treat participants
to a goal; others test a behavioral intervention
– Group interventions are typically not blinded;
must also account for “clustering” in intervention
• If possible, always blind staff performing
measurements
• Avoid surveillance and/or ascertainment bias
Clinical Trials (issues)
• Look out for loss to follow-up, differential
attrition, and poor adherence to treatments
• Intention-to-treat: when analyzing outcomes,
participants are included in analyses based on
treatment group assignment regardless of
treatments received or adherence
– Necessary to avoid potential bias due to selfselection
– Preserves randomization
– Drug and device companies love to do analyses
based on treatments received
Meta-analysis
• Pools results across multiple studies
• A review article with quantitative summary
• Typically combines results of several
experimental studies
– Useful for combining small studies
– Studies should have same or similar treatments
– Pools results to get single measure of effect
• Beware: meta-analyses combining
experimental and observational designs
• Dependent upon articles reporting sufficient
data (N, effect measure, variance)
Meta-analysis (example)
Leertouwer et. al., “Stent placement for renal
arterial stenosis…” Radiology (2000) 80:78-85
– Compared studies of RVD repair with stent
placement vs. PTA alone
– Combined data on technical success rate, BP
response, renal function response, anatomic F/U
from 14 studies of stent placement and 10 studies
of PTA
– Conclusion: “Renal artery stent placement is
technically superior and clinically comparable to
renal PTA alone.”
Data Collection for Statistical
Analyses
Data Collection for Statistical Analyses
1.
Enter all or most of the data as numbers. Avoid entering letters,
words, string variables (e.g.,NA, 22%, <3.6), or anything that
resembles a cartoon curse word, @#&*%,. In Excel, all columns,
with the exception of names and text comments, should be
formatted as numbers or dates (not as general or text).
2.
Give each column a unique, simple, 1-word name, 8 characters or
less with no spaces, beginning with a letter, and place this name in
the first row.
3.
Put only one variable in a column. Do not combine variables in the
same column.
4.
Enter each patient (or unit of analysis) on a separate line,
beginning on the second line.
5.
Give each research participant or patient a unique case number
(1,2,3, etc.)- in the first column. Delete patient name, SS#, MR#,
and any identifying information before sending it to a statistician.
Always, save the spreadsheet with a password.
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/DataTransmissionProcedures?CGISESSID=9fe1d0d63a71d176ca460de518acf2cf
Data Collection for Statistical Analyses
6. Enter cases and controls in the same spreadsheet. Use
one variable to define the control group (TREATED
0=no, 1=yes or GROUP 1=Drug A, 2=Drug B).
7. Quantify. Enter continuous measurements when
possible.
8. Create a simple guide (or key) using a word processor
to explain variables abbreviations, value coding, and
how missing values were entered. Be consistent.
9. Think through the analysis before collecting any data.
10. Have a biostatistician review the coding before data
entry and again after the first 10 patients have been
entered.
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/DataTransmissionProcedures?CGISESSID=9fe1d0d63a71d176ca460de518acf2cf
Spreadsheet from Hell
Spreadsheet from Heaven
Download