Epistimology - University of Pittsburgh

advertisement
Overview of Research Methods
in Dentistry
Robert Weyant, DMD DrPH
Department of Dental Public Health
and Information Management
University of Pittsburgh
What is “Causation”
•
•
•
•
Koch-Henle postulates
Bradford-Hill 'criteria'
inductionist, refutationist, or hypotheticodeductivist view
Provides the basis for “intervention”
"Causality. There is no escape from it, we are forever slaves to
it. Our only hope, our only peace is to understand it, to
understand the why”
Larry, .; Andy Wachowski, . The Matrix: Reloaded.
2
Hills Criteria of Causation
•
•
Austin Bradford Hill (1897-1991),
a British medical statistician as a
way of determining the causal link
between a specific factor (e.g.,
cigarette smoking) and a disease
(such as emphysema or lung
cancer).
Hill's Criteria form the basis of
modern epidemiological research,
which attempts to establish
scientifically valid causal
connections (disease – and its
cause)
•
•
•
•
•
•
•
•
•
Temporal Relationship
Strength
Dose-Response
Relationship
Consistency
Plausibility
Consideration of Alternate
Explanations
Experiment
Specificity
Coherence
3
Systems
•
Deterministic Systems
•
•
•
Events are part of an
unbroken chain of prior
occurrences.
Outcomes occur
predictably
Newtonian Physics
•
Stochastic Systems
•
•
•
Outcomes are
computationally and
practically
unpredictable.
Present state does not
fully determine the
next state
Biology and medicine
are stochastic
4
Statistical Causality
•
Observational studies (like counting cancer cases among smokers and among
non-smokers and then comparing the two) can give hints, but can never establish
cause and effect.
•
•
The gold standard for causation here is the randomized
•
•
•
Hypothesis generation.
experiment:
One limitation of experiment is they do a good job of testing for the presence of
some causal effect they do less well at estimating the size of that effect in a
population of interest.
Subject selection may lack generalizability.
.
Med
Exp
Outcome
5
Research Designs
In clinical research
6
Essentials of Research Design
•
•
•
Basic research
Clinical research (often experimental)
Epidemiological research (often observational,
know denominator)
•
Health services research
limited to human research (in vivo)
7
What are our research
(and clinical) concerns?
•
Exposure
•
•
Outcome
•
•
•
Good or bad: Chemical, biological, psychological,
educational, etc.
Good or bad: disease, cure, improved attitude,
longer life, etc.
We generally know one and want to measure
the other
Concerns are that we measure both accurately
and understand what population is
represented.
8
Classification Schemes
•
•
•
Descriptive vs. Analytical
Experimental vs. Observational
Time Referenced
•
Prospective vs. Cross-sectional vs.
Retrospective
9
Describe or Analyze?
•
Descriptive
•
simply describe what was seen (common in
surveys). Prevalence of various conditions.
• PREVALENCE: the proportion of the population who
exhibit the condition of interest.
•
Analytic
•
attempt to determine the associations between
disease and possible risk factors/determinates and
to quantify risk. (common for experimental designs
and search for causality)
10
Experiment or Observe?
•
Experimentation is defined by the degree
of control or manipulation the investigator has
over the study conditions.
•
•
In a non-experimental (observational) design the
investigator has less control over the study
conditions.
The consequences of study design are in the
limitations put upon the interpretation of the
results of the study.
11
Time
?
$
Retrospective
Prospective
Case control
All experimental
and cohort (obs)
Cross Sectional
Time
12
Classification according to
CONTROL / INTERVENTION
•
Experimental Designs (Classic Design = RCT)
•
•
•
•
•
Prospective
Investigator alters the conditions understudy
There is a true control group
Randomization MUST occur
Observational
•
•
•
May be prospective/retrospective/crosssectional
No control
No intervention
13
Issues of concern
1.
2.
3.
4.
5.
6.
7.
8.
Population
Control group
Sample size
Placebo
Control of Operational Procedures
Validity and Reliability of Measures
Duration
Statistical Analysis
14
1. Population (Relevance)
•
When you read a study you must ask:
• is the population representative of
something I care about?
• Is it appropriate to answer the question.
15
How do people get into a
“study”?
•
•
•
•
They volunteer
Often they are in the right place at the
right time
They have the right disease (severity) or
exposure.
Often “clinic” based studies are very
poorly generalized to larger populations.
16
Why people don’t get into a
study
•
•
•
•
Too sick or not sick enough
Wrong gender, race, etc….
Don’t live in the right place.
Don’t know about the study.
17
Where do research “subjects” come from?
Generalizability of Results
Population
of interest
Present
for study
Eligible
Consent/Enroll
Complete
study and
can be
found for
follow up
(in community)
Barriers
Lack of
knowledge
Referral Issues
Fear
Transportation
Barriers
Wrong
disease
severity
Demographic
issues
Barriers
Fear
Transportation
Not willing to be
“randomized”
Barriers
Not adhere to
protocol
Lost to follow
up (move, die)
Is the study relevant and valid?
•
External validity
•
•
•
Do the study subjects represent a definable
population of interest - i.e., “your patients”?
Hence, is it relevant
Internal validity
•
•
Is the study well designed and analyzed?
Hence, is it valid
19
2. Sample Size
(did you look at enough people…)
•
•
There must be enough people in the study to
ensure that the conclusions are valid. The
likelihood that a finding will be spurious or
incorrect decreases as you increase the
number of individuals in the study.
POWER: the ability of a test to detect a
significant difference when one exists. Be
particularly attentive to negative studies.
•
Function of effect size, variance, sample size
20
3. Control Group ?
(it worked!……compared to what?)
•
•
If we are to conclude that an
intervention has an effect, then we
must be sure that the group with and
without the intervention were similar
before the study began/and remained
so except for the intervention.
If not, bias can result in spurious
conclusions.
21
4. Placebo ?
(I feel much better...what was that?)
•
•
Placebo is a material, formulation,
intervention that is similar to the test
product, but without the active ingredient.
There is a well documented placebo
effect in many situations.
•
Up to 70% in some studies.
22
5. Control of operational procedures
(What exactly did you do, doctor?)
•
When reading a study for your own
use, it is important that the authors
explain precisely what they did.
This allows the reader to
generalize to his/her own situation
and helps to assess relevance
23
6. Reliability of measures
(That was great…now do it again?)
•
One of the most important areas in any study:
did the effect occur and how do we know.
Someone measured it. We must be able to
determine that the investigator(s) measured it
accurately, repeatable.
•
•
•
INTRA-RATER reliability (same cases over time)
INTER-RATER reliability (comparison of same cases
among raters)
Instrumentation
24
7. Duration of study
(over so fast?)
•
•
•
•
•
Did the trial run long enough to measure
the desired effect.
Caries trials 2-3 years
Calculus-preventing agents 90 days
Orthodontic outcomes (20 years?)
Implants (5 years)
25
8. Statistical Analysis
(So, did I find anything “significant?”)
•
•
•
Where they appropriate to the design,
quality of data, intent of investigators.
Statistical analysis is based on type of
data (nominal, ordinal, ratio).
Type of question being asked
•
•
•
Summarize
Difference between groups
Effect size or risk
26
Threats to Validity of a Study
(Nice result, but what about…)
•
•
•
Bias: Any systematic error in a study which
results in an incorrect estimate of the
association between disease and exposure.
Confounding: results when there is a mixing of
the effect of the exposure and disease with a
“third factor”
Chance: The exposure:disease relationship is
spurious as the result of random variation in
sampling.
27
Types of Bias
•
Selection
•
•
•
•
•
Non-representative sample
Non-comparable case/control groups
Loss to follow-up
Differential survival
Observation (Misclassification error)
•
•
•
Disease Classification
Exposure Classification
Instrumentation
28
Confounding
•
•
•
•
Definition: the bias in the (crude) diseaseexposure estimate that can result when the
exposure-disease relationship is mixed up with
the effect of “extraneous variables”
Confounding affects our understanding of the
“true” disease-exposure relationship
The determination is “data-based”
Two methods
•
•
Stratification
mulitvariate analysis
29
Chance
•
•
That’s what we have statistics for - to
quantify the chance.
Type 1 (alpha) error (p-value).
30
Research
Designs
Case-control
study
yes
Observational studies
no
Do we know
disease
status of patients
before study
no
Will observations be
made at more than one
time
Cross sectional
study
yes
Cohort study
no
Alter the
conditions
under study
yes
yes
True experiment
Is there to be a
control group
Experimental studies
no
Quasi Experiment
Observational Designs
Cross Sectional
Case Control (retrospective)
Cohort (prospective)
32
Cross Sectional Study
•
•
•
Measure, Classify, Compare
Used for questionnaires, surveys,
prevalence estimates, to generate
hypotheses.
Everything occurs “at once”.
33
Cross Sectional Design
1. Select Pop of interest
2. Select Sample
3. Assess population for both
disease (outcome) status
Study
Sample
Population of
Interest
Disease
Positive
Disease
Negative
and risk factor (exposure)
status
RF +
RF RF +
RF -
Analyze using correlational statistics
but causation not “provable” due to
lack of temporal association
Cross-Sectional Design
Advantage:
Disadvantage:
1. Quick and Low Cost
1. Subject selection may reflect
selection bias (volunteers,
hospital patients)
2. Evaluate a large number of
variables
3. Enroll a large number of
Subjects
Common Uses:
• Questionnaires and Surveys
• Prevalence studies
• Hypothesis Development
2. Is difficult to identify cause and
effect relationship.
Case Control
•
•
•
Select cases and controls
Retrospective assessment of risk factors
Quantify exposure. Since no
denominator, only relative rates.
36
Case-Control Design
1. Select group of subjects WITH
disease/outcome of interest = CASES
RF +
3. Measure
(retrospectively) risk
factors of interest.
RF -
Cases
RF +
RF 4. Analyze using strength of
association measures.
Controls
2. Select group of subjects WITHOUT
disease/outcome = CONTROLS
Selection of controls crucial
Common Use:
Case selection also must be
carefully considered
Rare Disease (e.g., birth defects)
Long Latency (e.g., cancer)
Case-Control Design
Advantages
Disadvantages
1. not dependent on natural frequency
of disease (thus used to study rare
diseases)
1. case selection may be problematic
2. well suited to study diseases with
long latency
2. controls may not be representative
of same population as cases in
terms of disease risk or
confounders
3. requires comparatively few cases
(2:1 or 3:1 matching)
3. investigators may be biased when
know of disease status of subjects
4. not dependent on previously
established cohort
4. subjects may bias answers (recall)
due to disease status
5. allows study of multiple potential
causes of disease
5. factors which are used to match
are removed from analysis
6. relatively low cost and quick
6. incidence, prevalence, RR and AR
can't be calculated since no
"population at risk" denominator is
available
7. ethical: disease has already
occurred
Cohort Design
•
Select two or more groups (cohorts) that
are free of disease but differ on their
exposure status.
•
•
•
May start with one heterogeneous cohort.
Cohorts have a “denominator” which
allows the calculation of true rates.
Useful when “exposure” varies over time.
39
Cohort Study Design
1. Select Population of
interest
2. Recruit sample WITHOUT
disease(s) of interest and
measure risk factors
Disease Free
Study Sample
(baseline
exam)
Population of
Interest
3. Recall cohort periodically and
remeasure risk factors and disease
status
Visit 2
Visit 3
Visit n
Prospective, Observational Design.
Uses:
Time
• Determining/quantifying risk factors
• Developing new etiological theory
•Establishing causality
Cohort Design
Advantages
Disadvantages
1. allows risk to be
expressed as
incidence
1. inefficient for study of rare disease
2. certain biases are
reduced:
3. selection bias not controlled
exposure status
disease status
3. subject characteristics
can be related to more
than one outcome
2. assessment of relationships limited to
those defined at beginning of study
4. loss to follow-up common
5. subjects may change in regards to
characteristics (i.e. exposure status)
6. bias may be present if the
characteristic studied influences
surveillance and if surveillance
influences detection of outcome
(Berkson's fallacy)
7. expensive and time consuming
Experimental Designs
Clinical Trials (RCTs)
Field Trials
42
Clinical Trials
•
•
•
•
•
•
Prospective controlled experiment of human
subjects to assess intervention for a specific
disease.
Asks an important research question
Clinical event or outcome
Done in clinical or medical setting
Evaluates one or more interventions compared
with “standard treatment”
Informed consent and DSMB required
43
Phases of Clinical Trials
•
•
•
•
Phase I: dose finding
Phase II: efficacy at fixed dose
Phase III: comparing treatment (RCT)
Phase IV: late/uncommon effects
44
Uses of Clinical Trails
(experimental studies)
•
•
•
Test new drug therapy
Test new surgical interventions
Test educational/programatic
interventions
45
Randomized Clinical Trial Design
1. Recruit individuals WITH
disease.
2. Randomize into treatment
arms
Standard
Treatment
Study
Sample
with
disease
3. Follow up to assess outcomes
Outcomes
Randomization
Ethical only to the degree
that differences in treatment
are unknown at time of
study initiation (equipoise).
Requires DSMB.
New
Treatment
Outcomes
Randomization is essential, and along
with strict control of experimental
conditions allows for minimal bias
Excellent internal validity (but possibly
low external validity)
Experimental Design
Advantages
Disadvantages:
1. investigator directly controls
assignment to study groups
1. not immune to problems encountered
with other designs: (non-compliance,
incomplete follow-up, biased
observation)
2. investigator directly controls
exposure to agent.
2. may have low external validity
3. random assignment measures
can control extraneous
3. may not be feasible for studies of
factors.
disease etiology (ethical
considerations, rare disease)
4. blinding of evaluators may be
possible
4. may not be feasible for effective
disease prevention exists. (can't
withhold treatment)
5. Can be very expensive
Efficacy vs. Effectiveness
•
•
Efficacy is the
potential to provide a
clinical benefit.
Measured in CTs
•
•
Effectiveness is the
benefit provided in
routine “real world”
use.
Measured in
surveillance systems
(registries), after
market incident
reports, etc.
48
Hierarchy of Research Designs
•
•
•
•
•
•
•
•
Experimental designs
Cohort studies
Case-control designs
Human trial without controls
Cross-sectional designs
Descriptive studies
Case reports
Personal opinion
Based on
control of
bias and
confounding
and ability to
make causal
arguments
49
RCT’s Strengths
•
Minimally biased design
•
•
•
•
Randomization
Control of extraneous variables
Prospective (causality established)
Design issues determined prior to initiation of
study.
50
Problems with (Dental) RCTs
•
•
Difficult to randomize
Ethical Concerns
•
•
•
•
Principle of Equipoise involves the ethical treatment of human
subjects in experimental conditions. A subject should only be
submitted to a randomized, controlled design if there is substantial
uncertainty about which of the treatments would benefit the subject
most.
RCTs should not be done when patient preference can be elicited
(ortho vs. surgical tx)
Blinding issues (Hawthorne effect)
Expensive (and often lack sponsor)
51
What are the current “Issues” in Dental
clinical research?
•
•
•
•
Diagnosis
Treatment approach
Materials
Long term issues
•
•
Health Services
Research
•
Cost Effectiveness
Harm
52
Negative Study
•
•
•
•
No association
Sloppy design (poor methods or
analysis)
Bias
Chance
•
Statistics measures “chance” (expressed
as p-value)
53
Systematic Reviews
Putting it all together
54
Scientific Truth relies on
•
the weight of evidence over many studies that creates
confidence in results.
•
•
If its not published….it didn’t happen.
Journalistic Reviews…the “old way”
•
•
Remember the essays you used to write as a student? You would
browse through the indexes of books and journals until you came
across a paragraph that looked relevant, and copied it out. If anything
you found did not fit in with the theory you were proposing, you left it
out.
Or the way its done by senior academics. Take a simmering topic,
extract the juice of an argument, add the essence of one filing cabinet,
sprinkle liberally with your own publications and sift out the work of
noted detractors or adversaries…or
55
Systematic Reviews…the new way
•
•
•
In contrast to the old way, systematic reviews use
explicit and rigorous methods to identify, critically
appraise, and synthesize relevant studies.
Qualitative: when the results of studies are not
statistically combined.
Quantitative or Meta-analysis: systematic review that
uses statistical methods to combine the results of two
or more studies
56
Maturation of Dentistry
Age of Empiricism:
Age of Evidence
Dental practice based on observation and
experience in ignorance of scientific
findings
Dental practice based on high
quality evidence of effectiveness
All knowledge
maintained personally
Textbooks and
Journals
Apprentice
Model of
Education
Absence of
Research
Internet
Scientific Literature and
Knowledge Synthesisbased Education
RCTs
Systematic
Reviews and Meta
Analysis
Evolution of the Dental Knowledge Base
• store of specialized information
- diseases
- treatment methods
- treatment outcomes
• basis of professional decision-making
• has evolved over time with respect to:
- creation
- synthesis
- dissemination
Bader JEBDP 2004
What is a Systematic Review
•
•
A "systematic review” comprehensively locates,
evaluates and synthesizes all the available literature
on a given topic using a strict scientific design which
must itself be reported in the review.
Aim of SR is:
•
•
•
•
Systematic (e.g. in its identification of literature)
Explicit (e.g. in its statement of objectives, materials and
methods)
Reproducible (e.g. in its methodology and conclusions)
Goal: To efficiently integrate valid information and
provide a basis for rational decision making.
Features of a Systematic Review
•
•
Explicit criteria (reproducible)
Efficient
•
•
•
•
As it is impractical for even an expert to read all the literature
published in his field. SR are a succinct but robust form for
practitioners who need to keep up to date?
Well focused (PICO)
Thorough (unpublished information may be included)
Provides a context for studies and creates a
sense of the “weight of evidence”
•
Secondary data analysis
60
Why Systematic Reviews
•
Annually 3 million articles are published in
biomedical journals and biomedicine mass
doubling time is less than 20 months.
•
•
•
You would need to read a dozen or more articles per
day (365 days/yr.) to stay up to date.
Not all articles are valid or useful for patient care.
SR provide a summary and context of the current
state of knowledge (that is lacking if you only read
a few articles in an area).
Quality of Evidence Pyramid
Meta-Analysis
Systematic Review
Randomized Controlled Trial
Cohort studies
Case Control studies
Case Series/Case Reports
Basic Research and Animal research
}
Guidelines
Questions come in two
varieties:
•
BACKGROUND QUESTIONS
•
•
Textbooks/Basic Sci Faculty
FOREGROUND QUESTIONS
•
•
•
Clinical Faculty
Journal articles
Guidelines
Foreground
Background
Dental School
Professional Practice
Background Questions
•
Are general questions about conditions,
illnesses, syndromes and patterns of disease,
and pathophysiology.
•
•
•
•
"What is the typical clinical presentation of primary oral
herpes?” or
“Which teeth are most commonly affected during
ECC?”
Novices asks this type of question in a particular
knowledge area, in order to gain a general
understanding of clinical issues.
Best resources include textbooks and faculty.
Foreground Questions
•
•
Foreground questions are about issues of patient
care and clinical decision-making.
Best resources:
•
•
guidelines,
systematic reviews
Remember: Generally, its not what you don’t know that
causes problems - its what you “know” that just ain't
so….
Steps in Developing
Systematic Reveiws
66
Step 1: Identify an area of Uncertainty
•
Diagnosis
•
•
Therapy
•
•
Should asymptomatic impacted third molars be extracted?
Prognosis
•
•
How well does DIAGNODENT diagnosis interproximal
caries?
How long will a implant last when used to replace a single
anterior tooth lost due to trauma? Is it different if the tooth
loss is due to perio?
Harm or Causality
•
Do posterior inlays result in greater risk of tooth sensitivity
compared with other posterior restorations?
Step 2: Frame it as an Answerable Questions
(PICO Format)
•
•
•
•
P patients or populations
I interventions
C comparison group(s) or "gold standard"
O outcome(s) of interest
P.I.C.O.
Patient or
Problem
Tips for
Building
Questions
Example
Intervention
(a cause, prognostic
factor, treatment etc.
Comparison
Intervention
Outcomes
(if necessary)
Starting with your
patient, ask “How
would I describe a
group of patients similar
to mine?”
Ask “Which main
intervention am I
considering”
Ask “What is the
main alternative to
compare with the
intervention?”
Balance precision with
brevity.
Be specific
Be specific
In young adults will
asymptomatic impacted
third molars, cause
ortho relapse or lead to
problems better dealt
with prophylactically
Surgical extraction
Watchful waiting
Ask “What can I
hope to
accomplish?”, or
“What could this
exposure really
affect?”
Be specific
reduction of ortho
relapse, prevention
of oral infections,
reduction in surgical
complications at an
older age.
Step 3: Search for the Evidence
•
•
•
Philosophy: Find all literature that is
relevant and valid
Eliminate studies with poor design
Reduce potential for bias
•
•
•
•
Effect size (design effects)
Publication (no negative studies)
Author (COI)
Poor search strategies
Step 3: Search for the Evidence
•
Establish inclusion and exclusion criteria
•
•
Type of study (RCTs, Cohort, Case-Control,
Cross sectional)
Type of exposure and outcomes
• Case Definition
• Exposure Definition
• Are Outcomes Important (to whom?)
71
Step 3: Search for the Evidence
•
Develop Search Strategy
•
Electronic Databases
• MEDLINE, EMBASE, Cochrane Library, etc.
• Search Filter (are they tested and sensitive/specific)
•
•
Hand searching
Unpublished studies
• Gray literature (conference proceedings, disssertations)
•
•
Reference lists
Personal communication
72
Step 4: Extract Data
•
•
•
•
•
•
Apply Inclusion and exclusion criteria
Two stage review (title/abstract; full article)
Two reviewers
Rules for resolving disagreements
Use predetermined forms
Log reason for exclusion
73
Step 5: Analyze and Present Results
•
Evidence Table
•
•
•
•
•
•
Research design
Subjects
Methods
Results
Qualitative Summary
Quantitative Summary
•
•
•
Heterogeneity
Meta-analysis
Sensitivity analysis
•
Methodological Quality
•
•
•
•
•
•
•
allocation concealment
blinding
statistical analysis
funding/sponsorship
population (specificity)
intervention (specificity)
outcomes (specificity)
74
Step 6: Interpret and Review Results
•
•
•
•
•
Have all the main outcomes been considered
Have data been presented about absolute
change as a result of the intervention
Have any factors that may limit application been
considered
Are the results consistent
Don’t confuse “no evidence of an effect” with
“evidence of no effect”
75
Forest Plots
A quick look at metaanalysis
76
there’s a label to tell
you what the comparison
is and what the outcome
of interest is
77
At the bottom there’s
a horizontal line. This
is the scale measuring
the treatment effect.
Here the outcome is death
and towards the left the
scale is less than one,
meaning the treatment
has made death less
likely.
Take care to read what
the labels say – things to
the left do not always mean
the treatment is better than
the control.
78
The vertical line in the
middle is where the
treatment and control
have the same effect –
there is no difference
between the two
79
For each study
there is an id
The data for
each trial
are here, divided
into the experimental
and control groups
This is the % weight
given to this
study in the
pooled analysis
80
The data shown in
the graph are also
given numerically
The label above the graph
tells you what statistic
has been used
•Each study is given a blob, placed where the data measure the effect.
•The size of the blob is proportional to the % weight
•The horizontal line is called a confidence interval and is a measure of
how we think the result of this study might vary with the play of chance.
•The wider the horizontal line is, the less confident we are of the
observed effect.
81
The pooled analysis is given a diamond shape
where the widest bit in the middle
is located at the calculated
best guess (point estimate),
and the horizontal width is the
confidence interval
Definition of a 95% confidence interval: If a trial was repeated 100 times,
then 95 out of those 100 times, the best guess (point estimate) would lie
within this interval.
82
At the end of the day….
What do we really want
to know?
83
Can we believe it ?
•
•
•
bias free search & inclusion criteria?
appraisal of methodology of primary
studies?
consistent results from all primary
studies?
•
•
if not, are the differences sensibly explained?
are the conclusions supported by the
data?
84
If we believe it — does it apply to our
patient?
•
•
Is our patient (or population) so different
from those in the primary studies that the
results may not apply?
consider differences in:
•
•
•
time — many things change.
culture — both treatments and values of
outcomes can be different
stage of illness or prevalence can effect
results.
We believe it ! But….does it matter?
•
•
•
•
Is the benefit worthwhile to our patient?
Ask the patient about cultural values.
Think about Relative Risk Reduction vs.
Absolute Risk to our patient.
Potential benefit is the Absolute risk
avoided in our patient = Absolute Risk
Reduction (ARR)!
Is it a systematic review? does it:
•
•
•
•
•
define a four part (answerable) clinical
question?
combine Randomized Controlled Trials
(RCT’s)?
describe PRE-DEFINED search
methods?
PRE-DEFINED inclusion criteria?
PRE-DEFINED methodological exclusion
criteria?
PICO Practice
PICO Practice
Step 1: Key Clinical Question
“What is the effectiveness of semiannual
fluoride varnish compared to semiannual
fluoride gel in preventing dental caries in
permanent teeth among caries-active
adults?”
Egger at al., 2001
PICO Practice
Step 1: Key Clinical Question
“What is the effectiveness of semiannual
fluoride varnish compared to semiannual
fluoride gel in preventing dental caries in
permanent teeth among caries-active
adults?”
Population
Egger at al., 2001
PICO Practice
Step 1: Key Clinical Question
“What is the effectiveness of semiannual
fluoride varnish compared to semiannual
fluoride gel in preventing dental caries in
permanent teeth among caries-active
adults?”
Population
Egger at al., 2001
PICO Practice
Step 1: Key Clinical Question
“What is the effectiveness of semiannual
fluoride varnish compared to semiannual
fluoride gel in preventing dental caries in
permanent teeth among caries-active
adults?”
Intervention
Egger at al., 2001
PICO Practice
Step 1: Key Clinical Question
“What is the effectiveness of semiannual
fluoride varnish compared to semiannual
fluoride gel in preventing dental caries in
permanent teeth among caries-active
adults?”
Intervention
Egger at al., 2001
PICO Practice
Step 1: Key Clinical Question
“What is the effectiveness of semiannual
fluoride varnish compared to semiannual
fluoride gel in preventing dental caries in
permanent teeth among caries-active
adults?”
Comparison
Egger at al., 2001
PICO Practice
Step 1: Key Clinical Question
“What is the effectiveness of semiannual
fluoride varnish compared to semiannual
fluoride gel in preventing dental caries in
permanent teeth among caries-active
adults?”
Comparison
Egger at al., 2001
PICO Practice
Step 1: Key Clinical Question
“What is the effectiveness of semiannual
fluoride varnish compared to semiannual
fluoride gel in preventing dental caries in
permanent teeth among caries-active
adults?”
Outcome
Egger at al., 2001
PICO Practice
Step 1: Key Clinical Question
“What is the effectiveness of semiannual
fluoride varnish compared to semiannual
fluoride gel in preventing dental caries in
permanent teeth among caries-active
adults?”
Outcome
Egger at al., 2001
Source of Secondary Information
•
Systematic Reviews
•
•
E.g., Cochrane Collaboration
Guidelines
•
E.g., National Guidelines Clearinghouse
98
THE COCHRANE
COLLABORATION
Cochrane Collaboration
•
An international organisation that aims to
help people make well-informed decisions
about healthcare by preparing,
maintaining and promoting the
accessibility of systematic reviews of the
effects of health care interventions.
Cochrane Centres
Canadian
San Francisco
Nordic
German
UK
Dutch
French
Iberoamerican
San
Antonio
Italian
Chinese
New
England
Brazilian
South
African
Australasian
Cochran Library
106
Evaluation of Diagnostic
Tests
107
Topics
How do we “know” something.
1.
•
What are the elements and structure of scientific
thinking.
2.
•
4.
•
6.
Facts, Hypotheses, Theories, Paradigms
Research Designs and Control of Bias
Clinical Epidemiology
3.
5.
Scientific Reasoning
Sensitivity, Specificity, Predictive Value
Measurement in Dentistry
The Research Enterprise
108
Topics
How do we “know” something.
1.
•
What are the elements and structure of scientific
thinking.
2.
•
4.
•
6.
Facts, Hypotheses, Theories, Paradigms
Research Designs and Control of Bias
Clinical Epidemiology
3.
5.
Scientific Reasoning
Sensitivity, Specificity, Predictive Value
Measurement in Dentistry
The Research Enterprise
109
Diagnostic Tests
•
•
Purpose: to increase our certainty about
the cause of a patients illness
Common Types:
•
•
•
•
Physical and history findings
Laboratory test
Radiography
“Other” technological findings (pulp tester,
etc.)
110
Examples of Diagnostic Tests in
Dentistry
•
Caries
•
•
Pulpal necrosis
•
•
Biopsy, dye
Periodontitis
•
•
Electrical, thermal
Soft tissue lesions
•
•
visual, radiography, DIFOTI
Future attachment loss, PSR
Malocclusion
•
Index, study models, ceph
111
Reduction of Diagnostic
Information
•
•
•
•
Scales
Indexes,
Cut Points
Basic Decision: Treat or No Treatment
112
Outcomes in Orthodontics
•
•
•
Malocclusion is not a disease
Outcomes based on clinician assumptions
of patients needs/desires
Many dimensions need to be measured
•
Overjet, overbite, cross bite, etc…
113
Measurement Issues in Orthodontics
•
Index - assign numerical rating
•
•
•
•
Diagnostic (Angle)
Epidemiological Index (Summer’s)
Treatment need (HLD, Salzman, IOTN)
Treatment Outcome (PAR)
114
Valid and Reliable
Reliable but NOT
valid
NOT reliable or
valid
Reliable and valid
Can’t be valid unless reliable
115
What is validity in Ortho Index
•
Measures dimensions of occlusion that
are considered clinically important.
These could based upon:
•
•
•
Expert opinion
Clinical consequences (disease) or
change
Patient values and desires
116
How to assess reliability
•
Intra-rater
•
•
Inter-rater
•
•
Have same person rate the “case” more than once.
Have different people rate the “case”.
Expressed as measures of rater agreement
•
•
•
Nominal (Kappa)
Categorical (Percent agreement, weighted Kappa )
Continuous (Correlation, ICC)
117
Test Quality
•
•
•
Diagnosis is an imperfect process - all
tests have some inherent inaccuracy
The “correct” diagnosis thus becomes a
probability
Understanding the mathematical
performance of a test improves the
clinicians decision making process.
118
Measures of the Quality of a
Diagnostic Test
•
•
•
•
•
Sensitivity
Specificity
Accuracy
Predictive Value (positive and negative)
The higher these numbers - the better the
test.
119
the “Gold Standard”
•
•
•
The definitive diagnostic technique
Often expensive, elaborate, or difficult to
perform.
We are always looking for faster, cheaper,
better ways to diagnose disease (and to
determine treatment).
120
Sensitivity
•
•
•
•
The number of people with the disease (Gold
Standard) who have a positive test result.
Relates Gold Standard to New Test.
A sensitive test rarely misses people with
disease.
Sensitive tests should be selected when there
is an important penalty for missing disease
(i.e., cancer diagnosis)
121
Specificity
•
•
•
The number of people without the disease
who test positive.
A specific test will rarely misclassify
people without disease as diseased.
Specific tests are used to “rule in” a
diagnosis that has been suggested by
other tests.
122
Accuracy of a Test
•
•
The overall ability of a test to correctly
classify a patient.
Sensitivity + Specificity / 2
123
Predictive Value
•
•
Positive predictive value is probability of
disease in a patient with an abnormal test.
Negative predictive value is the probability
of no disease in a patient when the test
result is normal.
124
A new diagnostic test for
periodontal disease
125
“PERIOCHECK®”
•
•
•
•
A new diagnostic assay that the
company claims “predicts” future
periodontal attachment loss (LOA).
Requires a “blood test” of 1 ml of blood
placed into the “Periocheck” machine.
Values of the test range from -5 to +5
“Gold Standard” is actual attachment
loss (measured prospectively).
126
A Validation Study for
PERIOCHECK
•
300 subjects
recruited into study
•
•
•
2 edentulous
exclusions
8 medical
complication
exclusions
4 refused upon
consent
•
•
•
45% African Am
Mean age 49 ± 15y
Upon 2 year follow up
•
•
48 lost to follow up
Final Study Sample
•
238 (79%)
• 125 had LOA (52.5%)
• 113 no LOA (47.5%)
127
Distribution of Baseline
PERIOCHECK values by future
LOA
TN - True Negatives
TP - True Positives
FN - False Negatives
FP - False Positives
Diagnostic Cutpoint ≥ 0
35
Frequency
30
People who do NOT
develop LOA
25
TP
TN
20
People who DO
develop LOA
15
10
5
FN
FP
0
-4
-3
-2
-1
0
1
Periocheck Values
2
3
4
128
Distribution of Baseline
PERIOCHECK values by future
LOA
TN - 91
TP - 109
FN - 16
FP - 22
Diagnostic Cutpoint ≥ 0
35
Frequency
30
People who do NOT
develop LOA
25
TP
TN
20
People who DO
develop LOA
15
10
5
FN
FP
0
-4
-3
-2
-1
0
1
Periocheck Values
2
3
4
129
Gold Standard (eventual LOA)
Disease
Present
Periocheck
Positive
Test ≥ 0
109
Negative
Test < 0
16
35
125
30
Frequency
Disease
Absent
22
TP
FP
FN
TN
91
113
238
25
TN
20
TP
15
10
5
FN
0
-4
-3
-2
FP
-1
0
1
Periocheck Values
2
3
4
Prevalence = 125/238 = 52%
Quality of Diagnostic Test
•
•
Sensitivity - the number of people with
disease who have a positive test.
Specificity - the number of people without
a disease who have a negative test
131
Gold Standard (eventual LOA)
Disease
Present
Periocheck
Positive
Test ≥ 0
109
Negative
Test < 0
16
35
125
30
Frequency
Disease
Absent
22
TP
FP
FN
TN
91
113
238
25
TN
20
TP
15
Sensitivity = 109/124 = 87.9%
10
5
FN
0
-4
-3
-2
FP
-1
0
1
Periocheck Values
Specificity = 91/113 = 80.5%
2
3
4
Prevalence = 125/238 = 52%
Performance related to “Cut Point”
•
•
•
“cut point” is arbitrary and may be changed.
It is a decision point that a clinician may wish to
set for him/herself.
Sensitivity and Specificity are inversely
associated to one another and vary with the cut
point
133
Gold Standard (eventual LOA)
Disease
Present
Periocheck
Positive
Test ≥ 0
109
Negative
Test < 0
16
35
125
30
Frequency
Disease
Absent
22
TP
FP
FN
TN
91
113
238
25
TN
20
TP
15
Sensitivity = 109/125 = 87.9%
10
5
FN
0
-4
-3
-2
FP
-1
0
1
Periocheck Values
Specificity = 91/113 = 80.5%
2
3
4
Prevalence = 125/238 = 52%
Gold Standard (eventual LOA)
Disease
Present
Periocheck
Positive
Test ≥ -2
124
Negative
Test < -2
1
35
125
30
Frequency
Disease
Absent
35
TP
FP
FN
TN
78
113
238
25
TN
20
TP
15
Sensitivity = 124/125 = 99.2%
10
5
FN
0
-4
-3
-2
FP
-1
0
1
Periocheck Values
Specificity = 78/113 = 69.0%
2
3
4
Prevalence = 125/238 = 52%
What we have so far
•
That at the cut point studied (i.e., 0)
•
•
for every 100 patients without disease we
will correctly classify 80 of them.
(Specificity)
For every 100 patients with disease we will
correctly classify 89 of them. (Sensitivity)
136
Relationship of Sensitivity/Specificity
to Cut Point
Cut Point
Sensitivity
Specificity
-3
100
34
0
95
71
.5
90
82
1
83
91
3
55
99
137
ROC Curves
•
•
•
Relates changes in sensitivity and
specificity to changes in cut point.
Provides overall utility of test
Suggests “optimal” cut point
138
Senst
ROC CURVE
100
.5
0
-2
1
1.5
50
0
3
0
1-Spec 50
100
139
Senst
ROC CURVE
100
.5
0
-1
1
1.5
50
0
2
0
1-Spec 50
100
140
Senst
ROC CURVE
100
.5
0
-1
1
1.5
50
0
2
0
1-Spec 50
100
141
Senst
ROC CURVE
100
Area =.91
50
0
Area =.5
0
1-Spec 50
100
142
Senst
ROC CURVE
100
Optimal cut point
50
0
0
1-Spec 50
100
143
What we actually get clinically
•
•
People with a “positive” test
• And we want to know how many really DO have
disease
• Positive Predictive Value - the number of people with
a positive test who have disease.
People with a “negative” test
• And we want to know how many really DO NOT
have disease.
• Negative Predictive Value - the number of people
with a negative test who do not have disease.
144
Gold Standard (eventual LOA)
Disease
Present
Periocheck
Positive
Test ≥ 0
109
Negative
Test < 0
16
22
TP
FP
FN
TN
91
131
107
238
35
30
Frequency
Disease
Absent
25
TN
20
TP
15
Positive Pred = 109/131 = 83.2%
10
5
FN
0
-4
-3
-2
FP
-1
0
1
Periocheck Values
Negative Pred = 91/107 = 85.0%
2
3
4
Prevalence = 125/238 = 52%
Test performance and prevalence
•
•
Sensitivity and Specificity are stable
properties
PPV and NPV are frequency (Prevalence)
dependent properties
146
Gold Standard (eventual LOA)
Disease
Present
Periocheck
Positive
Test ≥ 0
109
Negative
Test < 0
16
35
125
30
Frequency
Disease
Absent
22
TP
FP
FN
TN
131
91
107
113
238
25
TN
20
TP
15
Positive Pred = 109/131 = 83.2%
10
5
FN
0
-4
-3
-2
FP
-1
0
1
Periocheck Values
Negative Pred = 91/107 = 85.0%
2
3
4
Prevalence = 125/238 = 52%
Gold Standard (eventual LOA)
Disease
Present
Periocheck
Positive
Test ≥ 0
109
Negative
Test < 0
35
30
Frequency
Disease
Absent
194
TP
FP
FN
TN
303
16
806
822
125
1000
1125
25
TN
20
TP
15
Positive Pred = 109/303 = 35.9%
10
5
FN
0
-4
-3
-2
FP
-1
0
1
Periocheck Values
Negative Pred = 806/822 = 98.0%
2
3
4
Prevalence = 125/1125 = 11%
Remember
•
•
•
Sensitivity and Specificity are stable with
changing prevalence, but will vary inversely
with “cut point”.
PPV/NPV vary by the prevalence of the
population in which the test is administered.
Best to use when uncertainty is high
•
Prevalence close to 50%
149
HIV Example (ELISA)
When used in premarital screenings
•
•
•
•
•
•
•
Sensitivity - 98
Specificity - 99
Prevalence - 250/100,000
PPV = 20%
2 million marriages / year in US
HIV cases = 5,000
For every 1000 correctly diagnosed, there will
be 4000 false positives.
150
Research Ethics
•
•
•
•
•
Risk/Benefit Ratio
Subject safety
the information
investigator is
how
Written
Consent.
should
befrom
in a state
protected
The
investigator
Informed Consent
IRB Approved.
of
"equipoise,"
that
unauthorized
must
consider
how
Full disclosure
of
is,
if a new
Privacy and Confidentialityobservation,
and will
adverse
events
Risksintervention
iswho
being
how
participants
are
be
handled;
will
Adverse events
against
the
totested
be notified
any
provide
careoffor
a
currently
accepted
unforeseen
findings
Equipoise
participant
injured in
treatment,
thewho will
from
the research
a study
and
investigator
that
mayshould
or may
paythey
for that
care
are
be
genuinely
not
want
to know.
important
uncertain
which
considerations.
approach is superior.
151
Ethical Issues in Human
Research
•
•
•
Autonomy
Beneficence
Justice
the obligation on the part of the
Tuskegee:
Study
of syphilis
in the
beneficence,
which
refers
to
investigator
to respect
each
justice,
which
demands
equitable
Blacks,
without
telling
them
of their
obligation
on
the
part
of
the
participant
asofa participants,
person capable
selection
i.e., of
participation.
investigator
to attempt
to maximize
making
an
informed
decision
avoiding
participant
populations
Deception
and
lack
of informed
benefits
for
the
individual
regarding
participation
the
that may
be unfairlyincoerced
into
consent
used.
participant
and/or
society,
while
research
study.
The
investigator
participating,
such
as
prisoners
Individuals
followed
for 40
years
minimizing
risk
of
harm
to
the
must
ensure
that
the
participant
and
institutionalized
children.
without
treatment.
individual.
honest
and thorough
has receivedAn
a full
disclosure
of the
risk/benefit
calculation
must be
nature
of the study,
the risks,
performed.
benefits
and alternatives, with an
extended opportunity to ask
questions.
152
Components of Ethical, Valid
Consent
•
•
•
•
•
Disclosure
Understanding
Voluntariness
Competence
Consent
Disclosure: The potential
Understanding: The participant
participant must be informed as
must understand what has been
fully as possible
Voluntariness:
The
of participant's
the nature
explained and must be given the
and purpose
consent
to participate
of the research,
in the
opportunity to ask questions
the
research
procedures
must The
be
to voluntary,
be
used, the
free
Competence:
participant
and have them
answered
by one
expected
of
any coercion
benefits
or to
promises
thegive of
must
competent
to
of thebe
investigators.
participant
benefits
unlikely
tosociety,
result from
the
consent.
If and/or
the participant
is not
Consent:
The
potential human
potential
participation.
of due
reasonably
competent
to mentalhis/her
status,
subject must
authorize
foreseeable
risks, stresses,
disease,
or emergency,
a and
participation
in the research
discomforts,
and alternatives
designated
surrogate
may to
study, preferably
in writing,
participating
in the
research.
provide
consent
if it
isoral
in the
although
at times
an
participant's
best interest
consent or assent
may betomore
participate.
appropriate.
153
The End
Questions?
154
Download