Slovarp Critical Appraisal of Research

advertisement
10/13/2015
CRITICAL APPRAISAL OF RESEARCH
LAURIE SLOVARP, PH.D., CCC-SLP, BCS-S
WHY IS IT IMPORTANT?
 EBP demands it
LEVELS OF EVIDENCE
 Not all conclusions are correct
 Level Ia: Meta-analysis of 2 or more well-designed RCTs
 Not all published research is good research
 Level 1b: At least 1 well-designed RCT
 There is a lot of bias out there
 Level 2: Controlled trial without randomization; at least 10 participants in
 Just because its in print (or publicized), doesn’t mean the conclusions are accurate,
relevant, or cost effective
 We do important work
 We are accountable to our patients as well as to the funding sources that pay for our
each group
 Level 3: Expert consensus opinion in absence of good empirical evidence
 Level 4: Conflicting evidence
Adopted from ASHA
treatment
MORE LEVELS OF EVIDENCE
 Level 1: RCTs
 Level 2: non-randomized (quazi-experimental)
 Level 3: Single subject; case study
INTERNAL AND EXTERNAL VALIDITY
INTERNAL VALIDITY
 How well the results seen in a study
represent a causal relationship between
the treatment and the outcome
measure
EXTERNAL VALIDITY
 How well the inferences of the study
generalize/apply outside of the research
paradigm
 Level 4: Expert opinion
 Based on levels of evidence from ASHA (2004) adapted from Scottish Intercollegiate Guidelines Network
 Generally as internal validity increases, external validity decreases
1
10/13/2015
THREATS TO INTERNAL VALIDITY
 Extraneous variables – variables that may compete with the
independent variable (treatment) in explaining the outcome
 Confounding variable – extraneous variable that systematically varies
with or influences the independent variable and also influences the
dependent variable (outcome measure)
 E.g., spontaneous recovery, co-occurring treatment, dosage
 Extraneous variables MUST be dealt with to be confident in conclusions
INTERNAL VALIDITY THREATS BY COCHRANE
 Selection Bias – differences between groups at baseline
 Performance Bias – differences in tx that is separate from the tx of interest
(confounding variable)
 e.g., placebo, dosage
 Detection Bias – differences between groups in how outcomes are
determined
 Most vulnerable with subjective outcomes
 Attrition Bias – differences between groups due to data exclusion or drop out
 Reporting Bias – differences btn reported and unreported findings
 Other
DEALING WITH INTERNAL THREATS/BIAS
BIAS
CONTROL
Selection bias
Random assignment
Performance bias
Blinding of participants and clinicians
Detection bias
Blinding of outcome assessors
Attrition bias
(This is a difficult one)
Reporting bias
Appropriate outcome measures are
used and reported on
THREATS TO EXTERNAL VALIDITY
 Sampling bias
 Does the sample represent the population of interest?
 Variation in how the treatment is implemented
 Can the treatment be delivered similarly to how it was delivered in the study?
 Variation in outcome measures
OTHER THREATS TO INTERNAL VALIDITY
 Testing
 Did the pre-test effect the post-test?
 Poor reliability of outcome measure(s)
 Poor validity of outcome measure
 Comparing apples to oranges
 Compensation for participation
 Demoralization
THE PHYSIOTHERAPY EVIDENCE DATABASE (PEDRO)
 www.pedro.org.ua
 31,000 RCTs, systematic reviews, EBP guidelines relevant to
physiotherapy (dysphagia is in there)
 RCTs are rated for quality on 11 criterion (score 1-10)
 Can the results of the study generalize to other types of measures?
 Setting influence
 Does the setting in which the treatment is delivered impact the outcome?
2
10/13/2015
THE PEDRO SCALE
REVIEWING SYSTEMATIC REVIEWS
1. Eligibility criteria specified
 Questions to ask:
2.
3.
4.
5.
6.
7.
8. At least 1 key measure obtained
(doesn’t count in score)
from at least 85% of subjects
Subjects randomly allocated
9. All subjects receiving the measure
received the specific allocated tx
Group allocation concealed from
researcher
10.At least 1 statistical comparison
provided
Groups were similar at baseline
11.Some measure of effect size is
Blinding of subjects
reported
Blinding of therapists
Blinding of assessors
1. What question did the systematic review address? – main question should be clearly
stated.
2. Is it unlikely that important, relevant studies were missed? – searches should include
major bibliographic databases as well as search or references from relevant studies.
3. Were the criteria used to select articles for inclusion appropriate and clear?
4. Were the included studies sufficiently valid for the type of question asked?
5. Were the results similar from study to study?
From Center for Evidence-Based Medicine
http://www.cebm.net/critical-appraisal/
RELIABILITY AND VALIDITY OF MEASURES
VALIDITY OF MEASURES
RELIABILITY – Is the measure
consistent?
VALIDITY – Is the measure
representative and accurate?
 Content validity – How well the measure represents all facets of the
 Test-Retest Reliability
 Content Validity
 Inter-Rater Reliability
 Criterion Validity
 Intra-Rater Reliability
 Construct Validity
construct in question
 Construct validity – How accurately the measure captures the construct
in question. Does it really measure what it is supposed to measure?
 Criterion validity – How well the measure estimates a criterion or
predicts a criterion
STATISTICAL CONCLUSION VALIDITY
SIGNIFICANCE TESTING, EFFECT SIZE, POWER
 P-value = probability that the between-group differences found in the study was chance
 How accurately does the statistical conclusion reflect the
population the sample is intended to represent?
 Statistical errors:
 Type I – concluding there is a cause, when there actually is not
 Type II – concluding there is no cause, when there actually is
 Probability of Type I error
 Effect size = magnitude of treatment effect (difference between the groups)
 Confidence Interval (CI) = range of values that represent the level of confidence
that the true value is within the given range
 Usually expressed as 95% CI
 Statistical power = probability that a statistical test will accurately reject the null
hypothesis when it is false
 Based on sample size, variance,Type I and Type II error rates, and effect sizes
 .80 power recommended
3
10/13/2015
A WORD ABOUT CORRELATION
THREATS TO STATISTICAL CONCLUSION VALIDITY
 Low power (small sample size most common reason)
 Unreliable measure
 Remember, correlation does not indicate causation!!
 Restricted range of measure (insensitive)
 Unreliable treatment (non-standardized)
 Heterogeneity of participants
KEY QUESTIONS
RESOURCES
 Is the study question appropriate or relevant
 ASHA PRACTICE PORTAL:http://www.asha.org/Practice-Portal/Speech-LanguagePathologists/
 Was the study design appropriate for the question asked?
 ASHA web tutorials in assessing evidence:
 Did the design account for most important potential sources of bias?
http://www.asha.org/Members/ebp/Assessing-Evidence-Tutorials/
 Are the sampling methods appropriate and adequate?
 Cochrane: http://www.cochrane.org/
 Are the measures appropriate, valid, and reliable?
 Physiotherapy Evidence Database (PEDro): www.pedro.org.au
 Evidence-based review of stroke rehabilitation
 Did the study account for confounding variables?
 Aside from the allocated treatment, were the groups treated equally?
 Were appropriate statistics provided and interpreted correctly?
 http://www.ebrsr.com./sites/default/files/Chapter15_Dysphagia_FINAL_16ed.pdf
 Does the data justify the conclusions?
 Basic statistics for clinicians: http://epe.lac-
 Are there any conflicts of Interest?
Young & Solomon, 2009
bac.gc.ca/100/201/300/cdn_medical_association/cmaj/series/stats.htm
HELPFUL REFERENCES
 Heller, R.,Verma, A., Gemmell, I., Harrison, R., Hart, J., & Edwards, R. (2006). Critical appraisal for public health: A
new checklist. Public Health, 92-98.
 Fowkes, F., & Fulton, P. (1991). Critical appraisal of published research: Introductory guidelines. Bmj, 1136-1140.
 Rubin, D. (2011). How to Critically Analyze Psychological Research. In (pp. 1-13). Callaghan,Australia: School of
Psychology,The University of Newcastle.
 Salkind, N. (2009). Exploring Research. 7 th Ed. Pearson, New Jersey
 Shadish, Cook, Cambell (2002). Experimental and Quasi-experimental designs. Houghton Mifflin Co. New York
 Young, J., & Solomon, M. (2009). How to critically appraise an article. Nature Clinical Practice Gastroenterology &
Hepatology Nat Clin Pract Gastroenterol Hepatol, 82-91.
4
Download