PPTX - Halldale

advertisement
Click to edit Master Comparative
title style
Research on
Training Simulators in
• Click to edit Master text styles
Emergency Medicine:
• Second level
A Methodological
Review
• Third level
• Fourth level
• Fifth level
Matt Lineberry, Ph.D.
Research Psychologist, NAWCTSD
matthew.lineberry@navy.mil
Medical Technology, Training, &
Treatment (MT3) May 2012 1
Click to edit Master title style
Credits and Disclaimers
•
•
•
•
•
•
•
Click to edit Master text styles
Co-authors
Second
– Melissalevel
Walwanis, Senior Research Psychologist,
NAWCTSD
Third
level
– Josephlevel
Reni, Research Psychologist, NAWCTSD
Fourth
Fifth level
These are my professional views, not
necessarily those of NAWCTSD, NAVMED, etc.
2
Click to edit Master title style
Objectives
••
•
••
•
•
•
Click
to edit
Master
text styles research in
Motivate
conduct
of comparative
simulation-based
Second level training (SBT) for healthcare
Third
level
Identify
challenges evident from past
comparative
Fourth levelresearch
Fifth level
Promote more optimal research methodologies
in future research
3
Cook et al. (2011) meta-analysis in
JAMA:
“…we question the need for further
studies comparing simulation with no
intervention (ie, single-group pretestposttest studies and comparisons with
no-intervention controls).
…theory-based comparisons between
different technology-enhanced
simulation designs
(simulation vs. simulation studies)
that minimize bias, achieve appropriate
power, and avoid confounding…
are necessary”
Issenberg et al. (2011) research agenda
in SIH:
“…studies that compare simulation
training to traditional training or no
training (as is often the case in control
groups), in which the goal is to justify its
use or prove it can work, do little to
advance the field of human learning
and training.”
Click
to
edit
Master
title
style
Moving forward:
comparative research
• Click to edit Master text styles
•How do varying degrees and types of
• Second level fidelity affect learning?
• Third level
•Are some simulation
or modalities superior to
• Fourth level approaches
others?
For what learning objectives?
• Fifth level
Which learners? Tasks? Etc.
•How do cost and throughput
considerations affect the utility of
different approaches?
6
Click to edit Master title style
Where are we now?
•Searched
Click for
to peer-reviewed
edit Masterstudies
text comparing
styles training
effectiveness of simulation approaches and/or measured
• Second
practice on level
human patients for emergency medical skills
•• Third
level
Searched PubMed and CINAHL
– mannequin,
manikin, animal, cadaver, simulat*, virtual
• Fourth
level
reality, VR, compar*, versus, and VS
Fifth levelsearched Simulation in Healthcare
•• Exhaustively
• Among identified studies, searched references forward and
backward
7
Click to edit Master title style
Reviewed studies
•17Click
to met
editcriteria
Master text styles
studies
• Second level
• Procedure trained:
• Third level
– Predominantly needle access (7 studies).
4 airway
adjunct, 3 TEAM, 2 FAST, etc.
• Fourth
level
• Fifth level
• Simulators compared:
– Predominantly manikins, VR systems, and parttask trainers
8
Click to edit Master title style
Reviewed studies
••
•
••
•
•
Click
Design:to edit Master text styles
Almost
entirely
Second
levelbetween-subjects (16 of 17)
Third level
Trainee
performance measurement:
– 7 were post-test only; all others included pre-tests
Fourth
level
– Most (9 studies) use expert ratings;
also:level
knowledge tests (7), success/failure (6), and
Fifth
objective criteria (5)
– 6 studies tested trainees on actual patients
– 6 tested trainees on one of the simulators used in
training
9
Click
to
edit
Master
title
style
Apparent methodological
challenges
•1. Click
to edit Master text styles
Inherently smaller differences between conditions
– and consequently,
underpowered designs
• Second
level
•2. Third
level
An understandable
desire to “prove the null” –
but inappropriate
approaches to testing
• Fourth
level
equivalence
• Fifth level
3.
Difficulty measuring or approximating the
ultimate criterion:
performance on the job
10
Click
to
edit
Master
title
style
Challenge #1:
Detecting “small” differences
•• Click
edit Master
Cook et to
al. (2011)
meta: text styles
in outcomes of roughly 0.5-1.2 standard
• Differences
Second
level
deviations, favoring simulation-based training over no
• simulation.
Third level
research should expect smaller differences
• Comparative
Fourth
level
than these.
• Fifth level
• HOWEVER, small differences can have great practical
significance if they…
– correspond to important outcomes
(e.g., morbidity or mortality),
– can be exploited widely, and/or
– can be exploited inexpensively.
11
Click to edit Master title style
The power of small differences…
• Click to edit Master text styles
• Physicians Health Study:
• Second
levelhalted prematurely due to
Aspirin trial
• Third
level
obvious
benefit for heart attack
• reduction
Fourth level
– Effect
size: r = .034
• Fifth
level
– Of 22k participants,
85 fewer heart attacks in the aspirin group
12
Click
to
edit
Master
title
style
…and the tyranny of small
differences
•• Click
to edit
text styles(power)
Probability
to Master
detect differences
exponentially as effect size decreases
• decreases
Second level
•• Third
level can’t control effect sizes.
We generally
• Among
Fourth other
level things, we can control:
– Sample size
• Fifth level
– Reliability of measurement
– Chosen error rates
13
Click to edit Master title style
Sample size
•• Click
to edit Master
text styles
Among reviewed
studies, n ranges
from 8 to 62;
median n = 15.
• Second level
•• If
n = 15,level
α = .05, true difference = 0.2 SDs, and
Third
measurement is perfectly reliable,
of detecting the difference is only 13%
• probability
Fourth level
•RECOMMENDATION:
Fifth level
Pool resources in multi-site collaborations to achieve
needed power to detect effects
(and estimate power requirements a priori)
14
Click to edit Master title style
Reliability of measurement
•
••
••
•
••
Click to edit Master text styles
Potential
Secondrater
levelerrors are numerous
Third level
Typical
statistical estimates can be uninformative
(i.e. coefficient alpha, inter-rater correlations)
Fourth level
If
measures
are unreliable –
Fifth
level
and especially if samples are also small –
you’ll almost always fail to find differences,
whether they exist or not
15
Click to edit Master title style
Reliability of measurement
•Among
Click
tostudies
edit Master
text
styles
nine
using expert
ratings:
•• Second
level
Only two used multiple raters for all participants
• Third level
• Six studies did not estimate reliability at all
• Fourth
level
– One study
reported an inter-rater reliability coefficient
– Two studies reported correlations between raters’ scores
• Fifth
Bothlevel
approaches make unfounded assumptions
• Ratings were never collected on multiple occasions
16
Click to edit Master title style
Reliability of measurement
• Click to edit Master text styles
RECOMMENDATIONS:
1.
Use robust measurement protocols –
• Second
level
e.g., frame-of-reference
rater training, multiple raters
•2. Third
level
For expert
ratings, use generalizability theory
to estimate and improve reliability
• Fourth level
G-theory respects a basic truth:
•“Reliability”
Fifth level
is not a single value associated with a measurement tool
Rather, it depends on how you conduct measurement,
who is being measured,
the type of comparison for which you use the scores, etc.
17
Click to edit Master title style
G-theory process, in a nutshell
Clickratings,
to editusing
Master
text stylesdesign to expose
1.•Collect
an experimental
of error
•sources
Second
level
(e.g., have multiple raters give ratings, on multiple
•occasions)
Third level
Fourth
level
2.•Use
ANOVA
to estimate magnitude of errors
3.•Given
Fifthresults
levelfrom step 2, forecast what reliability will
result from different combinations of raters, occasions,
etc.
18
18
Click to edit Master title style
Weighted scoring
••
•
•
•
•
•
Click
to editused
Master
text styles
Two studies
weighting
schemes –
more
points
associated with more critical
Second
level
procedural steps
Third
level both reliability and validity
– Can improve
Fourth level
RECOMMENDATION:
Fifth
level
Use task
analytic procedures to identify
criticality of subtasks;
weight scores accordingly
19
Click to edit Master title style
Selecting error rates
• Click to edit Master text styles
Why do we choose p = .05 as the
• Second level
threshold
for
statistical
significance?
• Third level
• Fourth level
• Fifth level
20
Click to edit Master title style
Relative severity of errors
• Click
to edit
Master
text effective
styles than Simulator y”
Type
I error:
“Simulator
x is more
(but
really, they’re
• Second
levelequally effective)
• Third outcome:
level Largely trivial; both are equally
Potential
effective, so erroneously favoring one does not affect
•learning
Fourth
level
or patient outcomes
• Fifth level
Type II error: “Simulators x and y are equally effective”
(but really, Simulator X is superior)
Potential outcome: Adverse effects on learning and patient
outcomes if Simulator X is consequently underutilized
21
Click to edit Master title style
Relative severity of errors
• Click
to edit
Master
text effective
styles than Simulator y”
Type
I error:
“Simulator
x is more
(but
really, they’re
• Second
levelequally effective)
α=.05
• Third outcome:
level Largely trivial; both are equally
Potential
effective, so erroneously favoring one does not affect
•learning
Fourth
level
or patient outcomes
• Fifth level
Type II error: “Simulators x and y are equally effective”
(but really, Simulator X is superior)
β=1-power
Potential
outcome:
Adverse
effects on learning and patient
(e.g.,
1-.80
=
.20)
outcomes if Simulator X is consequently underutilized
22
Click to edit Master title style
Relative severity of errors
•• Click
to edit Master text styles
RECOMMENDATION:
in a new line of research, adopt an
• Particularly
Second level
alpha level that rationally balances inferential
• Third
errors level
according to their severity
• Fourth level
Cascio, W. F., & Zedeck, S. (1983). Open a new window in rational research planning:
Adjust
alpha to
maximize statistical power. Personnel Psychology, 36, 517-526.
• Fifth
level
Murphy, K. (2004). Using power analysis to evaluate and improve research. In S.G.
Rogelberg (Ed.), Handbook of research methods in industrial and organizational
psychology (Chapter 6, pp. 119-137). Malden, MA: Blackwell.
23
Click
to
edit
Master
title
style
Challenge #2:
Proving the null
• Click to edit Master text styles
• Language in studies often reflects desire to
• Second
level
assert equivalence
– e.g.,level
different simulators are “reaching
• Third
parity”
• Fourth level
•• Fifth
levelnull hypothesis statistical testing
Standard
(NHST) does not support this assertion
– Failure to detect effects should prompt
reservation of judgment, not acceptance of
the null hypothesis
24
Click to edit Master title style
Which assertion is more bold?
• Click to edit Master text styles
• Second
“Sim
X is level
more effective than Sim Y”
• Third level
• Fourth level
Y favored
X favored
0
• Fifth level
“Sims X and Y are equally effective”
Y favored
0
X favored
25
Click to edit Master title style
Proving the null
••
•
•
•
•
•
Click
totoedit
Master
text styles
Possible
prove
the null:
– Set a region
of practical equivalence around zero
Second
level
– Evaluate whether all plausible differences (e.g., 95%
confidence
Third
level interval) fall within the region
Fourth level
RECOMMENDATION:
Fifth
level
– Avoid
unjustified acceptance of the null
– Use strong tests of equivalence when hoping to assert
equivalence
– Be explicit about what effect size you would consider
practically significant, and why
26
Click
to
edit
Master
title
style
Challenge #3:
Getting to the ultimate criterion
•• Click
to is
edit
text stylesbut job
The goal
notMaster
test performance
• performance;
Second level
“the map is not the terrain”
• Third level
•• Fourth
level
Typical to
test demonstration of procedures,
often
on
a
simulator
• Fifth level
– Will trainees perform similarly on actual patients,
under authentic work conditions?
– Do trainees know when to execute the
procedure?
– Are trainees willing to act promptly?
27
Click to edit Master title style
e.g.: Roberts et al. (1997)
••
•
•
•
••
Click
to edit detected
Master text
No differences
in ratestyles
of successful
laryngeal
mask
airway placement for manikin vs.
Second
level
manikin-plus-live-patient training
– However:
Third
levelConfidence very low, and only increased with
live-patient practice
Fourth level
“…if
a level
nurse does not feel confident enough… the
Fifth
patient will initially receive pocket-mask or bag-mask
ventilation, and this is clearly less desirable”
Issue of willingness to act decisively
28
Click to edit Master title style
Criterion relevance
••
•
•
•
•
Click
to edit Master text styles
RECOMMENDATION:
Where
possible,
Second
level use criterion testbeds that
correspond highly to actual job performance
Third
level
– Assess
performance on human
patients/volunteers
Fourth
level
– Replicate performance-shaping factors (not just
Fifth
level
environment)
– Test knowledge of indications and willingness to
act
29
Click to edit Master title style
What if patients can’t be used?
• Click to edit Master text styles
• Using simulators as the criterion
• Second level
testbed
introduces
potential
biases
• Third level
– e.g., level
train on cadaver or manikin;
• Fourth
test
on
a
different
manikin
• Fifth level
30
Click
to
edit
Master
title
style
A partial solution:
Crossed-criterion design
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
31
Click
to
edit
Master
title
style
A partial solution:
Crossed-criterion design
•• Click
to edit Master text styles
Advantages
• Second
level
– Mitigates
bias
• Third
level
– Allows
comparison of generalization of
learning
from each training condition
• Fourth
level
•• Disadvantages
Fifth level
– Precludes pre-testing, if pre-test exposure
to each simulator is sufficiently lengthy to
derive learning benefits
32
Click to edit Master title style
Conclusions
• Click to edit Master text styles
• “The greatest enemy of a good plan is
• Second
level
the dream
of a perfect plan”
•• Third
level comparative research is to
All previous
• Fourth
levelfor pushing the field forward
be lauded
•• Fifth
level steps can be taken to
Concrete
maximize the theoretical and practical
value of future comparative research
33
Download