Brief_stats_primer_for_dummies

advertisement
Understanding Cutoffs, Norms,
Trajectories, and everything else that
freaks you out: A Primer for Dummies
G.S. (Jeb) Brown, Ph.D.
Center for Clinical Informatics
But I’m a clinician……
• Statistics aren’t necessary to be a good therapist.
• I was never good at math.
• I’ll let the researchers worry about statistics, I’m
interested in helping clients.
• I know when my clients are getting better; I don’t
need an outcome measure to tell me this.
• I dissociate when confronted with numbers.
Stats phobia desensitization
• Close you eyes and breath slowly
• Visualize a lovely hill… green grass, gentle
breeze, warm sun
• See how the grand rises slowly at first, then
steeper; notice the sensuous lines and beautifully
rounded summit.
• As you exhale slowly, say to yourself …
“I love the Bell Curve”.
• Repeat as needed.
Mantra: I love the Bell Curve
Mantra: I love the Bell Curve
10%
8%
6%
4%
2%
80
74
68
62
56
50
44
38
32
26
20
0%
What’s normal?
• All outcome questionnaires appear to measure a
common “factor”… global distress/happiness
• Like almost every other human trait, misery is
normally distributed.
100%
Seriously unhappy
80%
60%
Us normal folk
40%
20%
Pathologically happy
80
76
72
68
64
60
56
52
48
44
40
36
32
28
24
20
0%
Calculating clinical cutoff
• Clinical cutoff scores used to estimate the
boundary between “normal” and “clinical”.
• Take another deep breath… here comes a
formula……
(SD1)(mean2)+(SD2)(mean1)
C=
SD1+SD2
• Thanks to Jacobson & Truax, 1991
Idealized clinical cutoff
Clinical cutoff
10%
Normal Sample
8%
Clinical Sample
6%
4%
2%
82
88
76
64
70
52
58
46
34
40
28
16
22
10
0%
Nothing is perfect….
• The Bell Curve is beautiful in the abstract, but
reality is messier.
• Many tests may floor and ceiling effects.
• Distributions may depart from normality….
Floor and ceiling effects
• Example – the mean of a in a non clinical sample
completing the ORS was 30 and the standard
deviation 6.2.
• But wait! The ORS has a maximum of score of
40, only 1.6 standard deviations above the mean.
Example samples (sample samples?)
• Accountable Behavioral Healthcare Alliance
–
Five county community mental health system in Oregon serving
adults and children. ORS administered administered at every session.
Data collected at an estimated 80% of all sessions.
• RFL-Resources for Living
–
EAP company providing telephonic counseling services to adults.
ORS administered telephonically at every session. Approximately
80% of clients have a single session (one call).
• SAIC
–
Clinics in multiple countries serving children of active duty US
military personnel and military contractors.
• Community sample- Small sample of non clients.
ORS score distributions
4 clinical & 1 community samples
35%
ABHA -Adults
30%
25%
ABHA_Child &
Adolescent
20%
15%
RFL
10%
SAIC
5%
Community
Sample
0%
-5% 0
6
12
18
24
30
36
Non normal curves……
Clinical cutoff scores
ABHA Adults
ABHA Child &
Adolescent
RFL
Adults
SAIC
Adolescents
standard deviation normal
6.8
6.8
6.8
6.8
standard deviation clinical
9.3
7.7
9
8.2
mean normals
29.7
29.7
29.7
29.7
mean clinical
18.3
21.4
19.5
25.8
cuttoff =
24.9
25.8
25.3
27.9
When is change “real”
• All measurement has “error”
• How do we know if change on a test is simply the
result of random error?
• The Reliable Change Index is a common metric
to determine if the difference between two scores
is greater than expected from random error.
• Here we go again….Jacobson & Truax, 1991
RCI = 1.96 * Sdiff
Sdiff = 2SE2
SE = SD1 - rxx
When is change “real”
• Standard error of measurement (SE) is defined as
the standard deviation multiplied by he square
root of 1 minus the reliability of test as calculated
by the coefficient alpha, a measure of internal
consistency.
• Estimates of coefficient alpha….
ABHA = .89
RFL=.82
SAIC=.97
RCI estimates
ABHA Adults
ABHA Child &
Adolescent
RFL
Adults
SAIC
Adolescents
Standard deviation
9.3
7.7
9
8.2
reliability
0.89
0.87
0.81
0.97
standard error of measure
3.1
2.8
3.9
1.4
Sdiff
4.4
3.9
5.5
2.0
RCI
5.6
5.0
7.1
2.6
Did I mention that all measurement is approximation?
Criteria for recovery
• Jacobson & Truax (1991) proposed a two fold
criteria for recovery
1. Change score exceeds the RCI
2. Scores moves from clinical range to normal
range.
• Two problems with this criteria…
1. A substantial percentage of patients start
treatment in the normal range
2. The probability of change exceeding the RCI
is a function of severity
% of cases with “real” change
(Change exceeds 5 points)
Clinical range cases
(ORS intake =< 25)
All cases
100%
100%
80%
80%
Worse
60%
60%
No change
40%
Improved
40%
20%
20%
0%
0%
ABHA
RFL
SAIC
ABHA
RFL
SAIC
Case mix adjustment
• We can’t evaluate outcomes without answering
the question “Compared to what?”
• Differences in severity (intake score) and other
client characteristics make it difficult to compare
outcomes
• Case mix adjustment uses statistical methods to
adjust for differences in clients when comparing
outcomes from one site to another
• Case mix adjustment is always imperfect.
Regression to the rescue
Regression artifacts are always
present in repeated measures
• If a measure has test-retest reliability
(correlations between two points in time) then it
will exhibit regression to the mean.
• Correlations between measures will tend to
decrease over time.
• The intake score is almost always the strongest
predictor of change.
• Recommended reading – A Primer on Regression
Artifacts (Campbell & Kenny, 1999)
Multivariable GLM
• The General Linear Model can incorporate
multiple variables to predict a continuous
dependent variable, such as a change score.
• Both continuous and categorical variables can be
incorporated into the model.
• Multivariate analysis of variance using GLM
suggest that most of the variance in change scores
is explained by the intake score. Variables such as
diagnosis, age and sex account for a relatively
small percentage of the variance.
Multivariable Regression
Excel and regression
• The slope and intercept values for a simple linear
regression can be calculated in Excel.
• Amazingly, the slope function returns the
coefficient for the slope and the intercept function
returns the value for the intercept.
• Different slopes and intercepts can be calculated
for subgroups (age group, diagnosis, etc).
Benchmarking outcomes
• GLM permits “benchmarking” of outcomes, by
comparing an actual score (or change score) with
the predicted score derived some the comparison
(or normative) sample.
• Difference between actual and predicted score is
know as the residual score (change score)
• “Benchmark Score” or Change Index Score”
maybe have more intuitive meaning than
“Residual Score”. Or not.
Trajectory of Change Graph
• The Trajectory of Change graphs displays a
client’s actual scores to the predicted scores.
• GLM used to predict scores are subsequent
measurement intervals (sessions, weeks) using
intake score and any other variables available.
• Distribution of residuals used to plot percentiles
at different intervals.
Graphing scores over time
Trajectory of change
40
Actual score
Average Score
30
25th percentile
75% percentile
25
20
Clinical cuttoff
15
10
5
Session
9
ss
io
n
8
Se
ss
io
n
7
Se
ss
io
n
6
Se
ss
io
n
5
Se
ss
io
n
4
Se
ss
io
n
3
Se
io
n
ss
Se
Se
ss
io
n
2
ak
e
0
In
t
Outcome Score
35
Graphing 3 way interactions
• Trajectory of Change graphs can be used to
display 3 way interactions involving severity,
time and a third variable of interest (treatment,
age group, diagnosis, etc.)
• Slope and intercept at each measurement point
are calculated separately for grouping of interest.
• Following are two examples from the ABHA
data.
Trajectories of change
40
35
30
25
20
15
10
5
0
75% percentile
Clinical cuttoff
Children/adolescents
Adults
25th percentile
Se
ss
io
n
2
Se
ss
io
n
3
Se
ss
io
n
4
Se
ss
io
n
5
Se
ss
io
n
6
Se
ss
io
n
7
Se
ss
io
n
8
Se
ss
io
n
9
In
ta
ke
Outcome Score
Adults and Children/adolescents (AHBA)
Session
Trajectories of change
Before and after feedback (AHBA)
40
Outcome Score
35
30
75% percentile
25
Clinical cuttoff
20
Before feedback
After feedback
15
25th percentile
10
5
0
e
In
k
ta
S
n
io
s
es
2
S
n
io
s
es
3
S
n
io
s
es
4
Se
s
n
sio
5
Se
s
n
sio
Session
6
Se
s
n
sio
7
Se
s
n
sio
8
Se
s
n
sio
9
Norms and Benchmarks
• Regression formulas can be used to create norms
for change.
• Outcomes can be “Benchmarked” by determining
the difference between the expected change
(using the regression formula) and the actual
change
• By contributing data to common data repository,
ORS users can assure that norms are continuously
refined and updated.
References
1.
Science Cartoons by Georg Meixner
http://www.vias.org/science_cartoons/
2.
Jacobson NS & Truax P. Clinical significance: a statistical approach
to defining meaningful change in psychotherapy research. J Consult
Clin Psychol. 1991;59:12-19.
3.
Campbell, DT & Kenny, DA. A Primer on Regression Artifacts. The
Guildford Press, 1999.
About the presenter
G.S. (Jeb) Brown is a licensed psychologist with a Ph.D. from Duke
University. He served as the Executive Director of the Center for Family
Development from 1982 to 19987. He then joined United Behavioral
Systems (an United Health Care subsidiary) as the Executive Director for
of Utah, a position he held for almost six years. In 1993 he accepted a
position as the Corporate Clinical Director for Human Affairs
International (HAI), at that time one of the largest managed behavioral
healthcare companies in the country.
In 1998 he left HAI to found the Center for Clinical Informatics, a
consulting firm specializing in helping large organizations implement
outcomes management systems. Client organizations include PacifiCare
Behavioral Health/ United Behavioral Health, Department of Mental
Health for the District of Columbia, Accountable Behavioral Health Care
Alliance, Resources for Living and assorted treatment programs and
centers throughout the world.
Dr. Brown continues to work as a part time psychotherapist at
behavioral health clinic in Salt Lake City, Utah. He does measure his
outcomes.
Download