Understanding Cutoffs, Norms, Trajectories, and everything else that freaks you out: A Primer for Dummies G.S. (Jeb) Brown, Ph.D. Center for Clinical Informatics But I’m a clinician…… • Statistics aren’t necessary to be a good therapist. • I was never good at math. • I’ll let the researchers worry about statistics, I’m interested in helping clients. • I know when my clients are getting better; I don’t need an outcome measure to tell me this. • I dissociate when confronted with numbers. Stats phobia desensitization • Close you eyes and breath slowly • Visualize a lovely hill… green grass, gentle breeze, warm sun • See how the grand rises slowly at first, then steeper; notice the sensuous lines and beautifully rounded summit. • As you exhale slowly, say to yourself … “I love the Bell Curve”. • Repeat as needed. Mantra: I love the Bell Curve Mantra: I love the Bell Curve 10% 8% 6% 4% 2% 80 74 68 62 56 50 44 38 32 26 20 0% What’s normal? • All outcome questionnaires appear to measure a common “factor”… global distress/happiness • Like almost every other human trait, misery is normally distributed. 100% Seriously unhappy 80% 60% Us normal folk 40% 20% Pathologically happy 80 76 72 68 64 60 56 52 48 44 40 36 32 28 24 20 0% Calculating clinical cutoff • Clinical cutoff scores used to estimate the boundary between “normal” and “clinical”. • Take another deep breath… here comes a formula…… (SD1)(mean2)+(SD2)(mean1) C= SD1+SD2 • Thanks to Jacobson & Truax, 1991 Idealized clinical cutoff Clinical cutoff 10% Normal Sample 8% Clinical Sample 6% 4% 2% 82 88 76 64 70 52 58 46 34 40 28 16 22 10 0% Nothing is perfect…. • The Bell Curve is beautiful in the abstract, but reality is messier. • Many tests may floor and ceiling effects. • Distributions may depart from normality…. Floor and ceiling effects • Example – the mean of a in a non clinical sample completing the ORS was 30 and the standard deviation 6.2. • But wait! The ORS has a maximum of score of 40, only 1.6 standard deviations above the mean. Example samples (sample samples?) • Accountable Behavioral Healthcare Alliance – Five county community mental health system in Oregon serving adults and children. ORS administered administered at every session. Data collected at an estimated 80% of all sessions. • RFL-Resources for Living – EAP company providing telephonic counseling services to adults. ORS administered telephonically at every session. Approximately 80% of clients have a single session (one call). • SAIC – Clinics in multiple countries serving children of active duty US military personnel and military contractors. • Community sample- Small sample of non clients. ORS score distributions 4 clinical & 1 community samples 35% ABHA -Adults 30% 25% ABHA_Child & Adolescent 20% 15% RFL 10% SAIC 5% Community Sample 0% -5% 0 6 12 18 24 30 36 Non normal curves…… Clinical cutoff scores ABHA Adults ABHA Child & Adolescent RFL Adults SAIC Adolescents standard deviation normal 6.8 6.8 6.8 6.8 standard deviation clinical 9.3 7.7 9 8.2 mean normals 29.7 29.7 29.7 29.7 mean clinical 18.3 21.4 19.5 25.8 cuttoff = 24.9 25.8 25.3 27.9 When is change “real” • All measurement has “error” • How do we know if change on a test is simply the result of random error? • The Reliable Change Index is a common metric to determine if the difference between two scores is greater than expected from random error. • Here we go again….Jacobson & Truax, 1991 RCI = 1.96 * Sdiff Sdiff = 2SE2 SE = SD1 - rxx When is change “real” • Standard error of measurement (SE) is defined as the standard deviation multiplied by he square root of 1 minus the reliability of test as calculated by the coefficient alpha, a measure of internal consistency. • Estimates of coefficient alpha…. ABHA = .89 RFL=.82 SAIC=.97 RCI estimates ABHA Adults ABHA Child & Adolescent RFL Adults SAIC Adolescents Standard deviation 9.3 7.7 9 8.2 reliability 0.89 0.87 0.81 0.97 standard error of measure 3.1 2.8 3.9 1.4 Sdiff 4.4 3.9 5.5 2.0 RCI 5.6 5.0 7.1 2.6 Did I mention that all measurement is approximation? Criteria for recovery • Jacobson & Truax (1991) proposed a two fold criteria for recovery 1. Change score exceeds the RCI 2. Scores moves from clinical range to normal range. • Two problems with this criteria… 1. A substantial percentage of patients start treatment in the normal range 2. The probability of change exceeding the RCI is a function of severity % of cases with “real” change (Change exceeds 5 points) Clinical range cases (ORS intake =< 25) All cases 100% 100% 80% 80% Worse 60% 60% No change 40% Improved 40% 20% 20% 0% 0% ABHA RFL SAIC ABHA RFL SAIC Case mix adjustment • We can’t evaluate outcomes without answering the question “Compared to what?” • Differences in severity (intake score) and other client characteristics make it difficult to compare outcomes • Case mix adjustment uses statistical methods to adjust for differences in clients when comparing outcomes from one site to another • Case mix adjustment is always imperfect. Regression to the rescue Regression artifacts are always present in repeated measures • If a measure has test-retest reliability (correlations between two points in time) then it will exhibit regression to the mean. • Correlations between measures will tend to decrease over time. • The intake score is almost always the strongest predictor of change. • Recommended reading – A Primer on Regression Artifacts (Campbell & Kenny, 1999) Multivariable GLM • The General Linear Model can incorporate multiple variables to predict a continuous dependent variable, such as a change score. • Both continuous and categorical variables can be incorporated into the model. • Multivariate analysis of variance using GLM suggest that most of the variance in change scores is explained by the intake score. Variables such as diagnosis, age and sex account for a relatively small percentage of the variance. Multivariable Regression Excel and regression • The slope and intercept values for a simple linear regression can be calculated in Excel. • Amazingly, the slope function returns the coefficient for the slope and the intercept function returns the value for the intercept. • Different slopes and intercepts can be calculated for subgroups (age group, diagnosis, etc). Benchmarking outcomes • GLM permits “benchmarking” of outcomes, by comparing an actual score (or change score) with the predicted score derived some the comparison (or normative) sample. • Difference between actual and predicted score is know as the residual score (change score) • “Benchmark Score” or Change Index Score” maybe have more intuitive meaning than “Residual Score”. Or not. Trajectory of Change Graph • The Trajectory of Change graphs displays a client’s actual scores to the predicted scores. • GLM used to predict scores are subsequent measurement intervals (sessions, weeks) using intake score and any other variables available. • Distribution of residuals used to plot percentiles at different intervals. Graphing scores over time Trajectory of change 40 Actual score Average Score 30 25th percentile 75% percentile 25 20 Clinical cuttoff 15 10 5 Session 9 ss io n 8 Se ss io n 7 Se ss io n 6 Se ss io n 5 Se ss io n 4 Se ss io n 3 Se io n ss Se Se ss io n 2 ak e 0 In t Outcome Score 35 Graphing 3 way interactions • Trajectory of Change graphs can be used to display 3 way interactions involving severity, time and a third variable of interest (treatment, age group, diagnosis, etc.) • Slope and intercept at each measurement point are calculated separately for grouping of interest. • Following are two examples from the ABHA data. Trajectories of change 40 35 30 25 20 15 10 5 0 75% percentile Clinical cuttoff Children/adolescents Adults 25th percentile Se ss io n 2 Se ss io n 3 Se ss io n 4 Se ss io n 5 Se ss io n 6 Se ss io n 7 Se ss io n 8 Se ss io n 9 In ta ke Outcome Score Adults and Children/adolescents (AHBA) Session Trajectories of change Before and after feedback (AHBA) 40 Outcome Score 35 30 75% percentile 25 Clinical cuttoff 20 Before feedback After feedback 15 25th percentile 10 5 0 e In k ta S n io s es 2 S n io s es 3 S n io s es 4 Se s n sio 5 Se s n sio Session 6 Se s n sio 7 Se s n sio 8 Se s n sio 9 Norms and Benchmarks • Regression formulas can be used to create norms for change. • Outcomes can be “Benchmarked” by determining the difference between the expected change (using the regression formula) and the actual change • By contributing data to common data repository, ORS users can assure that norms are continuously refined and updated. References 1. Science Cartoons by Georg Meixner http://www.vias.org/science_cartoons/ 2. Jacobson NS & Truax P. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol. 1991;59:12-19. 3. Campbell, DT & Kenny, DA. A Primer on Regression Artifacts. The Guildford Press, 1999. About the presenter G.S. (Jeb) Brown is a licensed psychologist with a Ph.D. from Duke University. He served as the Executive Director of the Center for Family Development from 1982 to 19987. He then joined United Behavioral Systems (an United Health Care subsidiary) as the Executive Director for of Utah, a position he held for almost six years. In 1993 he accepted a position as the Corporate Clinical Director for Human Affairs International (HAI), at that time one of the largest managed behavioral healthcare companies in the country. In 1998 he left HAI to found the Center for Clinical Informatics, a consulting firm specializing in helping large organizations implement outcomes management systems. Client organizations include PacifiCare Behavioral Health/ United Behavioral Health, Department of Mental Health for the District of Columbia, Accountable Behavioral Health Care Alliance, Resources for Living and assorted treatment programs and centers throughout the world. Dr. Brown continues to work as a part time psychotherapist at behavioral health clinic in Salt Lake City, Utah. He does measure his outcomes.