PROMIS DEVELOPMENT METHODS, ANALYSES AND APPLICATIONS

advertisement
PROMIS DEVELOPMENT METHODS,
ANALYSES AND APPLICATIONS
Dennis A. Revicki, Ph.D.
Center for Health Outcomes Research,
United BioSource Corporation, Bethesda, Maryland, USA
Presented at the Patient-Reported Outcomes Measurement Information System
(PROMIS): A Resource for Clinical & Health Services Research, Academy Health
Annual Research Meeting, Orlando, Florida, June 3, 2007
OVERVIEW

Development of PROMIS item banks

Psychometric analysis of item bank data

Clinical and health services research applications
GOAL FOR PROMIS

Improve assessment of self- reported symptoms and domains of
health-related quality of life for application across a wide range of
chronic diseases

Develop and test a large bank of items for measuring PROs

Develop computer-adaptive testing (CAT) for efficient assessment of
PROs

Create a publicly available, flexible, and sustainable system allowing
researchers to access to item banks and CAT tools
PROMIS DOMAIN HIERARCHY
Upper Extremities: grip, buttons, etc (dexterity)
Function/Disability
Lower Extremities: walking, arising, etc (mobility)
Central: neck and back (twisting, bending, etc)
Activities: IADL (e.g. errands)
Physical
Health
Pain
Fatigue
Satisfaction
Symptoms
Sleep/Wake Function**
Sexual Function
Other
Anxiety
Selfreported
Health
Depression
Emotional Distress
Anger/Aggression
Substance Abuse
Mental
Health
Negative Impacts of illness
Cognitive Function
Positive Impacts of Illness
Satisfaction
Satisfaction
Social
Health
Satisfaction
Positive
Psychological
Functioning
Role Participation
Social Support
•Self Concept
•Stress Response
•Spirituality/Meaning
•Social Impact
Meaning and Coherence (spirituality)
Mastery and Control (self-efficacy)
Subjective Well-Being (positive affect)
Performance
Satisfaction
Items from
Instrument
Items from
Instrument
Items from
Instrument
A
B
C
New
Items
Item Pool
Content Expert
Review
Cognitive
Testing
Secondary
Data Analysis
  Questionnaire  


administered to large



 representative sample
2.5
1.0
2.0
0.8
Item
Respons
e
Theory
(IRT)
0.6
0.4
0.2
0.0
-3
-2
-1
0
1
2
3
Information
Probability of Response
Focus
Groups
1.5
1.0
0.5
0.0
-3
-2
-1
0
1
2
Theta
Theta
Item Bank
Short Form
Instruments
(IRT-calibrated items reviewed for
reliability, validity, and sensitivity)
CAT
3
ITEM BANKS
An item bank comprises a large collection of items
measuring a single domain, e.g., pain…
no
pain
mild
pain
moderate
pain

severe
pain


extreme
pain

Pain Item Bank
Item
1
Item
2
Item
3
Item
4
Item
5
Item
6
Item
7
Item
8
Item
9
Item
n
These items are reviewed by experts, patients, and methodologists to make sure:
• Item phrasing is clear and understandable for those with low literacy
• Item content is related to pain assessment and appropriate for target population
• Item adds precision for measuring different levels of pain
STEPS FOR PROMIS ITEM BANKS
Criteria
Item
Development
Skewness
Unidimensionality
Local Independence
IRT Analysis
Differential Item
Function
Item Parameter
Stability
Item Fit
Evaluation
1 Qualitative
Review
2 Frequency
Analysis
3 CFA
Focus groups and
cognitive interviews
< 95% response in one
category
>.60 factor loading
4 Residual
<.10 residual correlation
Correlations
5 Item Response Curves
monotonic
6 Regression
R2<.03 DIF
7 Exclusion of Items
?
8 Fit Tests
p>.05 Chi2 test
9 Simulation Studies
—
ITEM RESPONSE THEORY MODELS

IRT models enable reliable and precise measurement of
PROs
– Fewer items needed for equal precision
– Makes assessment briefer

More precision gained by adding items
– Reducing error and sample size requirements

Error is understood at the individual level
– Allowing practical individual assessment
WHICH RANGE OF MEASUREMENT?
Are you able to …
Does your health now limit you in ...
climb up
several stairs
Item information
10
heavy work
around the house
usual physical activities
8
sit on the
edge
of the bed
6
strenuous activities
4
2
0
-4.00
-3.00
-2.00
-1.00
0.00
1.00
2.00
Theta
Disability
Physical Function
5 = Not at all
4 = Very little
3 = Somewhat
2 = Quite a lot
1 = Cannot do
5 = Without any difficulty
4 = With a little difficulty
3 = With some difficulty
2 = With much difficulty
1 = Unable to do
PEOPLE AND ITEMS DISTRIBUTED ON
THE SAME METRIC: FATIGUE
People with
more fatigue
People with
less fatigue
Ceiling effect
0.0
Items more likely
to be endorsed
Items less likely
to be endorsed
BANK PRECISION LEVEL ALONG THE
PAIN CONTINUUM
40
0.3%
28.3%
Average self
- reported pain = 60.43
(Scaled score = 2.46)
30
20
10
0
Severe pain
0
Minimal/no pain
10
20
30
40
50
60
70
80
4
SE
3
2
SE = 0.5
10.7 (Scaled score =
-4.50)
SE = 0.5
71.8 (Scaled score = 4.05)
1
0
Very much
b
Quite
a bit
Somewhat
A little bit
Not at all
90
100
THE ADVANTAGES OF SHORT-FORMS
DEVELOPED FROM PROMIS ITEM BANKS

Select a set of items that are matched to the severity level of
the target population.

All scales built from the same item bank are linked on a similar
metric.
FATIGUE MEASURE AND STANDARD
ERROR COMPARISON BY TEST LENGTH
Fatigue Measure and Standard Error
Comparision by Test Length
1.0
0.9
Standard Error
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
-4
-3
-2
-1
0
1
2
3
4
Fatigue Measure
5 Item CAT
10 Item CAT
72 Item Bank
6 Item SF
13 Item Scale
THE ADVANTAGES OF CAT-BASED
ASSESSMENT
1.
Provide an accurate estimate of a person’s score with the
minimal number of questions.
•
2.
Questions are selected to match the health status of the
respondent.
CAT minimizes floor and ceiling effects.
•
People near the top or bottom of a scale will receive items
that are designed to assess their health status.
1.0
How often did you feel nervous?
0.8
All of
the
time
0.6
Most of
the
time
Some
of the
time
Little of
the
time
None
of the
time
0.4
0.2
0.0
-3.00
-3
Severe
-2.00
-2
high
-1.00
-1
0.00
1.00
0
moderate
1
low
Emotional Distress
2.00
2
3.00
3
very
low
Item Bank
(Validated & IRT-Calibrated Emotional Distress Items)
1.0
How often did you feel nervous?
0.8
Some
of the
time
0.6
0.4
0.2
0.0
-3.00
-3
Severe
-2.00
-2
high
-1.00
-1
0.00
1.00
0
moderate
1
low
2.00
2
3.00
3
very low
Emotional Distress
Item Bank
(Validated & IRT-Calibrated Emotional Distress Items)
1.0
How often did you feel nervous?
0.8
Some
of the
time
0.6
0.4
0.2
0.0
-3.00
-3
Severe
-2.00
-2
high
-1.00
-1
moderate
0.00
1.00
0
1
low
2.00
2
3.00
3
very low
Emotional Distress
Item Bank
(Validated & IRT-Calibrated Emotional Distress Items)
1.0
How often did you feel hopeless?
0.8
All of
the
time
0.6
Most of
the
time
Some
of the
time
Little of
the
time
None
of the
time
0.4
0.2
0.0
-3.00
-3
Severe
-2.00
-2
high
-1.00
-1
0.00
1.00
0
moderate
1
low
2.00
2
3.00
3
very low
Emotional Distress
Item Bank
(Validated & IRT-Calibrated Emotional Distress Items)
1.0
How often did you feel hopeless?
0.8
Some
of the
time
0.6
0.4
0.2
0.0
-3.00
-3
Severe
-2.00
-2
high
-1.00
-1
0.00
1.00
0
moderate
1
low
2.00
2
3.00
3
very low
Emotional Distress
Item Bank
(Validated & IRT-Calibrated Emotional Distress Items)
1.0
How often did you feel worthless?
0.8
All of
the
time
0.6
Most of
the
time
Some
of the
time
Little of
the
time
None
of the
time
0.4
0.2
0.0
-3.00
-3
Severe
-2.00
-2
high
-1.00
-1
0.00
1.00
0
moderate
1
low
2.00
2
3.00
3
very low
Emotional Distress
Item Bank
(Validated & IRT-Calibrated Emotional Distress Items)
1.0
How often did you feel worthless?
0.8
Little of
the
time
0.6
0.4
0.2
0.0
-3.00
-3
Severe
-2.00
-2
high
-1.00
-1
0.00
1.00
0
moderate
1
low
2.00
2
3.00
3
very low
Emotional Distress
Item Bank
(Validated & IRT-Calibrated Emotional Distress Items)
1.0
How often did you feel worthless?
0.8
Little of
the
time
0.6
0.4
0.2
0.0
-3.00
-3
Severe
-2.00
-2
high
-1.00
-1
moderate
0.00
1.00
0
1
low
2.00
2
3.00
3
very low
Target in on
emotional
distress score
Item Bank
(Validated & IRT-Calibrated Emotional Distress Items)
CLINICAL AND HEALTH SERVICES
RESEARCH APPLICATIONS

Brief, psychometrically sound short-form or CAT instruments
– Pain, fatigue, physical function, emotional distress, social
activities/function

Efficient collection of health outcomes data in clinical trials
– Comparing health interventions and strategies
– Comparing pharmaceutical treatments

Monitoring the health outcomes of populations
– Health plan members
– Medicare beneficiaries
– US general population (i.e., MEPS)
TREATMENT COMPARISONS AND EFFECT SIZE
ESTIMATES FOR BASELINE TO ENDPOINT CHANGES
FOR DEPRESSION SEVERITY SCALES FOR
PAROXETINE AND PLACEBO GROUPS
Score
Least Square Mean
Change
F-Value
P-Value
Effect Size
Paroxetine
Placebo
HDRS Total
-11.4407
-8.375
7.45
0.007
0.43
MADRS Total
-13.617
-8.793
11.93
0.001
0.54
DS-1 T-Score
-17.333
-12.171
8.73
0.004
0.46
DS-2 T-Score
-21.919
-13.690
14.57
0.0002
0.59
DS-3 T-Score
-23.135
-14.234
16.09
0.0001
0.63
a. Sample size: Paroxetine N = 98; Placebo N = 99
b. Sample size: Paroxetine N = 82; Placebo N = 85
SUMMARY AND CONCLUSION


PROMIS item banks, short-form measures and CAT will
enable the efficient and psychometrically sound assessment
of health outcomes
PROMIS items banks, instruments and software will be in the
public domain
– PROMIS Health Organization
– Not-for-profit organization for management and dissemination of
PROMIS products

Development of PROMIS item banks and instruments is
ongoing
– Preliminary measurement systems available late 2007

Health outcome measures may assist patients, their families,
clinicians, and other health care decision-makers in
understanding the outcomes of health care interventions and
treatment
Download