Rating Scale item Characteristics - Neuro-QoL

advertisement
6075_Rating scale item characteristic curve
Richard Gershon
Richard:
1
Welcome to this presentation of the Rating Scale Item
Characteristic Curve. My name is Richard Gershon from the
Department of Medical Social Sciences at Northwestern
University. In previous presentations in this series, we’ve looked
at the one parameter and two and three parameter dichotomous
item characteristic curves. And today we’re going to be looking at
what happens when you look at polydomous models.
Let’s first look at the [inaudible] of reality of a 10 point rating
scale item. This is the most popular patient port outcome measure
used worldwide. How would you rate your pain on average? This
type of item is available in almost every emergency room,
certainly in the United States. You can see you can give a ranking
from between 0 and 10 and very often it’s given along with an
analog scale using different faces.
If you look at the bottom of the screen, you can see that the reality
is that close to a third of the trait range contains people who say
zero. Actually, this data was derived from giving people both the
single 0 to 10 point item, as well as a lengthy assessment of their
pain. And we see that people who said no pain really experience a
wide range of pain. And this similar to the case for people who
give an answer of 9 or 10. There’s great differentiation in the
amount of pain being exhibited by somebody who says they have a
pain number of 9 or 10.
Conversely, if we look at 3, 4, 5, 6, 7, 8, there’s very little
difference between the amount of pain experienced by those
people and having said that, very often even treatment decisions
and even medication decisions to add or decrease medication are
based on a person saying they went from 4 to 5 or reverse.
So let’s go back to our: I have lack of energy item. We first looked
at this item in an earlier presentation, and it’s a very simple 5-point
item from not at all, little bit, somewhat, quite a bit, very much.
And using traditional test theory, we, in our heads, say: oh, if I
give this result here, we have nicely divided a person into five
equal areas. But again, if I give a person a lengthy, fatigued survey
and see the people who answered not at all really occupy almost
3.5 standard deviations of the trait range. And conversely at the
top, very much, go from 75 to 100 on a T-scale.
For those of you who have seen a dichotomous item characteristic
curve, it’s very simple. But in a polydomous environment, there is
www.gmrtranscription.com
6075_Rating scale item characteristic curve
Richard Gershon
2
a separate curve for every option given in the item. So this again
is: I have lack of energy. These are the curves associated with that.
Now, if you look at the top of the screen, we see the simplified five
zones which we showed on an earlier slide, which made it look
relatively simple. But the reality [inaudible] underlying that is
that there are some people who say not at all, and indeed the most
likely area on the trait range is between zero and about 35. But
there are people whose scores go all the way up to 76.
If you look at the somewhat area, that area in yellow, there are
people who say somewhat to the single item. But if you give them
a full-length fatigue scale, some of them go all the way down to
zero or all the way up to 100. This demonstrates in particular the
hesitation that must be used in interpreting the results given from a
single item. Indeed, very short scales of one, two, or three items
may be appropriate for assessing group differences but would not
be appropriate for clinical decision making.
So again, here is our very much curve. You can see when a person
selects an extreme category from a rating scale item, we don’t
know if their score is in a given range that looks most likely
between 75 and 100. It could be as low as 43. And the truth of the
matter is, it just means they’re off to the right somewhere. It’s like
saying, if I’m based in Chicago where I’m sitting right now and
going west, I don’t know if I mean I’m going to Iowa, California,
or Japan. Because when a person answers in an extreme category,
all you know is that their ability or their severity on a trait is off
there in the distance somewhere. We haven’t narrowed it down
very much. We are fairly certain that this person is not in the 0-42
range; the far left side.
The center categories, what’s interesting to note is that they do
represent people across a broad range. So while their most likely
position on the trait range is around 69, a person who answers I
have lack of energy quite a bit, may be as low as 28 or as high as
100. And we can see that similarly for the other categories. At the
other extreme, we have the same problem we found in the very
much category: no, not at all; their score may be 0; it may be lower
than that. It’s far off in the distance. We cannot accurately assess
where this person is. We are fairly certain that they are not in the
typical range of a person who answered very much.
www.gmrtranscription.com
6075_Rating scale item characteristic curve
Richard Gershon
3
By the way, this is a good rating scale item. One would actually
hope that the ranges were distinct; that would be the theoretical. If
we gave not at all, you would absolutely, positively be in the range
0 to 33. And if you said little bit, you would absolutely, positively
be in the range 33-43. That’s just not how items work and that’s
not how humans interact with items.
Here are some other items. I’ve been too tired to feel happy. You
can see that this item does a better job of assessing people towards
a sever fatigue range.
I have felt energetic. This item does a better job of discriminating
people who tend to have a lot of energy. Giving this item to a
person who is bed bound doesn’t tell us very much. They’re going
to be in the none of the time, which just means they’re off to the
right somewhere.
I’ve been too tired to read. This is still a good item, but we’re
going to notice here that the discrimination of the individual’s
selections is less and therefore the curves are lower. And what we
have here is a typical issue that we see with the assessment of an
underlying trait. And that is that we’re trying to circle in on
fatigue and we can only ask a person once: do you have fatigue?
After that, we have to talk about various areas related to fatigue,
and one of those is reading.
And the reason the reading item doesn’t discriminate as well is
because some people don’t like to read. So if you don’t like to
read, you’re always too tired to read. And if you love to read, you
may never be too tired to read. Having said that, it’s still a good
item and we certainly use it in our banks.
Next presentation in this series is entitled Item Banking for Rating
Scale Items.
[End of Audio]
Duration: 8 minutes
www.gmrtranscription.com
Download