6075_Rating scale item characteristic curve Richard Gershon Richard: 1 Welcome to this presentation of the Rating Scale Item Characteristic Curve. My name is Richard Gershon from the Department of Medical Social Sciences at Northwestern University. In previous presentations in this series, we’ve looked at the one parameter and two and three parameter dichotomous item characteristic curves. And today we’re going to be looking at what happens when you look at polydomous models. Let’s first look at the [inaudible] of reality of a 10 point rating scale item. This is the most popular patient port outcome measure used worldwide. How would you rate your pain on average? This type of item is available in almost every emergency room, certainly in the United States. You can see you can give a ranking from between 0 and 10 and very often it’s given along with an analog scale using different faces. If you look at the bottom of the screen, you can see that the reality is that close to a third of the trait range contains people who say zero. Actually, this data was derived from giving people both the single 0 to 10 point item, as well as a lengthy assessment of their pain. And we see that people who said no pain really experience a wide range of pain. And this similar to the case for people who give an answer of 9 or 10. There’s great differentiation in the amount of pain being exhibited by somebody who says they have a pain number of 9 or 10. Conversely, if we look at 3, 4, 5, 6, 7, 8, there’s very little difference between the amount of pain experienced by those people and having said that, very often even treatment decisions and even medication decisions to add or decrease medication are based on a person saying they went from 4 to 5 or reverse. So let’s go back to our: I have lack of energy item. We first looked at this item in an earlier presentation, and it’s a very simple 5-point item from not at all, little bit, somewhat, quite a bit, very much. And using traditional test theory, we, in our heads, say: oh, if I give this result here, we have nicely divided a person into five equal areas. But again, if I give a person a lengthy, fatigued survey and see the people who answered not at all really occupy almost 3.5 standard deviations of the trait range. And conversely at the top, very much, go from 75 to 100 on a T-scale. For those of you who have seen a dichotomous item characteristic curve, it’s very simple. But in a polydomous environment, there is www.gmrtranscription.com 6075_Rating scale item characteristic curve Richard Gershon 2 a separate curve for every option given in the item. So this again is: I have lack of energy. These are the curves associated with that. Now, if you look at the top of the screen, we see the simplified five zones which we showed on an earlier slide, which made it look relatively simple. But the reality [inaudible] underlying that is that there are some people who say not at all, and indeed the most likely area on the trait range is between zero and about 35. But there are people whose scores go all the way up to 76. If you look at the somewhat area, that area in yellow, there are people who say somewhat to the single item. But if you give them a full-length fatigue scale, some of them go all the way down to zero or all the way up to 100. This demonstrates in particular the hesitation that must be used in interpreting the results given from a single item. Indeed, very short scales of one, two, or three items may be appropriate for assessing group differences but would not be appropriate for clinical decision making. So again, here is our very much curve. You can see when a person selects an extreme category from a rating scale item, we don’t know if their score is in a given range that looks most likely between 75 and 100. It could be as low as 43. And the truth of the matter is, it just means they’re off to the right somewhere. It’s like saying, if I’m based in Chicago where I’m sitting right now and going west, I don’t know if I mean I’m going to Iowa, California, or Japan. Because when a person answers in an extreme category, all you know is that their ability or their severity on a trait is off there in the distance somewhere. We haven’t narrowed it down very much. We are fairly certain that this person is not in the 0-42 range; the far left side. The center categories, what’s interesting to note is that they do represent people across a broad range. So while their most likely position on the trait range is around 69, a person who answers I have lack of energy quite a bit, may be as low as 28 or as high as 100. And we can see that similarly for the other categories. At the other extreme, we have the same problem we found in the very much category: no, not at all; their score may be 0; it may be lower than that. It’s far off in the distance. We cannot accurately assess where this person is. We are fairly certain that they are not in the typical range of a person who answered very much. www.gmrtranscription.com 6075_Rating scale item characteristic curve Richard Gershon 3 By the way, this is a good rating scale item. One would actually hope that the ranges were distinct; that would be the theoretical. If we gave not at all, you would absolutely, positively be in the range 0 to 33. And if you said little bit, you would absolutely, positively be in the range 33-43. That’s just not how items work and that’s not how humans interact with items. Here are some other items. I’ve been too tired to feel happy. You can see that this item does a better job of assessing people towards a sever fatigue range. I have felt energetic. This item does a better job of discriminating people who tend to have a lot of energy. Giving this item to a person who is bed bound doesn’t tell us very much. They’re going to be in the none of the time, which just means they’re off to the right somewhere. I’ve been too tired to read. This is still a good item, but we’re going to notice here that the discrimination of the individual’s selections is less and therefore the curves are lower. And what we have here is a typical issue that we see with the assessment of an underlying trait. And that is that we’re trying to circle in on fatigue and we can only ask a person once: do you have fatigue? After that, we have to talk about various areas related to fatigue, and one of those is reading. And the reason the reading item doesn’t discriminate as well is because some people don’t like to read. So if you don’t like to read, you’re always too tired to read. And if you love to read, you may never be too tired to read. Having said that, it’s still a good item and we certainly use it in our banks. Next presentation in this series is entitled Item Banking for Rating Scale Items. [End of Audio] Duration: 8 minutes www.gmrtranscription.com