item response theory models - Neuro-QoL

advertisement
Item response theory models
Male Speaker
1
Welcome to item response theory models. My name is Richard Gershon with the
Department of Medical Social Sciences at Northwestern University. There are three
main item response theory models: the Rasch model or one parameter logistic model, the
two parameter logistic model, and the three parameter logistic model. The following
section I entitled how to choose an appropriate IRT model or my religion is better than
your religion. And for those of you who were hoping not to see some math, you may
want to overt your eyes. The one parameter logistic model known by some people is the
Rasch model, and I should point out that while the math underline what’s known as the
Rasch model and the one parameter model are the same there are philosophical
differences, which are outside our discussion here. So we’ll deal with the math at this
point, and if you’d like to understand the difference between the Rasch model, the one
parameter model, we’ll leave that for another presentation.
So the one parameter of logistic model really looks at one parameter, and that is the
difference between person ability and the difficulty of the item. If you work this equation
through, you will see that a person whose ability matches the item difficulty has a 50
percent chance of answering that item correctly. We use the one parameter logistic
model because it’s really the only option for small sample sizes. It offers the real model
underlying a test label that is three parameter. I’ve been a consultant for many
organizations that use two and three parameter modeling, and will philosophically talk
about how those models are better. The one parameter model often does the trick. It is
less costly to develop, and it is certainly the most parsimonious solution and the simple
solution is always the best.
The two parameter logistic model adds to the one parameter model by adding a
discrimination component and that is we know the difficult of the item and also how well
that item discriminates. We can add that to our probability formula and get a better
estimate of person ability. The Rasch model typically effectively averages out the item
discrimination, and when we do computer adaptive testing the value of knowing the
discrimination becomes less important. Let’s look at some two parameter examples. All
of these graphics represent an item of the same difficulty, but we vary the item
discrimination. As you can see on the far right, if a person gets this item correct know
with fair certainty that their ability is above that of the item because the slope is very
steep.
If they get it wrong, their ability is likely to be less than that of the item as opposed to the
far left where the slope is much lower, and therefore, if a person gets this right or wrong
the high likelihood they are near the ability of the item or the difficult of the item, but we
don’t know this much certainty. We also have what’s called the three parameter logistic
model. This adds guessing as a parameter to our equation. Now guessing is only
incorporated into an IRT model in a situation where there’s multiple choice items, and of
course for most rating scale applications for which this particular presentation is targeted
we don’t usually have guessing. But I just like to show you what this is about. So three
parameter logistic model requires a very large sample size in order to ascertain the value
www.gmrtranscription.com
Item response theory models
Male Speaker
2
of the guessing parameter. Theoretically, the three parameter model is certainly better
than the two or one parameter model, and it’s the most accepted theoretical model.
In many applications of IRT such as computer adaptive testing, which we use with
Neuro-QOL, first of all we don’t have guessing in Neuroquel and secondly the effective
guessing is very minimal in a cat environment. So here are two or three parameter
estimates and the difference here with guessing you can see is that on the left graph
there’s a ten percent chance of guessing an item correctly if your ability is very low, and
then the right graph there’s a 25 percent chance of guessing the item correctly. Now it
turns out that the likelihood of a well written item being answered correctly is actually
less than chance. So if I have a four choice item you would probably tell me that a
person who is guessing would have a 25 percent chance of answering that correctly. But
on a high stakes test that’s well written the chance is actually less than that. The item is
written such that it pulls [inaudible] people away from the correct answer versus what
one would expect.
So we actually can use the guessing parameters to improve our level of measurement.
We also have polytomous models. Now polytomous models can be both one and two
parameter in this Rasch rating scale model and partial credit model. Polytomous models
means more than one correct answer, and then two parameter model a greater response
model generalize partial credit model. And for Neuro-QOL we use a greater response
model in our calculations, and there are different reasons to use these different models.
You should just know at this point that they exist. There are also new multidimensional
models, and these models are really on the bleeding edge of where item response theory
is today. But multidimensional models allow us to assess more than one trait using the
same item just good to have in the back of your head.
This concludes our presentation on item response theory models. Next in the series how
does IRT differ from conventional testing theory?
[End of Audio]
Duration: 6 minutes
www.gmrtranscription.com
Download