Item response theory models Male Speaker 1 Welcome to item response theory models. My name is Richard Gershon with the Department of Medical Social Sciences at Northwestern University. There are three main item response theory models: the Rasch model or one parameter logistic model, the two parameter logistic model, and the three parameter logistic model. The following section I entitled how to choose an appropriate IRT model or my religion is better than your religion. And for those of you who were hoping not to see some math, you may want to overt your eyes. The one parameter logistic model known by some people is the Rasch model, and I should point out that while the math underline what’s known as the Rasch model and the one parameter model are the same there are philosophical differences, which are outside our discussion here. So we’ll deal with the math at this point, and if you’d like to understand the difference between the Rasch model, the one parameter model, we’ll leave that for another presentation. So the one parameter of logistic model really looks at one parameter, and that is the difference between person ability and the difficulty of the item. If you work this equation through, you will see that a person whose ability matches the item difficulty has a 50 percent chance of answering that item correctly. We use the one parameter logistic model because it’s really the only option for small sample sizes. It offers the real model underlying a test label that is three parameter. I’ve been a consultant for many organizations that use two and three parameter modeling, and will philosophically talk about how those models are better. The one parameter model often does the trick. It is less costly to develop, and it is certainly the most parsimonious solution and the simple solution is always the best. The two parameter logistic model adds to the one parameter model by adding a discrimination component and that is we know the difficult of the item and also how well that item discriminates. We can add that to our probability formula and get a better estimate of person ability. The Rasch model typically effectively averages out the item discrimination, and when we do computer adaptive testing the value of knowing the discrimination becomes less important. Let’s look at some two parameter examples. All of these graphics represent an item of the same difficulty, but we vary the item discrimination. As you can see on the far right, if a person gets this item correct know with fair certainty that their ability is above that of the item because the slope is very steep. If they get it wrong, their ability is likely to be less than that of the item as opposed to the far left where the slope is much lower, and therefore, if a person gets this right or wrong the high likelihood they are near the ability of the item or the difficult of the item, but we don’t know this much certainty. We also have what’s called the three parameter logistic model. This adds guessing as a parameter to our equation. Now guessing is only incorporated into an IRT model in a situation where there’s multiple choice items, and of course for most rating scale applications for which this particular presentation is targeted we don’t usually have guessing. But I just like to show you what this is about. So three parameter logistic model requires a very large sample size in order to ascertain the value www.gmrtranscription.com Item response theory models Male Speaker 2 of the guessing parameter. Theoretically, the three parameter model is certainly better than the two or one parameter model, and it’s the most accepted theoretical model. In many applications of IRT such as computer adaptive testing, which we use with Neuro-QOL, first of all we don’t have guessing in Neuroquel and secondly the effective guessing is very minimal in a cat environment. So here are two or three parameter estimates and the difference here with guessing you can see is that on the left graph there’s a ten percent chance of guessing an item correctly if your ability is very low, and then the right graph there’s a 25 percent chance of guessing the item correctly. Now it turns out that the likelihood of a well written item being answered correctly is actually less than chance. So if I have a four choice item you would probably tell me that a person who is guessing would have a 25 percent chance of answering that correctly. But on a high stakes test that’s well written the chance is actually less than that. The item is written such that it pulls [inaudible] people away from the correct answer versus what one would expect. So we actually can use the guessing parameters to improve our level of measurement. We also have polytomous models. Now polytomous models can be both one and two parameter in this Rasch rating scale model and partial credit model. Polytomous models means more than one correct answer, and then two parameter model a greater response model generalize partial credit model. And for Neuro-QOL we use a greater response model in our calculations, and there are different reasons to use these different models. You should just know at this point that they exist. There are also new multidimensional models, and these models are really on the bleeding edge of where item response theory is today. But multidimensional models allow us to assess more than one trait using the same item just good to have in the back of your head. This concludes our presentation on item response theory models. Next in the series how does IRT differ from conventional testing theory? [End of Audio] Duration: 6 minutes www.gmrtranscription.com