comparison of irt and conventional test theory - Neuro-QoL

advertisement
Comparison of IRT and conventional test theory
Richard Gershon
1
Welcome to how does item response theory differ from conventional test theory. My
name is Richard Gershon from the Department of Medical Sciences at Northwestern
University. In classical test theory, an individual takes an assessment and uses their total
score on that assessment for comparison purposes. A person with a high score is usually
higher on the train versus low score is lower on that train. In item response theory, each
individual item can be used for comparison purposes. It's really like each item is its own
test. A person who endorses a better rating on hard items is higher on the train, and a
person conversely a person who endorses worse ratings on easy items is lower on the
train. We can enter these items along the same construct into longer assessments and
improve our reliability.
So in classical test theory reliability is based upon the total test regardless of patient
ability or level of severity on a trade continuum reliability is the same. So even if I give
an eight grade test to third graders the level of reliability is the same as if I had given it to
eight graders directly. In item response theory, reliability is calculated for each patient,
each subject's ability and varies across the continuum. Typically there is better reliability
in the middle to distribution. People whose ability or severity on a personality trait that is
near the middle of the item distribution will have higher reliabilities than people who are
distant. In an ability situation given a fourth grader eight grade items will result in much
less reliability than giving a fourth grader fourth grade items.
This is similarly the case for validity. Validity in classical testers is based upon the total
test. Typically validity would need to be reassessed if the instrument is modified in any
way. In item response theory validity is assessed to the entire item bank, and actually
validity is effectively assessed for each individual item in the bank. So any subset of
items, full length tests, short forms or computer adapter tests, all inherit the validity
assessed to the original item bank, and indeed we can typically add items to that bank
without assessing the validity of those individual items and they too will inherit the
validity of the entire bank.
The next presentation in this series is entitled The Rating Scale Item Characteristic
Curve.
[End of Audio]
Duration: 3 minutes
www.gmrtranscription.com
Download