Comparison of IRT and conventional test theory Richard Gershon 1 Welcome to how does item response theory differ from conventional test theory. My name is Richard Gershon from the Department of Medical Sciences at Northwestern University. In classical test theory, an individual takes an assessment and uses their total score on that assessment for comparison purposes. A person with a high score is usually higher on the train versus low score is lower on that train. In item response theory, each individual item can be used for comparison purposes. It's really like each item is its own test. A person who endorses a better rating on hard items is higher on the train, and a person conversely a person who endorses worse ratings on easy items is lower on the train. We can enter these items along the same construct into longer assessments and improve our reliability. So in classical test theory reliability is based upon the total test regardless of patient ability or level of severity on a trade continuum reliability is the same. So even if I give an eight grade test to third graders the level of reliability is the same as if I had given it to eight graders directly. In item response theory, reliability is calculated for each patient, each subject's ability and varies across the continuum. Typically there is better reliability in the middle to distribution. People whose ability or severity on a personality trait that is near the middle of the item distribution will have higher reliabilities than people who are distant. In an ability situation given a fourth grader eight grade items will result in much less reliability than giving a fourth grader fourth grade items. This is similarly the case for validity. Validity in classical testers is based upon the total test. Typically validity would need to be reassessed if the instrument is modified in any way. In item response theory validity is assessed to the entire item bank, and actually validity is effectively assessed for each individual item in the bank. So any subset of items, full length tests, short forms or computer adapter tests, all inherit the validity assessed to the original item bank, and indeed we can typically add items to that bank without assessing the validity of those individual items and they too will inherit the validity of the entire bank. The next presentation in this series is entitled The Rating Scale Item Characteristic Curve. [End of Audio] Duration: 3 minutes www.gmrtranscription.com