Item Analysis Ursula Waln, Director of Student Learning Assessment Central New Mexico Community College Item Analysis Used with Objective Assessment • Looks at frequency of correct responses (or behaviors) in connection with overall performance • Used to examine item reliability • How consistently a question or performance criterion discriminates between high and low performers • Can be useful in improving validity of measures • Can help instructors decide whether to eliminate certain items from the grade calculations • Can reveal specific strengths and gaps in student learning How Item Analysis Works • Groups students by the highest, mid-range, and lowest overall scores and examines item responses by group • Assumes that higher-scoring students have a higher probability of getting any given item correct than do lower-scoring students • May have studied and/or practiced more and understood the material better • May have greater test-taking savvy, less anxiety, etc. • Produces a calculation for each item • Do it yourself to easily calculate a group difference or discrimination index • Use EAC Outcomes (a Blackboard plug-in made available to all CNM faculty by the Nursing program) to generate a point-biserial correlation coefficient • Gives the instructor a way to analyze performance on each item One Way to Do Item Analysis by Hand Shared by Linda Suskie at the NMHEAR Conference, 2015 Item Tally of those in Top 27% who missed item* 1 2 ||||| || 3 4 ||| Tally of those in the Middle 46% who missed item ||||| ||||| ||||| ||||| || ||||| ||||| ||||| ||||| | ||||| |||| Tally of those in the Lower 27% who missed item* ||||| ||||| ||||| || ||||| ||||| ||||| |||| || ||||| ||||| | Total % Who Missed Item 34% Group Difference (# in Lower minus # in Top) 17 40% 12 5% 17% -1 11 * You can use whatever portion you want for the top and lower groups, but they need to be equal. Using 27% is accepted convention (Truman Kelley, 1939). Another Way to Do Item Analysis by Hand Rasch Item Discrimination Index (D) N=31 because the upper and lower group each contain 31 students (115 students tested) Item # in Upper Portion of UG # in Lower Portion of LG Discrimination Group who who answered Group who who answered Index (D) answered correctly answered correctly D = pUG−pLG or (pUG) (pLG) correctly correctly #ππΊ −#πΏπΊ D= π (#UG) (#LG) 1 2 3 4 31 24 28 31 1.00 (100%) 0.77 (77%) 0.90 (90%) 1.00 (100%) 14 12 29 20 0.45 0.39 0.93 0.65 (45%) (39%) (93%) (65%) 0.55 0.38 -0.03 0.35 A discrimination index of 0.4 or greater is generally regarded as high and anything less than 0.2 as low (R.L. Ebel, 1954). The Same Thing but Less Complicated Rasch Item Discrimination Index (D) N=.27 115 = 31 N in Upper and Lower Groups is 31 (27% of 115 students) Item # in Upper # in Lower Discrimination Group who Group who Index (D) answered answered #ππΊ −#πΏπΊ correctly correctly D= π UG LG (# ) (# ) 1 31 14 0.55 2 24 12 0.38 3 28 29 -0.03 4 31 20 0.35 It isn’t necessary to calculate the portions of correct responses in each group if you use the formula shown here. 31−14 = 31 0.55 24−12 = 31 0.38 28−29 = 31 -0.03 31−20 = 31 0.35 Example of an EAC Outcomes Report A point-biserial correlation is the Pearson correlation between responses to a particular item and scores on the total test (with or without that item). Correlation coefficients range from -1 to 1. This is available to CNM faculty through Blackboard course tools. Identifying Key Questions • A key (a.k.a. signature) question is one that provides information about student learning in relation to a specific instructional objective (or student learning outcome statement). • The item analysis methods shown in the preceding slides can help you identify and improve the reliability of key questions. • A low level of discrimination may indicate a need to tweak the wording. • Improving discrimination value also improves question validity. • The more valid an assessment measure, the more useful it is in gauging student learning. Detailed Multiple-Choice Item Analysis • The detailed item analysis method shown on the next slide is for use with key multiple-choice items. • This type of analysis can provide clues to the nature of students’ misunderstanding, provided: • The item is a valid measure of the instructional objective • Incorrect options (distractors) are written to be diagnostic (i.e., to reveal misconceptions or breakdowns in understanding) Example of a Detailed Item Analysis Item 2 of 4. The correct option is E. (115 students tested) Item Response Pattern A B C D E ||||| Upper || ||||| 27% ||||| 6.5% 16% 77.5% | Middle ||| ||||| ||||| || ||||| 46% |||| ||||| 6% 26% 4% 2% 62% ||||| || ||||| || ||||| Lower ||||| 27% 16% 23% 16% 6% 39% Grand 10 Total 8.5% 26 23% 7 6% 3 2.5% 69 60% ||||| ||||| |||| Row Total 31 53 ||||| ||||| ||||| ||||| ||| ||||| || 31 115 These results suggest that distractor B might provide the greatest clue about breakdown in students’ understanding, followed by distractor A, then C.