Developing Selected-Response Items Two General Types of Paper and Pencil Tests Selected Response Items: – Binary response items (e. g., true-false). – Multiple-choice items. – Matching items. Constructed Response Items: – Fill-in-the-blank items. – Completion and short answer items. – Essay items. General Item Writing Rules (These rules are available on the website for this course) • Provide clear and understandable directions to students about how to respond. • Be sure the items themselves are clear (unambiguous) to students. • Do not provide unintentional cues regarding the correct response. • Use grammar and vocabulary consistent with the source of instructio. General Item Writing Rules Keep reading level below students’ ability. Format the item for efficient scoring. Be sure content experts would agree on the correct answer. WRITE THE ITEM SO THAT IT MEASURES THE SPECIFIED LEARNING TARGET. Additoinal rules for Binary-Choice Items Binary-choice (or alternate-choice) items present a proposition for which one of two opposing options represents the correct answer. Several variants exist: True-false. Fact-opinion. Right-wrong. Yes-no. Binary-Choice Items (Variations: Embedded true-false items) Indicate whether each underlined word is used as a verb (V) or as something other (O) than a verb. Sailing has many advantages as a recreational sport. You can sail by yourself or with others. While basic techniques can be learned quickly, you can spend a life-time developing your sailing skills. Answers V O has V O with V O spend V O as V O While V O sailing V O can V O techniques V O sail V O learned Binary-Choice Items (Variations: Multiple true-false items) Read each option and indicate which are correct. In comparison with multiple-choice items, an advantage of true-false items is … 1. more items can be administered within a given time. 2. higher reliability is obtained from a given number of items. 3. each test item can be developed in less time. 4. students will select the correct answer only when they have achieved the skill being assessed. Binary-Choice Items Advantages and Limitations Limitations: Advantages: • Allows adequate • Highly susceptible to guessing. sampling of (usually, knowledge-• Can be used only when dichotomous answers level) content. represent sufficient • Relatively easy to response options. construct. • Usually, only • Objectively and indirectly assess efficiently scored. intellectual skills. Binary-Choice Items Qualities of Good Binary-choice Items • Good binary-choice items should…. – Measure the specified skill (learning target). • This requires some serious thinking – Require appropriate level of reading skill. – Emphasize adjectives or adverbs when they alter or reverse the meaning of the item. – Have one (of two) response options that is unequivocally correct. – Continued on next slide Binary-Choice Items Qualities of Good Binary-choice Items – Exclude adjectives and adverbs that imply an indefinite degree. – Avoid adjectives and adverbs that imply absolute meaning – Be stated as simply as possible (e.g., should exclude “window dressing”). – Should be written so that the incorrect response is plausible. – Should present a single proposition (not a double-barreled proposition) Binary-Choice Items Tip for improving the quality: Use contrasts Without contrast: The reliability of short-answer tests is unaffected by guessing. With contrast: The reliability of shortanswer tests is less affected by guessing than is the reliability of multiple-choice tests. Binary-Choice Items Examples of “double-barreled” propositions Although essay tests require less time to construct than do multiple-choice tests, they require more time to score. Classroom tests should be reliable and yield consistent scores across time. Binary-Choice Items Evaluation Learning target: Information. Identify qualities desired in multiple-choice items. Poor Item TRUE or FALSE: It is not important for a multiple-choice item to contain five options. Improved Item TRUE or FALSE: A multiple-choice item can contain as few as two options. Binary-Choice Items Evaluation Learning target: Information. Identify qualities desired in multiple-choice items. Poor Item T or F: Sometimes multiple choice items are superior to true-false items. Improved Item T or F: A 10-item multiple-choice test typically will be more reliable than a 10item true-false test. Binary-Choice Items Evaluation Learning target: Information. Identify qualities desired in multiple-choice items. Poor Item T F Good multiple-choice items measure important skills. Improved Item T F If plausible distracters are easy to develop, a table of specifications is of little value when constructing multiplechoice items. Multiple-Choice Items Anatomy of Multiple-Choice Items • MC items consist of …. – A stem, • Either a direct question, or • An incomplete statement to be completed. – A correct answer, and – Two or more distracters or foils. Multiple-Choice Items: Advantages and Limitations Advantages • Provide for a wide sampling of content. • Effectively structure the problem to be addressed. • Can be quickly and objectively scored. Limitations • Somewhat susceptible to guessing. • Indirectly measure targeted behaviors. • Time-consuming to construct. Multiple-Choice Items Example: Due to lack of parallel content this item may have more than one correct answer: Which of the following represents the warmest temperature? A. 100 degrees Celsius B. 200 degrees Fahrenheit C. 300 degrees Kelvin D. an oven set a medium Multiple-Choice Items Qualities Continued Options avoid repetitive words. Example: Criterion-referenced… A. refers to how a test is constructed. B. refers to how a test is interpreted. C. refers to how a test is scored. D. refers to how a passing score is established. Multiple-Choice Items Qualities Continued • Extraneous content (“window dressing”) is excluded (example on next slide). • Adjectives or adverbs are highlighted when they reverse or alter the meaning of a stem. • Words like not and except should be emphasized. • These can be used, but only when it is important to do so. Multiple-Choice Items Which item stem contains window dressing? A. What is the highest numerical value of a reliability coefficient? B. Although usually not obtainable, the maximum value of a reliability coefficient is 1.0. Multiple-Choice Items Examples of uses of not and except 1. Which of the following qualities least affects the reliability of a test? 2. All of the following represents types of validity EXCEPT… 3. The quality that is not an advantage of multiple-choice items is… 4. ALL BUT WHICH ONE of the following is... Multiple-Choice Items (Continued) Sample item with equally plausible distracters: Which item format requires students to spend the greatest portion examination time actually solving problems presented by the items: A. Essay B. Short-answer C. True-false D. Multiple-Choice Multiple-Choice Items (Continued) • Qualities desired in M-C items, continued – Options contain grammar consistent with the item stem. – The use of “all of the above” or “none of the above” used only when necessary. – Options are arranged in “natural” or logical order. Evaluating M-C Items Poor item: Internal consistency is high… A. when students who scored high on the first half of the test score high on the second half of the test. B. when students who scored high on the first half of the test score low on the second half of the test. C. when students who scored high on the first half of the test score in an unpredictable manner on the second half of the test. D. when all of the above are true. Evaluating M-C Items Improved item: If the internal consistency of a test is good, how will a group of students score on the second half of the test if they got the highest scores on the first half of the test? A. Highest scores. B. Lowest scores. C. Unpredictable scores. Evaluating M-C Items Learning Target: Identify characteristics of formal and informal assessments. Which of the following is an example of informal assessment? A. Allowing students a choice of which questions they will answer. B. Not allowing students a choice of which questions they will answer. C. Observing which students are paying attention. Evaluating M-C Items Poor item: Various item formats have specific advantages and limitations. An advantage the essay format has over the multiple-choice format is: A. the essay item can assess more skills in a given amount of time. B. the essay item can assess students’ ability to evaluate ideas. C. the essay item can be reliably scored. D. the essay item requires students to communicate ideas in writing. Evaluating M-C Items Improved item: Which is an advantage of essay over multiple choice items? A. Assess more skills in a given amount of time. B. Assess students ability to evaluate ideas. C. Evaluate students’ ability to communicate ideas. D. Facilitate reliable scoring of answers. Multiple-Choice Items: Item-writing Guidelines 1. Does the stem present a clearly stated problem or question? 2. Is extraneous content (“window dressing”) excluded from the stem? 3. Are adjectives or adverbs emphasized when they reverse or significantly alter the meaning of a stem or option? 4. Are negatives avoided wherever possible or highlighted where necessary? 5. Are the “correct” answers equally distributed across all choice categories? Multiple-Choice Items: Item-writing Guidelines 6. 7. 8. 9. Are options parallel in form and content? Do the options avoid repetitive words? Is each distracter plausible? Is the grammar in each option consistent with the stem? 10. Does the item exclude options equivalent to “all of the above” and “none of the above”? 11. Unless another order is more logical, are options arranged alphabetically? Matching Items • Anatomy of a matching item: – Consist of • Premises (or stimuli) and • Responses. – Advantages • Provides for wide sampling of knowledge targets. • Relatively easy to construct. • Can be scored objectively and efficiently. Specific Item-Writing Guidelines for Matching Items 1. Include homogeneous premises and responses. 2. Use more responses than premises. 3. Make sure directions are clear to students. 4. Keep responses short and logically ordered. 5. Use four to ten premises (and restrict to one page). 6. Avoid grammatical clues to correct answers. Developing ConstructedResponse Items Paper & Pencil Constructedresponse items, that is. Developing ConstructedResponse Items • Major advantage of constructedresponse items: – They elicit responses that more closely resemble real-life behavior. • In general, however, if a selectedresponse item can provide the same evaluative information as a constructed-response item, use the selected-response item. Short-Answer Items: Advantages and disadvantages Advantages: 1. Easy to construct. 2. Require the student to supply and answer. 3. Many such items can be included in a test. Disadvantages: 1. Generally limited to knowledge-level skills. 2. More likely scored erroneously than are selectedresponse items. Short-Answer Items: Item-writing rules 1. 2. 3. 4. 5. Use direct questions rather than incomplete statements. Write items so that the correct response is concise (a few words or a short phrase). Write items so that they can be scored efficiently. Be sure there is a highly limited set of correct responses. Think of the correct response, then write the item. Completion Items: Item-writing rules Same advantages/disadvantages of shortanswer items. Same rules applicable to completion items, plus these additional four: 1. Be sure the blank represents a key word or phrase. 2. Position blank at or near the end of the item. 3. Keep blanks the same length. 4. Use no more than two or three blanks. Essay Items: Advantages Unique advantage: Can assess ability to communicate in writing (synthesize, evaluate, compose). Other advantages: 1. Provide more direct measures of behaviors specified in performance objectives. 2. Require the student to produce a response. Essay Items: Limitations Scoring is less reliable (more subjective). 1. Inconsistent within teachers across multiple scorings of the same responses. 2. Inconsistent within teachers across students. 3. Inconsistent among teachers on the same responses. Provides less adequate sampling of content domain. More time-consuming to score. Essay Item-Writing Rules 1. Convey a clear idea of how extensive a response is expected: – – – Ten minutes or less (typical for a restricted-response essays). Specify a range for the number of words or the amount of time to be spent on the response. Make the distribution of points obvious. – – – – Would different readers assign the same score? Describe what constitutes a correct and complete response. The rubric should be obvious to knowledgeable students. You do not have an essay item unless you have a rubric! 2. Develop a suitable scoring plan (rubric): Essay Item-Writing Rules (Continued) 3. Do not allow a choice of which items to answer. 4. Evaluate all responses one item at a time. 5. Vary the student order when reading responses. 6. Decide on the weight grammar and vocabulary will carry beforehand. 7. Conceal the identity of students, if possible. 8. Use multiple scorers, when possible. Multiple-Choice Item Flaws Examples M-C Item Flaws Which best describes what happens when work is done? A. B. C. D. A force operates through a distance. A force is exerted. Energy is destroyed. Potential energy is changed to kinetic energy. [A]Flaw: Using stereotyped phrases. Item can be answered correctly based on recall of verbal information as well as through understanding of the principal involved. M-C Item Flaws Which of the following has helped most to increase the length of human life? A. B. C. D. Fast driving. Avoidance of overeating. Wider use of vitamins. Wider use of inoculation. [D]Flaw: highly implausible distracter. Choice “A” is unreasonable, reducing the item to a three-choice item. M-C Item Flaws Horace Greeley is known for his A. B. C. D. advice to young men not to go west. discovery of anesthetics. editorship of the New York Times. humorous anecdotes. [C]Flaw: Verbal trick in distracter: choice “A” inserts the word not into a phrase otherwise attributable to Horace Greeley. M-C Item Flaws Slavery was first started A. B. C. D. at Jamestown settlement. at Plymouth settlement. at a settlement in Massachusetts. a decade before the Civil War. [A]Flaw: Non-parallel distracters. Choices “A” and “B” give specific places, “C” designates a more general area, “D” specifies a time. This ambiguity makes more than one choice correct. M-C Item Flaws In purifying water for a city water supply, one process is to have the impure water seep through layers of sand and fine and course gravel. Here many impurities are left behind. Below are four terms, one of which will describe this process better than the others. Select the correct one. A. Sedimentation C. Chlorination B. Filtration D. Aeration [B]Flaw: Stem includes an “instructional aside.” M-C Item Flaws While ironing her formal, Jane burned her hand accidentally on the hot iron. This was due to a transfer of heat by A. conduction. B. radiation. C. conversion. D. absorption. [A]Flaw: Stem includes “window dressing.” The introduction implies a practical problem when the item only involves knowledge of technical terms. M-C Item Flaws In the definition of a mineral, which of the following is incorrect? A. B. C. D. It was produced by geologic processes. It has distinctive physical properties. It contains one or more elements. It has a variable chemical composition. [D]Flaw: Uses a negative in the stem; tends to be confusing. These types of items are rarely found outside the classroom. M-C Item Flaws Which event is more important in American history? A. B. C. D. Braddock’s defeat. Burr’s conspiracy. Hayes-Tilden contest. Webster-Hayne debate. Flaw: No best answer. Who’s to say which is more important. Even experts would not agree. M-C Item Flaws The population of Denmark is about A. 2 million. B. 15 million. C. 4 million. D. 7 million. Flaw: Unnatural sequence of responses. It would be better to order from 2 million to 15 million. M-C Item Flaws The balance sheet report for the Ajax Canning Company would reveal (A) the company’s profit for the previous fiscal year, (B) the amount of money owed to its creditors, (C) the amount of income tax paid, or (D) the amount of sales for the previous fiscal period. [A]Flaw: Placing distracters in tandem with the item stem. M-C Item Flaws Which is the best definition of a vein? A. B. C. D A blood vessel carrying blood going to the heart. A blood vessel carrying blue blood. A blood vessel carrying impure blood. A blood vessel carrying blood away from the heart. [A]Flaw: Needless repetition in the distracters. End