Lecture 6: Writing Objective Test Items I. Writing Objective Test items. 1. VIP: Objective vs. Subjective scoring methodologies. Objective scoring does not require the scorer to make a decision about the correctness of the answer. It is either right or wrong, it either matches the key or it does not. A machine could easily accomplish this task. Subjective scoring requires the scorer to make judgments about the correctness and/or quality of the response. It may be done holistically (all or nothing) or analytically (partial credit). Analytic scoring requires the use of a checklist or rating scale, or some combination of the two. Subjective scoring is necessarily less reliable than objective scoring. Objective Test Items require students to select a correct or best answer. The items are “objective” because that is how they are scored. Objectively scored items are called Selected-Response Items. These include alternateresponse, matching, multiple choice, and sometimes, keyed response. Subjective Test Items (and Alternative Assessments) require students to create or construct a correct or best answer. The items (or assessments) are “subjective” because that is how they are scored. Subjectively scored items are called Constructed-Response Items. These include keyed response items, fill-in-the-blank, short answer, and essay items. Alternative assessments, such as performance and product-oriented tasks, are also scored subjectively. They are called “alternative assessments” because they are an alternative to paper and pencil assessments. 2. Objective items are popular with test writers because: 1. They can be used to measure many types and levels of learning. 2. They can test a lot of content because items can be answered quickly. 3. They are easy to administer, score, and analyze. 4. They yield very reliable test scores. 1 3. Congruence of Items with Objectives Help ensure that your score-based decisions are valid (you measured what you intended to measure) by paying attention to the “match” between the behavior described in the objective and the task the student will perform for the assessment. Every item on your test (or task in your assessment) must have a corresponding objective that it is measuring. A Table of Specifications provides evidence of this process. Analyze the performance described in the instructional objective and create a test item or assessment task that calls forth that performance. Examples: Objective: Define reliability. Item: Write a one-sentence definition of reliability. Objective: Distinguish between norm-referenced and criterion-referenced interpretations of test scores. Item: Read each of the following statements. Indicate whether each statement represents a criterion-referenced interpretation or a norm-referenced interpretation of student performance. Circle the letter “C” if you think the interpretation is criterion referenced or the letter “N” if you think it is norm-referenced. (1 point each) C C N N 1. Julie earned the highest score in her class. 2. Joe can recite the entire alphabet. Objective: Identify key characteristics of validity. Item: 1. Which of the following is NOT a key characteristic of validity? A. Validity is a matter of degree, not all-or nothing B. Validity is specific to some particular use or interpretation C. Validity involves an overall evaluative judgment D. Validity is necessary to ensure reliability 2 4. Developing Selected-Response Test Items A. Alternative-Response Items (A.K.A. True/False, binary choice) There are two choices to select from: T/F, Y/N, Noun/Verb, etc. PROS: CONS: B. Easy to score Can cover the most content in the least amount of time and space Yields the most reliable test scores (scoring + many items) *50% chance of a correct guess on a single item!! Can only measure one concept at a time Can be difficult to write well Alternate Choice Guidelines: 1. Make sure that the choices (T/F, Y/N, Norm/Criterion Referenced) logically match the judgments. 2. Include only one idea per statement POOR Example: A worm cannot see because it has simple eyes. BETTER Example: Worms have simple eyes. 3. Rarely use negative statements (never use them with very young children or children with special needs!), never use double negatives. Always bold or underline negative words in your items. POOR Example: BETTER Example: OR None of the steps in the experiment was unnecessary. All of the steps in the experiment were necessary. Some of the steps in the experiment were unnecessary. 4. Avoid providing unintended clues a. Make all statements approximately equal in length b. Use approximately equal numbers of each alternative (60-40 works better than 50-50) c. Avoid using absolutes (e.g. always or never) 3 5. Never quote statements directly from the text. Quotes taken directly from the text often lose their meaning without the context of the paragraph they are taken from. HINT: If you can write both a true and a false version of an alternate choice item, it is probably a good alternate choice item. C. Matching Items Consist of the premises (questions) and responses. They are used to measure knowledge of simple associations, such as between terms and their definitions. PROS: CONS: Can measure many facts/associations quickly Easy to construct Easy to score Requires homogeneous material Usually limited to measuring factual information Can be difficult to format (practice!!) D. Matching Guidelines 1. Provide clear directions. Describe how students are to respond and what type of judgment they are making. Refer to the Column labels to describe what they need to do. Indicate whether items may have more than one correct answer and whether or not responses may be used once, more than once, or not at all. 2. Use only homogeneous material for each set of items to avoid giving clues to the correct response. POOR Example: Column A 1.___Third level in Bloom’s Taxonomy 2.___Criteria for writing objectives 3.___An instrument used to collect samples of student behavior 3. a. b. c. d. Column B Test Application Measurable Observable Label your columns so you can refer to those labels in your directions. 4 4. Keep the responses brief and arrange them in logical order (e.g. alphabetically, numerically, chronologically, etc.). Premises are read once; the list of responses is read many times. Students should be able to read over that list quickly. 5. Present more responses than premises to avoid providing unintended clues to the correct answer. This eliminates determining a correct answer through the process of elimination. POOR Example: Column A 1. ____First United States Astronaut 2. ____First president of the U.S. 3. ____Invented the Telephone 4. ____Discovered America 6. a. b. c. d. Column B Alexander Graham Bell Christopher Columbus John Glenn George Washington Place all premises and responses for a matching item on a single page. E. Keyed Response Items PROS: CONS: Can measure many facts quickly Helps develop early writing skills Usually limited to lower level outcomes and factual information Usually scored subjectively, so less reliable than other selected response These are a cross between Matching and Fill-in-the-blank item types. Questions are presented with a key word missing (the blank). Responses are presented as a “word bank” or “answer bank”. Students write the missing word from the bank on the blank you provide. We recommend using these items to help young students get used to writing on tests and to reinforce writing and spelling skills in general. However, using this type of item can introduce new sources of error to your scoring scheme. Consider 1. 2. 3. 4. these things: Does spelling count? What about student-invented correct answers? Will scores be lower due to time for writing? Remember, it’s not a fill-in-the-blank; you’re testing recognition and not recall (so keep the point values low). 5 F. Multiple-Choice Items Consist of a stem (the question) and the alternatives, which are the distractors and one correct answer PROS: CONS: Allows for diagnosis of anticipated errors and misconceptions when plausible alternatives are used. Can be used to measure high levels of learning! Can measure multiple concepts in one item Do not require homogeneous material Easy to administer and score Can be difficult to write plausible alternatives, so time-consuming Possible to guess correct answers when distractors are not plausible or clues are given in the question G. Multiple Choice Guidelines 1. The stem may ask a direct question or be presented as an incomplete statement. Some people use the incomplete statement version, but these are more ambiguous and are confusing to young children and students with learning difficulties. If you choose to use the incomplete statement format, put the blank at the END of the statement. Examples: Direct Question: What is the a. b. c. d. capitol of Florida? Miami Tallahassee Tampa Ft. Lauderdale Incomplete Statement: The capitol of Florida is ___________. a. Miami b. Tallahassee c. Tampa d. Ft. Lauderdale 6 2. The stem should be clearly stated and should provide as much of the information as possible to keep the responses short. So, stuff the stem with information and make the alternatives as short as possible. POOR Stem Example: South America ___________. A. is a flat, arid country B. imports coffee from the United States C. has a larger population than the United States D. was settled mainly by colonist from Spain 3. Avoid negatively stated items. Always emphasize negatives when you do use them. 4. Avoid unintended clues by making all responses grammatically consistent with the stem. (e.g. verb tenses, singular/plural, a/an) 5. All responses should be plausible, but one should be clearly best. 6. Avoid unintended clues by not having verbal associations between the stem and response. POOR Example: Which of the following agencies should you contact to find out about a tornado warning in your locality? A. State farm bureau B. Local radio station C. United States Post Office D. U.S. Weather Bureau 7. Responses should be approximately the same length. (Teachers tend to make correct responses longer.) 8. Correct responses should appear approximately the same number of times in each position (A, B, C, or D) in random order. i.e. no pattern in answers. (C is the most common correct answer on most teacher-made tests.) 7 9. Avoid “none of the above” and “all of the above”. Teachers tend to use these when they can’t think of another distractor. So, they usually are not the correct response and don’t distract anyone, they are just a waste of space on the test. However, none of the above can be useful on some math tests because they force students to work problems all the way out. POOR Example: Which of the following birds does not fly? A. Turkey B. Herron C. Lark D. None of the above 5. Writing Directions for Objective Test Items 1. State the skill measured. 2. Describe any resource materials required to answer the item 3. Describe how students are to respond and where 4. Describe any special conditions 5. Give point values Example: Identifying States. Locate the map of the United States at the bottom of the page. 10 of the states on the map have been numbered. Write the name each of the numbered states on the corresponding lines below. Spelling must be correct for full-credit to be awarded. (2 points each) 6. Assemble the Test A. B. C. D. E. F. Provide a place for student’s name, class period, etc. Cluster items by type, but start with the easiest items and finish with the most difficult. Provide directions for each type of item, including point values. Ensure that item order does not contribute to the complexity of the test or give unintended clues. (Using the order of presentation has been shown to improve scores among younger students.) Consider ease of scoring. Place blanks flush left on the page or at the end of fillin-the-blank items. Use clearly visible graphics with a professional appearance. Sloppy graphics are hard to read and interpret, which increases the error of measurement 8 7. Check for Quality 1. Write the test at least several weeks before you plan to administer it. Read over the test several days after finishing it. You will find things to correct! Make out the answer key before you copy the test!!!!! 2. Take the test yourself to check for errors and time restrictions. Remember it will take your students 4 or 5 times longer than you to take the test. 3. Have a colleague take the test to check for errors and time restrictions. 4. Get feedback from your high achieving students about the clarity of directions and items. Use their suggestions to improve your test. 9