Test Construction Sue Brookhart December 1, 2014 Introductions • Sue Brookhart, Ph.D. • Juliette Lyons-Thomas, Ph.D. (Fellow, Regents Research Fund) 2 Webinar Norms • All phones will be placed on mute • If you have a question, you can type into the chat box, and your question will be addressed during a break The chat box icon is located at the top right hand corner of your screen (remember to direct your chat to “Everyone”) • At the end of the webinar, you will be asked to fill out a survey based on your experience today 3 Learning Outcomes • The TITC grant has emphasized that in some cases, districts may need to create or alter existing assessments, based on the results of the assessment review • The purpose of this webinar is to help attendees to better understand the components of test construction, including test blueprints, linking item performance to learning objectives, and item writing. 4 A Sad Tale Every Friday Story Test 15 points – vocabulary 5 points – comprehension 20 points total 5 Example – A Test Blueprint for Friday Story Test Learning Objective Know new vocabulary words Remember Understand Analyze Total /Create 5 5 (17%) Use new vocabulary words in sentences 5 5 (17%) Understand the main points in the story 10 10 (33%) Connect elements from the story (character, plot, or setting) with own life or other texts. Total 5 (17%) 15 (50%) 10 10 (33%) 10 (33%) 30 (100%) 6 Example – A Simpler Blueprint for Friday Story Test Content Vocabulary words Remember 5 Elements from the story (character, plot, or setting) Total 5 (17%) Understand Analyze Total /Create 5 10 (33%) 10 10 20 (67%) 15 (50%) 10 (33%) 30 (100%) 7 What is a Test Blueprint? • A table with rows and columns that is a plan for the way in which the questions in a test will be distributed Rows show number of questions and marks for each topic/standard Columns show number of questions and marks for depth of thinking levels • Other names for this -- test specification, specification matrix, test plan 8 Use a Blueprint • To plan individual tests • Blueprint (or table of specifications) includes: Content Thinking skills Specific learning targets Emphasis (weight) • This information helps you write a test that is interpretable as you intend 9 Example of a Test Blueprint for a Middle School Science Unit Content Outline Remember Understand Basic Parts of Cell Name and tell function of nucleus, cytoplasm, cell membrane; Label parts of a cell on a line drawing (12 points) Apply Given photos of actual plant and animal cells, label the parts (4 points) Explain differences between plant & animal cells; Describe cell walls & cell membrane (4 points) Plant vs. Animal Cells Distinguish between diffusion and osmosis (2 points) Total Points % 16 40 4 10 8 20 Cell Membrane Define diffusion; List substances diffused and not diffused by cell membrane (6 points) Division of Cells Define division, chromosomes, and DNA (4 points) Explain differences between plant and animal cell division (4 points) Given the numbers of chromosomes in a cell before division, state the number in each cell after division (4 points) 12 30 Total Points 22 8 10 40 100 % 55 20 25 Name at least three things you notice. 10 Example of a Different Format for a Test Blueprint for a Middle School Science Unit Learning Objective Identify basic parts of cell Remember 12 Distinguish between plant & animal cells Describe diffusion and the function of cell membrane Understand the process of cell division Understand Apply 4 4 6 Total 16 (40%) 4 (10%) 2 8 (20%) 12 (30%) 4 4 4 22 (55%) 8 (20%) 10 (25%) 40 How is this example different from the previous example? (100%) Total 11 Example of a Different Format for a Test Blueprint for a Middle School Science Unit Remember Learning Objective Identify basic parts of cell 12 Distinguish between plant & animal cells Describe diffusion and the function of cell membrane Understand the process of cell division Total Understand Apply 4 4 6 Total 16 (40%) 4 (10%) 2 8 (20%) 4 4 4 12 (30%) 22 (55%) 8 (20%) 10 (25%) 40 (100%) Unit learning outcomes go here. 12 What is a Multiple-Choice Item? • A multiple-choice item consists of one or more introductory sentences followed by a list of two or more suggested responses. The student must choose the correct answer. 13 Align requirements for multiple choice item performance to learning objectives • Content • Performance • Thinking skills Objective items, especially multiple choice, can tap higher-order thinking if carefully written • Clear presentation to students of what is required for each task or item 14 Multiple Choice Items Which president of the United States was elected stem to four terms? a. Abraham Lincoln distractors alternatives b. Theodore Roosevelt *c. Franklin D. Roosevelt key 15 Guidelines for writing Multiple Choice Items Assess an important aspect of the unit’s instructional targets. Match your assessment plan in terms of performance, emphasis, and number of points. Ask a direct question or set a specific problem. Put the alternatives at the end. Put repeated words in the stem. 16 Guidelines for writing Multiple Choice Items Place the word in the stem and definitions in the alternatives, if testing definitions. Avoid “cluing” and “linking” (where the correct answer of one item depends on another item). Avoid textbook wording. Use simple vocabulary and sentence structure. 17 Guidelines for writing Multiple Choice Items Use consistent, correct punctuation and grammar relative to the stem. Avoid phrasing the item so the student’s personal opinion is an option. Arrange alternatives in a logical order. Have distractors that would be plausible to nonknowledgeable students. 18 Guidelines for writing Multiple Choice Items Have homogenous alternatives. Have distractors based on common errors or misconceptions if possible. Have 3 to 5 functional alternatives. Have one correct or best answer. Avoid “all of the above” and use “none of the above” sparingly. 19 Evaluate the Stems A. Why did housing prices drop so rapidly in 2008? B. Which one of the following statements is true about housing prices? Which one is better? Why? 20 Evaluate the Stems A. An orangutan is a B. Orangutans are classified as Which one is better? Why? 21 Evaluate the Stems A. Brass, which is used in decoration, musical instruments, and plumbing supplies, to name only a few, is made from B. Brass is made from Which one one is is better? better? Why? Which Why? 22 Evaluate the Stems A. The man who first explored Florida was B. The Spaniard who first explored Florida was Which one is better? Why? Which one is better? Why? 23 Evaluate the Stems A. Which of the following illustrates what is meant by condensation? B. Which of the following does not illustrate what is meant by condensation? Which one is better? Why? Which one is better? Why? 24 A question for you Which of the following provides the best stem for a multiple-choice item? a. Penicillin is b. Penicillin was discovered by c. Penicillin, which has many uses in medicine, was discovered by d. Who discovered penicillin? 25 A question for you Which of the following provides the best stem for a multiple-choice item? a. Which of the following did not contribute to the great depression? b. One major factor that contributed to the great depression is c. The great depression was d. The great depression was caused by 26 A question for you What is wrong with the stem of the following multiple-choice question? “Which of the following states is the largest state in the United States?” a.Largest can be measured either geographically or by population. b. It measures opinion rather than fact. c. It measures only a lower order skill. d. It should be posed as a statement instead of a question. 27 A question for you Which of the following sets of alternatives would be best for a multiple-choice item about a battle in the Civil War? a. Davis, Grant, Lincoln, none of the above b. Lincoln, Mason-Dixon Line, Sherman, Vicksburg c. Grant, Jackson, Lee, Sherman d. Jefferson, Lincoln, Roosevelt, Washington 28 A question for you Which of the following sets of alternatives is best for the following multiple-choice item: "The perimeter of a rectangle 4 inches long and 2 inches wide is ______"? a. 6 inches, 8 inches, 12 inches b. 2 inches, 12 inches, 24 inches c. 11 inches, 12 inches, 13 inches 29 Circle the ball. 30 Context-dependent item sets • Use introductory material Readings Tables, graphs, or charts Pictures Formulas, lists of terms or symbols • Write a set of items requiring students to interpret the material 31 Why use Introductory Materials? • Introductory materials (readings, graphs, tables, and maps) used in context-dependent item sets can help assess higher order thinking. • Because they give students something to think about. 32 Context-dependent item sets • A good way to assess higher-order thinking • The introductory material allows you to present novel material to students • The questions can then be about interpreting, not recalling, material 33 Jennifer drew what the Moon looked like just after sunset every third or fourth night. Her drawings for the nights she observed the Moon are shown below. On Night 11 the clouds were so thick that Jennifer could not see the Moon. Based on the drawings for the other nights, what would Jennifer have seen on Night 11 if the sky were clear? A. C. C. B. D. D. 34 According to the map, which of the following does the United States both export to Canada and import from Canada? A. B. C. D. Cars Iron Aluminum Coal 35 Frontier Women Like the early colonial women settlers of the backwoods, frontier women made everything their families needed. Most began work at daybreak and did not rest until late evening. They cooked, spun cloth, made clothing, raised children, and tried to keep their dirt homes clean. They cleared and plowed fields, tended and harvested crops, milked the cows, raised hogs, rode and trained horses, and did just about every chore on the farm. The women not only worked, they also made most of their own tools. To make pitchforks, they attached handles to deer antlers. Many of the women learned to use a knife well enough to carve spoons, forks, and bowls out of animal bones. They fashioned cups and containers out of vegetable gourds and animal horns. Which statement best describes the frontier women? A.They lived dangerous lives and tamed the West. B.They hunted to provide food for their families. C.They frequently worried about the safety of their homes. D.They worked hard and possessed many skills. 36 What is a constructed-response item? • Constructed response test items ask students to compose their responses, and are scored with a judgment of the quality of those responses. 37 Types of Constructed Response Items • Restricted response essay items limit both the content of students’ answers and the form of their written responses. • Extended response essay items require students to express their own ideas and to organize their own answers. • Show-and-explain-the-work problems on math and science tests are also constructed response items. 38 Restricted Response Essays • Limit the responses • Still should require higher-order thinking, not recall • Can include interpretive material • Several restricted response essays usually yield better information about student understanding than one extended essay 39 Essay Items • Write items that require students to explain a process, defend a position, etc. – something worth writing about, NOT just “coming up with” facts and concepts • Write scoring scales or rubrics that match the learning objective(s) • Usually best to score all answers to one question before scoring the next 40 Example A bird-watcher wants to see many birds in a onehour period. She decides to investigate which type of food will attract more birds in her backyard. She has a choice of two types of bird food. 1. 2. Sunflower seeds Thistle seeds Describe a fair test the bird-watcher could conduct to help her decide which food will attract more birds. What information should the bird-watcher collect from her test to help decide which type of food attracts more birds? 41 Example The two statements below represent contrasting views regarding the rapid development of the Brazilian rain forest. For each view, explain one probable reason for the speaker's attitude, and give one possible argument the speaker might make to defend his or her point of view. I.Brazilian developer: "Our nation’s prosperity depends on developing the rich resources of the rain forest.“ II.European diplomat at an international conference on Earth’s environment: "There is certainly a need for an international agreement on the responsible development of the rain forest." 42 Show-and-explain-the-work Problems An amusement park has games, rides, and shows. The total number of games, rides, and shows is 70. There are 34 rides. There are two times as many games as shows. How many games are there? ______________________ How many shows are there? ______________________ Use numbers, words, or drawings to show how you got your answer. 43 Guidelines for Writing Essay Items Assess an important aspect of the unit’s instructional objective(s). Match your assessment plan in terms of performance, emphasis, and number of points. Require students to apply their knowledge to a new or novel situation. 44 Guidelines for Writing Essay Items Define a task with specific directions (rather than leave the task so broad that virtually any response can satisfy it). Use a level of complexity appropriate for students’ level of maturity. Require the student to demonstrate more than recall of facts, definitions, generalizations or other ideas. 45 Guidelines for Writing Essay Items Word questions in a way that leads all students to interpret the item in the way you intended. Make clear to the students all of the following: (a) length of the required writing, (b) purpose for which they are writing, (c) amount of time to be devoted to answering this item, and (d) the basis on which their answers will be evaluated. 46 Guidelines for Writing Essay Items For essays requiring students to state and support their opinions on controversial matters, make clear to the students that their assessment will be based on the logic and evidence supporting their arguments, rather than the actual position taken or opinion stated. 47 A question for you Identify the flaw(s) in the following essay question: “List the major exports of Chile.” a.Requires only the recall of facts. b.Does not specify length or criteria for evaluation. c.Does not require students to support an opinion. d.Only (a) and (b): “requires only recall” and “does not specify criteria” 48 A question for you In a current events unit in a Social Studies class (where the learning goals are for students to understand current events in the news), students are asked to write an essay on the question, “Who do you think should be the next President of the United States?” Which of the following is the best critique of this essay question? a.It’s a bad question, because it calls for an opinion. b.It’s a bad question, because it asks about events that haven’t happened yet. c.It’s a good question, and should be given as an in-class essay after some directions about length and evaluation criteria have been added. d.It’s a good question for an out-of-class essay, and students should be asked to support their opinion with material from the news media and other sources. 49 A question for you What is the MOST important flaw in this essay question: “We have studied the organization of the federal government. Explain each step in the process for passing a bill into law.” a.Does not specify the content the essay is to be about. b.Does not require application or higher-order thinking. c.Does not provide criteria for evaluation. d.Does not give time limits. 50 A question for you A teacher gave her students this essay question: “Evaluate the effect of air pollution on the quality of life in the western part of this state.” One student wrote, “It’s horrible!” Which of the flaws in this question allowed for such a response? a.The question calls for an opinion. b.The question is too broad. c.No directions were given. 51 Scoring • Constructed response questions (essays or show-the-work problems) need scoring rubrics or point schemes. • The rubrics or points need to match the learning target. • More about that later! For now, just a couple examples. © S. M. Brookhart, 2014 52 Example General scoring rubric for an essay question 2 Main Idea and An important main Supporting idea is clearly Details stated. Supporting details are relevant and convincing. Explanation How the evidence supports the main idea is clear, reasonable, and well explained. 1 0 A main idea is stated. Supporting details are mostly relevant. A main idea is not stated, or is not correct. Supporting details are not relevant or are missing. How the evidence How the evidence supports the main supports the main idea is mostly idea is not clear, clear and not reasonable, reasonable. Some and/or not explained. explanation is given. 53 Thank you • The slides and a video of this webinar will be posted at https://www.engageny.org/resource/teachingcore-assessment-literacy-series-materials • Next webinar: Action Plan and Professional Development 3:30pm-4:30pm on December 15th, 2014 • Feedback: • https://www.surveymonkey.com/s/testconstruc tion 54