Test Construction

advertisement
Test Construction
Sue Brookhart
December 1, 2014
Introductions
• Sue Brookhart, Ph.D.
• Juliette Lyons-Thomas, Ph.D. (Fellow, Regents
Research Fund)
2
Webinar Norms
• All phones will be placed on mute
• If you have a question, you can type into the
chat box, and your question will be addressed
during a break

The chat box icon is located at the top right hand
corner of your screen (remember to direct your
chat to “Everyone”)
• At the end of the webinar, you will be asked to
fill out a survey based on your experience
today
3
Learning Outcomes
• The TITC grant has emphasized that in some
cases, districts may need to create or alter
existing assessments, based on the results of
the assessment review
• The purpose of this webinar is to help
attendees to better understand the
components of test construction, including
test blueprints, linking item performance to
learning objectives, and item writing.
4
A Sad Tale
Every Friday Story Test
15 points –
vocabulary
5 points –
comprehension
20 points total
5
Example – A Test Blueprint for Friday Story
Test
Learning
Objective
Know new vocabulary
words
Remember
Understand
Analyze Total
/Create
5
5 (17%)
Use new vocabulary
words in sentences
5
5 (17%)
Understand the main
points in the story
10
10 (33%)
Connect elements from
the story (character, plot,
or setting) with own life or
other texts.
Total
5 (17%)
15 (50%)
10
10 (33%)
10 (33%)
30
(100%)
6
Example – A Simpler Blueprint for Friday Story
Test
Content
Vocabulary words
Remember
5
Elements from the story
(character, plot, or setting)
Total
5 (17%)
Understand
Analyze Total
/Create
5
10 (33%)
10
10
20 (67%)
15 (50%)
10 (33%)
30
(100%)
7
What is a Test Blueprint?
• A table with rows and columns that is a plan
for the way in which the questions in a test will
be distributed


Rows show number of questions and marks for
each topic/standard
Columns show number of questions and marks
for depth of thinking levels
• Other names for this -- test specification,
specification matrix, test plan
8
Use a Blueprint
• To plan individual tests
• Blueprint (or table of specifications) includes:




Content
Thinking skills
Specific learning targets
Emphasis (weight)
• This information helps you write
a test that is interpretable as you
intend
9
Example of a Test Blueprint for a Middle School Science Unit
Content Outline
Remember
Understand
Basic Parts
of Cell
Name and tell function of
nucleus, cytoplasm, cell
membrane; Label parts of a
cell on a line drawing
(12 points)
Apply
Given photos of
actual plant and
animal cells, label
the parts
(4 points)
Explain differences
between plant &
animal cells;
Describe cell walls
& cell membrane
(4 points)
Plant vs.
Animal Cells
Distinguish
between diffusion
and osmosis
(2 points)
Total
Points
%
16
40
4
10
8
20
Cell
Membrane
Define diffusion; List
substances diffused and not
diffused by cell membrane
(6 points)
Division of
Cells
Define division, chromosomes,
and DNA
(4 points)
Explain differences
between plant and
animal cell division
(4 points)
Given the numbers
of chromosomes in
a cell before
division, state the
number in each
cell after division
(4 points)
12
30
Total Points
22
8
10
40
100
%
55
20
25
Name at least three things you notice.
10
Example of a Different Format for a Test
Blueprint for a Middle School Science Unit
Learning
Objective
Identify basic parts of cell
Remember
12
Distinguish between plant
& animal cells
Describe diffusion and the
function of cell membrane
Understand the process
of cell division
Understand
Apply
4
4
6
Total
16 (40%)
4 (10%)
2
8 (20%)
12 (30%)
4
4
4
22 (55%)
8 (20%)
10 (25%)
40
How is this example different from the previous example? (100%)
Total
11
Example of a Different Format for a Test
Blueprint for a Middle School Science Unit
Remember
Learning
Objective
Identify basic parts of cell
12
Distinguish between plant
& animal cells
Describe diffusion and the
function of cell membrane
Understand the process
of cell division
Total
Understand
Apply
4
4
6
Total
16 (40%)
4 (10%)
2
8 (20%)
4
4
4
12 (30%)
22 (55%)
8 (20%)
10 (25%)
40
(100%)
Unit learning outcomes go here.
12
What is a Multiple-Choice Item?
• A multiple-choice item consists of one or more
introductory sentences followed by a list of
two or more suggested responses. The
student must choose the correct answer.
13
Align requirements for multiple choice
item performance to learning objectives
• Content
• Performance
• Thinking skills

Objective items, especially multiple choice, can
tap higher-order thinking if carefully written
• Clear presentation to students of what is
required for each task or item
14
Multiple Choice Items
Which president of the United States was elected
stem
to four terms?
a. Abraham Lincoln
distractors
alternatives
b. Theodore Roosevelt
*c. Franklin D. Roosevelt
key
15
Guidelines for writing Multiple Choice
Items
Assess an important aspect of the unit’s
instructional targets.
Match your assessment plan in terms of
performance, emphasis, and number of points.
Ask a direct question or set a specific problem.
Put the alternatives at the end.
Put repeated words in the stem.
16
Guidelines for writing Multiple Choice
Items
Place the word in the stem and definitions in the
alternatives, if testing definitions.
Avoid “cluing” and “linking” (where the correct
answer of one item depends on another item).
Avoid textbook wording.
Use simple vocabulary and sentence structure.
17
Guidelines for writing Multiple Choice
Items
Use consistent, correct punctuation and
grammar relative to the stem.
Avoid phrasing the item so the student’s
personal opinion is an option.
Arrange alternatives in a logical order.
Have distractors that would be plausible to nonknowledgeable students.
18
Guidelines for writing Multiple Choice
Items
Have homogenous alternatives.
Have distractors based on common errors or
misconceptions if possible.
Have 3 to 5 functional alternatives.
Have one correct or best answer.
Avoid “all of the above” and use “none of the
above” sparingly.
19
Evaluate the Stems
A. Why did housing prices drop so rapidly in
2008?
B. Which one of the following statements is true
about housing prices?
Which one is better?
Why?
20
Evaluate the Stems
A. An orangutan is a
B. Orangutans are classified as
Which one is better?
Why?
21
Evaluate the Stems
A. Brass, which is used in decoration, musical
instruments, and plumbing supplies, to name
only a few, is made from
B. Brass is made from
Which one
one is
is better?
better? Why?
Which
Why?
22
Evaluate the Stems
A. The man who first explored Florida was
B. The Spaniard who first explored Florida was
Which one is better?
Why?
Which one is better? Why?
23
Evaluate the Stems
A. Which of the following illustrates what is
meant by condensation?
B. Which of the following does not illustrate
what is meant by condensation?
Which one is better?
Why?
Which one is better? Why?
24
A question for you
Which of the following provides the best stem for
a multiple-choice item?
a. Penicillin is
b. Penicillin was discovered by
c. Penicillin, which has many uses in medicine,
was discovered by
d. Who discovered penicillin?
25
A question for you
Which of the following provides the best stem for
a multiple-choice item?
a. Which of the following did not contribute to
the great depression?
b. One major factor that contributed to the great
depression is
c. The great depression was
d. The great depression was caused by
26
A question for you
What is wrong with the stem of the following
multiple-choice question? “Which of the following
states is the largest state in the United States?”
a.Largest can be measured either geographically
or by population.
b. It measures opinion rather than fact.
c. It measures only a lower order skill.
d. It should be posed as a statement instead of a
question.
27
A question for you
Which of the following sets of alternatives would
be best for a multiple-choice item about a battle
in the Civil War?
a. Davis, Grant, Lincoln, none of the above
b. Lincoln, Mason-Dixon Line, Sherman,
Vicksburg
c. Grant, Jackson, Lee, Sherman
d. Jefferson, Lincoln, Roosevelt, Washington
28
A question for you
Which of the following sets of alternatives is best
for the following multiple-choice item: "The
perimeter of a rectangle 4 inches long and 2
inches wide is ______"?
a. 6 inches, 8 inches, 12 inches
b. 2 inches, 12 inches, 24 inches
c. 11 inches, 12 inches, 13 inches
29
Circle the ball.
30
Context-dependent item sets
• Use introductory material




Readings
Tables, graphs, or charts
Pictures
Formulas, lists of terms or symbols
• Write a set of items requiring students to
interpret the material
31
Why use Introductory Materials?
• Introductory materials
(readings, graphs,
tables, and maps) used
in context-dependent
item sets can help
assess higher order
thinking.
• Because they give
students something to
think about.
32
Context-dependent item sets
• A good way to assess higher-order thinking
• The introductory material allows you to
present novel material to students
• The questions can then be about interpreting,
not recalling, material
33
Jennifer drew what the Moon looked like just after
sunset every third or fourth night. Her drawings for the
nights she observed the Moon are shown below.
On Night 11 the clouds were so thick that Jennifer
could not see the Moon. Based on the drawings for the
other nights, what would Jennifer have seen on Night
11 if the sky were clear?
A.
C.
C.
B.
D.
D.
34
According to the map,
which of the following
does the United States
both export to Canada
and import from Canada?
A.
B.
C.
D.
Cars
Iron
Aluminum
Coal
35
Frontier Women
Like the early colonial women settlers of the backwoods, frontier women made
everything their families needed. Most began work at daybreak and did not rest until
late evening. They cooked, spun cloth, made clothing, raised children, and tried to keep
their dirt homes clean. They cleared and plowed fields, tended and harvested crops,
milked the cows, raised hogs, rode and trained horses, and did just about every chore
on the farm.
The women not only worked, they also made most of their own tools. To make
pitchforks, they attached handles to deer antlers. Many of the women learned to use a
knife well enough to carve spoons, forks, and bowls out of animal bones. They
fashioned cups and containers out of vegetable gourds and animal horns.
Which statement best describes the frontier women?
A.They lived dangerous lives and tamed the West.
B.They hunted to provide food for their families.
C.They frequently worried about the safety of their homes.
D.They worked hard and possessed many skills.
36
What is a constructed-response item?
• Constructed response test items ask students
to compose their responses, and are scored
with a judgment of the quality of those
responses.
37
Types of Constructed Response Items
• Restricted response essay items limit both the
content of students’ answers and the form of
their written responses.
• Extended response essay items require
students to express their own ideas and to
organize their own answers.
• Show-and-explain-the-work problems on math
and science tests are also constructed
response items.
38
Restricted Response Essays
• Limit the responses
• Still should require higher-order thinking, not
recall
• Can include interpretive material
• Several restricted response essays usually
yield better information about student
understanding than one extended essay
39
Essay Items
• Write items that require students to explain a
process, defend a position, etc. – something
worth writing about, NOT just “coming up
with” facts and concepts
• Write scoring scales or rubrics that match the
learning objective(s)
• Usually best to score all answers to one
question before scoring the next
40
Example
A bird-watcher wants to see many birds in a onehour period. She decides to investigate which
type of food will attract more birds in her
backyard.
She has a choice of two types of bird food.
1.
2.
Sunflower seeds
Thistle seeds
Describe a fair test the bird-watcher could
conduct to help her decide which food will
attract more birds. What information should the
bird-watcher collect from her test to help decide
which type of food attracts more birds?
41
Example
The two statements below represent contrasting views
regarding the rapid development of the Brazilian rain
forest. For each view, explain one probable reason for the
speaker's attitude, and give one possible argument the
speaker might make to defend his or her point of view.
I.Brazilian developer: "Our nation’s prosperity depends
on developing the rich resources of the rain forest.“
II.European diplomat at an international conference on
Earth’s environment: "There is certainly a need for an
international agreement on the responsible development
of the rain forest."
42
Show-and-explain-the-work Problems
An amusement park has games, rides, and shows. The
total number of games, rides, and shows is 70. There
are 34 rides. There are two times as many games as
shows.
How many games are there? ______________________
How many shows are there? ______________________
Use numbers, words, or drawings to show how you got
your answer.
43
Guidelines for Writing Essay
Items
Assess an important aspect of the unit’s
instructional objective(s).
Match your assessment plan in terms of
performance, emphasis, and number of points.
Require students to apply their knowledge to a
new or novel situation.
44
Guidelines for Writing Essay
Items
Define a task with specific directions (rather than
leave the task so broad that virtually any
response can satisfy it).
Use a level of complexity appropriate for
students’ level of maturity.
Require the student to demonstrate more than
recall of facts, definitions, generalizations or
other ideas.
45
Guidelines for Writing Essay
Items
Word questions in a way that leads all students
to interpret the item in the way you intended.
Make clear to the students all of the following:
(a) length of the required writing, (b) purpose
for which they are writing, (c) amount of time
to be devoted to answering this item, and (d)
the basis on which their answers will be
evaluated.
46
Guidelines for Writing Essay
Items
For essays requiring students to state and
support their opinions on controversial
matters, make clear to the students that their
assessment will be based on the logic and
evidence supporting their arguments, rather
than the actual position taken or opinion
stated.
47
A question for you
Identify the flaw(s) in the following essay
question: “List the major exports of Chile.”
a.Requires only the recall of facts.
b.Does not specify length or criteria for
evaluation.
c.Does not require students to support an
opinion.
d.Only (a) and (b): “requires only recall” and
“does not specify criteria”
48
A question for you
In a current events unit in a Social Studies class (where the
learning goals are for students to understand current events in the
news), students are asked to write an essay on the question,
“Who do you think should be the next President of the United
States?” Which of the following is the best critique of this essay
question?
a.It’s a bad question, because it calls for an opinion.
b.It’s a bad question, because it asks about events that haven’t
happened yet.
c.It’s a good question, and should be given as an in-class essay
after some directions about length and evaluation criteria have
been added.
d.It’s a good question for an out-of-class essay, and students
should be asked to support their opinion with material from the
news media and other sources.
49
A question for you
What is the MOST important flaw in this essay
question: “We have studied the organization of
the federal government. Explain each step in the
process for passing a bill into law.”
a.Does not specify the content the essay is to be
about.
b.Does not require application or higher-order
thinking.
c.Does not provide criteria for evaluation.
d.Does not give time limits.
50
A question for you
A teacher gave her students this essay question:
“Evaluate the effect of air pollution on the quality
of life in the western part of this state.” One
student wrote, “It’s horrible!” Which of the flaws
in this question allowed for such a response?
a.The question calls for an opinion.
b.The question is too broad.
c.No directions were given.
51
Scoring
• Constructed response questions (essays or
show-the-work problems) need scoring rubrics
or point schemes.
• The rubrics or points need to match the
learning target.
• More about that later! For now, just a couple
examples.
© S. M. Brookhart, 2014
52
Example
General scoring rubric for an essay question
2
Main Idea and An important main
Supporting idea is clearly
Details stated. Supporting
details are
relevant and
convincing.
Explanation How the evidence
supports the main
idea is clear,
reasonable, and
well explained.
1
0
A main idea is
stated.
Supporting details
are mostly
relevant.
A main idea is not
stated, or is not
correct.
Supporting details
are not relevant or
are missing.
How the evidence How the evidence
supports the main supports the main
idea is mostly
idea is not clear,
clear and
not reasonable,
reasonable. Some and/or not
explained.
explanation is
given.
53
Thank you
• The slides and a video of this webinar will be
posted at
https://www.engageny.org/resource/teachingcore-assessment-literacy-series-materials
• Next webinar:


Action Plan and Professional Development
3:30pm-4:30pm on December 15th, 2014
• Feedback:
• https://www.surveymonkey.com/s/testconstruc
tion
54
Download