Test Writing Basics

advertisement
Test Writing Basics
Molly Baker, Ph.D.
Planning Good Tests
Determine the amount of time available, the location of the test, the level of difficulty
desired, and the percentage of weight the test will hold compared to the total number of
assessments.
Create test specification charts to help plan the test. See examples below, which assume that
the test creator has determined that an objective test with some constructed items is best.
Write sample items and ask colleague to critique them. Revise.
Administer test.
Evaluate test by conducting item analysis and collecting feedback from the students on items
that were frequently missed. Was the question confusing or is the content it was
attempting to measure confusing?
Example 1:
Bloom's
Knowledge
Comprehension
Application
Analysis
Synthesis
Evaluation
Total
Content
Area 1
15%: 9 items
Content
Area 2
5%: 3 items
5%: 3 items
15%: 9 items
5%:3 items
Content
Area 3
Content
Area 4
5%: 3 items
5%: 3 items
5%: 3 items
5%: 3 items
35%
15%
5%: 3items
20%
Total
20%:12 items
5%: 3 items
5%:3 items
30%
20%
30%
20%
10%
10%
10%
100%
Example 2:
Objectives
MC
TF
Obj 1
(know)
Obj
2(apply)
Obj
3(comp)
Obj
4(synth)
Obj
5(apply)
Total
5 pts
5 pts
10
5
5
Matching
Short
answer
10 pts (5
items)
Essay
Total items
5 pts (1 item)
25%:25 items
15%:15 items
10
5 pts (1 item)
15
15%:15 items
5
40
20%:20 items
10
10 pts (5
items)
20
10
1
10 pts (1
item)
20
25%:25 items
100
Objective Test Items
Objective items assess content based on objectives.
Plan on about 1 minute per MC question, 2 minutes per TF, 3-5 minutes per short answer.
Common types:
Multiple Choice
True False
Fill-in and short answer
Matching
It's more challenging to write good items, but easier to grade and easier to select a more
representative sample of questions from all subject areas to be tested.
It is easier for the test taker to guess.
Multiple choice questions tend to be less ambiguous or subject to misinterpretation than do
T-F ones.
Multiple Choice Items:
How to develop them:
1. Write propositions (facts, concepts (abstract or concrete), principles (cause and effect,
relationship between two concepts, laws of probability, axiom), or procedures (sequence
of mental or physical acts leading to a result)) that represent important content the
students have been learning. These are often based on the instructional objectives you
have already written. Students should feel they are being tested on material that is
important to know, not trivial content.
2. Convert these propositions into questions (called the "stem").
a. Each question should address one idea at an appropriate reading level.
b. The central idea should be in this stem rather than in the choices.
Sample:
What best defines photosynthesis? NOT=Photosynthesis is:
c. Each question should be a complete sentence. Do not use an incomplete
statement.
Sample:
Which tool is recommended for taking blood pressure readings? NOT=Taking
blood pressure readings uses _______ tool:
d. Try not to use negatives (not, except). If unavoidable, put the negative word in
CAPS. If it is important to measure a student's knowledge of all of the
possibilities, convert the question into multiple questions rather than one with a
negative term. Never use double negatives.
e. Put the important part of the question near the beginning.
3. Develop a correct answer (vary the position of the right answer in the items)
4. Develop plausible distractors:
a. Select distractors from common misconceptions and misunderstandings about the
central idea of the question, perhaps selected from errors students have made. A
higher-level thinking list of distractors includes other correct answers too. The test
taker is then asked to select the "best answer," requiring him/her to evaluate all of
the correct answers.
Sample:
2
Which approach is most effective for reducing fevers in young children?
b. Try to make answer choices equal in length (preferably short), similar in
complexity, and equal grammatically to the correct answer.
c. If the answer choices have a natural order, arrange them in that fashion (dates,
ages). Otherwise, arrange them randomly.
d. Use "all of the above" or "none of the above" infrequently and not as a filler. It is
OK to have 3-6 distracters in various items throughout an exam. An alternative is
to use the "best answer" approach described above, or construct multiple truefalse items (see True-False section).
e. Move any repetitive words or phrases that appear in all distractors to the stem.
5. Check for correct grammar, punctuation, spelling, and capitalization.
6. Format the item vertically instead of horizontally.
HOTS recommendations:
1. To test "understanding" (ability to explain a term, concept or principle beyond rote
memory) the question can ask students to identify a correct nonverbatim definition,
identify characteristics or noncharacteristics, or identify examples or nonexamples.
Samples:
Which best defines ___________?
Which is (un)characteristic of ___________?
Which of the following is an example of ____________?
2. To test critical thinking "prediction," the question requires the student to predict what will
happen given information OR what caused something to happen (usually based on
understanding of particular concepts and principles).
Samples:
What would happen if _______?
If this happens, what should you do?
On the basis of _______, what would you do?
Given __________, what is the primary cause of __________?
3. To test critical thinking "evaluation," the question asks the student to select a criterion or
criteria, use a criterion or criteria, or both (usually based on understanding of particular
principles and the application of a procedure).
Samples:
What is most effective (appropriate) for ________________?
Which is better (worse) ______________?
What is the most effective method for _______________?
What is the most critical step in this procedure?
What is (un)necessary in a procedure?
4. To test critical thinking "problem solving," multiple steps must be required of the student.
Therefore, objective questions of this sort are usually presented in sets (usually requiring
understanding of concepts, principles or procedures and mental skills to select which
ones; most difficult to teach and test).
Sample set:
3
What is the nature of the problem?
What do you need to solve this problem?
What is a possible solution?
Which is a solution?
Which is the most effective (efficient)solution?
Why is _______the most effective (efficient) solution?
True-False Questions:
Less reliable than multiple choice.
Best used when a question has only two plausible answers.
Sample:
Increasing parental involvement with math homework reduces student performance on math
tests.
Also good to use when testing the ability to apply a principle (HOTS).
Sample:
It is easier for a poor student to get a good score (80 percent correct) on a true-false test if the
test includes only 50 items than if it includes 100 items.
How to Develop Them
1. Select a single idea from the proposition, such as a concept or principle, that is important
and worth saying.
2. Write a true statement, based on the idea, that would be easy to defend to an expert but
not be obvious to anyone.
3. Write a false statement based on the same idea, usually a parallel but opposite statement.
Sample:
More salt can be dissolved in a pint of warm water than in a pint of cold water.
More salt can be dissolved in a pint of cold water than in a pint of warm water.
4. Use determiners in a variety of ways throughout the exam (all, never, some, few….avoid
a pattern such as statements with "never" always being false)
5. Choose more false statements than true.
6. To reduce ambiguity, use internal comparisons.
Sample:
Open-book tests tend to me less inefficient than closed-book tests. NOT=Open book tests
tend to be inefficient.
7. Avoid exact wording of the text book.
8. Avoid tricky items; make it clearly true or false so it can be defended.
9. If multiple answers are correct, a multiple tru-false question may be a better format to
use.
Sample:
An ecologist losing weight by jogging and exercising is
1. increasing maintenance metabolism (T)
2. decreasing net productivity (T)
3. increasing biomass (F)
4. decreasing energy lost to decomposition (F)
5. increasing gross productivity (F)
4
Short Answer Questions:
Often written to focus the question toward recalling facts or applying principles.
Easier to write than a multiple-choice or true-false question, but potentially more difficult or
more time-consuming to grade.
How to Develop Them
1. Based on a proposition, write a question rather than an incomplete statement.
2. For fill-in-the-blank statements, avoid more than one blank; put the one blank at the end
of the sentence.
Matching:
Best used for lower-order objectives.
How to Develop Them
1. Identify a category of items.
2. Arrange the premise and response options in 2 columns, numbered items on the left and
lettered items on the right.
3. Clearly specify in the directions whether the matching is one-to-one or one-to-many and
whether response items can be used more than once.
4. Use between 6 and 15 question stems and 2-3 more response options than question stems.
5. Put the entire matching set on one test paper.
Essay Tests
Essay tests assess content, process and/or writing skills depending on the objectives being
measured.
Plan on about 5 pages of writing per hour, depending on amount of thinking required to address
the complexity of the problem.
Essay tests are easier to write, but grading takes more time. It's important to have criteria
(checklist?) identified for correct answers to reduce subjectivity, increase consistency of
scoring across student papers, and minimize the potential impact of "bluffing." It is
recommended that scoring of one item be done for all students before moving on to the next
item. Provide feedback to students on their answers and share your key and model answers
with the students; these practices will increase performance on future exams. Do not let
writing skills affect your evaluation of the answer's content unless the students know they are
being evaluated for such.
Essay tests can draw on fewer representative subject areas than objective tests.
A good essay question requires an original thoughtful response composed by the examinee in the
form of several sentences, not the recall of memorized items or an undefended opinion.
Essay tests do not do a better job of determining how well students can analyze, organize,
synthesize, or develop original ideas if the efforts of instruction were not directed toward
such goals. The tendency is to focus instruction on establishing a knowledge base and focus
the evaluation on the application of that knowledge. Planning can prevent these
inconsistencies.
Essay Questions:
How to Develop Them
1. Select an objective(s) that the essay will measure.
5
2. Delimit the scope of the content to be covered. One way to do this is the develop the
criteria for evaluation and then write a question aimed at those criteria. Avoid making the
question so general that any number of right answers are possible; it makes it very
difficult to evaluate the answers then. It is tempting to compare the thoroughness of the
answers between students instead of the correctness of an answer as prescribed by the
checklist.
3. Define the student's task as clearly and specifically as possible. The verb in the objective
may suggest the task (analyze, interpret, explain, predict). Avoid using verbs such as
discuss, comment on, elaborate on unless you make it clear what you expect.
4. Avoid using essays to measure objectives that can be better measured by objectively
scored items. Use essays to measure ability to synthesize, integrate, speculate and
perform other higher-level tasks.
5. Use several short essay questions rather than one long one. It makes preparing the scoring
guide much easier and your grading more reliable from one test to another.
6. Develop a model answer or scoring guide. Show the objective, the question, and your
scoring guide to a colleague for feedback. Revise as needed.
7. After the test, look at the range of answers received to determine if the question was
sufficiently delimited and made clear.
8. Share sample questions and exemplary answers with the students before you give the
exam. Practice essay quizzes are a good idea too.
Performance Tests:
These tests are often created to assess complex thinking skills that can easily be observed (e.g.,
fix a cellphone), complex mental or physical behavior that is not easy to observe without the
help of a checklist or rating scale (e.g., efficient planning of a lesson OR correct loading of a
dishwasher), or complex physical behavior that can only be evaluated through observation
(e.g., correct serving of a tennis ball).
Examples:
1. Actual Performance (assess content AND process; observe, rating quality or describe,
often with a rubric)
Interpretive reading aloud
Teach a child a concept
Do CPR
Demonstrate how to serve a tennis ball
Conduct an interview for human relations office
Perform an original dance
Play the clarinet
Conduct an ITV lesson on the system
Do quality of water analysis of nearby stream
Write an essay (when assessing writing skills)
2. Simulation of performance (assess content AND process used or proposed)
Flight simulation
Design a stock/bond transaction plan based on data provided
Diagnose an engine problem based on data provided
Propose clinical decisions to address presented case
6
Propose management decision to respond to presented scenario
Outline strategy for taking "sample" objective psychology test
Conduct a mock interview
3. Product development (assess the product, the outcome of performance; appropriate process
(actual performance) is inferred)
Compose a nursing care plan based on data
Make a woodworking project
Develop a weekly menu
Write a book review or speech or lab report
Outline a chapter in biology text book
Design a lesson plan for teaching ______ to ________
Write an annual report based on data
Portfolio (organized collection of student work often used to demonstrate progress)
Oral report
Research paper
4. Identification (of real objects; "go get a ___" or "go get me a _____ that does _______"
Carpenter's tools
Rocks
Muscles, bones, nerves
Works of art by particular artists or in a particular style
Musical instruments
Leaves
Lab equipment
Map locations or routes
Birds
5. Performance task with prescribed components and criteria for evaluation (assess each
component using performance criteria shared with learners ahead of time)
Online course design
Research article critique
Journal of internship experience
Cost-benefit analysis of corporate initiative
Experiment (set up and complete experiment with recorded observations and conclusion)
How to Develop Them
1. Select an objective or objectives to which a performance task is appropriate.
2. Determine the length of time the learners will have to complete the test/task (usually a
full period to several days or weeks).
3. Develop one or more potential tests/tasks tied to the objectives.
a. How structured is the definition of the problem? In other words, will you tell them
what the problem is (structured) or are they to figure out what the problem is and
go from there (unstructured)? Is the type of problem meaningful and realistic
within the subject taught, i.e., is it contextualized (authentic)?
7
b. How much will you tell the students about what materials to use, what proportions
to use, where to get the information they need, etc.? (scaffolding) Whatever is
decided, are the directions clear?
c. Are the students free to choose what strategies they will use to solve the problem,
or are they to choose between ones you describe, or do you expect them to invent
one that is consistent with limits you set? (alternate strategies) Do they encourage
the learners to draw on a variety of skills, knowledge, and processes? Is your
description of the scope of the project clear?
d. Are alternate solutions possible or is only one acceptable? (alternate solutions)
4. Decide if the students are to work alone or in groups and if the latter, whether their ability
to work in the group will be part of the evaluation. If so, develop guidelines and
conditions for collaborative work.
5. Decide if the students can solicit feedback or assistance form others while preparing their
"project." How much and when?
6. Develop the scoring criteria. Will they be based on the product produced, the process
used to create the product, or both? Are they specific enough to give guidance to the
students without being so rigid that a multitude of product modes are ruled out? Avoid
unclear language in the criteria. If the scoring criteria are too complex for the students to
understand, translate them into a checklist so that they know what is expected.
Sample
Parallel park without hitting the curb. NOT=Parallel park precisely.
7. Assemble necessary materials or equipment for the learners to use.
8. Evaluate what types of assistance will be needed during the project and make appropriate
arrangements.
Plan to attend the ICC workshop on designing rubrics, checklists, etc. for performance testing.
References for Item Writing
Baird, H. (1997) "Evaluating higher-order thinking" Performance assessment for science
teachers. Accessed: http://www.usoe.k12.ut.us/curr/science/Perform/Past4.htm
Baird, H. (1997) "Performance Assessment for science teachers" Performance assessment for
science teachers. Accessed:
http://www.usoe.k12.ut.us/curr/science/Perform/Past5.htm#Performance
Ebel, R.L. & Frisbie, D.A. (1991) Essentials of educational measurement, 5th ed. Englewood
Cliffs, NJ: Prentice-Hall Inc.
(Lots of examples)
Gronlund, N.E. (1991) How to write and use instructional objectives. New York: Macmillan
Publishing.
(Lots of verbs tied to types of learning in 3 domains, including Bloom's taxonomy in the
cognitive domain.)
Haladyna, T.M. (1997) Writing test items to evaluate higher order thinking. Boston: Allyn and
Bacon. (Lots of examples)
8
Download