Day 3 Slide Deck_Mt. Vernon

advertisement
STUDENT GROWTH
PROFESSIONAL
DEVELOPMENT FOR THE
PRINCIPAL AND TEACHER
EVALUATION PROCESS
May 2014
Survey Results


10 respondents from Advisory Committee
Top topics to cover:
 Designing
high quality assessments
 SLO’s
 Team

activities related to assessment creation
Top subtopics to cover:
 SLO’s:
discussion, process/creation, fairness
 Assessments: what is high quality, key elements,
selection
 Growth models: identification, implementation
Agenda for the Day





Student Growth Measures
Student Learning Objectives
Team Sharing Time
Assessment Development
Office Hours
Website with Resources
http://edanalytics.org/projects/illinois-studentgrowth
PERA Updates
PERA May update

Local Assessment Support (ISBE contract awarded)

Develop assessment literacy

Create local assessments, mostly in traditionally non-tested grades and subject

Run workshops, some on performance-based assessment, e.g. music,
career/technical



Pilot these workshops, then run with larger groups, and/or make into webinars
Guidebook on student growth (posted)

Includes examples of timelines and decision points and examples from other states
(build/borrow/buy assessments, set up data systems, set up SLOs, run PD)

Includes strong recommendation to have at least one pilot year
SLO template (posted)

Associated guidance shows that goals can be set by group (each starting point) or by
student, or both

Draft SLOs are posted in elementary math, music, HS English, independent living,
nutrition/culinary arts, consumer math

No guidance on how to make results comparable across teachers/schools/subjects.
PERA May Update


Guidance document on teachers of special education students, ELL, and early childhood
(discussed, not finalized or posted)

Districts have been advised by Charlotte Danielson not to create new frameworks, and to create
addenda instead. CPS has created these for special education, ELLs, arts, and physical education,
and one for early childhood education (ECE) will be released soon.

These are needed because administrators don't always have experience in all areas; they
illuminate specific considerations for these groups/acknowledge potentially different context
Lowest 20% of districts not yet determined


These must implement in 2015-16; Likely will be determined in summer 2014
State default model

Use one type II and one type III assessment, if a type II exists in that subject area and the process
for developing or selecting an assessment includes teachers, the assessment is aligned with the
curriculum, and is reliable and valid.

Guidance on how to meet these requirements will be forthcoming

Otherwise, use two type III assessments, with SLOs, one determined by evaluator, one by teacher

Note that the state model is only required in situations where a joint committee cannot reach
agreement on one or more aspects of student growth, and is only required for the parts where no
agreement was reached.

So if a joint committee agrees to everything but the percentage assigned to student growth, it
must use 50%.
PERA May Update

PEAC meetings are public
 Upcoming
meetings (all scheduled for 10 a.m. - 3 p.m.
at ISU Alumni Center, 1101 N Main, Normal):
 May 16 (may
be moved to Springfield)
 June
20
 July 18
 August 15

www.isbe.net/peac
Student Growth Measures
Student Learning Objectives
Value Tables
Simple Growth
Adjusted Growth
Value-Added
Student Learning Objectives (SLOs)

SLOs are goals that teachers and administrators set
using baseline data such as:
 Students’
performance on a prior year test
 Students’ performance on a diagnostic test (e.g., for
literacy or math skills)
 Prior year attendance data or discipline rates


Goals are monitored throughout the year and
success or failure determined at the end of the year
SLOs can be created by individual teachers, teams of
teachers, or the district.
Student Learning Objectives
Benefits
• Can be used with any assessment type
• Can be uniquely tailored
Drawbacks
• Not necessarily standardized or
comparable across educators
• Potentially very easy to game
• Potentially very hard to draw
meaningful conclusions
Implementation Requirements
• Parameters for educators to
follow
• Type I or Type or Type III
• Consideration of unintended
consequences and biases
Value Tables


Assigns point values to improvements across
categories (e.g., proficient to advanced)
Value tables can be weighted in many different
ways and cut points can be placed at creator’s
discretion.
Value Table Example
Cut Scores (from ISBE)
Example: Grade 4 to 5 Math

Focus on categories 1b and 2a (but possible for any adjacent categories in other
subjects and grades)

Grade 4: 1b covers 174 to 190, 2a covers 191 to 206
Grade 5: 1b covers 183 to 200, 2a covers 201 to 217







Student A: Moves from 174 to 200 (gain: 26 points, 85 value table points for
moving from 1b to 1b)
Student B: Moves from 190 to 190 (gain: 0 points, 85 value table points)
Student C: Moves from 190 to 183 (gain: -7 points, 85 value table points)
Should students A, B and C be weighted the same?
Student D: Moves from 174 to 182 (gain: 8 points, 20 value table points for
moving from 1b to 1a)
Should student C count more than student D?
Example: Grade 4 to 5 Math

Focus on categories 1b and 2a (but possible for any adjacent categories in
other subjects and grades)

Grade 4: 1b covers 174 to 190, 2a covers 191 to 206
Grade 5: 1b covers 183 to 200, 2a covers 201 to 217





Student E: Moves from 191 to 217 (gain: 26 points, 90 value table points
for moving from 2a to 2a)
Student F: Moves from 191 to 218 (gain: 27 points, 125 value table points
for moving from 2a to 2b)
Student G: Moves from 190 to 218 (gain: 28 points, 150 value table points
for moving from 1b to 2b)
Is the difference between students E, F and G substantial enough to justify
three widely different weights?
Value Tables
Benefits
• Uses two points in time—better than
attainment!
Drawbacks
• Does not separate teacher’s effect
on student growth from studentlevel factors
• Can lead to focus on “bubble kids”
• Biased if the point values do not
reflect the difficulty of making a
particular improvement
• Ignore improvements within
categories, and measurement error
• Easy to draw incorrect conclusions
based on category instability
Implementation Requirements
• Maintain data from two
consecutive years
• Type I or Type II (considerably
more challenging for Type III)
• Creation of categories
• Consideration of unintended
consequences and biases
Simple Growth


Here we are not referring to the general definition of
growth as outlined in state legislation, but a specific
type of growth model.
Simple growth is average gain in test scores across
students.
Simple Growth
Classroom
Gain in Score from
rd
th
3 to 4
1
11
2
13
3
12
4
7
5
20
6
19
7
15
8
22
9
18
10
29
11
35
12
33
Gain
High
Low
Is it fair to compare these
classrooms using simple growth?
3 Grade Score
4 Grade Score
Gain in Score from
rd
th
3 to 4
1
312
323
11
2
304
317
13
3
301
313
12
4
294
301
7
5
288
308
20
6
278
297
19
7
275
290
15
8
264
286
22
9
259
277
18
10
256
285
29
11
244
279
35
12
238
271
33
Classroom
rd
th
Test Score
Range
High
Low
Gain
High
Low
Simple Growth
Benefits
• Uses two points in time—better than
attainment!
Drawbacks
• Does not separate teacher’s effect
on student growth from studentlevel factors
• Ignores test measurement error
• May be harder for some students to
make gains—particularly high
achievers
• Easy to draw incorrect conclusions
based on test scale instability
Implementation Requirements
• Maintain data from two
consecutive years
• Type I or Type II (to be safe)
• Subtraction!
• Test where typical gain is the
same regardless of starting point
• Consideration of unintended
consequences and biases
Adjusted Growth

Adjusted growth measures growth in student scores
from one year to the next by looking at groups of
students (divided according to their performance on
a prior year test) and comparing their performance
on a post-test
Using Real Data to Account for the
Relative Difficulty of Making Gain
3 Grade Score
4 Grade Score
Gain in Score from
rd
th
3 to 4
Allen, Susan
312
323
11
Anderson, Laura
304
317
13
Alvarez, Jose
301
313
12
Adams, Daniel
294
301
7
Anderson, Steven
288
308
20
Acosta, Lilly
278
297
19
Adams, James
275
290
15
Atkinson, Carol
264
286
22
Anderson, Chris
259
277
18
Alvarez, Michelle
256
285
29
Abbot, Tina
244
279
35
Andrews, William
238
271
33
Student
rd
th
Test Score
Range
High
Low
Gain
High
Low
Using Real Data to Account for the
Relative Difficulty of Making Gain
rd
Student
th
3 Grade 4 Grade
Score
Score
Gain in Score
rd
th
from 3 to 4
Typical Growth
Student made at least
typical growth?
Allen, Susan
312
323
11
10
Y
Anderson, Laura
304
317
13
12
Y
Alvarez, Jose
301
313
12
13
Adams, Daniel
294
301
7
14
Anderson, Steven
288
308
20
16
Y
Acosta, Lilly
278
297
19
17
Y
Adams, James
275
290
15
18
Atkinson, Carol
264
286
22
20
Anderson, Chris
259
277
18
21
Alvarez, Michelle
256
285
29
26
Y
Abbot, Tina
244
279
35
28
Y
Andrews, William
238
271
33
30
Y
Y
Adjusted Growth
Benefits
• Takes students’ starting points into
account
Drawbacks
• Does not separate teachers’ effect
on student growth from
demographic factors
• Ignores test measurement error
Implementation Requirements
• Maintain data from two
consecutive years, and additional
historical data to determine
typical growth
• Type I or Type II (to be safe)
• More complex methods than
subtraction
• Consideration of unintended
consequences and biases
Value-Added

A value-added model:





takes a classroom of students’ pre-test scores and demographic
characteristics,
compares those students to others like them, and
predicts what their post-test scores would be assuming students had
an average teacher.
If the students’ actual teacher was able to produce more growth
than predicted in her students, she will have a high value-added
score. This teacher “beat the odds”.
This is the difference between actual student achievement and
the average achievement of a comparable group of students
(where comparability is determined by prior scores, at a
minimum).
Value-Added
Visual Representation
Actual student
achievement
scale score
Value-Added
Starting student
achievement scale
score
Predicted student achievement
(Based on observationally
similar students)
Year 1
(Prior-test)
Year 2
(Post-test)
Value-Added
Benefits
• Comprehensive measure that accurately
separates effects of educator on student
growth from other confounding factors
Drawbacks
• Requires the most data and more
complex modeling
Implementation Requirements
• Maintain data from two
consecutive years
• Large enough sample size (alone
or in a consortium)
• Type I or Type II
• Research team or statistical
capacity for calculation
• Statistical reference group score
and demographic data
• Consideration of unintended
consequences and biases
Capabilities Checklist
Please see the Capabilities Checklist handout
Goal Helper Tool
This is a goal helper, not a goal creator.
The goal helper provides a reality check
based on historical data.
Goal Helper


A tool to help you make realistic goals
Four Tabs
 Previous
Year Raw Data (Blue)
 Pre-Test Groups (Orange)
 This Year Raw Data (Red)
 Goal Output (Purple)

Note, the tool also includes seven (Gray) tabs; information in these tabs relates to the
workings of the tool and should not be altered.
Tab 1: Previous Year Raw Data
Column A
• Student ID
• This should be
unique by student
Column B
• Last Year Pre-Test Score
• Must be a numeric
value for this
student's Pre-Test
Column C
• Last Year Post-Test Score
• Must be a numeric
value for this
student's Post-Test
Data here is for all students in the district taking the
assessment at this grade level.
Tab 2: Pre-Test Groups
Desired Number of Pre-Test Groups
• Yellow Box (ONLY)
• Chose between 1 and 8
groupings
Information is provided in the
orange box for the Average Gain by
Pre-Test Score Range
Tab 3: This Year Raw Data
Column A
• Student ID
• This should be
unique by student
Column B
• Pre-Test Score
• Must be a numeric
value for this
student's Pre-Test
Data here is for ONLY the students in a particular class
or school taking the assessment at this grade level.
Tab 4: Goal Output
Column A
• Student ID
Column B
• Pre-Test
Column C
• Pre-Test Group
Column D
• Average Gain by Pre-Test
Group
Column E
• Post-Test
The purple box provides the projected group (class or
school at this grade level) average Post-Test score.
DuPage Goal Helper

To access the goal helper, please have one member
of your team contact Linda Kaminski at the DuPage
ROE: lkaminski@dupage.k12.il.us
Team Reflection Time
Please see Team Reflection Time Guiding Questions
Student Learning Objectives
What are SLOs? (A recap)
Student Learning Objectives (SLO) are
detailed, measurable goals for student academic
growth to be achieved in a specific period of time
(typically an academic year).
“A detailed process used to organize evidence of student growth
over a specified period of time.” – ISBE SLO webinar
Key Characteristics of SLOs*
Baseline Data and Rationale
What are student needs? What
baseline data did you review and what
did it demonstrate? What are student
starting points?
Learning Goal
What should students know and be
able to do at the end of your course?
How does this goal align to content
standards?
Assessments and Scoring
How will you measure the learning
goal? How will you collect data,
monitor, and score final student
outcomes?
Target Population
Which students will you target in the
SLO?
Expected Growth Targets
What is your specific target, or goal, for
student growth? Are goals different for
different student populations, based on
starting points?
Time Span
What is the timeframe for the SLO?
(typically a semester or a year)
Instructional Strategies
What instructional strategies will you
employ to support this SLO? How will
you modify strategies based on student
progress?
* Derived from ISBE SLO Guidebook and Template
SMART Goals
Does the SLO statement identify a specific
student population and growth target?
Specific
 Measurable
 Appropriate
 Realistic
 Time Limited

How is growth being measured? Does the SLO
identify a quantifiable growth target and a
specific assessment/evidence source?
Is the SLO appropriate for the grade
level/subject area? Are growth targets
appropriate for the student population?
Are goals and growth target realistic, given
the student population and
assessment/evidence source?
Does the SLO indicate the time period within
which the goal must be met?
Flexibility of Approaches to the
SLO Process
New York
More
Structured
Georgia
Ohio
Wisconsin
More Flexible
Assessment Selection

Flexible – Allows educators to select and/or
develop their own assessments to measure
student growth.

Structured – Requires the use of pre-approved,
standardized assessments
Target Setting

Statistical or Model Informed


Objective, or standardized


SLO targets are determined using a statistical model to predict
levels of student growth based on prior data
Standardized, or common, way to set growth targets for
teachers across classrooms, schools & districts
Subjective


Growth targets are set based on historical data, student needs,
and context.
Relies more on professional judgment
Scoring

Statistical or Model Informed
SLO scored using a statistical model that predicts levels of
student growth and set thresholds for final SLO ratings
 No scoring rubric needed


Objective, or standardized


Scoring rubric includes prescriptive criteria for assigning a
rating category based on student outcomes
Subjective

Scoring rubric includes broad and/or subjective language
for assigning a rating category based on student outcomes
Structured: New York

Assessment selection
 Requires
use of state test where available
 Provides list of state-approved assessments for
district use

Target setting
 Expectations

based on state-provided scale
Scoring
 Number
of students reaching the target goal
directly linked to final SLO score
New York State Scoring
New York State Scoring
Structured: Georgia

Assessments selection
 Teacher-selected,
but they must meet
minimal criteria

Target setting
 Teacher-selected,
but the state approves
the overall SLO goal

Scoring Rubric
 Each
possible score associated with % of
students that met the goal
Georgia (cont.)
Flexible: Ohio

Assessment selection
 Teacher-selected

Target setting
 Teacher-selected
 Attainment

discouraged
Scoring rubric
 Principals
must use the SLO scoring matrix to
score SLOs
Ohio (cont.)
Flexible: Wisconsin

Assessment selection
 Teacher-selected

Target setting
 Teacher-selected

& principal approved
& principal approved
Scoring rubric
 Outcomes
& process
 “Holistic” & self-scored
 Ultimately
principal determined
Trade-offs: Assessments
Flexibility
 Allows teachers to identify best assessment, BUT
 Lack of comparability of student outcomes
 Is
“passing” the same thing across different SLOs?
Standardization
 Provides clarity and potentially comparability, BUT
 Assessments may not directly align to classroom
teaching
Trade-offs: Target-setting
Flexibility
 Allows educators to consider individual student needs and
contextual factors, BUT
 Could potentially penalize educators who set ambitious
goals & reward those who set less rigorous goals
Standardization
 Provides consistency in SLO goals across educators, BUT
 Formulas may not be appropriate for all courses, students,
and assessments
Austin Example
Total Points – Pre-test Score = Predicted growth
2
100 – 70 = 15 points (or 85 post-test score)
2
Issue of growth difficulty
New York State Example

Standardization can work well
 NWEA
MAP
 Value-added model
Trade-offs: Scoring
Flexibility
 Supports professional judgment & takes extenuating
circumstances into account , BUT
 Subjectivity in defining what “all” or “most” students
means when assigning a rating
Standardization
 Allows for greater comparability across and within
schools
 Limits the degree to which professional judgment
enters into the SLO scoring process
Illinois?
Activity:
Consider the three major dimensions:
Assessment selection
 Target setting
 Scoring rubric

Think about how these dimensions work together in your
district to inform:
• Instruction
• Resource allocation
• High stakes decisions
SLO Dimensions
Scoring
rubric
Assessment
selection
Target
setting
Team Reflection Time
Lunch
Assessment Development
Agenda
Why use assessments?
 What makes a high quality assessment?
 Selecting high quality assessments
 Designing high quality assessments

Why use assessments?

Inform instruction
 Allows
the teacher to plan, support, monitor and
verify learning

Support resource allocations

High stakes decisions
Assessment Types as Defined by
PERA
Type I
Type II
Type III
An assessment that measures
a certain group of students in
the same manner with the
same potential assessment
items, is scored by a nondistrict entity, and is widely
administered beyond Illinois
An assessment developed or
adopted and approved by the
school district and used on a
district-wide basis that is
given by all teachers in a
given grade or subject area
An assessment that is
rigorous, aligned with the
course’s curriculum, and
that the evaluator and
teacher determine measures
student learning
Examples: Northwest Evaluation
Association (NWEA) MAP tests,
Scantron Performance Series
Examples: Collaboratively
Examples: teacher-created
developed common assessments, assessments, assessments of
curriculum tests, assessments
student performance
designed by textbook publishers
Need for Assessments in Illinois

ISBE requires a combination of
Type I and II (and possibly III) assessments for
principal evaluation.
 Type I, II, and III assessments for teacher evaluation.

What makes a high quality
assessment?

Selecting or constructing the right assessment for
the right purpose
 Does
the assessment align with the curriculum
standards?
 Does the assessment align with what is actually being
taught in the classroom?

Need appropriate assessments for SLOs and other
growth measures
Key Features of a High Quality
Assessment

Alignment: should cover the actual
standards/constructs to be taught in the classroom

Validity: measures what it claims to measure

Reliability: consistency (same test taker receives
same score)
How do I know if an assessment is
valid and reliable?

Ask yourself:
 Would
the same student, taking the test two days in a
row, be likely to get the same score? (Reliability)
 Does the test measure the curriculum I taught that
year? (Validity)
 Does the test cover all areas of the curriculum taught?
(Validity)
Assessments & Growth

Imprecise, inaccurate assessments lead to
imprecise, inaccurate growth measures
 Floor
& Ceiling effects
 Imprecision outside of proficiency thresholds
 Coarse scaling
Selecting Assessments for SLOs
Quality considerations:
 Content-alignment
 Stretch of growth
 Validity & reliability
Logistical considerations
 Timing (time of year and assessment duration)
 Test form
 Administration
 Scoring
SLO Assessment Checklist Activity
With your team:
Step 1: Identify one or two assessments that you think
teachers use/will use to write their SLO goals.
Step 2: Next, describe the key features of your
assessment in the Assessment Description form.
Step 3: Finally, use the SLO Assessment Checklist to
determine if your assessment meets the basic
requirements for an SLO
Designing Assessments

At what point should teachers develop their own
assessments?

How should teachers develop their own
assessments?
Hillsborough County Assessment
Development

Subject / Grade Coverage


Models from Art to Welding
Multiple Measures
Charlotte Danielson observational
ratings
 Combined use of student
outcomes and observational data
in state system


Data Quality

Student-Teacher Link
LAUSD Gates Project

Partnership between the Los Angeles Unified School
District, Education Analytics and the Gates Foundation
to develop teacher-developed assessments

Contributors include assessment design and evaluation
experts from around the country
Overall goal: To develop a replicable process for
supporting high-quality teacher-created assessments and
SLO development
LAUSD Teacher-Developed
Assessment Pilot


Piloting Fall 2014 & Spring
2015
30 to 40 teacher participants
5th grade Theatre
 High School Biology


Teacher Assessment Writing
Workshops
Assessment blueprint creation
 Item writing
 Item review and revision

LAUSD Assessment Development
Timeline
1
2
3
4
5
6
7
• Design: Prioritize Standards & Create Assessment Blueprint – Coordinators
• Write Items - Teachers
• Editorial & Bias Review – Education Analytics Team & Assessment Experts
• Content Review – Teachers & Coordinators
• Field Testing - Representative biology and theatre classrooms
• Analyze Data – Education Analytics Team
• Revise Items & Create Operational Tests – Teachers and Education Analytics
Team
Item Writing Process
Test Blueprint
Purpose:
 Communicate to teachers what is important to
teach
 Communicate with students and parents what is
important to learn
 Provide direction to the item writers and test
developers
 Ensure appropriate coverage of content on test
 Outline the cognitive complexity of the test items
Blueprint Development Process

Identify the benchmark or standards that will be
measured on the test
 Determine the



benchmarks that will not be tested!
Identify pre-requisite skills that should be
measured
Identify the weight of each blueprint entry
Determine the cognitive complexity of the items
that will be included on the test
Depth of Knowledge
Source: Webb, N.L., et al. (2005). Web Alignment Tool. Wisconsin Center of Education Research.
University of Wisconsin-Madison. Retrieved from http://www.wcer.wisc.edu/WAT/index.aspx
Theatre Example

Benchmark:
1.2 Identify the structural elements of plot (exposition,
complication, crisis, climax, and resolution) in a script or
theatrical experience.

What pre-requisite skills should be measured?


What is the level of cognitive complexity?



Students need to know the terms & definition
“Identify”
Level 2 on DOK: Skill/concept
What weight should we assign the standard?

20% of unit curriculum focuses on this content
Blueprint Example
Benchmark/ Recall
Content
Skill/Conce Strategic
pt
Thinking
Theatre: 1.2
XX
Extended
thinking
Weight
20%
Assessment item types
Constructed response
 Short answer
 Essays
 Labeling
 Performance-based tasks
Objective
 Multiple choice
 True/False
 Matching
 Fill-in
Why Performance Tasks?

Used to measure learning outcomes and
learning objectives that cannot be measured
well by objective tests

Suited for less structured problems, creation
of a product or a performance
Performance Tasks



Emphasis is on doing – not merely knowing; on
process as well as product
Goal is to be as authentic as possible
Can be narrow in definition or more broad and open
Examples: Oral presentations, online reading logs,
podcast, scientific experiment, excel graphing
Source: Brenda Lyseng, Minnesota State Colleges and Universities Center for Teaching and
Learning
Building a performance task
Learning Objectives
Which one do you want to assess?
Task
What is the core task you want
students to perform?
Source: Brenda Lyseng, Minnesota State Colleges and Universities Center for Teaching and
Learning
Building a performance task
What content standards do you want What knowledge will students
students to demonstrate?
demonstrate?
What performance standards do you What skills will students
want students to demonstrate?
demonstrate?
What will be their sources of
information?
Interviews, primary sources,
secondary, textbook
What type of product do you want?
Written report, oral report,
recommendation, graph
How will students work?
Individual, partner, team
Source: Brenda Lyseng, Minnesota State Colleges and Universities Center for Teaching and
Learning
Performance Task Rubric

Considerations:
 What
elements must be present to ensure high
quality?
 How many levels do I want?
 What is a clear description of each achievement level?
 Rubrics are for you and the students – ask for
feedback
Source: Brenda Lyseng, Minnesota State Colleges and Universities Center for Teaching and
Learning
Why Multiple Choice Items?




Efficient
Objectively scored
Tend to help with creating reliable tests
Can be very effective in producing information on
students’ conceptual knowledge, procedural
knowledge, and reasoning (Webb, 2006).
Parts of a Multiple Choice Item
What is the leading cause of death in the
United States?
Key
A.
B.
C.
D.
Cancer
Heart disease
Accidents
Diabetes
Stem
Distractors
*The Key and Distractors collectively are called the response options.
MC Item Formats

Correct answer
 One
option is indisputably correct and the other
options are indisputably incorrect
What year was George Washington elected
president?
A. 1783
B. 1789 *
C. 1790
D. 1795
MC Item Formats

Best answer
 One
or more options are correct to varying degrees;
however one option is unquestionably better than the
other options
According to the Mayo Clinic, what is the first thing you
should do if you burn your skin?
A. Take a pain reliever.
B. Put a burn ointment on your skin.
C. Call your doctor.
D. Run your skin under cool water. *
MC Item Formats

Negative format
 Examinee
is asked to identify which option is not
correct or which is the worst answer
 Use with caution!!
 Most negative format items can be converted into
correct answer or best answer formats
Which of the following is not one of the author’s
main points?
MC Item Formats

Multiple answer
 More
than one option is keyed as correct
 Examinee must identify all correct answers
Which of the following are recommended if you are trying to
reduce your blood pressure? Select all that apply.
A. Reduce your sodium intake *
B. Get 8 hours of sleep each night
C. Exercise regularly *
D. Increase your fluid intake
MC Item Formats

Complex format

Takes the options and regroups them into combined
answers
Which of the following are benefits of multiple choice items?
i.
They are efficient.
ii. They are objectively scored.
iii. Students like them.
A.
B.
C.
D.
i and ii
ii and iii
i and ii
i, ii, and iii
General Guidelines

Each item should reflect one content area
 What
is the important content the students should
know?
 Do not
 Use
test trivial content
your test blueprint!
 Identify
the relevant test objective that the item is
measuring
General Guidelines

Use correct grammar, punctuation, capitalization, and
spelling
For incomplete stems:
The process by which plants convert light energy into
chemical energy is called
A. cellular respiration.
B. photosynthesis. *
C. osmosis.
D. evaporation.
Guidelines for Writing the Item Stem

Keep the stem as brief as possible
 Be
clear and concise
 Avoid “window-dressing”
 Minimize reading time
 Stems that are unnecessarily wordy or contain
unneeded information can confuse test takers
Guidelines for Writing the Item Stem

Be sure the stem contains the main idea of your
item

When possible, avoid using negative words
 Not,
never, except, least, false
 EXCEPTION: Negative format item
 First
try to write the item as another format
Guidelines for Writing the Item Stem

Use terms like often, frequently, rarely, and
occasionally with caution
 These
words have different interpretation for different
examinees

Avoid being over specific or over general

Avoid opinion based items
 Qualify
any opinions
Guidelines for Writing the Item Stem

Use a visual stimulus in the item stem only if it is
necessary
A
graph, chart, picture, diagram etc.
 Including a stimulus that is not necessary can detract
from the purpose of your item
Identify the component of a neuron
that the arrow is pointing to.
A. Axon
B. Nucleus
C. Cell body
D. Dendrite *
Item Stems Needing Revision
The 19th amendment to the U.S.
Constitution, which was adopted
on August 18, 1920, granted
women the right to
Lengthy stem;
contains unnecessary
information
The 19th amendment to the
U.S. Constitution granted
women the right to
Romeo and Juliet
Too general; stem
does not include the
main idea
The main theme in
Shakespeare's Romeo and
Juliet is
The best movie of all time is
Stem is opinion based According to the American
Film Institute, the best
movie of all time is
Which of the following is not
found on Maslow’s hierarchy of
needs?
Negative stem
Which of the following is
found at the top of
Maslow’s hierarchy of
needs?
Guidelines for Writing the Options

Vary the position of the key
 Assign

the location of the key first
Make the distractors plausible
 What
are the common mistakes students make or the
common misconceptions they have?
 Do not use funny distractors
Guidelines for Writing the Options

Place the options in a logical order
 Numerical,

chronological, shortest to longest, etc.
Avoid using specific determiners
 Always,
never, only, etc.
In order for a bill to become a law it
A. should only be written by a U.S. Senator.
B. must always be signed by the president.
C. needs to pass both houses of Congress. *
D. can never be vetoed.
Guidelines for Writing the Options

Move any phrases that are repeated in all
options to the stem
If a number is squared,
A. the result is the number times two.
B. the result is the number plus itself. BETTER:
C. the result is the number times itself.
If a number is squared, the
D. the result is the original number.
result is the
A. number times two.
B. number plus itself.
C. number times itself. *
D. original number.
Guidelines for Writing the Options

Keep the options independent from one another
Which of the following has been shown to lower
cholesterol?
A. apples
B. yogurt
C. fruit
D. oatmeal *
Guidelines for Writing the Options

In general, avoid using “none of the above” and
“all of the above”
 None
of the above
 Often
not plausible to students
 Measures the ability to recognize incorrect answers
 All
of the above
 If
a student can identify that two alternatives are correct,
then they know that “all of the above” is the correct
answer
Guidelines for Writing Options:
Giving Away the Correct Answer

Key is too long or too specific
The term pessimism most nearly means
A. thoughtful.
B. happy.
C. the tendency to only see bad or undesirable outcomes. *
D. scared.
Guidelines for Writing the Options:
Giving Away the Correct Answer

Avoid one item giving away the answer to another
item
2.
A.
B.
C.
D.
In the passage, Iago is a
parrot. *
iguana.
flamingo.
snake.
4. The parrot in the story, Iago,..
Guidelines for Writing the Options:
Giving Away the Correct Answer

Avoid repeating words in the stem and the key
What is true of the Western Lowland Gorilla?
A. They are found on the continent of Asia.
B. The gorilla mostly consumes a vegetarian diet. *
C. Their lifespan is approximately 25 years.
D. Newborns weigh an average of 20 lbs. at birth.
Item Revision Activity
Activity:
Take out your Item Revision worksheet.
(1)
(2)
(3)
Identify the problem with the each question’s item(s)
Rewrite the questionable item to make it higher
quality
Share out
Items for Revision
Who painted the Mona Lisa?
A. Giovanni Bellini
B. Sandro Botticelli
C. Leonardo da Vinci *
D. Mickey Mouse
Problem: Funny, implausible distractor
Items for Revision
The humpback whale’s diet consists mainly of
A. krill and small fish. *
B. seals.
C. sharks and large fish.
D. mammals.
Problem: Dependent options
Items for Revision
Minnesota
A. is the 41st state.
B. has the robin as its state bird.
C. is known as the “Show-Me-State”.
D. Is the 21st most populous state in the U.S. with most
of its residents living in the Saint Paul/Minneapolis
metro area. *
Problem: stem is too general; main point is not conveyed
in the stem, key is too long and specific
Items for Revision
The first amendment of the U.S. Constitution grants
freedom of
A. speech.
B. religion.
C. press.
D. all of the above **
Problem: uses all of the above
Items for Revision
In the 1st book in the Harry Potter Series,
A. Harry, Ron, and Hermione kill a basilisk.
B. Harry, Ron, and Hermione only attend class.
C. Harry, Ron, and Hermione fight werewolf.
D. Harry, Ron, and Hermione find the Sorcerer's
Stone. *
Problem: repeated phrases in stem and options;
incorrect grammar; specific determiner
Office Hours
Download