Item and Test Development Process Collaborative Conference for Student Achievement

advertisement
Item and Test Development
Process
Collaborative Conference for Student Achievement
March 30–April 1, 2015
NCDPI/Division of Accountability Services
Test Development
1
Test Development Presenters
Dan Auman – English Language Arts
Test Measurement Specialist, NCDPI
Tom Englehart – Education Research & Evaluation
Consultant, NCSU-TOPS
Iris Irving – Social Studies
Test Measurement Specialist, NCDPI
2
Assessment Cycle
• NCDPI Curriculum & Instruction develop
standards
• State Board of Education adopt standards
• Blueprint decided by Test Specification Panel
• Items are developed for adopted standards
• Field testing of items for adopted standards
• Data analyzed from field test items
• Operational testing of items
3
•
•
•
•
Phase 1: 4 months
Phase 2: 12 months
Phase 3: 20 months
Phase 4:
– 4 months/EOC
– 9 months/EOG
• Phase 5: 4 months
• Total: 44-49 months
4
Frequently Asked Question
Question:
Did the test determine the content standards?
Answer:
No. Content standards drive item and test
development.
5
TOPS
• North Carolina State University/Technical Outreach
for Public Schools
• NCDPI Partner
• Content Specialists
• Item Development using NC teachers
• Form Development (paper and online)
• Production and Editing
• Printing, Shipping, and Distribution
• NCTEST Online Platform
• NCFE Constructed Response Scoring
6
NC Teacher Involvement in Item
Writing or Reviewing
Recruitment information
•1) Complete two courses on Test Development
(Content Standards Overview and Test Development
Basics) https://center.ncsu.edu/nc/x_courseNav/index.
php?id=21
•2) Once online training courses are completed,
teachers are interested in item writing or reviewing they
can go to http://goo.gl/forms/wXv4Imh0ko to submit an
interest form.
7
Test Development Basics
• Items designed to be accessible to NC’s
diverse student population
• Vocabulary
"TD101B: Test Development Basics." NC Education. North Carolina Department of Public Education.
8
Test Development Basics
•
•
•
•
•
•
•
•
•
Alignment to a standard
Clear and distinct answer
Foils formatted in logical order
Minimize wording
Use third person
Avoid contractions, stereotypes, idioms
Use grade-level appropriate language
Plausible distractors
Review for bias
"TD101B: Test Development Basics." NC Education. North Carolina Department of Public Education.
9
Weak
Improved
Which describes a cloud that
will most likely produce
heavy rain, lightning, and
thunder?
Which describes a cloud that
will most likely produce
heavy rain, lightning, and
thunder?
A. The cloud will be tall
and dark.
B. The cloud will be low and
gray.
C. The cloud will be high and
broken.
D. The cloud will be high and
wispy.
A. tall and dark
B. low and gray
C. high and broken
D. high and wispy
"TD101B: Test Development Basics." NC Education. North Carolina Department of Public Education.
10
Weak
Improved
How much does a Channel
Bass usually weigh?
How much does a Channel
Bass usually weigh?
A. up to 75 pounds
B. less than 100 pounds
C. between 30 and 40
pounds
D. more than 30 pounds
A. Over 100 pounds
B. Between 60 and 80
pounds
C. between 30 and 40
pounds
D. Less than 20 pounds
"TD101B: Test Development Basics." NC Education. North Carolina Department of Public Education.
11
Weak
Improved
Which are symbols of North
Carolina?
I.
cardinal
II. honeybee
III. Boykin spaniel
IV. dogwood
Which are symbols of North
Carolina?
A. cardinal and fire ant
B. Boykin spaniel and shad boat
C. Dogwood and soda
D. Honeybee and Plott Hound
A. I and III only
B. II and III only
C. III and IV only
D. I, II, and IV only
"TD101B: Test Development Basics." NC Education. North Carolina Department of Public Education.
12
Selection Review
13
Selection Review
• English Language Arts items are all tied to
selections; therefore, before writing items,
selections must be reviewed and approved
• Approved selections must go through the
copyright process (with the exception of
(works for hire or Public Domain)
14
Selection Review
• Some things to keep in mind when
reviewing selections:
– Is the grade level appropriate?
– Are words spelled correctly and appropriately? (with
the understanding that texts must be allowed a level of
authenticity)
– Is the length appropriate for the assigned grade level?
– Are there an excessive number of footnotes needed?
(more than 3, then it probably belongs at a higher
grade level)
15
Selection Review
• Avoid using selections that contain:
– Any focus on negative behavior or activities
– References to holidays such as Christmas, Halloween,
birthdays, etc. that are celebrated by some groups, but
not others
– Controversial topics such as magic, ghosts, witches,
and death
– Any focus on tragedies that have occurred in recent
years in NC (floods, hurricanes, tornadoes)
– Any articles that mention topics such as tobacco or
alcohol
16
Item Development Process
17
Item Characteristics
• Level of difficulty
– Easy, medium, hard
• Level of cognitive complexity
– Example: Depth of Knowledge (DOK) or
Revised Bloom’s Taxonomy (RBT)
• Different types of items
18
Types of Items
• Multiple-Choice (MC)
• Technology Enhanced (TE)
– Drag and Drop
– Text Identify
– String Replacement
• Constructed Response (CR)
– Gridded Response/Numeric Entry
– Short Answer
19
Item Development Process
17 Steps
•
•
•
•
Teachers
• Production
Content Lead
• Editing
Content Specialist • Subject-specific
Test Measurement
DPI-Curriculum &
Specialist (TMS)
Instruction
• EC/ESL/VI
20
Steps 1–3
• Step 1: Item created
– Assigned standard, DOK/RBT rating,
knowledge type and cognitive category
• Step 2: TOPS Content
• Step 3: Production
– Graphics, copyright check
21
Step 3 – Sample of Production Edits
4th Grade Science Released Item Fall 2015
(Item 5)
Step 2: alluded to an image
Step 3: Production created image
22
Sample Item: ELA at Step 2
(Grade 8 EOG Released Form 2012-13, Item 4)
What does saffron mean in the selection?
A) green
B) yellow
C) dark
D) light
23
Step 4: Teacher Review
• Select Answer
• Match
– Content/Course
– Content/Standard
– Content/DOK
(NCSCS) or RBT
(ES)
– Content/Difficulty
• Stem Quality
• Plausible Distractors
• Grade Level
Vocabulary
• Appropriate Graphics
• Bias, insensitivity,
accessibility issues
• Overall item quailty
• Additional comments
24
Step 4: Teacher Review
25
Steps 5–6:
Reconciliation, Production
• TOPS Content Specialist address
feedback from teacher reviews
– Teacher review suggested revision to Foil D
26
Step 7: DPI-Curriculum & Instruction,
Exceptional Children (EC), English as a Second
Language (ESL), Visually Impaired (VI)
DPI-Curriculum:
• Keyed correctly?
• Match
– Content
– DOK for NCSCS,
RBT for ES
• Select aligned standard
• Any bias, insensitivity,
or accessibility issues?
• Overall item quality
EC/ESL/VI:
• Any accessibility
issues?
• Overall item quality
from ESL perspective
• Overall item quality
from VI perspective
27
Steps 8–9: Reconcile, Production
• TOPS-Content Specialists would
incorporate feedback from DPI-Curriculum
& Instruction and EC/ESL/VI reviews as
needed
28
Step 10: Test Measurement
Specialist (TMS)
• NCDPI/Test Development Section
• TMS Review Screen
29
Steps 11–12:
Reconcile, Production
• Incorporate Test Measurement Specialist
(TMS) suggestions
• Sample item: Clarification in stem
Steps 13–14:
Grammar, Security
• Copyright permissions, etc.
30
Step 15: Final Approval
• TOPS Content Lead
Step 16: Production Edits
Step 17: Item Approved
• Ready for placement on a form
31
Released Sample Item – Steps 16–17
Based on the sentences below, what does saffron mean?
“But high up, their tops were green and caught the saffron
light of the west. He remembered that when a boy, he had
thought there was nothing more beautiful than the evening
sunshine falling athwart the dark green fir boughs on the
hills.”
A)
B)
C)
D)
green
yellow
dark
west
32
Form Building Process
33
Blueprint
• Priority of Topics and Sub-topics
determined by the Test Specification
Panel
–NC Teachers
–DPI-Curriculum & Instruction
–DPI-Exceptional Children
–DPI-Test Development
–Outside Content (e.g., university professors)
34
Blueprint, cont.
• All forms of each test are built to the same test
specification
• Topic Level:
– Forms within a subject have the same
distribution of items by topic
• Sub-topic Level:
– Forms within a subject may or may not
have the same distribution of items by subtopic
35
Example:
Biology EOC
-“Ecosystems”
domain has 6
sub-standards
- Items from
“Ecosystems”
will fall within
the 18-22%
total score
points
- May have
variation in
sub-standards
tested across
forms
36
Form Building Process and
Blueprint
• Test items on forms used from year to
year are different.
– Tests equivalent at the total score level, not
at the sub-topic level.
– Thus, forms from year to year may have
more or less items on a particular topic or
sub-topic.
37
Form Review
27 Steps
•
•
•
•
•
• Outside Content Specialist
Content Lead
Content Manager • Subject-specific NCDPI
Test Measurement
Content Specialist
Specialist (TMS)
Production, Editing
• Psychometrician
TOPS/IT Staff
38
Building Base Forms
(e.g., Forms A, B, C)
• Step 1: Operational items selected by
psychometricians
• Step 2: Production edits (as needed)
• Step 3: TOPS Content review
• Step 4: DPI-TMS Review/Key Balance
39
Building Base Forms, cont.
•
•
•
•
•
•
•
Step 5: TOPS Content Reconcile
Step 6: Outside Content Key Check
Step 7: TOPS Content Reconcile
Step 8: Psychometric Review/Key Balance
Step 9: Production
Step 10: Grammar
Step 11: Content Lead Review/Finalize Form
40
Frequently Asked Question
Question:
Does anyone review the item statistics for each
item each year?
Answer:
Yes. Item statistics are reviewed for every item
on every form after semester and yearlong test
cycles as soon as a representative data sample
is received by Accountability Services.
41
Embedded Sub-Form Review
(e.g., A1, A2, A3)
• NC field tests items by creating sub-forms
• Each sub-form has the same operational items but
different field test items
• Field test items are developed, aligned to the content
standards, and then piloted/field tested (item tryout)
– 2 to 1 ratio for item needs
• Field test item are not included in students’ scores
• Items deemed to meet technical criteria based on the
field test statistics are then placed on a test form the
following year
42
Common Questions
Why are items field tested?
• Before being placed on a test form, item statistics are
needed to control the overall difficulty and reliability of
a form.
Is one test form harder than another form?
(EOC/EOG)
• No, all of the forms (online and paper) of a given test
for a grade/subject are equivalent with respect to
content and difficulty (one form is not harder or easier
than another)
43
EOG/EOC Resources
• Released Forms
http://www.ncpublicschools.org/accountability/testing/releasedforms
• EOG Test Specifications
http://www.ncpublicschools.org/accountability/testing/eog/
• EOC Test Specifications
http://www.ncpublicschools.org/accountability/testing/eoc/
• Technical Reports
http://www.ncpublicschools.org/accountability/testing/technicalnotes
• Guidelines, Practice and Examples Math Gridded
Response Items
http://www.ncpublicschools.org/accountability/testing/eoc/
44
NC Final Exams Resources
• NC Final Exams (test specs, released forms/items,
reference sheets)
http://www.ncpublicschools.org/accountability/common-exams/
• Fall 2014 Released Item Sets
http://www.ncpublicschools.org/accountability/common-exams/releaseditems/
• Assessment Specifications
http://www.ncpublicschools.org/accountability/commonexams/specifications/
45
Resources
• NC Testing Program Overview
http://www.ncpublicschools.org/docs/accountability/1415testovervie
w.pdf
• NC Testing Calendar
http://www.ncpublicschools.org/docs/accountability/testing/calendar
s/1415optestcal.pdf
46
Questions?
Thank you!
47
Download