Northborough-Southborough DDM Development

advertisement
NorthboroughSouthborough DDM
Development
October 9, 2014
Dr. Deborah Brady
Northborough-Southborough

DDM 1 - MCAS (SGP)


For teachers who receive a SGP from MCAS (grades 4-8 for ELA and Math only)

The District is only required to use median Student Growth Percentiles (SGP) from one MCASarea per teacher.

In the first year, the K-5 DDM will focus only on MCAS ELA.

In grades 6-12, the MCAS focus may be either math or ELA.

The DDM rating is based on the SGP (student growth) and not the scaled scores (student achievement).
DDM 1 - Common Assessment

For teachers who do not receive a SGP from MCAS:

Teachers will develop grade level/course common assessments utilizing a pre and post assessment model.

DDM 2 - Common Assessment

For all teachers:

Teachers will develop grade level/course common assessments utilizing a pre- and post-assessment
model
Goal: 2014-2015
(DDMs must be negotiated with our Associations)

Content Student Learning DDMs

*Core Content Areas



(Core areas: math, English, science, and social studies)
Year 1: Identify first two (of four) unique DDM data elements

Alignment of DDM’s with Massachusetts Curriculum Frameworks

Identify/develop DDMs by common grades (K-12) and content

Create rubric

Collect first year of data
Year 2: Identify second two (of four) unique or utilize 2014-2015 DDM’s

(same assessment different students)

Note: Consumer science, applied arts, health & physical education, business education, world language
and SISPs – Received a one year waiver

Planning: Identify/develop DDMs for 2015-2016 implementation
• Collect first year of data 2015-2016
Core DDMs
ELA
Math
Science
Social Studies
12
CA/CA
CA/CA
CA/CA
CA/CA
11
CA/CA
CA/CA
CA/CA
CA/CA
10
CA/CA
CA/CA
CA/CA
CA/CA
9
CA/CA
CA/CA
CA/CA
CA/CA
8
MCAS SGP/CA
MCAS SGP/CA
CA/CA
CA/CA
7
MCAS SGP/CA
MCAS SGP/CA
CA/CA
CA/CA
6
MCAS SGP/CA
MCAS SGP/CA
CA/CA
CA/CA
5
MCAS SGP/CA
MCAS SGP/CA
4
MCAS SGP/CA
MCAS SGP/CA
3
CA/CA
CA/CA
2
CA/CA
CA/CA
1
CA/CA
CA/CA
Quality Assessments

Substantive
 Aligned
with standards of Frameworks, Vocational standards
 And/or
local standards

Rigorous

Consistent in substance, alignment, and rigor

Consistent with the District’s values, initiatives,
expectations

Measures growth (to be contrasted with achievement)
and shifts the focus of teaching
Scoring Student Work

Districts will need to determine fair, efficient and accurate methods for
scoring students’ work.

DDMs can be scored by the educators themselves, groups of teachers within
the district, external raters, or commercial vendors.

For districts concerned about the quality of scoring when educators score
their own student’s work, processes such as randomly re-scoring a selection of
student work to ensure proper calibration or using teams of educators to
score together, can improve the quality of the results.

When an educator plays a large role in scoring his/her own work, a supervisor
may also choose to include the scoring process into making a determination of
a Student Impact.
Some Possible Common Exam Examples
A
Valued Process: PORTFOLIO: 9-12 ELA portfolio measured
by a locally developed rubric that assesses progress
throughout the four years of high school
 K-12 Writing or Writing to Text: A district that required
that at least one DDM was “writing to text” based on CCSS
appropriate text complexity
 Focus on Data that is Important: A HS science department
assessment of lab report growth for each course (focus on
conclusions)
 “New CCSS” Concern: A HS science department assessment
of data or of diagram or video analysis
More
 CCSS
Math Practices: A HS math department’s use of
PARCC examples that require writing asking students to
“justify your answer”
 SS Focus on DBQs and/or PARCC-like writing to Text:
A social studies created PARCC exam using as the
primary sources. Another social stuies department used
“mini-DBQs” in freshman and sophomore courses
 Music: Writing about a concert
 Common Criteria Rubrics for Grade Spans: Art (color,
design, mastery of medium), Speech (developmental
levels)
More
 Measure
the True Goal of the Course: Autistic and
behavioral or alternative programs and classrooms, Socialemotional development of independence (whole
collaborative—each educator is measuring)
 SPED
“Directed Study” Model—now has Study Skills
explicitly recorded by the week for each student and by
quarter on manila folder: Note taking skills, text
comprehension, reading, writing, preparing for an exam,
time management, and differentiated by student
A
Vocational School’s use of Jobs USA assessments for one
DDM and the local safety protocols for each shop
Assessing Math Practices
Communicating Mathematical Ideas

Clearly constructs and communicates a complete
response based on:
a
response to a given equation or system of equations
a
chain of reasoning to justify or refute algebraic,
function or number system propositions or conjectures
a
response based on data
How can you assess
these standards?
Demonstrating Growth
Billy Bob’s work is shown below. He has made a mistake In the space to
the right, solve the problem on your own on the right. Then find Billy
Bob’s mistake, circle it and explain how to fix it.
Billy Bob’s work
½ X -10 = -2.5
+10 = +10
Your work
Finding the mistake provides
students with a model.
Requires understanding.
Requires writing in math.
_____________________________________________
½X
+0 =
+12.5
(2/1)(1/2)X =12.5 (2)
X=25
Explain the changes that should be
made in Billy Bob’s Work
A resource for
DDMs.
A small step?
A giant step?
The district decides
Which of
the three
conjectures
are true?
Justify your
answer
Determine if each of Michelle’s three conjectures are true. Justify each answer.
Rubrics and
grading:
numbers
good or a
problem?
Objectivity versus Subjectivity
Calibration
 Human
 What
judgment and assessment
is objective about a multiple choice test?
 Calibrating
 Common
 What
 Use
standards in using rubrics
understanding of descriptors
does “insightful,” “In-depth,” “general” look like?
exemplars to keep people calibrated
 Assess
collaboratively with uniform protocol
Consistency in Directions for Administrating Assessments

Directions to teachers need to define rules for
giving support, dictionary use, etc.
 What
can be done? What cannot?
“Are
you sure you are finished?”
How
much time?
Accommodations
and modifications?
Qualitative Methods of Determining an
Assessment’s VALIDITY

Looking at the “body of the work”
 Validating

an assessment based upon the students’ work
Floor and ceiling effect
 If you piled the gain scores (not achievement) into
High, M, and Low gain
 Is there a mix of at risk, average, and high
achievers mixed throughout each pile or can you
see one group mainly represented
Low, Moderate, High Growth Validation

Did your assessment accurately pinpoint differences in growth?
1.
Look at the LOW pile
If you think about their work during this unit, were they struggling?
2.
Look at the MODERATE pile. Are these the average learners who
learn about what you’d expect of your school’s student in your
class?
3.
Look at the HIGH achievement pile. Did you see them learning
more than most of the others did in your class?

Based on your answers to 1, 2, and 3,

Do you need to add questions (for the very high or the very low?)

Do you need to modify any questions (because everyone missed them
or because everyone got them correct?)
Psychometric process called
Look at specific students’ work
Body of the Work
validation

Tracey is a student who was rated as having high
growth.
James had moderate growth
Linda had low growth

Investigate each student’s work



Effort

Teachers’ perception of growth

Other evidence of growth

Do the scores assure you that the assessment is
assessing what it says it is?
Objectivity versus Subjectivity
Multiple Choice Questions
 Human
 What
judgment and assessment
is objective about a multiple choice test?
 What is subjective about a multiple choice
test?
 Make sure the question complexity did not
cause a student to make a mistake.
 Make sure the choices in M/C are all about the
same length, in similar phrases, and clearly
different
Rubrics and Inter-Rater Reliability
Getting words to mean the same to all raters
Category
4
3
2
1
Resources
Effective use
Adequate use
Limited use
Inadequate use
Development
Highly focused
Focused response
Inconsistent response
Lacks focus
Organization
Related ideas
support the writers
purpose
Has an
organizational
structure
Ideas may be
repetitive or
rambling
No evidence of
purposeful
organization
Language
conventions
Well-developed
command
Command; errors
don’t interfere
Limited or
inconsistent
command
Weak command
Protocol for Developing Inter Rater Reliability

Before scoring a whole set of papers, develop Inter-rater
Reliability

Bring High, Average, Low samples (1 or 2 each) (HML Protocol)

Use your rubric or scoring guide to assess these samples

Discuss differences until a clear definition is established

Use these first papers as your exemplars

When there’s a question, select one person as the second
reader
Annotated
Exemplar
How does the author create
the mood in the poem?
Answer and
explanation in the
student’s words
Specific
substantiation
from the text
The speaker’s mood is greatly influenced by the weather.
The author uses dismal words such as “ghostly,” “dark,”
“gloom,” and “tortured.”
“Growth Rubrics” May Need to Be Developed
Pre-conventional Writing
Ages 3-5
2
Relies primarily on pictures to convey
meaning.
2
Begins to label and add “words” to
pictures.
2
Writes first name.
1 Demonstrates awareness that print conveys
meaning.
?
Makes marks other than drawing on paper
(scribbles).
?
Writes random recognizable letters to
represent words.
J Tells about own pictures and writing.
Emerging
Ages 4-6
2
Uses pictures and print to convey meaning.
2
Writes words to describe or support pictures.
2
Copies signs, labels, names, and words
(environmental print).
1 Demonstrates understanding of letter/sound
relationship.
?
Prints with upper case letters.
?
Matches letters to sounds.
?
Uses beginning consonants to make words.
?
Uses beginning and ending consonants to
make words.
J
Pretends to read own writing.
J
Sees self as writer.
J
Takes risks with writing.
2
2
1
?
?
?
?
?
?
Developing
Ages 5-7
Writes 1-2 sentences about a topic.
Writes names and familiar words.
Generates own ideas for writing.
Writes from top to bottom, left to right, and
front to back.
Intermixes upper and lower case letters.
Experiments with capitals.
Experiments with punctuation.
Begins to use spacing between words.
Uses growing awareness of sound segments
(e.g., phonemes, syllables, rhymes) to write
words.
Spells words on the basis of sounds without
regard for conventional spelling patterns.
?
Uses beginning, middle, and ending sounds to
make words.
J Begins to read own
writing.
?
Protocols to Use with Implemented Assessments

Floor and Ceiling Effects

Validating the Quality of Multiple Choice
Questions

Inter-Rater Reliaibility with Rubrics and Scoring
guides

Low-Medium-High Looking at Student Work
Protocol (calibration, developing exemplar,
developing action plan)
FAQ from DESE

Do the same numbers of students have to be identified as having high, moderate, and low growth?
There is no set percentage of students who need to be included in each category. Districts should set
parameters for high, moderate, and low growth using a variety of approaches.

How do I know what low growth looks like? Districts should be guided by the professional judgment of
educators. The guiding definition of low growth is that it is less than a year’s worth of growth relative to
academic peers, while high growth is more than a year’s worth of growth. If the course meets for less than
a year, districts should make inferences about a year’s worth of growth based on the growth expected during
the time of the course.

Can I change scoring decisions when we use a DDM in the second year? It is expected that districts are
building their knowledge and experience with DDMs. DDMs will undergo both small and large modifications
from year to year. Changing or modifying scoring procedures is part of the continuous improvement of DDMs
over time.

Will parameters of growth be comparable from one district to another? Different assessments serve
different purposes. While statewide SGPs will provide a consistent metric across the Commonwealth and
allow for district-to-district comparisons, DDMs are selected
Calculating Scores
What you need to understand as you are creating assessments
288 to 244/ 25 SGP
4503699
230 to 230/ 35 SGP
214 to 225/
92 SGP
248 to 244/ 25 SGP
4503699
230 to 230/ 34 SGP
214 to 225/
92 SGP
Median student growth percentile
Last name
SGP
Lennon
6
McCartney
12
Starr
21
Harrison
32
Jagger
34
Richards
47
Crosby
55
Stills
61
Nash
63
Young
74
Joplin
81
Hendrix
88
Jones
95
Imagine that the list of students to
the left are all the students in your
6th grade class. Note that they are
sorted from lowest to highest SGP.
The point where 50% of students
have a higher SGP and 50% have a
lower SGP is the median.
Median SGP for the 6th grade class
Sample Cut Score Determination (for local assessments)
Student Scores
Pre-test
Post test
Difference
20
35
15
5
25
30
5
15
30
50
20
20
35
60
25
25
35
60
25
40
70
40
Sorted low to
high
Teacher score is based on the MEDIAN
Score of her class for each DDM
Cut score
LOW Growth
Lowest ___%
25
median
teacher score
35
25
median
Teacher score
65
25
25
50
75
25
30
50
80
30
35
50
85
35
35
Top 20%
Cut score
HIGH GROWTH
Highest ___?
Important Perspective
It is expected that districts are building their
knowledge and experience with DDMs. DDMs will
undergo both small and large modifications from
year to year. Changing or modifying scoring
procedures is part of the continuous improvement
of DDMs over time.
We are all learners in this initiative.
Next Steps Today
 Begin
to Develop Common Assessments
 Consider
Rigor and Validity (Handout Rubrics)
 Develop
Rubric (Consider scoring concerns)
 Develop
Common Expectations for Directions (to Teachers)
Other Important Considerations:
 Consider
when assessments will be given
 The
amount of time they will take
 The
impact on the school
Handout Rubrics

Bibliography—Sample exams; sample texts

Rubrics

Types of questions (Multiple choice, essay, performance

Reliability

Will you design 2 exams, pre- and post-

Ultimate validity

Does it assess what it says it does?

How does it relate to other data

Step-by-step, precise considerations (DESE)

Quality Rubric (all areas)

Protocol for determining growth scores
Download