MSSAA Updated October 1 2014 - DDMsBrady

advertisement

MSSAA

District Determined Measures

October 1, 2015

Dr. Deborah Brady dbrady3702@msn.com

Do Now

“Tent” ID

In a marker, create a “tent” ID with your name, position, and district

In a word or phrase, in your school/department/district how do you feel about DDMs?

Deb Brady

Consultant

District:

DDMs are complex

Defining DDMs

Growth Versus Achievement

DDM Quality Criteria

Agenda

Lunch

Some Good, Bad, and Mediocre Examples

Determining Validity and Reliability

Next Class—Using the Results, CASW, Calibration, Protocols, Excel

 On-line access to materials now at: https://wikispaces.com/join/HQ4FDRM

Code HQ4FDRM

1. What are

DDMs?

2. Assessment

Selection &

Development

3. Piloted a few

DDMs

4. June Report

Data; Waivers are done

5. Ready to Interpret scores into L, M, H

• The state has given us too much to do

We can’t possibly do

DDMs plus teacher evaluation plus

PARCC plus the

Common Core!

• Which tests that we give now can we use or adapt?

• Defining alignment with the curriculum

• Defining Rigor

Defining good assessments

• Defining the next steps for

• Looking at the results

• Do they reflect what happened in the classroom?

• Can we change these pilots

Where do we keep this data?

• Name of DDM

• Kind of assessment

Grade level

• Course

• Indirect DDMs for nurses, SISP, administrators

• Request

Waivers

• What’s high growth?

• What’s a year’s growth?

• What’s low growth?

• What’s low growth

• Who scores this data?

Where do we store this data?

• Can we change these assessments?

your district

What are your NEXT STEPS?

5. Ready to

Interpret Results into Low, Moderate,

High Growth

2. Deciding which

Assessments to Use

4. June Report;

Waiver Requests all set

1. What are

DDMs?

What are your

NEXT STEPS?

3. Piloted a few

DDMs

Agenda

Defining DDMs

Growth Versus Achievement

DDM Quality Criteria

Lunch

Some Good, Bad, and Mediocre Examples

Determining Validity and Reliability

Next Class—Using the Results, Collaboratively Looking At Student Work, Calibration,

Protocols, Excel

 On-line access to materials now at: https://wikispaces.com/join/HQ4FDRM

Code HQ4FDRM

District Determined Measures

DEFINITION

DDMs are defined as:

“Measures of student learning, growth, and achievement related to the

Curriculum Frameworks, that are comparable across grade or subject level district-wide”

TYPES OF MEASURES

 Portfolio assessments

 Approved commercial assessments

MCAS ELA 4-8; MCAS Math 4-8

District developed pre- and post- unit and course common assessments

Capstone projects

Quality Assessments

Almost any kind of assessment can work

Must be a “Substantive” assessment (DESE)

Aligned with (at least 2) standards of Frameworks or 1 Power Standard

And/or local standards

Rigorous (appropriately challenging; locally defined)

Consistent with K-12 DDMs in substance, alignment, and rigor

Consistent with the District’s values, initiatives, expectations

Measures growth (to be contrasted with achievement) and shifts the focus of teaching

From achievement to growth for all students

From teaching to learning

From the teacher to the learner

Last Year

Timeline for DDMs and Impact Ratings

Implementation

District-wide training, development of assessments and piloting of 5 required DDMs (Primary ELA or math;

Elementary ELA or Math, MS Math, HS Writing to Text plus un-tested area)

June 2014 Report: List of DDMs from District plus any waivers granted by DESE.

2014-2015

Year 1 (1 st Cohort) Non-waivered; scores are divided into H-M-and Low and stored locally

Year 2 (2 nd Cohort) Areas waivered by DESE based on June report

2015-2016

Second year data is collected for 1 st Cohort.

First year of DDM data is collected for the 2 nd Cohort.

2016-2017

October 2016: First DDM rating of High Moderate or Low is given to the 1 st Cohort. The impact rating linked to the educator’s EPIMS NUMBER is sent to DESE with the October 1 Report based on 2 years of data for two

DDMs.

A second year of data is collected for 2 nd Cohort. Their Impact Rating will be calculated and sent to DESE by

October 2017.

June Report Data and Waivers

NAME of the DDM (Math 6 Proportions; ELA 7 Argument)

SOURCE; District Developed, Commercial, Open Source

GRADE (Pk-2, 3-5, 6-8, HS, Multiple)

CONTENT AREA or ROLE Assessment Type

TYPE OF ASSESSMENT

DIRECT: Pencil and Paper (on demand), Performance/Project,Writing Prompt/Essay, Portfolio, Hybrid (a combination of any 2)

INDIRECT: (mainly for administrators and SISP—nurses, guidance, OT/PT, principals, directors

Number of Educators using this DDM: 1, 2, 3-5, 6-10, 11-20, 20-40, 40+

All waivers that were requested were granted. Varies by district.

NOTE: All of the grades, teacher names, assessments are internal to your district .

DDM Quick Review

Must measure growth, not achievement

Growth equals one year’s progress

Each educator will have at least 2 DDMs

Teachers’ DDMs will be assessments of student work called a DIRECT Measure

Most growth will be based on a pre-test before teaching and a post-test after teaching

MCAS SGP for grades 4-8 for ELA and math ONLY can be used (not grade 10)

Scores (100%) or Rubrics can be used to measure progress

One measure must not be MCAS; it must be a District Determined Measure which can include local assessments, and Galileo, normed assessments (DRA, MAP, SAT). However, self-contained classroom teachers may use both ELA and math SGA if the district makes that decision.

Some SISPs, administrators, nurses may have to have 1 INDIRECT Measure (a few who do not work with students may have 2 indirect)

INDIRECT Measures are like SMART goals—attendance, graduation rates, MCAS

Performance & Impact Ratings

Performance Rating

Ratings are obtained through data collected from observations, walkthroughs and artifacts

 Exemplary

Proficient

Needs Improvement

Unsatisfactory

Impact Rating

( October 1, 2016 for 1 st Cohort)

Ratings are based on trends and patterns in student learning, growth and achievement over a period of at least 2 years Data gathered from DDM’s and State-wide testing

High

Moderate

Low

Year 1 Year 2

PROFESSIONAL

JUDGMENT

There are no weights or percentages that dictate how an evaluator must interpret pattern and trend data to determine a Student Impact Rating for an individual educator.

Analyzing the 4 DDM Data Points

Impact Ratings

4 data Points

If more than half of the ratings point to the same rating, this is the rating

If there is no clear result, the default is M

Example

Low

Moderate

Moderate

Moderate

L,L,M,M

Overall Rating for

Student Impact

Moderate

Moderate is the default without compelling evidence otherwise

Performance and Impact Ratings

Performance

Rating

Exemplary

Proficient

Needs

Improvement

1-yr Self-

Directed Growth

Plan

2-yr Self-Directed Growth Plan

Directed Growth Plan

Unsatisfactory Improvement Plan

Low Moderate High

Rating of Impact on Student Learning

Massachusetts Department of Elementary and Secondary Education 15

Impact

Rating

Results of Impact Ratings on the Evaluation Cycle

Performance

Rating

Exemplary

Impact Rating Result

Proficient

Exemplary or

Proficient

High or Moderate • District negotiated result may include recognition

• No meeting required

High or Moderate • District negotiated may include eligibility for additional roles (mentoring)

• No meeting required

Low

High, Moderate, Low

• 1-year Self-Directed Plan

• Required Meeting when the Evaluator confirms rating

• Educator and evaluator analyze the data

• Educator Plan goals may be impacted

• If the Evaluator judges that the rating should be higher, this change must be approved by the superintendent

Directed Growth Plan Needs

Improvement

Unsatisfactory High, Moderate, Low Improvement Plan

Five Requirements of DDMs (DESE)

1.

Must measure growth

2.

Employ a common administration procedure

3.

Use a common scoring process

4.

Translate these assessments to an Impact Rating (High-

Moderate-Low)

5.

Assure comparability of assessments within the school (rigor, validity).

17

Comparable Assessments

1. Comparable across schools and within grades

 Example: Teachers with the same job (e.g., all 5 th grade teachers or all teachers of English 8) give the same assessment

 Where possible, measures are identical

 Easier to compare identical measures

 But the district can determine whether or not these identical measures provide meaningful information about all students

 Exceptions: When might assessments not be identical?

 Different content ( different sections of Algebra I)

 Differences in untested skills (reading and writing on math test for ELL students )

 Other accommodations ( fewer questions to students who need more time )

Co-Taught Classes and Teacher of Record

• In some co-taught classes or when teachers share in the results of the assessment because of additional time spent in a pull-out class, districts may need to determine the teacher or teachers responsible for content covered by statewide testing

• Co-teachers may share equally if both co-teach all students or the district may determine one teacher as the teacher of record.

Table Talk

Communication Plan

DDMs

Quality Assessments

Roster Verification

When DDM Results Have Been Calculated

 Educators confirm the accuracy of their rosters

 Student must be on roster by October 1, and the student must remain on roster through last day of DDM testing.

 Student must be present for 90% of instructional time.

Direct measures of student learning, growth,

or achievement provide information about student growth from student work.

Most classroom teachers will have 2 direct measures

 Indirect measures of student learning, growth, or achievement provide information about students from means other than student work.

 These measures may include student record information

(e.g., grades, attendance or tardiness records, or other data related to student growth or achievement such as high school graduation or college enrollment rates).

 ESE recommends that at least one of the measures used to determine each educator’s student impact rating be a direct measure and MCAS SGP if available and appropriate.

Indirect Measure Examples

Consider the teams that you are a part of, for example, what many schools call the “child study team” in which many specialists participate, or all of your job-alike colleagues.

Discuss the goals that you have as a district group or as a school-based group. For example, you may be working on increasing the attendance of a group of students who are frequently out or tardy. Or, you may be working to return students to class quickly. Or, you may be focusing on working with a few families or students to support students with, for example, school anxiety.

Note that the measures can focus on a small group and do not have to include all students. For example, students with emotional problems that prevent them from participating fully can be your subgroup. For those educators with a small caseload, supporting your caseload to return to class and to participate fully in the class may be your goal.

Select a goal on something that is already an initiative or is recognized as something worthwhile for students.

This goal needs to connect indirectly to student growth. Making sure students are in class and are able to participate fully is an appropriate goal.

Social-emotional growth is an appropriate direct measure.

Develop a method for measuring this goal. For example, nurses in one district realized that the software that they used could also track their goal of quick returns of students to classes. In another district, a school-based team decided that their “Student Support Team” had been reviewing numbers of classroom incidents and absences, but had not put aside time to study the most at risk students fully. Their goals was to change their weekly process so that time was set aside to discuss a student more thoroughly and to develop an action plan for that student. Their record was their weekly agenda, and they counted the increased number of action plans and tracked the progress of the most at risk students.

Indirect Measures More Specifics:

School-based student support teams: Increase attendance, decrease tardiness, increase the number of families involved with their students’ success plans, increase the in-depth studies of students and track their progress, decrease the number of behavior incidents.

Librarians: Increase the number of classes that work on research and projects in the library. Increase the number of teachers with whom you work to support specific units with materials.

Behavior specialists, speech pathologists, social workers, school psychologists: Increase the number of students who participate in class more fully. You can look at a sub-set of your students or caseload. For example, a behavior specialist was going to count the number of responses to a “non-classroom adult” that students made as they walked to gym or to lunch. This is a DIRECT MEASURE.

Example: A group of school psychologists were concerned that many students were referred to Special Education testing before interventions had taken place. They wanted to make sure the “Student Study Teams” processes were uniform.

They provided an instrument for referral that was consistent within the district. A goal was to increase the number of times the instrument was used to provide appropriate interventions before a student was considered for assessment for special education

A group of speech pathologists used a developmental scale (like a holistic rubric) to measure the improvement of students’ speaking skills. They assessed these skills in their classroom and in general education classrooms. This is a DIRECT

MEASURE.

If IEP goals measure student academic or social growth, attainment of success in working toward these goals can be used as

DIRECT MEASURES. Thus, achieving IEP growth goals for my caseload can be a goal. Please note that DESE is still looking at subgroup sizes under 7 as being too small for growth measurement.

Guidance counselors set increasing the number of applications for college as a goal. Their comprehensive initiative included visiting classrooms and encouraging more students to take the PSAT, to work with the administration to provide

SAT workshops, and to offer workshops through the school in writing college application letters.

Table Talk

Direct and Indirect Measures

Teacher of Record

Roster Verification

Growth versus Achievement

4503699

288 to 244/ 25 SGP

230 to 230/ 35 SGP

214 to 225/ 92 SGP

“Cut Scores” for MCAS SGP

Typical growth

One year’s growth

0

Lower growth

50

35

Classroom

40

Whole Grade

Percent of students

60

Whole Grade

65

Classroom

Higher growth

100

MCAS SGP Cut Scores

Teachers

Administrators

Low Moderate

35* or lower Greater than 35, but less than 65

40 or lower Greater than 40, but less than 60

High

65 or higher

60 or higher

• More latitude is given to teachers because of the statistically small numbers in a classroom (at least 20 students).

• Administrative “cut scores” for Low and High are based upon the entire class’ scores

Median student growth percentile

Last name

Lennon

McCartney

Starr

Harrison

Jagger

Richards

Crosby

Stills

Nash

Young

Joplin

Hendrix

Jones

63

74

81

88

95

32

34

47

55

61

SGP

6

12

21

Imagine that the list of students to the left are all the students in your

6 th grade class. Note that they are sorted from lowest to highest SGP.

The point where 50% of students have a higher SGP and 50% have a lower SGP is the median.

Median SGP for the 6 th grade class

Using median student growth percentiles: growth by achievement for schools

100

Higher achieving

Lower growing

80

Higher achieving

Higher growing

60

40

20

0

0

Lower achieving

Lower growing

20 40 60

ELA median SGP, 2009

80

Lower achieving

Higher growing

100

Student growth percentiles

 Each student’s rate of change is compared to other students with a similar test score history (“academic peers”)

 The rate of change is expressed as a percentile.

 How much did John improve in mathematics from 5 th grade to 6 th grade, relative to his academic peers?

 If John improved more than 65 percent of his academic peers, then his student growth percentile would be 65.

 NOTE: Differences of fewer than 10 SGP points are likely not educationally meaningful.

Growth and MCAS SGP

 The median SGP must be used when a teacher has 20 or more students (altogether) in a content area

 Median SGPs for 8-19 students have validity and may be used if the district determines this is appropriate

 More than one educator (a classroom teacher, a support teacher, a curriculum supervisor, and a supervising administrator) may be considered responsible for a content-area SGP. Different students may be in their respective rosters.

Galileo (software)

Computes Growth

In the Same Way as the MCAS

(subgroup comparisons)

Low

Growth

Moderate

Growth

Figure 1

Theoretical I illustration

High

Growth

Beginning to Consider a Plan

Core Areas

MCAS SGP

Direct and Indirect Measures

4

3

2

1

K

7

6

5

12

11

10

9

8

“Sample Plan” for Core Areas Only

ELA

CA/CA

CA/CA

CA/CA

CA/CA

MCAS SGP/CA

MCAS SGP/CA

MCAS SGP/CA

MCAS SGP/CA

MCAS SGP/CA

CA/CA

CA/CA

CA/CA

CA

Math

CA/CA

CA/CA

CA/CA

CA/CA

MCAS SGP/CA

MCAS SGP/CA

MCAS SGP/CA

MCAS SGP/CA

MCAS SGP/CA

CA/CA

CA/CA

CA/CA

CA

Science

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

Social Studies

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

“ Sample Plan” for Other

8

7

6

12

11

10

9

3

2

5

4

Singleton, Art, Music,

Technology, PE by Grade or gradespans

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

CA/CA

Special Education

Specialists, Co-

Teachers, substantially separate

Inclusion Co-Teachers

Can “share” scores with

General Ed Teachers

Or

If their students’ goals are substantially different, the assessments can be modified or can focus on the goal of inclusion

Specialists Measure Goals

Can develop K-12 rubric

Indirect Measures (IM)

Central Office,

Psychologist

Adaptation of the

SMART goal process

Measure goal that has an impact on student growth

Attendance

College Applications

Technology Growth

Administrators

Principals, Assitant

Principals,

Superintendent,

MCAS Growth

Either ELA or Math

PLUS IM

Table Talk

Initial Planning

Core Subjects

Specialists, PE, Art, Music, Technology

SPED

Singletons (HS mainly)

Indirect Specialists, Central Office

Principals, Assistant Principals,

Superintendent, Curriculum Directors,

Department Heads

Sample Cut Score Determination (for 100 students)

Pre-test Post test

Difference

20

25

30

35

35

40

40

50

50

50

35

30

50

60

60

70

65

75

80

85

15

5

20

25

25

35

25

25

30

35

Sorted low to high

5

15

20

25

25

25

25

30

35

35

Score of her class for each DDM

Cut score median median

Cut score

LOW Growth

Lowest ___% teacher score

Teacher score

Top 20%

HIGH GROWTH

Highest ___?

Measuring Growth Example: Fountas and Pinnell

P to Q; N to P; D to K

Second

Grade

Student

Level

Q

P

K

Fountas and Pinnell

Growth for Each Student

Is Based on 10 Months of Growth

Achievement

Level End of Year

Levels from beginning to the end of the year

Pre-Post F&P Levels

Growth

HIGH, MODERATE, OR

LOW GROWTH

(10 MONTHS=YEAR)

Above Benchmark PQ 7 MONTHS GROWTH LOW GROWTH

At Benchmark

Below Benchmark

NOP

DEFGHIJK

10 MONTHS OF

GROWTH

17 MONTHS OF

GROWTH

MODERATE GROWTH

HIGH GROWTH

Teacher A

5

Teacher B Teacher C

7.5

Teacher D

6.5

Teacher E

3.5

5.5

6.5

6.5

6.5

6.5

7

10

12

None 7

7

9

9

10

10

12

12

12

1

3

3

6

6.5

6.5

12

12

12

13

13

13

16

17

17

10

10

10

7.5

8.5

10

10

10

10

12

12

13.5

13.5

13.5

17

10

10

10

6.5

7.5

8.5

9

10

12 17

16 17

Median 6.5

Median 9 Median 12 Median 10 Median 16

12

16

16

16

16

16

16.5

16.5

16.5

7

7

7

7

10

10

11.5

12.5

Below 6.5

Between 6.5 and 16

Between 6.5 and 16

Between 6.5 and 16

Between 6.5 and 16

Median 12

Between 6.5 and 16

Teacher F

9

13

13

13.5

13.5

13.5

13.6

15.5

19

10

10

10

10

12

12.5

12.5

12

LOW Moderate Moderate Moderate Moderate Moderate

103 Third Graders

All Classes

1

3

3

3

6.5

6.5

6.5

6.5

6.5

6.5

6.5

7

7

7

7

3.5

5.5

6

6.5

6.5

7

7

10

10

9.5

10

10

10

10

10.7

10.9

9

9

9

9

7

7

7.5

7.5

8.5

8.5

9

11.0

11.2

11.4

11.5

11.7

11.9

12.0

12.2

12.5

6.5

Cut Score lowest 15%

12.0

Median for whole

Grade 3 DDM

16

16.5

16.5

16.5

16.5

16.5

17

19

17

17

26

16

16

16

15.5

16

16

16

13.5

13.5

13.5

13.5

13.5

13.5

13.6

13

13

13

12.5

12.5

13

13

14.8

15.0

15.1

15.3

15.5

15.6

12.5

13.6

13.8

13.9

14.1

14.3

14.4

14.6

12.6

12.7

12.9

13.1

13.2

13.4

16 cut score highest 15%

Details

Holistic Rubric

Show Progress across a Scale, Continuum, Descriptors

1 2

No improvement in the level of detail.

One is true

* No new details across versions

* New details are added, but not included in future versions.

* A few new details are added that are not relevant, accurate or meaningful

Modest improvement in the level of detail

One is true

* There are a few details included across all versions

* There are many added details are included, but they are not included consistently, or none are improved or elaborated upon.

* There are many added details, but several are not relevant, accurate or meaningful

3 4

Considerable Improvement in the level of detail

All are true

* There are many examples of added details across all versions,

* At least one example of a detail that is improved or elaborated in future versions

*Details are consistently included in future versions

*The added details reflect relevant and meaningful additions

Outstanding Improvement in the level of detail

All are true

* On average there are multiple details added across every version

* There are multiple examples of details that build and elaborate on previous versions

* The added details reflect the most relevant and meaningful additions

Example taken from Austin, a first grader from Answer Charter School

44

Learn more about this and other examples at http://elschools.org/student-work/butterfly-drafts

Preconventional

Ages 3-5

Emerging

Ages 4-6

2 Relies primarily on pictures to convey meaning.

2 Begins to label and add “words” to pictures.

2 Writes first name.

Uses pictures and print to convey meaning.

Writes words to describe or support pictures.

Copies signs, labels, names, and words (environmental print).

Developing

Ages 5-7

.

Writes 1-2 sentences about a topic.

Writes names and familiar words.

Beginning

Ages 6-8

Expanding

Ages 7-9

2 Writes several sentences about a topic.

2 Writes about observations and experiences.

2 Writes short nonfiction pieces

(simple facts about a topic) with guidance.

Writes short fiction and poetry with guidance.

Writes a variety of short nonfiction pieces (e.g., facts about a topic, letters, lists) with guidance.

Writes with a central idea.

Writes using complete sentences.

Bridging

Ages 8-10

Fluent

Ages 9-11

Proficient

Ages 10-13

Connecting

Ages 11-14

Independent

Writes about feelings and opinions.

Writes fiction with clear beginning, middle, and end.

Writes poetry using carefully chosen

2 Begins to write organized fiction and nonfiction (e.g., reports, letters, biographies, and autobiographies).

language with guidance.

Writes organized nonfiction pieces

(e.g., reports, letters, and lists) with

2 Develops stories with plots that include problems and solutions with guidance.

guidance.

2 Creates characters in stories with

Begins to use paragraphs to organize ideas.

Uses strong verbs, interesting language, and dialogue with guidance.

guidance.

2 Writes poetry using carefully chosen language.

1 Begins to experiment with sentence length and complex sentence structure.

1 Varies leads and endings with guidance.

1 Uses description, details, and similes with guidance.

1

Uses dialogue with guidance.

Writes persuasively about ideas, 2 Writes organized, fluent, accurate, feelings, and opinions.

and in-depth nonfiction, including

Creates plots with problems and solutions.

Begins to develop the main characters and describe detailed settings.

Begins to write organized and fluent nonfiction, including simple bibliographies.

Writes cohesive paragraphs including reasons and examples with guidance.

Uses transitional sentences to connect paragraphs.

Varies sentence structure, leads, and endings.

Begins to use descriptive language, details, and similes.

Uses voice to evoke emotional response from readers.

Begins to integrate information on a topic from a variety of sources.

writing with increasing frequency.

?

references with correct

2 Writes in a variety of genres and bibliographic format.

forms for different audiences and purposes independently.

2 Writes cohesive, fluent, and effective poetry and fiction.

2 Creates plots with a climax.

1 Uses a clear sequence of

2 Creates detailed, believable paragraphs with effective settings and characters in stories.

transitions.

2 Writes organized, fluent, and

1 Begins to incorporate literary detailed nonfiction independently, devices (e.g., imagery, metaphors, including bibliographies with personification, and correct format.

foreshadowing).

1 Writes cohesive paragraphs

1 Weaves dialogue effectively into including supportive reasons and examples.

1 Uses descriptive language, details, stories.

1 Develops plots, characters, setting, and mood (literary elements) similes, and imagery to enhance ideas independently.

1 Begins to use dialogue to enhance effectively.

1 Begins to develop personal voice and style of writing.

character development.

" .

1 Incorporates personal voice in

J

Criterion Referenced Rubric and Raw Scores or % of 100

4(25)=

100

4(22)=

88

4(18)=

72 x

4(15)=

60

25 x x x

+ 18 + 22 + 15 = 80%

AP Rubric of Rubrics Prose Analysis

(9 levels give students room to improve Holistic)

9-8 Answers all parts of the question completely. Using specific evidence from the work and showing how that evidence is relevant to the point being made. Fashions a convincing thesis and guides reader through the intricacies of argument with sophisticated transitions.

Demonstrates clear understanding of the work and recognizes complexities of attitude/tone. Demonstrates stylistic maturity by an effective command of sentence structure, diction, and organization. Need not be without flaws, but must reveal an ability to choose from and control a wide range of the elements of effective writing.

7-6 Also accurately answers all parts of the question, but does so less fully or effectively than essays in the top range. Fashions a sound thesis.

Discussion will be less thorough and less specific, not so responsive to the rich suggestiveness of the passage or precise in discussing its impact.

Well written in an appropriate style, but with less maturity than the top papers. Some lapses in diction or syntax may appear, but demonstrates sufficient control over the elements of composition to present the writer’s ideas clearly. Confirms the writer’s ability to read literary texts with comprehension and to write with organization and control.

5 Discusses the question, but may be simplistic or imprecise. Constructs a reasonable if reductive thesis. May attempt to discuss techniques or evidence in the passage, but may be overly general or vague. Adequately written, but may demonstrate inconsistent control over the elements of composition. Organization is attempted, but may not be fully realized or particularly effective.

4-3 Attempts to answer the question, but does so either inaccurately or without the support of specific evidence. May confuse the attitude / tone of the passage or may overlook tone shift(s) or otherwise misrepresent the passage. Discussion of illustrations / techniques / necessary parts of the prompt may be omitted or inaccurate. Writing may convey the writer’s ideas, but reveals weak control over diction, syntax, or organization. May contain many spelling or grammatical errors. Essays scored three are even less able and may not refer to illustrations / techniques at all.

2-1 Fails to respond adequately to the question. May misunderstand the question or the passage. May fail to discuss techniques / evidence used or otherwise fail to respond adequately to the question. Unacceptably brief or poorly written on several counts. Writing reveals consistent weakness in grammar or other basic elements of composition. Although may make some attempt to answer the question, response has little clarity and only slight, if any, evidence in its support. Although the writer may have made some attempt to answer the prompt, the views presented have little clarity or coherence; significant problems with reading comprehension seem evident. Essays that are especially inexact, vacuous, and /or mechanically unsound should be scored 1.

0 A blank paper or one that makes no attempt to deal with the question receives no credit.

Rubric from Sharon Kingston

Rubric “Cut Scores”

Create a “growth” rubric and describe a typical year’s growth

 Translate into 100%

 www.roobrix.com

Who Scores Student Work

 Districts will need to determine fair, efficient and accurate methods for scoring students’ work.

 DDMs can be scored by the educators themselves, groups of teachers within the district, external raters, or commercial vendors.

For districts concerned about the quality of scoring when educators score their own student’s work, processes such as randomly re-scoring a selection of student work to ensure proper calibration or using teams of educators to score together, can improve the quality of the results.

 When an educator plays a large role in scoring his/her own work, a supervisor may also choose to include the scoring process into making a determination of a Student Impact.

Who Assesses the DDMs?

 Teams of Teachers (e.g., all 5 th grade teachers)

 Team members rate each other’s students’ responses

 Multiple raters score each response

 Individual Teachers

 Random auditing (rechecking)

 By principal, coordinator, or department head; this is a district decision

Table Talk

Scoring

Grade Levels

Departments

Who Scores?

Who Stores?

DDM Quality Criteria

Collaborative Process

Validity

Reliability

Rigor

Aligned to standards

Assessment Quality Requirements and Definitions from DESE

Alignment to Frameworks and District Curriculum content and/or district standards

Rigor

Comparability across all classes and in all disciplines

“Substantial” assessment of the course; core content and skills

Modifications are allowed as with MCAS

Rigor Measures

District Determined

Bloom (next slide)

Hess (next slide)

DESE’s “Curriculum Embedded

Performance Assessment” Rubric in the

Model Curriculum Units (slide 36)

Depth + thinking

Remember

Level 1

Recall & Reproduction

-Recall, locate basic facts, details, events

Hess: Rigor Relevance Matrix

Level 2

Skills & Concepts

Level 3

Strategic Thinking/

Reasoning

Not appropriate at this level

Understand

B

Apply

L

O

Analyze

O

M

-Select appropriate words to use when intended meaning is clearly evident

MCAS

-Use language structure

(pre/suffix) or word relationships

(synonym/antonym) to determine meaning

-Identify whether information is contained in a graph, table, etc.

-Specify or explain relationships

-summarize

-identify central idea

-Use context to identify meaning of word

-Obtain and interpret information using text features

-Compare literary elements, terms, facts, events

-analyze format, organization, & text structures

Evaluate

PARCC

-Explain, generalize, or connect ideas using supporting evidence (quote, example…)

-Use concepts to solve non-routine problems

-Analyze or interpret author’s craft

(literary devices, viewpoint, or potential bias) to critique a text

Level 4

Extended Thinking

-Explain how concepts or ideas specifically relate to other content domains or concepts

-Devise an approach among many alternatives to research a novel problem

-Analyze multiple sources

-Analyze complex/abstract themes

-Cite evidence and develop a logical argument for conjectures

-Evaluate relevancy, accuracy,

& completeness of information

Create

-Brainstorm ideas about a topic -Generate conjectures based on observations or prior knowledge

-Synthesize information within one source or text

-Synthesize information across multiple sources or texts

Post Test Only

Use District’s History to predict scores on AP Calculus Exam, for example

Previous

Grade’s Math

Score

A

B

C

D

Low

Growth

3

2

1

Moderate

Growth

4

3

2

1

High

Growth

5

4

3

2

Identifying and Selecting DDMs

The DDM process must be collaborative

District establishes a DDM Working Group

 Co-chaired by superintendent and president of local bargaining unit or their designees.

 Surveys the district for available assessments

 Recruits educators to identify assessments and make recommendations

 Identifies at least two measures for each educator

 Collects feedback on the quality of the DDMs (continuous improvement)

 Makes recommendations to the superintendent

Assessment Quality Requirements and Definitions from DESE

Alignment to Frameworks and District Curriculum content and/or district standards

Rigor

Comparability across all classes and in all disciplines

“Substantial” assessment of the course; core content and skills

Modifications are allowed as with MCAS

Rigor Measures

District Determined

Bloom (next slide)

Hess (next slide)

DESE’s “Curriculum Embedded

Performance Assessment” Rubric in the

Model Curriculum Units (slide 36)

Depth + thinking

Remember

Level 1

Recall & Reproduction

-Recall, locate basic facts, details, events

Hess: Rigor Relevance Matrix

Level 2

Skills & Concepts

Level 3

Strategic Thinking/

Reasoning

Not appropriate at this level

Understand

B

Apply

L

O

Analyze

O

M

-Select appropriate words to use when intended meaning is clearly evident

MCAS

-Use language structure

(pre/suffix) or word relationships

(synonym/antonym) to determine meaning

-Identify whether information is contained in a graph, table, etc.

-Specify or explain relationships

-summarize

-identify central idea

-Use context to identify meaning of word

-Obtain and interpret information using text features

-Compare literary elements, terms, facts, events

-analyze format, organization, & text structures

Evaluate

PARCC

-Explain, generalize, or connect ideas using supporting evidence (quote, example…)

-Use concepts to solve non-routine problems

-Analyze or interpret author’s craft

(literary devices, viewpoint, or potential bias) to critique a text

Level 4

Extended Thinking

-Explain how concepts or ideas specifically relate to other content domains or concepts

-Devise an approach among many alternatives to research a novel problem

-Analyze multiple sources

-Analyze complex/abstract themes

-Cite evidence and develop a logical argument for conjectures

-Evaluate relevancy, accuracy,

& completeness of information

Create

-Brainstorm ideas about a topic -Generate conjectures based on observations or prior knowledge

-Synthesize information within one source or text

-Synthesize information across multiple sources or texts

GENERIC Rubric for CEPAs in Mass

1 2 3

Topic development

Evidence and

Content

Accuracy

Use of

Visuals/Media

1

Little topic/idea development, organization, and/or details

Little or no awareness of audience and/or task

Limited awareness of audience and/or task

Little or no evidence is included

Use of evidence and content knowledge is limited or weak and/or content is inaccurate

Visuals and/or media are missing or do not contribute to the quality of the submission

1

2

Visuals and/or media demonstrate a limited connection to the submission

3

Limited or weak topic/idea development, organization, and/or details

Rudimentary topic/idea development and/or organization

Basic supporting details

Simplistic language

Use of evidence and content is included but is basic and simplistic

Visuals and/or media are basically connected to the submission and contribute to its quality

2

4 5 6

4 5 6

Moderate topic/idea development and organization

Full topic/idea development

Rich topic/idea development

Adequate, relevant details

Logical organization

Strong details

Careful and/or subtle organization

Some variety in language

Use of evidence and accurate content is relevant and adequate

Visuals and/or media are connected to the submission and contribute to its quality

Appropriate use of language

Effective/rich use of language

Use of evidence and accurate content is logical and appropriate

A sophisticated selection of and inclusion of evidence and accurate content

Visuals and/or media contribute to the contribute to an outstanding submission

Visuals and/or media are carefully and quality of the submission in a logical enhance the content of and appropriate way strategically selected to the submission

3 4

Standards for

English

Conventions

Errors seriously interfere with communication and

Little control of sentence structure, grammar and usage, and mechanics

Errors interfere somewhat with communication and/or

Too many errors relative to the length of the submission or complexity of sentence structure, grammar and usage, and mechanics

Errors do not interfere with communication and/or

Few errors relative to length of submission or complexity of sentence structure, grammar and usage, and mechanics

Control of sentence structure, grammar and usage, and mechanics

(length and complexity of submission provide opportunity for student to show control of standard English conventions)

Table Talk

Quality of Assessments

Quality Rubrics

Validity

Rigor

Inter-Rater Reliability

Specific Examples

Examples of DDMs

The GOOD

The BAD and the Not-so-good

Quality Assessments, Developed Locally, Adapted, or Adopted

Dr. Deborah Brady dbrady3702@msn.com

Quality Control Consideration

 Substantive

 Aligned with standards of Frameworks, Vocational standards

 And/or local standards

 Rigorous

 Consistent in substance, alignment, and rigor

 Consistent with the District’s values, initiatives, expectations

 Measures growth (to be contrasted with achievement) and shifts the focus of teaching

As a Result of the GOOD

In Districts, schools, departments:

 Educators have collaborated thoughtfully

Initiatives are one step more unified

 The District, school, department, or specific course

 Moves forward (a baby step or a giant step)

 Gains collaborative understanding of the purpose of a course, discipline, year’s work

Some GOOD Common Exam Examples

A Valued Process: PORTFOLIO: 9-12 ELA portfolio measured by a locally developed rubric that assesses progress throughout the four years of high school

K-12 Writing or Writing to Text: A district that required that at least one DDM was “writing to text” based on CCSS appropriate text complexity

Focus on Data that is Important: A HS science department assessment of lab report growth for each course (focus on conclusions)

“New CCSS” Concern: A HS science department assessment of data or of diagram or video analysis

More

CCSS Math Practices: A HS math department’s use of

PARCC examples that require writing asking students to

“justify your answer”

SS Focus on DBQs and/or PARCC-like writing to Text:

A social studies created PARCC exam using as the primary sources. Another social stuies department used

“mini-DBQs” in freshman and sophomore courses

 Music: Writing about a concert

 Common Criteria Rubrics for Grade Spans : Art (color, design, mastery of medium), Speech (developmental levels)

More

Measure the True Goal of the Course: Autistic and behavioral or alternative programs and classrooms, Socialemotional development of independence (whole collaborative—each educator is measuring)

SPED “Directed Study” Model—now has Study Skills explicitly recorded by the week for each student and by quarter on manila folder: Note taking skills, text comprehension, reading, writing, preparing for an exam, time management, and differentiated by student

 A Vocational School’s use of Jobs USA assessments for one

DDM and the local safety protocols for each shop

High school SST team example (Frequent Absentees)

Child Study Team example (Universal Process)

School Psychologists (Did not follow procedure for referral)

 School Psychologists (subgroup of students studied)

High school guidance example (PSAT, SAT, College

Applications)

IEP goals can be used as long as they are measuring growth (academic or social-emotional)

District Capacity and Time to Collaborate

Data teams

PLCs

Leaders/coaches to provide context and meaning to student work

Looking at student work protocols

Diagnosing student needs and developing action plans

 Without Time and Capacity, it’s all just

Low Moderate and High in Human Terms

A story of two teachers

Effective Teaching

All levels of learners

Curriculum

Goals/Agenda

Notebook

Group work

Routines

We ended here our first meeting

The second meeting will have (drafting components based on feedback):

Specific examples plus time to look at assessment sites

 World Language

 HSS

 ELA

 Math

 Science

 PE, Art, Music, Technology

Mass Model Curriculum Units (MCU)

Mock assessment of assessments for quality, rigor, standards, validity

 Substantial and Standards

 Rigor and Bloom and/or Hess

 Validity does this assess what it says it will?

Mock calibration of assessments

Mock Validation using results

Math Practices

Communicating Mathematical Ideas

 Clearly constructs and communicates a complete response based on:

 a response to a given equation or system of equations

 a chain of reasoning to justify or refute algebraic, function or number system propositions or conjectures

 a response based on data

How can you assess these standards?

Demonstrating Growth (Galileo Example)

Billy Bob’s work is shown below. He has made a mistake In the space to the right, solve the problem on your own on the right.

Then find Billy Bob’s mistake, circle it and explain how to fix it.

Your work Billy Bob’s work

½ X -10 = -2.5

+10 = +10

_____________________________________________

½ X +0 = +12.5

Find the mistake provides students with model.

Requires understanding.

Requires writing in math

(2/1)(1/2)X =12.5 (2)

X=25

Explain the changes that should be made in Billy Bob’s Work

A resource for

DDMs.

A small step?

A giant step?

The district decides

Which of the three conjectu res are true?

Justify your answer

Essay Prompt from Text

Read a primary source about Mohammed based on Muhammad’s Wife’s memories of her husband.

Essay: Identify and describe Mohammed’s most admirable quality based on this excerpt. Select someone from your life who has this quality. Identify who they are and describe how they demonstrate this trait.

What’s wrong with this prompt? Text-based question?

PARCConline.org

Where are the CLAIMS and EVIDENCE?

Science Open Response from Text

Again, from a textbook, Is this acceptable?

Is this recall?

A scoring guide from a textbook for building a

Lou Vee Air Car. Is it good enough to ensure inter-rater reliability?

Scoring Guides from Text

Lou Vee Air Car built to specs (50 points)

Propeller Spins Freely (60 points)

Distance car travels

1m 70

2m 80

3m 90

4m 100

Best distance (10,8,5)

Best car(10,8,5)

Best all time distance all classes (+5)

 235 points total

Technology/Media Rubric

A multi-criteria rubric for technology. What is good, bad, problematical?

Don’t try to read it!

PE Rubric in

Progress.

Grade 2 for overhand throw and catching.

Look good?

Music: Teacher and Student Instructions

Are numbers good or a problem?

The

UGLY

Comply with regulations

Bring about no change or understanding

The Best?

Build on what is in the District, school or department

 A small step or a larger step in cognitive complexity

Use the results to learn about students’ needs and how to address these needs

Use time to look at student work, to collaboratively plan to improve

Table Talk

District Values

Consistent Conditions for Assessments

Looking at the Results: A Preview

A Psychometrician’s view

How Do We Determine

Cut Scores?

Growth Scores?

Both are new areas for learning

Growth is not achievement.

Moderate=a year’s growth

What if a student is below benchmark?

Again, setting these specific parameters is district determined

 “Common Sense”

Psychometricians are still figuring out what a good/fair assessment is

Objectivity versus Subjectivity

Calibration

 Human judgment and assessment

 What is objective about a multiple choice test?

 Calibrating standards in using rubrics

 Common understanding of descriptors

What does “insightful,” “In-depth,” “general” look like?

 Use exemplars to keep people calibrated

Assess collaboratively with uniform protocol

Assessment Drift

 Spot Checking; recording; assessment blind

Develop EXEMPLARS (simple protocol)

In F&P Comprehension “conversation”

Grade Level Team: Calibration with sample below benchmark, at benchmark, and above benchmark sample to begin. Discuss differences

 Then sample recorded F&P

Protocols for Administration of Assessments

 Directions to teachers need to define rules for giving support, dictionary use, etc.

 What can be done? What cannot?

 “Are you sure you are finished?”

 How much time?

Accommodations and modifications?

 Feedback from teachers indicated some confusions about procedures

 Update instructions (common format)

Qualitative Methods of Determining an

Assessment’s

VALIDITY

 Looking at the “ body of the work ”

 Validating an assessment based upon the students’ work

Floor and ceiling effect

 If you piled the gain scores (not achievement) into

High, M, and Low gain

Is there a mix of at risk, average, and high achievers mixed throughout each pile or can you see one group mainly represented

Low, Moderate, High Growth Validation

 Did your assessment accurately pinpoint differences in growth?

1.

Look at the LOW pile

2.

3.

If you think about their work during this unit, were they struggling?

Look at the MODERATE pile. Are these the average learners who learn about what you’d expect of your school’s student in your class?

Look at the HIGH achievement pile. Did you see them learning more than most of the others did in your class?

 Based on your answers to 1, 2, and 3,

 Do you need to add questions (for the very high or the very low?)

Do you need to modify any questions (because everyone missed them or because everyone got them correct?)

Look at specific students’ work

Psychometric process called

Body of the Work validation

 Tracey is a student who was rated as having high growth.

James had moderate growth

Linda had low growth

 Investigate each student’s work

 Effort

 Teachers’ perception of growth

 Other evidence of growth

 Do the scores assure you that the assessment is assessing what it says it is?

Objectivity versus Subjectivity

Multiple Choice Questions

 Human judgment and assessment

 What is objective about a multiple choice test?

 What is subjective about a multiple choice test?

 Make sure the question complexity did not cause a student to make a mistake.

 Make sure the choices in M/C are all about the same length, in similar phrases, and clearly different

Category

Rubrics and Inter-Rater Reliability

Getting words to mean the same to all raters

4 3 2

Resources

Development

Organization

1

Effective use

Highly focused

Adequate use

Related ideas support the writers purpose

Has an organizational structure

Limited use

Focused response Inconsistent response Lacks focus

Ideas may be repetitive or rambling

Inadequate use

No evidence of purposeful organization

Language conventions

Well-developed command

Command; errors don’t interfere

Limited or inconsistent command

Weak command

Protocol for Developing Inter Rater Reliability

 Before scoring a whole set of papers, develop Inter-rater

Reliability

Bring High, Average, Low samples (1 or 2 each)

Use your rubric or scoring guide to assess these samples

Discuss differences until a clear definition is established

Use these first papers as your exemplars

When there’s a question, select one person as the second reader

Annotated Exemplar: How does the author create the mood in the poem?

Answer and explanation in the student’s words

Specific substantiation from the text

The speaker’s mood is greatly influenced by the weather.

The author uses dismal words such as “ghostly,” “dark,”

“gloom,” and “tortured.”

“Growth Rubrics” May Need to Be Developed

Pre-conventional Writing

Ages 3-5

2

2

Relies primarily on pictures to convey meaning.

Begins to label and add “words” to pictures.

2

Writes first name.

1 Demonstrates awareness that print conveys

?

meaning.

Makes marks other than drawing on paper

?

(scribbles).

Writes random recognizable letters to represent words.

J Tells about own pictures and writing.

Emerging

Ages 4-6

2

2

2

Uses pictures and print to convey meaning.

Writes words to describe or support pictures.

Copies signs, labels, names, and words

(environmental print).

?

?

1 Demonstrates understanding of letter/sound relationship.

Prints with upper case letters.

Matches letters to sounds.

?

?

J

J

J

Uses beginning consonants to make words.

Uses beginning and ending consonants to make words.

Pretends to read own writing.

Sees self as writer.

Takes risks with writing.

Developing

Ages 5-7

2

2

Writes 1-2 sentences about a topic.

Writes names and familiar words.

1 Generates own ideas for writing.

?

Writes from top to bottom, left to right, and front to back.

?

?

?

?

?

Intermixes upper and lower case letters.

Experiments with capitals.

Experiments with punctuation.

Begins to use spacing between words.

Uses growing awareness of sound segments

(e.g., phonemes, syllables, rhymes) to write words.

?

Spells words on the basis of sounds without regard for conventional spelling patterns.

?

Uses beginning, middle, and ending sounds to make words.

J Begins to read own writing.

Protocols for Administration of Assessments

 Directions to teachers need to define rules for giving support, dictionary use, etc.

 What can be done? What cannot?

 “Are you sure you are finished?”

 How much time?

Accommodations and modifications?

 Feedback from teachers indicated some confusions about procedures

 Update instructions (common format)

 Next Workshop: Protocols, defining Low, Moderate, High, Using

Excel

DESE Quote

It is expected that districts are building their knowledge and experience with DDMs. DDMs will undergo both small and large modifications from year to year. Changing or modifying scoring procedures is part of the continuous improvement of DDMs over time.

We are all learners in this initiative.

Next Class

 Bring sample assessment results

Bring assessments to discuss quality

Excel

 Setting up protocols for each assessment

Other? Please suggest!

On-line access to materials at: https://wikispaces.com/join/HQ4FDRM

Code HQ4FDRM

Exit Slip

What do you want more of in the next class?

 What do you want less of in the next class?

 What new information or activities would you like to see in the next class?

dbrady3702@msn.com

Please email with any questions.

Download