NorthboroughSouthborough DDM Development October 9, 2014 Dr. Deborah Brady Northborough-Southborough DDM 1 - MCAS (SGP) For teachers who receive a SGP from MCAS (grades 4-8 for ELA and Math only) The District is only required to use median Student Growth Percentiles (SGP) from one MCASarea per teacher. In the first year, the K-5 DDM will focus only on MCAS ELA. In grades 6-12, the MCAS focus may be either math or ELA. The DDM rating is based on the SGP (student growth) and not the scaled scores (student achievement). DDM 1 - Common Assessment For teachers who do not receive a SGP from MCAS: Teachers will develop grade level/course common assessments utilizing a pre and post assessment model. DDM 2 - Common Assessment For all teachers: Teachers will develop grade level/course common assessments utilizing a pre- and post-assessment model Goal: 2014-2015 (DDMs must be negotiated with our Associations) Content Student Learning DDMs *Core Content Areas (Core areas: math, English, science, and social studies) Year 1: Identify first two (of four) unique DDM data elements Alignment of DDM’s with Massachusetts Curriculum Frameworks Identify/develop DDMs by common grades (K-12) and content Create rubric Collect first year of data Year 2: Identify second two (of four) unique or utilize 2014-2015 DDM’s (same assessment different students) Note: Consumer science, applied arts, health & physical education, business education, world language and SISPs – Received a one year waiver Planning: Identify/develop DDMs for 2015-2016 implementation • Collect first year of data 2015-2016 Core DDMs ELA Math Science Social Studies 12 CA/CA CA/CA CA/CA CA/CA 11 CA/CA CA/CA CA/CA CA/CA 10 CA/CA CA/CA CA/CA CA/CA 9 CA/CA CA/CA CA/CA CA/CA 8 MCAS SGP/CA MCAS SGP/CA CA/CA CA/CA 7 MCAS SGP/CA MCAS SGP/CA CA/CA CA/CA 6 MCAS SGP/CA MCAS SGP/CA CA/CA CA/CA 5 MCAS SGP/CA MCAS SGP/CA 4 MCAS SGP/CA MCAS SGP/CA 3 CA/CA CA/CA 2 CA/CA CA/CA 1 CA/CA CA/CA Quality Assessments Substantive Aligned with standards of Frameworks, Vocational standards And/or local standards Rigorous Consistent in substance, alignment, and rigor Consistent with the District’s values, initiatives, expectations Measures growth (to be contrasted with achievement) and shifts the focus of teaching Scoring Student Work Districts will need to determine fair, efficient and accurate methods for scoring students’ work. DDMs can be scored by the educators themselves, groups of teachers within the district, external raters, or commercial vendors. For districts concerned about the quality of scoring when educators score their own student’s work, processes such as randomly re-scoring a selection of student work to ensure proper calibration or using teams of educators to score together, can improve the quality of the results. When an educator plays a large role in scoring his/her own work, a supervisor may also choose to include the scoring process into making a determination of a Student Impact. Some Possible Common Exam Examples A Valued Process: PORTFOLIO: 9-12 ELA portfolio measured by a locally developed rubric that assesses progress throughout the four years of high school K-12 Writing or Writing to Text: A district that required that at least one DDM was “writing to text” based on CCSS appropriate text complexity Focus on Data that is Important: A HS science department assessment of lab report growth for each course (focus on conclusions) “New CCSS” Concern: A HS science department assessment of data or of diagram or video analysis More CCSS Math Practices: A HS math department’s use of PARCC examples that require writing asking students to “justify your answer” SS Focus on DBQs and/or PARCC-like writing to Text: A social studies created PARCC exam using as the primary sources. Another social stuies department used “mini-DBQs” in freshman and sophomore courses Music: Writing about a concert Common Criteria Rubrics for Grade Spans: Art (color, design, mastery of medium), Speech (developmental levels) More Measure the True Goal of the Course: Autistic and behavioral or alternative programs and classrooms, Socialemotional development of independence (whole collaborative—each educator is measuring) SPED “Directed Study” Model—now has Study Skills explicitly recorded by the week for each student and by quarter on manila folder: Note taking skills, text comprehension, reading, writing, preparing for an exam, time management, and differentiated by student A Vocational School’s use of Jobs USA assessments for one DDM and the local safety protocols for each shop Assessing Math Practices Communicating Mathematical Ideas Clearly constructs and communicates a complete response based on: a response to a given equation or system of equations a chain of reasoning to justify or refute algebraic, function or number system propositions or conjectures a response based on data How can you assess these standards? Demonstrating Growth Billy Bob’s work is shown below. He has made a mistake In the space to the right, solve the problem on your own on the right. Then find Billy Bob’s mistake, circle it and explain how to fix it. Billy Bob’s work ½ X -10 = -2.5 +10 = +10 Your work Finding the mistake provides students with a model. Requires understanding. Requires writing in math. _____________________________________________ ½X +0 = +12.5 (2/1)(1/2)X =12.5 (2) X=25 Explain the changes that should be made in Billy Bob’s Work A resource for DDMs. A small step? A giant step? The district decides Which of the three conjectures are true? Justify your answer Determine if each of Michelle’s three conjectures are true. Justify each answer. Rubrics and grading: numbers good or a problem? Objectivity versus Subjectivity Calibration Human What judgment and assessment is objective about a multiple choice test? Calibrating Common What Use standards in using rubrics understanding of descriptors does “insightful,” “In-depth,” “general” look like? exemplars to keep people calibrated Assess collaboratively with uniform protocol Consistency in Directions for Administrating Assessments Directions to teachers need to define rules for giving support, dictionary use, etc. What can be done? What cannot? “Are you sure you are finished?” How much time? Accommodations and modifications? Qualitative Methods of Determining an Assessment’s VALIDITY Looking at the “body of the work” Validating an assessment based upon the students’ work Floor and ceiling effect If you piled the gain scores (not achievement) into High, M, and Low gain Is there a mix of at risk, average, and high achievers mixed throughout each pile or can you see one group mainly represented Low, Moderate, High Growth Validation Did your assessment accurately pinpoint differences in growth? 1. Look at the LOW pile If you think about their work during this unit, were they struggling? 2. Look at the MODERATE pile. Are these the average learners who learn about what you’d expect of your school’s student in your class? 3. Look at the HIGH achievement pile. Did you see them learning more than most of the others did in your class? Based on your answers to 1, 2, and 3, Do you need to add questions (for the very high or the very low?) Do you need to modify any questions (because everyone missed them or because everyone got them correct?) Psychometric process called Look at specific students’ work Body of the Work validation Tracey is a student who was rated as having high growth. James had moderate growth Linda had low growth Investigate each student’s work Effort Teachers’ perception of growth Other evidence of growth Do the scores assure you that the assessment is assessing what it says it is? Objectivity versus Subjectivity Multiple Choice Questions Human What judgment and assessment is objective about a multiple choice test? What is subjective about a multiple choice test? Make sure the question complexity did not cause a student to make a mistake. Make sure the choices in M/C are all about the same length, in similar phrases, and clearly different Rubrics and Inter-Rater Reliability Getting words to mean the same to all raters Category 4 3 2 1 Resources Effective use Adequate use Limited use Inadequate use Development Highly focused Focused response Inconsistent response Lacks focus Organization Related ideas support the writers purpose Has an organizational structure Ideas may be repetitive or rambling No evidence of purposeful organization Language conventions Well-developed command Command; errors don’t interfere Limited or inconsistent command Weak command Protocol for Developing Inter Rater Reliability Before scoring a whole set of papers, develop Inter-rater Reliability Bring High, Average, Low samples (1 or 2 each) (HML Protocol) Use your rubric or scoring guide to assess these samples Discuss differences until a clear definition is established Use these first papers as your exemplars When there’s a question, select one person as the second reader Annotated Exemplar How does the author create the mood in the poem? Answer and explanation in the student’s words Specific substantiation from the text The speaker’s mood is greatly influenced by the weather. The author uses dismal words such as “ghostly,” “dark,” “gloom,” and “tortured.” “Growth Rubrics” May Need to Be Developed Pre-conventional Writing Ages 3-5 2 Relies primarily on pictures to convey meaning. 2 Begins to label and add “words” to pictures. 2 Writes first name. 1 Demonstrates awareness that print conveys meaning. ? Makes marks other than drawing on paper (scribbles). ? Writes random recognizable letters to represent words. J Tells about own pictures and writing. Emerging Ages 4-6 2 Uses pictures and print to convey meaning. 2 Writes words to describe or support pictures. 2 Copies signs, labels, names, and words (environmental print). 1 Demonstrates understanding of letter/sound relationship. ? Prints with upper case letters. ? Matches letters to sounds. ? Uses beginning consonants to make words. ? Uses beginning and ending consonants to make words. J Pretends to read own writing. J Sees self as writer. J Takes risks with writing. 2 2 1 ? ? ? ? ? ? Developing Ages 5-7 Writes 1-2 sentences about a topic. Writes names and familiar words. Generates own ideas for writing. Writes from top to bottom, left to right, and front to back. Intermixes upper and lower case letters. Experiments with capitals. Experiments with punctuation. Begins to use spacing between words. Uses growing awareness of sound segments (e.g., phonemes, syllables, rhymes) to write words. Spells words on the basis of sounds without regard for conventional spelling patterns. ? Uses beginning, middle, and ending sounds to make words. J Begins to read own writing. ? Protocols to Use with Implemented Assessments Floor and Ceiling Effects Validating the Quality of Multiple Choice Questions Inter-Rater Reliaibility with Rubrics and Scoring guides Low-Medium-High Looking at Student Work Protocol (calibration, developing exemplar, developing action plan) FAQ from DESE Do the same numbers of students have to be identified as having high, moderate, and low growth? There is no set percentage of students who need to be included in each category. Districts should set parameters for high, moderate, and low growth using a variety of approaches. How do I know what low growth looks like? Districts should be guided by the professional judgment of educators. The guiding definition of low growth is that it is less than a year’s worth of growth relative to academic peers, while high growth is more than a year’s worth of growth. If the course meets for less than a year, districts should make inferences about a year’s worth of growth based on the growth expected during the time of the course. Can I change scoring decisions when we use a DDM in the second year? It is expected that districts are building their knowledge and experience with DDMs. DDMs will undergo both small and large modifications from year to year. Changing or modifying scoring procedures is part of the continuous improvement of DDMs over time. Will parameters of growth be comparable from one district to another? Different assessments serve different purposes. While statewide SGPs will provide a consistent metric across the Commonwealth and allow for district-to-district comparisons, DDMs are selected Calculating Scores What you need to understand as you are creating assessments 288 to 244/ 25 SGP 4503699 230 to 230/ 35 SGP 214 to 225/ 92 SGP 248 to 244/ 25 SGP 4503699 230 to 230/ 34 SGP 214 to 225/ 92 SGP Median student growth percentile Last name SGP Lennon 6 McCartney 12 Starr 21 Harrison 32 Jagger 34 Richards 47 Crosby 55 Stills 61 Nash 63 Young 74 Joplin 81 Hendrix 88 Jones 95 Imagine that the list of students to the left are all the students in your 6th grade class. Note that they are sorted from lowest to highest SGP. The point where 50% of students have a higher SGP and 50% have a lower SGP is the median. Median SGP for the 6th grade class Sample Cut Score Determination (for local assessments) Student Scores Pre-test Post test Difference 20 35 15 5 25 30 5 15 30 50 20 20 35 60 25 25 35 60 25 40 70 40 Sorted low to high Teacher score is based on the MEDIAN Score of her class for each DDM Cut score LOW Growth Lowest ___% 25 median teacher score 35 25 median Teacher score 65 25 25 50 75 25 30 50 80 30 35 50 85 35 35 Top 20% Cut score HIGH GROWTH Highest ___? Important Perspective It is expected that districts are building their knowledge and experience with DDMs. DDMs will undergo both small and large modifications from year to year. Changing or modifying scoring procedures is part of the continuous improvement of DDMs over time. We are all learners in this initiative. Next Steps Today Begin to Develop Common Assessments Consider Rigor and Validity (Handout Rubrics) Develop Rubric (Consider scoring concerns) Develop Common Expectations for Directions (to Teachers) Other Important Considerations: Consider when assessments will be given The amount of time they will take The impact on the school Handout Rubrics Bibliography—Sample exams; sample texts Rubrics Types of questions (Multiple choice, essay, performance Reliability Will you design 2 exams, pre- and post- Ultimate validity Does it assess what it says it does? How does it relate to other data Step-by-step, precise considerations (DESE) Quality Rubric (all areas) Protocol for determining growth scores