1 Embedded formative assessment: still more rhetoric than reality National Conference of The Schools Network 2011 Dylan Wiliam www.dylanwiliam.net Origins and antecedents 2 Feedback (Wiener, 1948) Developing range-finders for anti-aircraft guns Effective action requires a closed system within which Actions taken within the system are evaluated Evaluation of the actions leads to modification of future actions Two kinds of loops Positive (bad: leads to collapse or explosive growth) Negative (good: leads to stability) “Feedback is information about the gap between the actual level and the reference level of a system parameter which is used to alter the gap in some way” (Ramaprasad, 1983 p. 4) Feedback and instructional correctives (Bloom) What’s wrong with the feedback metaphor? 3 In education Feedback is any information given to the student about their current performance … or at best, information that compares current performance with desired performance. Much rarer is information that can be used by learners to improve In engineering That’s just data That’s just a thermostat That’s a feedback system Feedback has complex effects 4 264 low and high ability grade 6 students in 12 classes in 4 schools; analysis of 132 students at top and bottom of each class Same teaching, same aims, same teachers, same classwork Three kinds of feedback: scores, comments, scores+comments Scores Comments Achievement Attitude no gain High scorers : positive Low scorers: negative 30% gain High scorers : positive Low scorers : positive Butler(1988) Br. J. Educ. Psychol., 58 1-14 Responses 5 Scores Comments Achievement Attitude no gain High scorers : positive Low scorers: negative 30% gain High scorers : positive Low scorers : positive What do you think happened for the students given both scores and comments? A. B. C. D. E. Gain: 30%; Attitude: all positive Gain: 30%; Attitude: high scorers positive, low scorers negative Gain: 0%; Attitude: all positive Gain: 0%; Attitude: high scorers positive, low scorers negative Something else Students and grades 6 Feedback is not always effective 7 200 grade 5 and 6 Israeli students Divergent thinking tasks 4 matched groups experimental group 1 (EG1); comments experimental group 2 (EG2); grades experimental group 3 (EG3); praise control group (CG); no feedback Achievement EG1>(EG2≈EG3≈CG) Ego-involvement (EG2≈EG3)>(EG1≈CG) Butler (1987) J. Educ. Psychol. 79 474-482 Feedback should feed forward 8 80 Grade 8 Canadian students learning to write major scales in Music Experimental group 1 (EG1) given written praise list of weaknesses workplan Experimental group 2 (EG2) given oral feedback nature of errors chance to correct errors Control no group (CG1) given feedback Achievement: EG2>(EG1≈CG) Boulet et al. (1990) J. Educational Research 84 119-125 …and should leave learning with the learner 9 ‘Peekability’ (Simmonds & Cope, 1993) Pairs of students, aged 9-11 Angle and rotation problems class 1 worked on paper class 2 worked on a computer, using Logo Class 1 outperformed class 2 ‘Scaffolding’ (Day & Cordón, 1993) 2 grade 3 classes class 1 given ‘scaffolded’ response class 2 given solution when stuck Class 1 outperformed class 2 Effects of feedback 10 Kluger & DeNisi (1996) Review of 3000 research reports Excluding those: without adequate controls with poor design with fewer than 10 participants where performance was not measured without details of effect sizes left 131 reports, 607 effect sizes, involving 12652 individuals On average feedback does improve performance, but Effect sizes very different in different studies 40% of effect sizes were negative Getting feedback right is hard 11 Feedback indicates performance… exceeds goal falls short of goal Change behavior Exert less effort Increase effort Change goal Increase aspiration Reduce aspiration Abandon goal Decide goal is too easy Decide goal is too hard Reject feedback Feedback is ignored Feedback is ignored Response type Feedback practice audit 12 How often do students receive ‘feedback’ in the form of scores, levels, sub-levels, or grades? A.Key stages 1 to 3 B.Key stage 4 C.Key stage 5 1. 2. 3. 4. 5. Every week Every two or three weeks Every month or half-term Termly/twice a year Annually Kinds of feedback (Nyquist, 2003) 13 Weaker feedback only Feedback only KCR+ explanation (KCR+e) Moderate formative assessment KoR + clear goals or knowledge of correct results (KCR) Weak formative assessment Knowledge or results (KoR) (KCR+e) + specific actions for gap reduction Strong formative assessment (KCR+e) + activity Effects of formative assessment (HE) 14 Kind of feedback Count Effect/sd Weaker feedback only 31 0.14 Feedback only 48 0.36 Weaker formative assessment 49 0.26 Moderate formative assessment 41 0.39 Strong formative assessment 16 0.56 Feedback practice audit 2 15 In your school, what proportion of feedback events involve students in responding to the feedback provided immediately, and in class? 1. 2. 3. 4. 5. Less than 10% 10% to 30% 30% to 70% 70% to 90% More than 90% Unfortunately, humans are not machines… 16 Attribution (Dweck, 2000) Personalization (internal v external) Permanence (stable v unstable) Essential that students attribute both failures and success to internal, unstable causes (it’s down to you, and you can do something about it) Personalization Success Failure internal: “I got a good grade because I did a good piece of work” internal: “I got a low grade because it wasn’t a very good piece of work” external: “I got a good grade because the teacher likes me” external: “I got a low grade because the teacher doesn’t like me stable: “I got a good grade stable: “I got a bad grade because because I’m good at that subject” I’m no good at that subject” Stability unstable: “I got a good grade because I was lucky in the questions that came up” Specificity unstable: “I got a bad grade because I hadn’t reviewed the material before the test” specific: “I’m good at that but specific: “I’m no good at that but that’s the only thing I’m good at” I’m good at everything else” global: “I’m good at that means I’ll be good at everything” global: “I’m useless at everything” Mindset 18 Views of ‘ability’ fixed (IQ) incremental (untapped potential) Essential that teachers inculcate in their students a view that ‘ability’ is incremental rather than fixed (by working, you’re getting smarter) Force-field analysis (Lewin, 1954) 19 What are the forces that will support or drive the adoption of formative assessment practices in your school/authority? + What are the forces that will constrain or prevent the adoption of formative assessment practices in your school/authority? — “Flow” 20 A dancer describes how it fees when a performance is going well: “Your concentration is very complete. Your mind isn’t wandering, you are not thinking of something else; you are totally involved in what you are doing. … Your energy is flowing very smoothly. You feel relaxed, comfortable and energetic.” A rock climber describes how it feels when he is scaling a mountain: “You are so involved in what you are doing [that] you aren’t thinking of yourself as separate from the immediate activity. … You don’t see yourself as separate from what you are doing.” A mother who enjoys the time spent with her small daughter: “Her reading is the one thing she’s really into, and we read together. She reads to me and I read to her, and that’s a time when I sort of lose touch with the rest of the world, I’m totally absorbed in what I’m doing.” A chess player tells of playing in a tournament: “… the concentration is like breathing—you never think of it. The roof could fall in and, if it missed you, you would be unaware of it.” (Csikszentmihalyi, 1990, pp. 53–54) Motivation: cause or effect? 21 high arousal Flow anxiety challenge control worry relaxation apathy boredom low low Csikszentmihalyi (1990) competence high Providing feedback that moves learning on 22 Key idea: feedback should: Cause thinking Provide guidance on how to improve Comment-only marking Focused marking Explicit reference to mark schemes/scoring guide Suggestions on how to improve: Not giving complete solutions Re-timing assessment: e.g., three-quarters-of-the-way-through-a-unit test A blossoming of research reviews… 23 Fuchs & Fuchs (1986) Natriello (1987) Crooks (1988) Bangert-Drowns, et al. (1991) Dempster (1991, 1992) Elshout-Mohr (1994) Kluger & DeNisi (1996) Black & Wiliam (1998) Nyquist (2003) Brookhart (2004) Allal & Lopez (2005) Köller (2005) Brookhart (2007) Wiliam (2007) Hattie & Timperley (2007) Shute (2008) Effects of formative assessment 24 Standardized effect size: differences in means, measured in population standard deviations Source Kluger & DeNisi (1996) Black &Wiliam (1998) Wiliam et al., (2004) Effect size 0.41 0.4 to 0.7 0.32 Hattie & Timperley (2007) Shute (2008) 0.96 0.4 to 0.8 Problems with effect sizes 25 Restriction of range Sensitivity to instruction Ambiguous comparisons Definitions of formative assessment 26 We use the general term assessment to refer to all those activities undertaken by teachers—and by their students in assessing themselves— that provide information to be used as feedback to modify teaching and learning activities. Such assessment becomes formative assessment when the evidence is actually used to adapt the teaching to meet student needs” (Black & Wiliam, 1998 p. 140) “the process used by teachers and students to recognise and respond to student learning in order to enhance that learning, during the learning” (Cowie & Bell, 1999 p. 32) “assessment carried out during the instructional process for the purpose of improving teaching or learning” (Shepard et al., 2005 p. 275) 27 “Formative assessment refers to frequent, interactive assessments of students’ progress and understanding to identify learning needs and adjust teaching appropriately” (Looney, 2005, p. 21) “A formative assessment is a tool that teachers use to measure student grasp of specific topics and skills they are teaching. It’s a ‘midstream’ tool to identify specific student misconceptions and mistakes while the material is being taught” (Kahl, 2005 p. 11) 28 “Assessment for Learning is the process of seeking and interpreting evidence for use by learners and their teachers to decide where the learners are in their learning, where they need to go and how best to get there” (Broadfoot et al., 2002 pp. 2-3) Assessment for learning is any assessment for which the first priority in its design and practice is to serve the purpose of promoting students’ learning. It thus differs from assessment designed primarily to serve the purposes of accountability, or of ranking, or of certifying competence. An assessment activity can help learning if it provides information that teachers and their students can use as feedback in assessing themselves and one another and in modifying the teaching and learning activities in which they are engaged. Such assessment becomes “formative assessment” when the evidence is actually used to adapt the teaching work to meet learning needs. (Black et al., 2004 p. 10) Which of these is formative? 29 A. A science adviser uses test results to plan professional development workshops for teachers B. Teachers doing item-by-item analysis of KS2 math tests to review their curriculum C. A school tests students every 10 weeks to predict which students are “on course” to pass a big test D. “Three fourths” of the way through a unit test E. Exit pass question: “What is the difference between mass and weight?” F. “Sketch the graph of y equals one over one plus x squared on your mini-dry-erase boards.” What does formative assessment form? 30 Cycle length Long Medium Short Student involved assessment Student engagement Teacher cognition about learning Curriculum alignment Monitoring progress Responsive classroom practice Formative assessment: a new definition 31 “An assessment functions formatively to the extent that evidence about student achievement elicited by the assessment is interpreted and used to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions that would have been taken in the absence of that evidence.” (Wiliam, 2009) Formative assessment involves the creation of, and capitalization upon, moments of contingency in the regulation of learning processes. Unpacking formative assessment 32 Key processes Establishing where the learners are in their learning Establishing where they are going Working out how to get there Participants Teachers Peers Learners Unpacking formative assessment 33 Where the learner is going Teacher Peer Learner Clarifying, sharing and understanding learning intentions Where the learner is How to get there Providing Engineering effective discussions, tasks, and feedback that moves learners activities that elicit forward evidence of learning Activating students as learning resources for one another Activating students as owners of their own learning Five “key strategies”… 34 Clarifying, sharing, and understanding learning intentions Engineering effective classroom discussions, tasks and activities that elicit evidence of learning feedback Activating students as learning resources for one another classroom discourse, interactive whole-class teaching Providing feedback that moves learners forward curriculum philosophy collaborative learning, reciprocal teaching, peer-assessment Activating students as owners of their own learning metacognition, motivation, interest, attribution, self-assessment Wiliam & Thompson (2007) Unpacking formative assessment 35 Where the learner is going Teacher Peer Learner Where the learner is How to get there Using evidence of Clarifying, achievement to adapt what sharing and understanding happens in classrooms to learning intentionsmeet learner needs Clarifying, sharing, and understanding learning intentions Sharing learning intentions 37 3 teachers each teaching 4 Year 8 science classes in two US schools 14 week experiment 7 two-week projects, each scored 2-10 All teaching the same, except: For a part of each week Two of each teacher’s classes discusses their likes and dislikes about the teaching (control) The other two classes discusses how their work will be assessed White & Frederiksen, Cognition & Instruction, 16(1), 1998 Sharing learning intentions 38 Comprehensive Test of Basic Skills Group Likes and dislikes Reflective assessment Low Middle High Outcomes 39 Who will benefit most from the reflective assessment? 1. 2. 3. 4. Higher achievers Average achievers Lower achievers All students will benefit equally Sharing learning intentions 40 Comprehensive Test of Basic Skills Group Low Middle High Likes and dislikes 4.6 5.9 6.6 Reflective assessment Sharing learning intentions 41 Comprehensive Test of Basic Skills Group Low Middle High Likes and dislikes 4.6 5.9 6.6 Reflective assessment 6.7 7.2 7.4 Sharing learning intentions 42 Explain learning intentions at start of lesson/unit: Consider providing learning intentions and success criteria in students’ language. Use posters of key words to talk about learning: Learning intentions Success criteria e.g., describe, explain, evaluate Use planning and writing frames judiciously Use annotated examples of different standards to “flesh out” assessment rubrics (e.g., lab reports) Provide opportunities for students to design their own tests Engineering effective discussions, activities, and classroom tasks that elicit evidence of learning Eliciting evidence 44 Key idea: questioning should cause thinking provide data that informs teaching Improving teacher questioning generating questions with colleagues closed v open low-order v high-order appropriate wait-time Medicine Hat Tigers 45 A major junior (ice) hockey team playing in the Central Division of the Eastern Conference of the Western Hockey League in Canada Players are aged from 15 to 20 15 year olds are only allowed to play five games until their own season has ended Each team is allowed only three 20 year olds Total roster 25 players Medicine Hat Tigers 46 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1 2 3 4 5 6 7 8 Eliciting evidence 47 Getting away from I-R-E basketball rather than serial table-tennis ‘No hands up’ (except to ask a question) ‘Hot Seat’ questioning All-student response systems ABCD cards, Mini white-boards, Exit passes Nothing new under the sun… 48 Eliciting evidence practice audit 49 In what proportion of lessons in your school would a teacher use an ‘all student response’ system at least every 30 minutes? 1. 2. 3. 4. 5. Less than 10% 10% to 30% 30% to 70% 70% to 90% More than 90% Hinge questions 50 A hinge question is based on the important concept in a lesson that is critical for students to understand before you move on in the lesson. The question should fall about midway during the lesson. Every student must respond to the question within two minutes. You must be able to collect and interpret the responses from all students in 30 seconds Questioning in maths: Diagnosis 51 In which of these right-angled triangles is a2 + b2 = c2 ? A b a B a c C b a b D c c b c E c a a b F b c a Questioning in science: Diagnosis 52 The ball sitting on the table is not moving. It is not moving because: A. B. C. D. E. no forces are pushing or pulling on the ball. gravity is pulling down, but the table is in the way. the table pushes up with the same force that gravity pulls down gravity is holding it onto the table. there is a force inside the ball keeping it from rolling off the table Wilson & Draney, 2004 Questioning in English: Diagnosis (2) 53 Which of these is correct? A. Its on its way. B. It’s on its way. C. Its on it’s way. D. It’s on it’s way. Questioning in English: Diagnosis (3) 54 Identify the adverbs in these sentences: 1. The boy ran across the street quickly. (A) (B) (C) (D) (E) 2. Jayne usually crossed the street in a leisurely fashion. (A) (B) (C) (D) (E) 3. Fred ran the race well but unsuccessfully. (A) (B) (C) (D) (E) Questioning in English: Diagnosis (4) 55 Which of these is the best thesis statement? A. The typical TV show has 9 violent incidents B. The essay I am going to write is about violence on TV C. There is a lot of violence on TV D. The amount of violence on TV should be reduced E. Some programs are more violent than others F. Violence is included in programs to boost ratings G. Violence on TV is interesting H. I don’t like the violence on TV Questioning in history: Diagnosis 56 Why are historians concerned with bias when analyzing sources? A. People can never be trusted to tell the truth B. People deliberately leave out important details C. People are only able to provide meaningful information if they experienced an event firsthand D. People interpret the same event in different ways, according to their experience E. People are unaware of the motivations for their actions F. People get confused about sequences of events Questioning in MFL: Diagnosis 57 Which of the following is the correct translation for ”I give the book to him”? A. B. C. D. E. F. Yo lo doy el libro. Yo doy le el libro. Yo le doy el libro. Yo doy lo el libro. Yo doy el libro le. Yo doy el libro lo. Key requirement: discriminate between incorrect and correct cognitive rules 58 Version 1 Version 2 There are two flights per day from Newtown to Oldtown. The first flight leaves Newtown each day at 9:20 and arrives in Oldtown at 10:55. The second flight from Newtown leaves at 2:15. At what time does the second flight arrive in Oldtown? Show your work. There are two flights per day from Newtown to Oldtown. The first flight leaves Newtown each day at 9:05 and arrives in Oldtown at 10:55. The second flight from Newtown leaves at 2:15. At what time does the second flight arrive in Oldtown? Show your work. Activating students as owners of their own learning Self-assessment: Portugal 60 45 teachers studying for a Masters degree in Education, matched in age, qualifications and experience using the same curriculum scheme for the same amount of time Control group (N=20) follow Experimental group (N=25) regular MA program develop self-assessment with their students 117 students aged 8 years 119 students aged 9 years 77 students aged 10 - 14 years 125 students aged 8 years 121 students aged 9 years 108 students aged 10 - 14 years Fontana & Fernandes, Br. J. Educ. Psychol. 64: 407-417 Details of the intervention 61 Weeks Intervention 1 to 2 Individual choice from a range of work provided by the teacher. Student self-assessment using materials provided 3 to 6 Children construct own problems like those in weeks 1 and 2 and select structured math apparatus to aid solutions 7 to 10 Children presented with a new learning objectives, and make up their own problems, without exemplars by the teacher 11 to 14 Children set their own learning objectives, construct appropriate problems, and use appropriate self-assessment 15 to 20 As weeks 1 to 14, but with less monitoring from the teacher and increased freedom of choice and personal responsibility Impact on student achievement 62 Pre-test Post-test Gain Effect size Control 65.1 72.9 7.8 0.34 Experimental 58.7 73.7 15.0 0.66 Students owning their own learning 63 Students assessing their own work: With mark schemes or scoring guides With exemplars Self-assessment of understanding: Traffic lights Red/green discs Coloured cups Activating students as learning resources for one another Benefits of structured interaction 65 15-yr-olds studying World History were tested on their understanding of material delivered in lessons At the end of the lessons, students were given time to review their understanding of the material before they were tested Half the students had been trained to pose questions as they listened to the lectures Individual Group Unstructured Independent review Group discussion Structured Structured peerquestioning Structured selfquestioning King, A. (1991). Applied Cognitive Psychology, 5(4), 331-346 Impact on achievement 66 100 Structured peer questioning 90 Score 80 Structured selfquestioning 70 Group discussion 60 Independent review 50 40 Pre Post 10-day King, A. (1991). Applied Cognitive Psychology, 5(4), 331-346 Students as learning resources 67 Students assessing their peers’ work: “Pre-flight checklist” “Two stars and a wish” Training students to pose questions/identifying group weaknesses End-of-lesson students’ review Pulling it all together Dual-pathway model (Boekaerts, 1993) 69 “It is assumed that students who are invited to participate in a learning activity use three sources of information to form a mental representation of the task-in-context and to appraise it: 1.current perceptions of the task and the physical, social, and instructional context within which it is embedded; 2.activated domain-specific knowledge and (meta)cognitive strategies related to the task; and 3.motivational beliefs, including domain-specific capacity, interest and effort beliefs.” (Boekaerts, 2006, p. 349) Growth and well-being 70 Share learning goals with students so that they are able to monitor their own progress toward them. Promote the belief that ability is incremental rather than fixed; when students think they can’t get smarter, they are likely to devote their energy to avoiding failure. Make it more difficult for students to compare themselves with others in terms of achievement. Provide feedback that contains a recipe for future action rather than a review of past failures. Use every opportunity to transfer executive control of the learning from the teacher to the students to support their development as autonomous learners, and as learning resources for one another Use random questioning and all-student response systems to provide high-quality evidence to the teacher about the progress of learning Force-field analysis (Lewin, 1954) 71 What are the forces that will support or drive the adoption of formative assessment practices in your school/authority? + What are the forces that will constrain or prevent the adoption of formative assessment practices in your school/authority? —