Accountability Measures and School League Tables Robert Coe Capita workshop , 15th July 2014 Outline Evidence on impact of accountability Typology of accountability systems Moral leadership ∂ What should we do? 2 Who wants accountability? Direct incentives drive people’s behaviour – Policymakers – Economists – Parents ∂ Negative side-effects outweigh benefits – Teachers – Education researchers – Parents 3 Evidence on impact of accountability Robert Coe 4 Research evidence Meta-analysis of US studies by Lee (2008) – Small positive effects on attainment (ES=0.08) Impact of publishing league tables (England vs Wales) (Burgess et al 2013) ∂ – Overall small positive effect (ES=0.09) – Reduces rich/poor gap – No impact on school segregation Other reviews: mostly agree, but mixed findings Lack of evidence about long-term, important outcomes 5 Evidence from PISA DfE Accountability response: ‘OECD evidence shows that a robust accountability framework is essential to improving pupils’ achievement’ (DfE, 2013) ∂ What the report actually said: ‘there is no measurable relationship between…various uses of assessment data for accountability purposes and the performance of school systems’ (OECD, 2010, p46) 6 Dysfunctional side effects Extrinsic replaces intrinsic motivation Narrowing focus on measures Gaming (playing silly games) ∂ Cheating (actual cheating) Helplessness: giving up Risk avoidance: playing it safe Pressure: stress undermines performance Competition: sub-optimal for system 7 Accountability cultures Distrust Controlled Fear Threat Competitive Target-focus Image presentation Quick fix Tick-list quality Sanctions Trust Autonomous Confidence Challenge Supportive ∂ Improvement-focus Problem-solving Long-term Genuine quality Evaluation Accountability and improvement Professional Monitoring Systems Official Accountability Systems ∂ If you find a problem with your performance, what do you do? Cover it up Expose it to view. (Tymms, 1999) Overall evidence-based conclusions Easy to cherry-pick ‘[E]ducational policy makers and practitioners should be cautioned against relying exclusively on research that is consistent with their ideological positions to support or criticize the current high-stakes testing policy movement’ (Lee, 2008, p. 639) Direct incentives do drive people’s behaviour; ∂ current evidence suggests accountability has small positive effects on attainment Accountability systems always seem to have some undesirable side-effects Balance of positive & negative effects likely to depend on a range of factors; current knowledge does not allow us to predict confidently 10 Moral leadership 11 ∂ 12 Hard questions 1. Imagine there was no accountability. What would you do differently? 2. Would students be better off as a result? a) No – I wouldn’t do anything at all differently ∂ b) Not significantly – minor presentational changes only c) Yes – students would be better off without accountability 3. What actually stops you doing this? 13 Ways forward 14 Making Accountability Work 1. 2. 3. 4. 5. 6. Reclaim professionalism Experiment to optimise Improve the measures ∂ Make teacher assessment robust Uncertainty and unpredictability No substitute for judgement (Coe & Sahlgren, 2014) 15 1. Reclaim professionalism Take the pledge: “We do what’s right for children and young people, not just what Ofsted might want” Commit to supporting other schools/teachers who suffer as a result ∂ – Need evidence of great teaching, from robust evaluation and monitoring: can’t just support any school/teacher judged inadequate – Important that it is not just the ‘failed’ school/teacher that complains – Social media campaigns can be very effective: @OldAndrewUK vs Ofsted 16 2. Experiment to optimise Should accountability have – Explicit (eg PRP, schools ‘academised’) or implicit (challenge, compare) incentives? – Performance published or confidential? – Interpreted judgements ∂ or objective data? – Improvement through consequences or feedback? – Focus on information for consumers (eg parents) or professionals? We don’t know, so need to experiment 17 3. Improve the measures Choose measures that are genuinely aligned with what is valued (& hard to distort) Ensure assessments/qualifications are predictive of later success Measure a wide range of outcomes Look at distributions, not ∂ just thresholds Use delayed outcomes: eg for 11-16 – % NEET @ 18 – % entering elite university courses Build in loophole-closing mechanisms (eg re difficulty/value of ‘equivalents’) 18 4. Make teacher assessment robust Training in assessment and moderation Link teacher assessed mark distribution to within-centre exam mark distribution Spot checks (risk targeted): can students ∂ reproduce it? Support whistle-blowing Signed declarations from teachers, headteachers and students Questionnaire audit of practices: ‘too good to be true’ triggers spot check 19 5. Uncertainty and unpredictability State general aims, but be vague/flexible about specific targets/measures Change the targets and monitor who chases ∂ Make assessments less predictable (more capricious?) 20 6. No substitute for judgement Combine statistical measures with face-to-face observation & judgement Require inspectors to demonstrate their ability ∂ to make sound judgements about complex data, from observation, etc Actively look for (and publicise) gaming and unintended consequences; encourage whistleblowing on counter-productive gaming Summary … www.cem.org 1. Evidence on accountability is not great, but suggests small positive impacts 2. Dysfunctional side-effects are also real 3. We need experiments to learn how to optimise @ProfCoe Robert.Coe@cem.dur.ac.uk 4. Moral leadership is required