NAPLAN and MySchool – impact on Assessment and Curriculum Margaret Wu 1 • “There are three kinds of lies: lies, damned lies, and statistics.” Mark Twain, based on part of a phrase attributed to Benjamin Disraeli 2 Statistical complexity – to stop debate • When situations become calculable it is taken to indicate the fact that political contestation has ended. (Andrew Barry, 2002, p. 272) 3 Statistical complexity – to stop debate • (Julia Gillard, in reply to a journalist, National Press Club, 24/2/10, cited in Williss, 2010) We have obviously had public debate about the ICSEA index ... I do have a standing offer to any journalist who has read Barry McGaw’s book on meta‐analysis and would like to sit through and work through the regression equations with him, anybody who wants to do that, a standing invitation to come to my office for the number of days necessary to get that done. 4 Claims about NAPLAN data: Smh: Nov 16, 2010 Giving a preview of My School version two on Wednesday, Ms Gillard said the additional facts would help parents make more informed decisions about where to send their children. The Australian: Nov 18, 2010 The extra data will give principals greater ability to identify poorly performing teachers. Smh: Nov 17, 2010 PARENTS will be able to measure how much an individual school improves the performance of their child on the federal government's My School website. 5 Key Issues • Large margins of error at student and class levels. • (reliability) • Student factors contribute more to student achievement than teacher/school factors do. • validity • ICSEA does NOT explain away all student level factors. 6 David can answer 60% of the items (if we have the opportunity to administer all items) (a) 16 × 10 = Possible Grade 5 Mathematics Item Pool – Many questions can be asked Each apple weighs around 160 grams. How many apples together will weigh close to half a kilogram? 40 questions are sampled from the large item pool NAPLAN 2008 Test NAPLAN 2009 Test NAPLAN 2010 Test David’s score: 20/40 David’s score: 25/40 David’s score: 28/40 David’s test scores on similar NAPLAN tests will have a range of 10 score points (e.g., between 20/40 to 30/40) On the NAPLAN scale… NAPLAN 2008 reading scores 800 700 600 2.5%tile 500 mean 97.5%tile 400 300 8 200 grade 3 grade 5 grade 7 grade 9 On the NAPLAN scale… NAPLAN 2008 reading scores 800 700 600 2.5%tile 500 mean 97.5%tile 400 300 9 200 grade 3 grade 5 grade 7 grade 9 Teacher effect NAPLAN 2008 reading scores 800 • A high performing teacher can raise student 700 standards by one more year of growth as compared to a low performing teacher. 600 2.5%tile excellent teacher 500 mean 1.5 year of growth 97.5%ti 400 average teacher 1 year of growth 300 Ineffective teacher 0.5 year of growth 200 grade 3 grade 5 grade 7 grade 9 Margin of error surrounding class average is as large as the whole range of teacher effect. 10 Summary about margin of error • One single test, once a year • Not much information that the teacher does not already know • NAPLAN was not designed to inform about individual students. • NAPLAN is a BLUNT instrument for measuring teaching effect. 11 • “Torture numbers, and they will confess to anything.” Gregg Easterbrook 12 ICSEA • Idea of ICSEA is to control for student level factors • Assumption is that after adjusting for ICSEA, what is left is school effect. 13 Factors contributing to student achievement NAPLAN 2008 reading scores ?% variation 800 ?% variation ?% variation 700 600 10% of the student scores variation is due to school/teacher effect 500 400 Everyone equal 300 200 grade 5 + Variation + Variation + Variation due to natural ability/ personality differences due to student background due to school/ teacher effect 14 School effect Non-School effect Variation of school mean scores After ICSEAVariation within likeschool group 15 Red/Green bars do NOT reflect school effects Donnelly (April 14, 2011, ABC The Drum) • On the back of schools being forced to implement Australia’s National Assessment Program Literacy and Numeracy tests (NAPLAN) at years three, five, seven and nine beginning in 2008, Australia’s education ministers have called for a similar testing and accountability regime to monitor and evaluate Australian parents. • The new tests, called National Assessment Program Mums and Dads (NAPMAD), will be taken annually by all parents and will improve parent productivity by benchmarking Australian homes against world best practice. 16 Drawing statistical inference • • • • What does “accurate” mean? NAPLAN tests students Used to make inferences about teachers A leap of faith There is 36% chance that he is effective, 48% chance of being average, and 16% chance of being ineffective And I am only 90% confident of the above statement I want to know if the principal at Eastbourne primary school is effective! 17 Conviction by Numbers I want to know the truth! It ain’t statistics. Darrell Huff, How to Lie with Statistics • Given any group, there will always be someone at the lower end, by chance. • After examining a whole group, the chance of finding someone at the lower end is a certainty. 18 Conviction by Numbers • Statistical information alone can never be used to make high stakes decisions • Certainly not to label schools/teachers as effective or not effective. • Teacher pay linked to student performance? • Punitive actions against teachers based on test results must be stopped. 19 Finally… Darrell Huff 20