Whole School Approaches to Standards and Improvement: “Intelligent
Accountability”
Claire Wyatt-Smith Val Klenowski
Griffith University Queensland University of Technology
Abstract
This paper focuses on standards-driven assessment reform and is based on research findings from a four-year, large-scale, federally funded Australian
Research Council Linkage project. The authors propose that moderation, using explicitly defined standards, provides opportunities for teachers to develop their assessment capability and to carry forward their professional responsibility for intelligent accountability. In 21 st
century assessment it is also proposed that teachers and students undertake assessment as a shared enterprise involving increased student participation in assessment so that their reliance on the teacher as the primary source of evaluative feedback is systematically reduced over time. A qualitative methodological approach was adopted to analyse the corpus of data collected over the four years. The findings are presented as they relate to the Australian Curriculum and
Achievement Standards and through a series of questions with direct application to ensuring dependable teacher judgment, standards and moderation. The paper concludes with recommendations for achieving dependable and sustainable assessment cultures that attend to ‘system’ and
‘site’ validity.
1
Whole School Approaches to Standards and Improvement: “Intelligent
Accountability”
Claire Wyatt-Smith
Griffith University
Val Klenowski
Queensland University of Technology
The proposition in this paper is that moderation, involving the use of explicitly defined standards, provides teachers with opportunities to inform their own assessment practice as part of their professional responsibility. A related proposition is that teachers also have a responsibility to induct students into knowledge about expectations of quality including knowledge of standards and their use in arriving at judgments of quality. The necessary shifts in 21 st century assessment are to enable teachers and students to undertake assessment as a shared enterprise and to build students’ assessment capabilities so that students’ reliance on the teacher as the sole, or primary source, of evaluative feedback is systematically reduced over time.
The paper is written in three parts and is based on research findings from a four-year, large-scale, federally funded Australian Research Council Linkage project. The focus of the study was on standards-driven assessment reform
(Wyatt-Smith & Klenowski, 2010; Wyatt-Smith, Klenowski, & Gunn, S, 2010), in partnership with the Queensland Studies Authority, the National Council for
Curriculum and Assessment for the Republic of Ireland, and Queen’s
University Belfast, Northern Ireland.
Part one of this paper presents important background information and an overview of the changing assessment context in Australia, especially as this relates to the Australian Curriculum and Achievement Standards. Through a series of questions, part two presents, research findings with direct application to ensuring dependable teacher judgment, standards and moderation. The paper concludes with recommendations for achieving dependable and sustainable assessment cultures that attend to ‘system’ and ‘site’ validity.
Part one: Background and Overview
Two major levers for current educational change in Australia include largescale high stakes standardised testing, and the understanding of the central role of the teacher in quality assessment practice, agreed to be at the heart of learning, and learning improvement. Also influential in recent reform efforts, is the system push for evidence of achievement tied to a commitment of transparency for accountability. The development of national student assessment, a national curriculum and reporting of school education outcomes, marks major educational reform in Australia.
Benchmark testing began in 1999 when the first annual literacy tests (reading and writing) for Year 3 and Year 5 students were conducted. In 2008 the
National Assessment Programme – Literacy and Numeracy (NAPLAN) was introduced, students in Years 3, 5, 7 and 9 sit the same national tests in
2
reading, writing, spelling, grammar and punctuation and numeracy. The nationally agreed literacy and numeracy benchmarks for Years 3, 5 and 7 represent minimum standards of performance. National Assessment
Programme assessments that take place also involve triennial sample assessments in science at Year 6, in civics and citzenship at Years 6 and 10 and in ICT literacy at Years 6 and 10 (Harrington, 2008). To date, despite these developments in national testing there has been no direct link of these tests to a national curriculum.
In 2007 the six states and two territories of Australia developed individual approaches to the use of standards in the implementation of curriculum, assessment and reporting. In February 2008 the interim National Curriculum
Board was established to set the core content and Achievement Standards in
Mathematics, Science, History and English from Kindergarten to Year 12. In
May 2009, the Australian Curriculum, Assessment and Reporting Authority
(ACARA) assumed responsibility for the work of the National Curriculum
Board (April 2008 - May 2009). In addition to the national curriculum ACARA assumed responsibility for a national assessment program, including the use of Achievement Standards aligned to the curriculum to measure students’ progress, and a national data collection and reporting program. The latter is intended to support analysis, evaluation, research, resource allocation, accountability and reporting on schools and broader national achievement.
The performance of individual schools is published on the My School website
(www.myschool.edu.au) which the federal government claims achieves transparency for parents to evaluate schools’ performance, and to target schools that are underperforming. On each school’s profile page a summary table of the school’s NAPLAN results is colour coded to indicate substantial differences between the results from the school compared with the Australian average and the results of statistically similar schools (ACARA, 2009).
While ACARA has responsibility for the management and the development of the Australian Curriculum, national student assessment and reporting of school education outcomes, each state of Australia has developed local systems to support teachers. In Queensland, the Queensland Studies
Authority introduced the Queensland Curriculum, Assessment and Reporting
(QCAR) framework of Essential Learnings, A-E standards and a common reporting framework to promote consistency of teacher judgment.
The findings reported here, emerged from this Australian Research Council
Linkage project that focused on how standards inform and regulate teacher judgment of student work in the middle years of schooling. The place of social moderation in developing consistency of teacher judgment was explored in different curriculum domains (English, Science and Mathematics), identifying the configural properties of teacher judgments and changes to those judgments. The data sets comprised survey responses collected at the beginning of the project in 2006, pre- and post-moderation interviews of participant teachers, audio and observation data of the moderation meetings
(face-to-face and ICT mediated) and artefacts from these meetings. Table 1 summarises the total corpus of data collected in 2007 and in 2008. Teachers
3
were interviewed before and after the moderation meetings and their talk was also tracked during the recorded moderation sessions. Those teachers who had their talk tracked were identified as focus teachers. Data collection in
2008 continued when the moderation meetings involving the Queensland
Common Assessment Tasks (QCATs) were held across the state.
All interview and meeting data were progressively transcribed in full for analysis by more than one researcher. A qualitative research paradigm was adopted. The interview transcripts were analysed using qualitative data analysis techniques of organising, matching, coding, identifying patterns and themes. The NVivo software package was used to sort the data into categories. Qualitative data analysis (Creswell, 2008) involved the researchers reading the transcripts and coding them individually. First the transcripts were read to ascertain the general content, they were then read again with the research questions used as a lens to highlight responses that related to the specific focus of each sub-study of the research. This time the texts were scanned for participants’ articulated responses that related to the research questions. Sections of transcripts were highlighted if they were evaluated as providing evidence of teachers’ views about the relationship of moderation and consistency of teacher judgement. A pre-determined set of descriptors (Bazeley, 2007; Lincoln & Guba, 1985; Miles & Huberman, 1994;
Saldaña, 2009) was identified for use in this coding exercise. This was “a formal inductive process of breaking down data into segments or data sets which can then be categorised, ordered and examined for connections, patterns and propositions that seek to explain the data” (Simons, 2009, p.
117). Once the data sets were coded they were then checked for accuracy and reliability. After the coding of the transcripts was checked, compared and analysed, themes aligned to the research questions were identified.
Table 1
Summary of Total Data Collection (2007 & 2008)
Focus teachers - gender
School numbers /
KLAs
Female
Primary
Special Schools
66
26
3
Male
(19 secondary level)
Secondary
Sector
23
20
Year 4 English, Science &
Maths
Year 6 English, Science &
Maths
State
Year 9 English, Science
& Maths
33 Catholic
Independent
10
6
Interviews
Moderation sessions
Pre-moderation
Face-to-face
90 Post-moderation
63 ICT
74
12
Part two: The practice of teacher judgment and moderation
TOTAL
89
49
49
164
75
4
Part two of this paper brings together some of the principal findings of this study that have been used by the industry partner, the Queensland Studies
Authority. These findings have helped to inform policy development and to support teachers in their professional responsibility to become assessment literate in a system that trusts teachers’ professional judgment, and seeks to support and sustain such efforts. This next section is presented as a series of questions, which represent those often asked by teachers and policy officers.
What is consistency of teacher judgment?
Consistency is achieved when two or more teachers assess a piece of student work and arrive at a comparable or “like” judgment expressed as a grade or mark.
Consistency is achieved when, at the end of the judgment process, there is agreement regarding the grade or mark to be awarded.
This involves teachers applying a shared understanding of those qualities that characterise the standards as they apply at different levels. If there is consistency of judgment then this will be evident in the comparability of teachers’ grading decisions.
The term comparability does not apply directly to the processes that teachers rely on to arrive at a judgment. It is accepted that these processes will vary from teacher to teacher and context to context. To emphasise then comparability is the outcome of informed use of the stated standards.
How do teachers work together to achieve consistency?
What are standards?
Standards describe the expected features or characteristics of quality at various levels of performance.
What is the role of standards?
The role of standards is to inform teachers’ decision-making in assessing the quality of student work at various levels. Standards provide a common set of stated reference points for teacher use and gain meaning through use over time. This is because standards, written as verbal descriptors, require interpretation and application in a community of practice.
Standards are important for informing teaching and learning in terms of the development of assessment tasks. In the teaching – learning cycle teachers share the standards with students to provide information about the expected qualities they are aiming for. In this way standards are linked to teacher feedback and student self and peer assessment.
What is the role of evidence?
Evidence used to make a judgment can take a range of forms: print, multimodal, spoken, performance, digital. Teachers decide what evidence is relevant for learning and teaching, judgment and reporting. Fitness for purpose is integral to these decisions whether these are for formative or diagnostic assessment or for summative assessment and reporting purposes.
5
Evidence is related to learning and teaching activities and it also reflects the qualities represented in the standards. The connection between the evidence and the standards must be articulated.
For making a judgment about a grade or mark the evidence of the characteristics or traits of performance should be apparent in the student work sample. Evidence is explicit and available for scrutiny by others. Consistency of judgment is possible through the direct links between the identified evidence of quality and the stated standards. In this way judgments are made defensible.
How do we reach agreement?
Agreement is possible when teachers concur on the fit between the evidence and the standards. This implies that the teachers share a common interpretation of the standards. This can only be developed with use over time. Discussion among teachers regarding the evidence depicting the qualities of the standards is fundamental. Through talk, and interactions, teachers make explicit how the qualities of the standards are met in student work. This reflects that there are various ways that the requirements of the standards can be met.
An on-balance judgment is reached when teachers consider the overall qualities of performance evidenced in the work being assessed. Teachers attend to the overall judgment of the qualities to assess the best fit between the work and the stated standards to reach a final grade.
What is the role of exemplars to achieve consistency?
Exemplars are useful in illustrating how the requirements of standards may be met. A range of exemplars to illustrate each grade helps teachers understand the application of standards to achieve that grade or mark. Exemplars can be developed by teachers or supplied by the system. Exemplars are useful for both teachers and students in informing teaching and learning. They are a vital support and illustrate the range of standards.
Where exemplars are of optimal use they include a commentary addressing how the standards have been applied. This involves making explicit the qualities in the work and how these align with the standards. In addition, the commentary identifies and explains the compensations and/ or trade-offs involved in arriving at the overall on-balance judgment.
What do we do if we can’t reach agreement?
Where teachers disagree about a grade, this usually reflects differing interpretations of the criteria and the standards. To address this teachers are advised to talk about how they interpreted and applied the criteria and standards to the particular piece of student work or folio of work. In this discussion it is vital that teachers focus attention on how the qualities of the work align with the standard.
6
Disagreement about the overall quality of work and the grade to be awarded can be attributable to: different interpretations of the standards that teachers have put in place; particular compensations or trade-offs; placing a greater value on certain qualities, and drawing on considerations other than those evident in the work (e.g., knowledge of the child). These aspects of judgment practice are routinely not evident in the grading decision. That is, typically they remain implicit or unstated. These factors for disagreement remain private and are not self-evident in the final grade awarded (letter or numeric score). They nevertheless operate and are influential in judgment as private practice and may lead to unfair or biased decisions. Moderation is therefore essential for deprivatising judgment practice.
What is the role of moderation procedures in ensuring comparability of teachers’ judgments in deciding results?
Moderation provides the opportunity for teachers to discuss how they apply the standards in relation to student work. It requires meetings of teachers within their own school, between schools, or in clusters of schools. This can involve face-to-face and/or technologically mediated meetings. Central to moderation is the matching exercise of the evidence and the standards through discussion of work samples that makes links explicit.
Moderation occurs after teachers have assessed and graded student work.
Moderation that focuses on the application of standards to student work is integral to system and local efforts to achieve consistency and comparability of judgment. In the process direct inter student comparison plays no part.
The focus is on the classification of the qualities in the work with the expected standards.
It is through moderation practice over time that teachers develop judgment practice that is dependable and defensible. They also develop a community of shared understanding of how the standards apply to student work in a range of contexts using exemplars. In this practice teachers develop a more confident identity as an assessor as they shift in their own trajectory of understanding standards. This opportunity of participating in moderation fulfils both learning and accountability purposes.
Moderation opportunities provide teachers with the context for reviewing the quality of the assessment tasks, and the extent to which these tasks enable the students to demonstrate achievement across the full range of the standards.
Another role of moderation is to provide a mechanism for demonstrating accountability in terms of comparability of individual teachers’ judgments. In this accountability context the focus is on how the standards have been consistently applied within and across school contexts. The judgments are scrutinised for their comparability. This is taken to include the reliability of teachers’ judgments and the application of the standards overtime.
Considerations or evidence not apparent in the student work should not influence the grade awarded.
7
How does teacher judgment inform learning improvement?
When teachers work with standards in their school communities in the context of moderation they share their understanding of those standards and how they apply them in developing assessment tasks or opportunities. As teachers become confident in their knowledge and application of the standards they integrate their use in teaching and learning. They attend to the assessment demands that students are expected to meet and design learning and assessment tasks that enable student success. This involves the alignment of their teaching with assessment including opportunities for students to use the standards to review and improve the quality of their work.
Teachers’ work extends to the induction of students into knowledge of the standards and their use in feedback for learning, self-monitoring over time and peer and self-assessment.
In this final section of the paper we provide four main recommendations for systems to develop, support and sustain assessment cultures that are dependable and which can contribute to intelligent accountability by providing increased opportunities for teachers to develop their professional responsibility in assessment and learning.
Part three: Recommendations for dependable, sustainable assessment cultures
There are four main concepts for maintaining dependable and sustainable assessment cultures.
First is the concept of ‘front-ending’ (Wyatt-Smith &
Bridges, 2008) the assessment and task design processes as the anchor for curriculum planning and teaching. Assessment should not be viewed as an endpoint or terminal activity, something tacked on at the end of the unit, or done after teaching and learning. We recommend that fundamental and productive changes in teaching practice can result from critical reflection on the assessment evidence to be collected before teaching begins.
The second concept relates to alignment of assessment, curriculum and pedagogy. We also recommend that assessment tasks and associated standards and criteria are developed within a unit of work, each informing the other and establishing a clear link between assessment, curriculum and pedagogy. Effective classroom practice requires that these are aligned.
Third is assessment as situated practice. Teachers need to draw on their understandings of the local context, curriculum knowledge and skills, and the literacy demands of assessment. Such demands present powerful barriers to student access to learning and success in schooling.
Finally, moderation starts at the stage of task design when teachers interrogate the quality and demands of the assessment activities and related tasks they are developing relative to the standards they plan to use for judging quality. Through a focus on assessment expectations and quality task design
8
prior to commencing a unit of work, teachers develop a shared language for talking about quality in the classroom and gain confidence in the feedback they give students.
References
Australian Curriculum, Assessment and Reporting Authority, (2009).
Curriculum design paper. Retrieved 30 June, 2010, from www.acara.edu.au/verve/_resources/Curriculum_Design_Paper_.pdf
Bazeley, P. (2007) Qualitative data analysis with NVivo (Los Angeles, SAGE).
Creswell, J. (2008) Educational research: planning, conducting, and evaluating quantitative and qualitative research (3rd ed.) (Upper Saddle River,
N.J, Pearson/Merrill Prentice Hall
Harrington, M. (2008). Australian Curriculum, Assessment and Reporting
Authority
Bill, Bill’s Digest Number 60, 2008-2009, Retreived 30 January, 2009, from http://www.aph.gov.au/library/Pubs/bd/2008-09/09bd060.htm
Lincoln, Y. S., & Guba, E. G. (1985) Naturalistic Inquiry (Beverly Hills, SAGE).
Miles, M., & Huberman, A. M. (1994) Qualitative data analysis: an expanded sourcebook (Thousand Oaks, Sage).
Saldaña, J. (2009) The coding manual for qualitative researchers (London,
SAGE).
Simons, H. (2009) Case study research in practice (Los Angeles, SAGE).
Wyatt-Smith, C.M.
, & Bridges, S. (2008). Meeting in the middle: Assessment, pedagogy, learning and students at educational disadvantage .
Final
Evaluation Report for the Department of Education, Science and Training on
Literacy and Numeracy in the Middle years of Schooling. Available at: http://education.qld.gov.au/literacy/docs/deewr-myp-final-report.pdf
Wyatt-Smith, C.M., & Klenowski, V. (2010). The role and purpose of standards in the context of national curriculum and assessment reform for accountability, improvement and equity in student learning. Curriculum
Perspectives , 30 (3), 37-47.
Wyatt-Smith, C. M., Klenowski, V., & Gunn, S. (2010). The centrality of teachers’ judgment practice in assessment: A study of standards in moderation. Assessment in Education: Principles, Policy & Practice, 17 (1),
59-75.
9