Nurse Education Today 32 (2012) e45–e48 Contents lists available at SciVerse ScienceDirect Nurse Education Today journal homepage: www.elsevier.com/nedt Why should we bother with assessment moderation? Colleen Smith ⁎ School of Nursing and Midwifery, Division of Health Sciences, University of South Australia, North Terrace Adelaide 5001, Australia a r t i c l e i n f o Article history: Accepted 18 October 2011 Keywords: Assessment moderation Community of practice Teaching and learning Quality assurance s u m m a r y Assessment moderation is a significant component of a quality education system. How this practice is conceptualised, applied to the assessment process and embedded in teaching and learning, influence the quality of nurse education programmes. This paper challenges the traditional view that moderation is confined to what happens at the time of assessment which is evident in the use of language such as pre-moderation and post-moderation practice. It critiques traditional moderation practices such as double marking, applying assessment criteria and standards and assigning marks and grades and argues that these practices don't do justice to the complexity of assessment. It calls for a whole of course approach to moderation based on a set of principles which encompass constructive alignment, a community of practice group, the subjective nature of assessment and a reflective quality improvement cycle. © 2011 Elsevier Ltd. All rights reserved. Introduction Higher education institutions are being pressured by governments to improve the quality of teaching and learning. This quality agenda is driven by the requirement by professional and higher education regulating agencies for detailed explanations about moderation practices, that is, how transparent, valid, fair and reliable assessment outcomes are guaranteed (Bloxham, 2009). Given that “assessment drives student learning” (Ramsden, 2003) and moderation is a process of quality assurance (Miller, 2000) we should bother with assessment moderation in order to instil professional and community confidence in the quality of the institution's programmes and the graduates it produces. There is much in the literature acknowledging and critiquing traditional moderation practices however, little attention is given to conceptualising moderation as a whole of course approach. How we understand moderation, how we embed moderation into teaching and learning and how we apply moderation to the assessment process are important considerations for a whole of course approach. It was this claim that led to challenging the traditional view that moderation is only what happens at the time of assessment and led to a re-conceptualisation of moderation that focuses on the whole of course approach. Moderation Practices Whilst moderation occurs at a number of levels the focus of this article is on what happens at the course/subject/unit level and often ⁎ Tel.: + 61 8 830 22038; fax: + 61 8 830 22168. E-mail address: colleen.smith@unisa.edu.au. 0260-6917/$ – see front matter © 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.nedt.2011.10.010 referred to as internal moderation. At this level it is about the learning and teaching practices implemented to ensure transparent, valid, fair and reliable assessment outcomes (Bloxham, 2009). Moderation at the course/subject/unit level is referred to as course moderation throughout the remainder of the article. Traditionally, moderation of courses in higher education has focused on what happens during the assessment process. This has resulted in associating moderation with practices such as double marking, applying assessment criteria and standards and assigning marks and grades (Sadler, 2005; Yorke et al., 2000; Miller, 2000). This association is further reflected in the literature where moderation is defined as “a process for assuring that an assessment outcome is valid, fair and reliable and that marking criteria have been applied consistently” (Bloxham, 2009, p.4). Why bother with moderation if it is limited to what happens at the time of assessment? Whilst it is acknowledged there are gains undertaking moderation at this time, when coupled with other practices outside the time of assessment, transparent, valid, fair and reliable assessment outcomes are improved. This is evident in the work of Mahmud et al. (2010, p.9) who point out that moderation ‘encompass all stages from the planning and operationalisation of assessment design and marking through to the post hoc review of judgements made about students' results or grades’. Rust et al. (2005), also advocate for a whole of course approach through the application of a social constructivist position that begins by utilising the principles of constructive alignment (Biggs, 1999) to link objectives, assessment and teaching methods to support student learning (Rust et al., 2005). Adopting a whole of course approach depicts how the achievements of learning outcomes are assessed, how teaching methods and learning activities prepare students for assessment and how the traditional assessment moderation practices are applied. Without this e46 C. Smith / Nurse Education Today 32 (2012) e45–e48 whole of course approach, the validity and reliability of assessment outcomes are questionable. What are the benefits and controversies of traditional moderation practices such as double marking, assigning marks and grades and applying assessment criteria and standards? What improvements can be made to these practices to ensure transparent, fair, valid and reliable assessment outcomes and how can these help inform a whole of course approach? Double Marking Double marking is a common form of assessment moderation (Bloxham, 2009). Some assessors view this practice as important as it provides reassurance that they are on the “right track” (Yorke et al., 2000), and allows assessors to compare their marking and grade allocation with colleagues (Hand and Clewes, 2000). Others claim that it ensures quality and offers fairness and consistency for students (Hand and Clewes, 2000). In contrast, some don't value this form of moderation practice, suggesting that it does not “mean the system is reliable” (Rust, 2007, p.233) as two assessors may have different reasons for awarding similar marks (Rust, 2007). Others regard it as a process that needs to be undertaken to tick appropriate quality assurance boxes (Hand and Clewes, 2000). Earlier work by Partington (1994) suggests that double marking is problematic, particularly when less experienced assessors know the mark awarded by more experienced assessors. In these situations there is a tendency for the less experienced marker to converge towards the mark of the experienced marker (Hand and Clewes, 2000). More recent work expands on Partington's view highlighting how power relations between experienced and novice assessors influence assessment moderation decisions (Hand and Clewes, 2000; Orr, 2007; Rust, 2007). A further concern is the assumption that the selection and double marking of a small sample of papers is representative of the whole student group and will lead to valid and reliable marking standards (Bloxham, 2009; Rust, 2007). This assumption is problematic, particularly when dealing with large student cohorts. Furthermore, moderation practices including double marking, do not take into account the complexity of assessment marking and variables such as the intricacies of applying the assessment criteria, the amount of marking undertaken within a particular time frame, marking fatigue, or the level of experience and knowledge of the marker (Bloxham, 2009). Despite these concerns, double marking does have a number of advantages particularly when it brings together communities of assessors who are there for a common purpose of reaching a shared understanding (Price, 2005). It allows assessors to discuss and debate the standards and qualities of the assessment in an open honest way (Hand and Clewes, 2000). It is also a valuable staff development activity, particularly for inexperienced assessors (Bloxham, 2009) However, it should not be the only form of moderation and should be undertaken as a whole of course approach, adopting other forms of assessment moderation. Assigning Marks and Grades There are varying practices of assigning marks and grades to assessment items. For example, they may be assigned according to a 100-point scale, a weighting point scale (analytical grading) or a holistic grading. When using a point scale method there is the potential for students to receive marks at the upper or lower end of the scale however in reality, students rarely receive marks outside the 35–70 range on a 100-point scale (Rust, 2007) unless it is for an assessment item where answers are either correct or incorrect (Yorke et al., 2000). Rust questions what it means if a student gets a mark for a particular assessment item because unless it linked to some standard or criteria, this mark has little meaning ‘about the strengths and weaknesses, knowledge and skills of the student’ (Rust, 2007, p.234). Sadler (2009) refers to an analytic grading which involves using a weighting point scale to allocate marks to each assessment criterion. Assessors make separate qualitative judgements about each criterion with the assigned mark being an aggregate that is converted into a grade. Sadler (2009) goes on to argue that this is a reductionist approach as it views each criterion in isolation. It also assumes that the sum of the assessment components represent the whole and fails to consider the assessment item in a holistic way. Holistic grading involves the marker using criteria to build a mental picture of the student's achievement as they progress through marking the assessment. A qualitative judgement is made about the quality of the work and this is matched to the relevant standard or point on the grading scale (Sadler, 2009). In this approach, the criteria and standards are not the dominant feature, instead, the assessor's qualitative judgement dominates. Sadler (2009, p.165) goes on to point out the discrepancies between analytic and holistic grading, stating that an assessment item judged “as ‘brilliant’ overall may not rate as outstanding one each criterion” and vice versa. Bloxham (2009) and Rust (2007) question the reliability of these marking practices and whether you can precisely assign marks to complex assessment items. They also attest with an early view by Hand and Clewes (2000) and Orr (2007), that mark allocation is a victim of hierarchical power struggles. This is evident when less experienced assessors wait to hear from a more experienced marker and then err on awarding a similar mark rather than having their professional judgement questioned (Rust, 2007; Hornby, 2003). This power relationship results in accepting the views of the more experienced assessors, rightly or wrongly and the opportunity for meaningful discussion is lost (Rust, 2007; Hand and Clewes, 2000). What is the role of moderation in assigning marks and grades particularly when assessing higher order thinking that requires scholarly judgement? At this level, reliability and validity issues should be addressed recognising if “assessments are unreliable, the validity of any inference drawn from appraisal is weakened” (Sadler, 2009, p.162). Applying Assessment Criteria and Standards Assessment criteria and standards are commonly used to assess student learning. They have the potential to motivate and improve learning along with improving the standard of work students produce (Ecclestone, 2001). What role does moderation play in the development and interpretation of these criteria and standards, given that assessors share different views about assessment criteria resulting in varying interpretations? Some assessors have a clear idea of what is expected from the criteria and assess accordingly, others have an intuitive approach where they prefer and use their own guidelines and mark according to their own experience of being marked (Yorke et al., 2000; Hand and Clewes, 2000). These practices create tensions, particularly when students are required to work with common assessment criteria, yet assessors choose their own interpretations (Hand and Clewes, 2000). Ecclestone (2001) propose 2 important elements. Firstly, assessors need to actively engage with the development and interpretation of criteria and standards of achievement. Assessment is a subjective activity thus a shared understanding among assessors about what these mean for practice and opportunities for assessors to engage in discussion, debate and ongoing interactions to share views will assist to address the validity and reliability issue in grading (Holroyd, 2000). Secondly, students need to know what they have to do to achieve a certain standard (Ecclestone, 2001). Vague descriptors used to describe standards, such as ‘adequate’, ‘inadequate’ and ‘good’ have different meanings to different assessors and also to students. The challenging task here is for assessors to have an agreed C. Smith / Nurse Education Today 32 (2012) e45–e48 upon understanding of these terms and to more precisely explain to students what ‘adequate’, ‘inadequate’ and ‘good’ work looks like (Hand and Clewes, 2000). This level of transparency will avoid students ‘second guessing’ the assessment requirements and constantly seeking clarification from their marker so they can write for the marker. According to Ecclestone (2001, p.311), ‘currently, moderation and assessment with students seem not to be seen as parallel process of ongoing socialisation into what ‘standards’ mean’. Assessors are better prepared to do this with students if they have been involved in the development and interpretation of standards of achievement. Given this critique of traditional moderation, the following practices should be embedded into the principles that drive a whole of course approach. The practices include: • ensuring the type and form of assessment is linked to the learning outcomes (Sadler, 2009), • embedding learning and teaching activities that prepare students for assessment (Rust et al., 2005), • bringing together assessors to discuss marking and the application of criteria and standards using actual students work (Price, 2005), • limiting the number of criteria to be assessed, to increase the potential for agreement and reduce the difficulties in assimilating a large number of standards (Elander, 2002), • involving assessors in briefing students about the criteria and standards and students having opportunities to discuss these prior to submitting their work (Price, 2005). Through incorporating these practices into a set of principles that inform a whole of course approach to moderation acknowledges the complexity of the moderation process and helps address issues concerning transparent, valid, fair and reliable assessment outcomes. A Whole of Course Approach to Assessment Moderation In higher education moderation practices continue to focus on what happens at the time of assessment and include practices such as double marking, assigning marks and grades and developing assessment criteria and standards. This is evident in the use of language such as pre-moderation and post-moderation practices which implies that moderation is confined to the actual assessment task. From this observation and a critique of traditional moderation practices, the following set of principles for a whole of course approach emerged. Assessment is an Essential Component of Course Design Assessment moderation should be part of the initial course design. Using principles of constructive alignment the learning outcomes are determined, assessment tasks are conceptualised and developed to demonstrate the achievement of the outcomes and learning activities devised to assist students with the assessment tasks (Biggs, 1999). All these systems need to be aligned and this alignment should be transparent to students. Students need to know how these systems are linked and how the learning activities will assist them with their assessment. As part of the initial course design and whilst conceptualising assessment, assessment criteria and standards are developed with involvement from assessors who will be assessing students' work. Good alignment provides a strong incentive for students to actively engage with learning and teaching activities and their learning is purpose driven (Rust, 2002). Assessors' involvement at this early stage allows an opportunity for discussions and debates about the course design and ‘to understand the types and roles of criteria in the teaching and learning cycle’ (Elwood and Klenowski, 2002, p.249). By adopting this principle, moderation becomes a whole of course approach rather than limited to what happens at the time of assessment. e47 Community of Practice Group The function of a community of practice group is to achieve a shared set of principles and understanding about moderation. The group focuses on a common goal (Knight, 2002; Holroyd, 2000) and has the ‘opportunity to communicate, cooperate and collaborate’ (Holroyd, 2000, p.35). However, bringing together a group does not necessarily constitute the formation of a community of practice. It requires a sense of joint enterprise with a shared understanding of the group purpose, a level of mutual engagement that evolves and grows through engaging in ongoing dialogue and sharing practices and exchanging ideas in a non-confronting environment. It also requires a repertoire of tools and resources to reflect on its history and impact on practice (Wegner, 2000). This necessitates input from all staff involved in moderation in developing a shared vision and purpose for the group including what it means to be part of the group, establishing how the group will function, share tacit knowledge and how it will interact with systems outside the group and vice versa (Wegner, 2000). Achieving a shared understanding is a complex process but overtime as the group matures, a system will evolve that guarantees a fairer, more consistent, reliable outcome for students (Dunn and Wallace, 2008). Acknowledging and Embracing the Subjective Nature of Assessment Moderation of assessment is a subjective activity which will always require a degree of professional judgement (Sadler, 2005). No two individuals bring to the assessment process the same level of knowledge, understanding and experience (Yorke et al., 2000). A whole of course approach acknowledges this and focuses on strategies for communication among the community practice group and between this group and students, about how their performance will be judged. It involves opportunities for sharing interpretations and for assessors to assimilate their subjective judgements with other assessors. Judgements about criteria and standards by which students are assessed acquire meaning overtime as community practice groups meet and share understanding (Wyatt-Smith et al., 2010; Harlen, 2005). Only then are we able to move towards having a process that is fairer, more consistent, reliable and transparent (Sadler, 2009). Adopting a Reflective Quality Improvement Cycle Moderation is accepted as a quality improvement process and is one way to ensure students are receiving a high quality education. It requires a whole of course approach to ‘improve the quality of the assessment and the course itself’ (Miller, 2000, p.255). It must focus on improving learning and teaching processes including assessment (Houston, 2008). Through a community of practice group, issues about quality can be subjected to critical review and reflection and it is through this process that quality improvement opportunities arise (Houston, 2008). Through a group process of reflecting on assessment moderation practices, what worked well and what did not work well and debating what are the minimum criteria for safe quality care we are then able to identify and make recommendations for change. These recommendations can then feed forward to the next course offering and it is through this continuing, spiral process of quality improvement, that assessment moderation practices will improve. Conclusion Why bother with assessment moderation? As the quality agenda in higher education is at the forefront of ensuring the quality of student learning it is vital that traditional assessment moderation practices are challenged. Through adopting a whole of course approach, we refocus moderation away from what only happens at the time of e48 C. Smith / Nurse Education Today 32 (2012) e45–e48 the assessment, to a practice that encompasses principles of constructive alignment, the development of a community of practice group, the adoption of a reflective quality improvement cycle whilst also acknowledging the subjective nature of assessment. Given the subjective nature of assessment, moderation practices will never be completely void of validity and reliability issues. However, it is through ongoing discussion and critiques through a community of practice group that we can progress to developing more valid, fair and reliable assessment of students and reassure higher education regulating agencies the profession and community of the quality of programmes and the graduates produced. References Biggs, J., 1999. Teaching for Quality Learning at University. Society for Research into Higher Education & Open University Press, Buckingham, Philadelphia. Bloxham, S., 2009. Marking and moderation in the UK: false assumptions and wasted resources. Assessment & Evaluation in Higher Education 34 (2), 209–220. Dunn, L., Wallace, M., 2008. Intercultural communities of practice. In: Dunn, L., Wallace, M. (Eds.), Teaching in Transnational Higher Education. Routledge, New York, pp. 249–259. Ecclestone, K., 2001. ‘I know a 2:1 when I see it’: understanding criteria for degree classification in franchised university programmes. Journal of Further and Higher Education 25 (3), 301–313. Elander, J., 2002. Developing aspect-specific assessment criteria for examination answers and coursework essays in psychology. Psychology Teaching Review 10 (1), 31–51. Elwood, J., Klenowski, V., 2002. Creating communities of shared practice: the challenges of assessment use in learning and teaching. Assessment & Evaluation in Higher Education 3 (3), 301–313. Hand, L.Y., Clewes, D., 2000. Marking the difference: an investigation of the criteria used for assessing undergraduate dissertations in a business school. Assessment & Evaluation in Higher Education 25 (1), 5–21. Harlen, W., 2005. Teachers' summative practices and assessment of learning-tensions and synergies. Curriculum Journal 16 (2), 207–223. Holroyd, C., 2000. Are assessors professional? Active Learning in Higher Education 1 (1), 28–44. Hornby, W., 2003. Assessing using grade-related criteria: a single currency for universities? Assessment & Evaluation in Higher Education 28 (4), 435–454. Houston, D., 2008. Rethinking quality and improvement in higher education. Quality Assurance in Education 16 (1), 61–79. Knight, P., 2002. The Achilles' Heel of quality: the assessment of student learning. Quality in Higher Education 8 (1), 107–115. Mahmud, S., Sanderson, G., Yeo, S., Briguglio, C., Wallace, M., Hukam-Singh, P., Thuraingsam, T., 2010. Moderation for fair assessment in transnational learning and teaching literature review. Retrieved August 15, 2011 from http://altc-tnemoderation.wikispaces.com/file/view/ALTCLitReview6April2010.pdf. Miller, P., 2000. Moderation as a tool for continuous improvement. Retrieved September 5th, 2011 from http://www.in-site.co.nz/misc_links/papers/miller255.pdf. Orr, S., 2007. Assessment moderation: constructing the marks and constructing the students. Assessment & Evaluation in Higher Education 32 (6), 645–656. Partington, J., 1994. Double-marking students' work. Assessment & Evaluation in Higher Education 19 (1), 57–60. Price, M., 2005. Assessment standards: the role of communities of practice and the scholarship of assessment. Assessment & Evaluation in Higher Education 30 (3), 215–230. Ramsden, P., 2003. Learning to Teach in Higher Education. Routledge, London. Rust, C., 2002. The impact of assessment on student learning. Active Learning in Higher Education 3 (2), 145–158. Rust, C., 2007. Towards a scholarship of assessment. Assessment & Evaluation in Higher Education 32 (2), 229–237. Rust, C., O'Donovan, B., Price, M., 2005. A social constructivist assessment process model: how the research literature shows us this could be best practice. Assessment & Evaluation in Higher Education 30 (3), 231–240. Sadler, D.R., 2005. Interpretations of criteria-based assessment and grading in higher education. Assessment & Evaluation in Higher Education 30 (2), 175–194. Sadler, D.R., 2009. Indeterminacy in the use of preset criteria for assessment and grading. Assessment & Evaluation in Higher Education 34 (20), 159–179. Wegner, E., 2000. Communities of practice and social learning systems. Organization 7 (2), 225–246. Wyatt-Smith, C., Klenowski, V., Gunn, S., 2010. The centrality of teachers' judgement practice in assessment: a study of standards in moderation. Assessment in Education: Principles, Policy & Practice 17 (1), 59–75. Yorke, M., Bridges, P., Woolf, H., 2000. Mark distributions and marking practices in UK higher education. Active Learning in Higher Education 1 (1), 7–27.