Committing to quality learning through adaptive online assessment Di Challis Deakin University diana@deakin.edu.au Abstract: Assessment lies at the heart of the university undergraduate experience. It defines for students their sense of what is significant in their study how they will prioritise their time and effort. Yet there is a forceful and compelling literature on assessment that claims much current practice in the higher education sector encourages surface learning where the extrinsic motivation is to focus on disaggregated selected details of content rather than intrinsically to seek deep understanding through constructing knowledge (see, eg, Ramsden, 1992). Further, there is now a more widely shared view within the sector that assessment should fulfil a plurality of purposes and that students need opportunities to self- and peerassess as part of a quality learning experience. With advances in computer-based technologies and the emergence of e-learning, there are unprecedented opportunities to reconsider assessment of learning (and, axiomatically, of teaching) and how this can be undertaken. One approach is adaptive assessment. Although it has existed in the tertiary environment since the time of the oral examination, advanced technologies allow much fuller exploitation of the possibilities inherent in a dynamic system of testing that responds to the user. Having described the characteristics of adaptive assessment this paper will consider how it can achieve significant pedagogical aims within the sector. The paper will differentiate between adaptive assessment to assist learning and adaptive assessment to assess achievement. How adaptive assessment can be put in place and salient issues, such as security and system integrity, when such assessment is used for credit, will then be discussed. Keywords: adaptive assessment online testing security Introduction As Brown and Knight (1994, p12) contend, “it is not the curriculum which shapes assessment, but assessment which shapes the curriculum and embodies the purposes of higher education”. Yet, too often for complacency, let alone comfort, assessment is poorly understood and hence inadequately conceived. A further troubling reality is that, beyond the design of the test instrument, decisions about the ways in which students have responded and the results that will/will not lead to their successful completion of their study, are frequently relegated to junior members of faculty with consequent concerns about reliability. Conventional summative assessment, as epitomised by examinations where the questions invite regurgitation of information, much of which is rote learnt and readily forgotten, is unlikely to provide students with ways to demonstrate their knowledge and understanding and to produce graduates who can readily meet the professional demands of their chosen discipline in the twenty-first century. With greater attention given to the purposes and practices of assessment in higher education over the last decade, there is now a more widely shared view within the sector that assessment needs to fulfil a plurality of purposes and that students need opportunities within the process to self- and peer-assess as part of their learning experience. Certainly, the view that assessment in the higher education context is solely a judgement of academic worth by the provision of a score at the completion of a program of a study is now widely regarded as a serious oversimplification of a complex matter. In the Introduction to Assessing Learning in Universities Nightingale et al (1996, p6) point to three pressures for change: the desire to develop – and consequently to assess - a much broader range of student abilities; the desire to harness the full power of assessment and feedback in support of student learning; the belief that “education should lead to a capacity for independent judgement and an ability to evaluate one’s own performance”, contending that “these abilities can only be developed through involvement in the assessment process”. With the mainstreaming of online courses, there are unprecedented opportunities to respond positively to such pressures for changeto reconsider assessment of learning (and, axiomatically, of teaching) and to explore new possibilities and harness new opportunities. This view is supported by others. For instance, in the Australian Universities Teaching Committee funded Assessing Learning in Australian Universities (2002, p.4), James et al contended that on-line assessment offered an “unparalleled opportunity for rethinking assessment in higher education”. In 1999, in their conclusion to the text Computer-Assisted Assessment, Brown et al averred: The capabilities of computers to deliver unique assessments should and will be exploited. The power of networks to transfer, upload and download data automatically should also be exploited. The benefits to student learning of receiving specific, timely and encouraging feedback are utilised by only a few. The advantages of detailed feedback on student and group performance, delivered rapidly enough to allow academics to deal with student misconceptions during a module and enhancing student learning, are yet to be achieved (p.197). As Blacharski (2001) claims adaptive testing has been around longer than computers with Alfred Binet, inventor of the IQ test, creating the first one in the late 1800s. In a sense, adaptive testing in the tertiary environment also has a long history as it has existed since the time of the oral examination, and responding adaptively to students’ responses is a recognised and customary teaching strategy. However, although major IT vendors and global educational testers increasingly are moving towards adaptive computer-based testing, it it is foreign to the experience of the vast majority of undergraduatesnot only to those whose assessment is by customary written form but also to those whose assessment is, at least in part, through the Internet. This paper considers but one way in which computer capability can be exploited to benefit education, adaptive testing using online technologies, with the discussion restricted to undergraduate study. What characterises adaptive assessment? Adaptive assessment is most frequently seen as synonymous with adaptive testing (Ray, 2001). For the purposes of this discussion ‘adaptive assessment’ will be construed as computer-aided adaptive testing. The distinctive element of adaptive testing is that, as the student responds to test items, the items change to reflect the performance on the preceding item(s). This means that, by using a statistical method of discovery, any adaptive test is designed on the requirement that the level of the user is constantly established. Commencing at a moderate level, as the user answers questions items become more difficult/complex or easier, depending on the demonstrated level of answer. The score is not derived from the number of correct answers but, rather, the level of difficulty of the questions answered correctly. Such tests can be designed to be terminated when a student has reached a required level of proficiency/competency or that performance has reached the user’s highest level. A further possibility for self- or group testing is that the user(s) can specify an entry level and the sequencing of the test items will adapt to that. Arguably the greatest strength of computer-adaptive testing is that it is responsive to the user. Foster (2001) offers a useful analogy by comparing the experience of taking a computerized adaptive test (CAT) with participating in a high-jump event. The high-jumper, regardless of ability, quickly reaches a challenging level where there is about an equal chance of clearing the bar or knocking it down. The ‘score’ is the last height that was successfully cleared and is earned without having to jump all possible lower heights or trying the higher levels. It is important to differentiate between online assessment tools. Those readily delivered through learning management systems such as WebCT and Blackboard and tools such as QUIZIT, although having the significant advantages of collation and analysis tools and immediate feedback of results, are fixed, rather than dynamic, for they are not adaptive. Where there is a form of adaptive testing (such as with WebCTVista, for example) it is based on the summative score rather than response to individual questions and their perceived level of difficulty. It is important, also, to differentiate between adaptive testing that is used for certification, to assess individual achievement and to rank (ie summatively), and adaptive testing that is used to assist learning (ie formatively). In this latter sense the term focuses more on customising the testing to inform learning and is part of the developmental process. As Gouli et al (2002) point out, this can be used by both the learner and the teacher: learners can observe their personal learning progress and decide how further to direct their learning process and tutors can individually support learners and formulate judgements about the quality and effectiveness of the provided instruction. In this case adaptive testing can provide important diagnostic tools that contribute meaningfully to learning. The benefits of adaptive testing In summative assessment The most frequently claimed main benefit of computerised adaptive testing is efficiency as the level of attainment can be determined with fewer questions. As Gouli et al (2002) claim, adaptive assessment is less tedious for those undertaking the testing because the generated questions are tailored to their level of proficiency and so they do not have to attempt numerous questions that are either too easy or too difficult for them. They contend that research shows that, as well as being more efficient, a more accurate estimation of the learner’s level of proficiency is established. This is of unquestioned value and is the hallmark of summative assessment which “produces a measure which sums up someone’s achievement and which has no other real use except as a description of what has been achieved” (Brown & Knight, 1994, p15). As Brown et al (1999, p32) recognise, in the context of higher education it is important to identify and differentiate higher-order cognitive skills, and to test for these. It is reductionist to regard objective testing as solely gauging factual recall for it is readily possible to test a wider range of learning outcomes such as analysis, synthesis, evaluation and application. As Dempster (1999) claims, the adoption of more flexible question formats and forms of feedback are strong incentives to rethinking computer-based assessment as well-designed tests can facilitate deeper probing of student’s knowledge as well as their understanding of concepts. Illustrations of such approaches are case study scenarios where information is provided from multiple perspectives and solutions can be developed by the building on screen of complex relationships and flow processes. While James et al (2002, p.24) correctly point out that on-line examinations are likely to require more time and effort than conventional pen and paper examinations, they also recognise that computers offer the potential to present students with more complex scenarios through the use of interactive resources (images, sound, simulation). They do not consider adaptive testing, but this undeniably requires considerably enhanced skills to design effective test instruments. With the rapid increase in systems to provide adaptive testing using a variety of methods (see, eg, Gouli et al, 2002) this situation is likely to become far more viable in the next few years. While computer-based testing is inappropriate in some disciplines and for evidencing skills in some areas, where the intention is to test skills in core areas it is not fanciful to see consortia employing experts to prepare banks of such tests that can be used nationally and, with global enrolments, internationally. This would do much to improve confidence in reliability of assessment and comparability of standards. In formative assessment Conventional means of assessing student learning are typically invitations to students to conceal and camouflage areas where they lack knowledge as they strive to present the best possible case. Students are understandably loathe to reveal any perceived weaknesses and, often, those they have failed to identify are forced to their attention by questions they feel incapable of responding adequately to. When this occurs during a face-to-face encounter with mentor and peers, social consequences of revealed inadequacy and failure contrive to mitigate against effective learning. The clear advantage of online testing as part of formative assessment is that it is in a private space and, in most cases, it can be accessed at a time when the student feels appropriate. Where the test allows students to pursue areas of perceived weakness as well as to affirm areas of strength, the learning becomes more useful. This is not to contend, however, that all students will respond positively to online quizzes and their ilk because that will depend on many variables, not the least of which is the test itself. What it does point to, is the potential for educators to recognise the different opportunities online assessment offers and to design courses to facilitate students receiving immediate feedback on how successfully they understand and apply key concepts. Self-assessment has a critical role in assisting students to see their work as an ongoing source of learning. Where the emphasis is on effective learning, there is a clear connection with maturation to a learner who has the ability to form a reasonable judgement about the extent to which they have met certain criteria and standards and to be proactive in response, assisting their development towards independent and autonomous learning. In this process, self-assessment reinforces its significance by becoming a major source of the commitment to improve (Loacker, 2000). As Boud (1986) claimed over a decade ago, “The development of skills in self-assessment lies at the core of higher education, and as teachers we should be finding whatever opportunities we can to promote self-assessment in the courses we teach”. While there are multiple approaches to the provision of opportunities for self-assessment, the Internet has huge potential in this regard. Probably the most widely recognised is that it allows students to undertake quizzes and online exercises to test their level of knowledge, skills and/or understanding. While, typically, this is done after a learning activity as a review, it has an important role as a prelude to a specific area of study. There are significant efficiencies for educators as such testing can be done away from the limited direct teaching time and, once in place, similar tasks can be repeated with other cohorts with minimal adjustment. This is because issues of security are essentially irrelevant as the priority is for students to gain self-knowledge and, in a culture where this is valued, it is pointless and self-defeating to cheat oneself. A further efficiency is that, where gaps and weaknesses are exposed, remediation can be provided through embedded feedback and assistance which can be prepared once and then accessed as frequently as needed without drawing on lecturers’ time and involving them in what is typically base level repetitive instruction. A significant advantage is that students can specify the level at which they wish to work (eg entry, intermediate or advanced) and, with adaptive testing, there is a reduced need for repetitive drill before students can be confident they have the requisite level of competency. While there are commonly general time constraints related to the sequencing of the teaching and its formal assessment, within this construct students appreciate the opportunity to work at their own pace and level, with the opportunity to repeat and review as often as they wish. It is important, however, to differentiate explicitly between online formative assessment that is solely intended for student feedback and that which has the dual role of informing educators. Even when the former is the case, major information management systems, such as WebCT, Blackboard and Learning Space, routinely provide such revealing data as the time spent in accessing such tests, and, for ethical reasons, students should be aware of this and the use to which such information may be put. Any ways in which the private space of the student is jeopardised will limit one important value of such assessment: the freedom to explore areas of perceived weakness and to make mistakes without revealing these to those responsible for the final assessment or to peers without this being a deliberate decision on their part. In the latter case, computer generated statistics of usage and patterns of responses are seemlessly collected and can be a highly informative and important diagnostic tool for faculty. Standard reports provide statistics on the number of sessions started and finished, duration, low and high scores, mean scores and standard deviations with analysis of individual test items generating statistics on difficulty, effectiveness of discriminators, as well as standard deviation and the frequency rates for answerchoices within questions. The issue of security Candidate validity A critical aspect of delivering any assessment which counts towards the student’s final score is ensuring that the designated candidate is completing the assessment task and that they are not using any unstipulated resources. This is an issue facing all assessment. The making available of an assessment via an online system is relatively straightforward with products such as WebCT, Blackboard and Learning Space providing the tools and the engines for delivery. What is missing is candidate validation. In a controlled environment, candidates have their identity validated by staff responsible for this environment who also are responsible for ensuring that the assessment is taken under appropriate conditions and that the candidate does not cheat. By using centres which have required authorisation codes a system can be relatively easily set up whereby both the centre staff and the candidate logon on and provide authorisation codes, a vital step towards ensuring validity. The management of the actual test can be readily done by the person/group responsible for the assessment. The testing centre is responsible only for validating candidate identity and assessment conditions. Divulging of content One the main advantages of online delivery of courses is to provide flexibility. If the aim is to allow candidates to take the test at a time of their choosing, although that may be within a defined window of opportunity, what steps can be taken to protect the content such that it cannot be divulged by the first person to take the assessment? As any given test is made up of a subset of items from a large pool of questions, to be able to pass the examination simply by knowing the test items themselves would require the ability to memorise a large number of questions to ensure adequate coverage. This can be compounded by similar questions and slight changes of detail that would defeat the usefulness of rote learning of answers. Adaptive testing makes the likelihood of students cheating in this way even less likely. As discussed above, a clear advantage of adaptive testing for formative assessment and where no score is recorded, is that it is self-defeating to know before hand what the test items will be. Intellectual Property (IP) Protection There has always been the risk that in the process of getting the assessment from the developer of the test to the candidate and back again that it could be intercepted and copied. While the use of the Internet provides flexibility of delivery, it also exposes vulnerability in terms of the systems, many of which are public, through which the communication travels. This risk is minimised by encryption and digital certificate technologies. While encryption can be used to ensure that information on the channel cannot be read, there is also the need to establish trust in the end-to-end communication so the institution running the test can be confident that the system accessing it is the testing centre and the testing centre can be confident that they have connected to the institution, and not someone pretending to be that institution. The use of these digital technologies should ensure that communication is not only from the sender that it claims to be from, but also that the communication has not been tampered with. System integrity With added complexity comes added risk. Central servers, public networks and testing centre networks all need to function properly. One of the noted areas of vulnerability is the propensity for computers to ‘crash’ at inappropriate times and, in this regard, students who undertake online assessment for credit in their own locations are at far greater risk than those at a testing centre, the more sophisticated companies providing 24x7 worldwide phone and online support for technical and operational issues. This risk can, however, albeit at a cost, be minimised by strategies such as clustered servers, Internet server farms, redundant network components and multiple testing servers. **** Will these steps outlined above inevitably ensure the integrity of the assessment process? No more so than the methods used to ensure the integrity of all assessment processes. Problems with security of examination papers and cheating during examinations still exist and the increased security measures (such as increased surveillance, photograph security IDs and external security firms managing delivery of print examination papers) are testimony to this concern. There is a touch of irony in the fact that it is Internet resources that are revealing the extent of plagiarism in traditional forms of assessment, such as essays. Cheating is a sector problem and, while online assessment is not immune, using a testing centre should mean that, in order for a candidate to cheat, either having someone else take the assessment for them or using materials that are not meant to be used, would require a much higher level of collusion than if they were simply able to log on to the assessment Internet site. Further, such centres can provide a high level of certainty that the system will deliver as intended and, in the rare cases when it does not, support for the students affected. Conclusion Returning to the three pressures for change cited at the commencement of this paper (Nightingale et al, 1996), as this discussion has aimed to show assessment using the Internet (most especially, adaptive assessment) can assess a broad range of student abilities and hence assist their development; it can provide significant feedback and be a valuable tool to support student learning; and it supports self-assessment. Self-assessment can be incorporated into course design in a multitude of ways and its enhancement of skills to appraise one’s progress accurately and its promotion of active and purposeful learning should mandate its inclusion. While there is potential for adaptive summative assessment, because of issues of security its greatest promise at the moment seems to be in the formative area. Educators need to see online testing (including, where appropriate, adaptive testing) as an enabling tool that can provide students with enriched opportunities to learn in their own time and space and exploit its potential to adapt to their needs more fully. Notions concerning the collaborative, dynamic and evolving nature of learning are nurtured with the Internet being exploited as a tool to engage thinking and support knowledge construction with rich, complex understanding, rather than merely reduced to a powerful storer and transmitter of information. In this way the goals that Brown et al (1999) set for computer-based learning will be closer to being achieved: Online assessments can be unique; students can access specific, timely and useful feedback; this feedback can make a significant contribution to their learning and can be used also to inform teaching. The capability is undeniably there. It is up to educators within the higher education system to exploit it appropriately as a viable approach to assessment and as a contributor to quality learning. Acknowledgement: The author expresses her thanks to P. Brent Challis for his advice on the use of online adaptive testing within the corporate domain. References Blacharski, D. (2001) “Computerized Adaptive Testing”, Certification News, viewed 3 Oct. 2003, <http://www.itworld.com/nl/cert_news/02052001/. Boud, D., (1986: 1991 edn) Implementing Student Self-Assessment, HERDSA, Campbelltown. Brown, S. & Knight, P. (1994) Assessing Learners in Higher Education, Kogan Page, London. Brown, S. et al, (1999) Computer-Assisted Assessment in Higher Education, Kogan Page, London. Dempster, J. (1999) “Web-based assessment software: Fit for purpose or squeeze to fit?”, Interactions, Vol 2, No3, viewed 3 Oct. 2003, < www.warwick.ac.uk/ETS/interactions/ vol2no3/ dempster.htm. Foster, D. (2001) " Adaptive Testing", Exam and Testing Procedures,viewed 8 Aug. 2003 http://www.microsoft.com/traincert/downloads/adapt.doc Gouli, E, Papanikolaou, K. and Grigoriadou, M. (2002) “Personalising Assessment in Adaptive Educational Hypermedia Systems”, viewed 20 Aug. 2003, <http://hermes.di.uoa.gr/lab/CVs/papers/gouli/ah-2002.pdf. James, R, et al, (2002) Assessing Learning in Australian Universities, Centre for the Study of Higher Education, The University of Melbourne and the Australian Universities Teaching Committee, Canberra. Loacker, G. (ed) (2000) Self-assessment at Alverno College, Alverno College Institute. Nightingale, P. et al, (1996) Assessing Learning in Universities, University of New South Wales Press, Sydney. Ramsden, P. (1992) Learning to Teach in Higher Education, Routledge, London. Ray, R., (2001) “Artificially Intelligent * Adaptive Instruction: What is Adaptive Instruction?”, viewed 15 Sept. 2003 .<http://www.psych-ai.com/AdapEval2.html.