Committing to quality learning through adaptive assessment

advertisement
Committing to quality learning through adaptive online assessment
Di Challis
Deakin University
diana@deakin.edu.au
Abstract: Assessment lies at the heart of the university undergraduate experience. It defines for students their sense of
what is significant in their study how they will prioritise their time and effort. Yet there is a forceful and compelling
literature on assessment that claims much current practice in the higher education sector encourages surface learning
where the extrinsic motivation is to focus on disaggregated selected details of content rather than intrinsically to seek deep
understanding through constructing knowledge (see, eg, Ramsden, 1992). Further, there is now a more widely shared view
within the sector that assessment should fulfil a plurality of purposes and that students need opportunities to self- and peerassess as part of a quality learning experience.
With advances in computer-based technologies and the emergence of e-learning, there are unprecedented opportunities to
reconsider assessment of learning (and, axiomatically, of teaching) and how this can be undertaken. One approach is
adaptive assessment. Although it has existed in the tertiary environment since the time of the oral examination, advanced
technologies allow much fuller exploitation of the possibilities inherent in a dynamic system of testing that responds to the
user.
Having described the characteristics of adaptive assessment this paper will consider how it can achieve significant
pedagogical aims within the sector. The paper will differentiate between adaptive assessment to assist learning and
adaptive assessment to assess achievement. How adaptive assessment can be put in place and salient issues, such as
security and system integrity, when such assessment is used for credit, will then be discussed.
Keywords: adaptive assessment
online testing
security
Introduction
As Brown and Knight (1994, p12) contend, “it is not the curriculum which shapes assessment, but assessment
which shapes the curriculum and embodies the purposes of higher education”. Yet, too often for complacency,
let alone comfort, assessment is poorly understood and hence inadequately conceived. A further troubling
reality is that, beyond the design of the test instrument, decisions about the ways in which students have
responded and the results that will/will not lead to their successful completion of their study, are frequently
relegated to junior members of faculty with consequent concerns about reliability.
Conventional summative assessment, as epitomised by examinations where the questions invite regurgitation of
information, much of which is rote learnt and readily forgotten, is unlikely to provide students with ways to
demonstrate their knowledge and understanding and to produce graduates who can readily meet the professional
demands of their chosen discipline in the twenty-first century. With greater attention given to the purposes and
practices of assessment in higher education over the last decade, there is now a more widely shared view within
the sector that assessment needs to fulfil a plurality of purposes and that students need opportunities within the
process to self- and peer-assess as part of their learning experience. Certainly, the view that assessment in the
higher education context is solely a judgement of academic worth by the provision of a score at the completion
of a program of a study is now widely regarded as a serious oversimplification of a complex matter.
In the Introduction to Assessing Learning in Universities Nightingale et al (1996, p6) point to three pressures for
change:

the desire to develop – and consequently to assess - a much broader range of student abilities;

the desire to harness the full power of assessment and feedback in support of student learning;

the belief that “education should lead to a capacity for independent judgement and an ability to evaluate
one’s own performance”, contending that “these abilities can only be developed through involvement in
the assessment process”.
With the mainstreaming of online courses, there are unprecedented opportunities to respond positively to such
pressures for changeto reconsider assessment of learning (and, axiomatically, of teaching) and to explore new
possibilities and harness new opportunities. This view is supported by others. For instance, in the Australian
Universities Teaching Committee funded Assessing Learning in Australian Universities (2002, p.4), James et al
contended that on-line assessment offered an “unparalleled opportunity for rethinking assessment in higher
education”. In 1999, in their conclusion to the text Computer-Assisted Assessment, Brown et al averred:
The capabilities of computers to deliver unique assessments should and will be exploited. The power of
networks to transfer, upload and download data automatically should also be exploited. The benefits to
student learning of receiving specific, timely and encouraging feedback are utilised by only a few. The
advantages of detailed feedback on student and group performance, delivered rapidly enough to allow
academics to deal with student misconceptions during a module and enhancing student learning, are yet
to be achieved (p.197).
As Blacharski (2001) claims adaptive testing has been around longer than computers with Alfred Binet, inventor
of the IQ test, creating the first one in the late 1800s. In a sense, adaptive testing in the tertiary environment also
has a long history as it has existed since the time of the oral examination, and responding adaptively to students’
responses is a recognised and customary teaching strategy. However, although major IT vendors and global
educational testers increasingly are moving towards adaptive computer-based testing, it it is foreign to the
experience of the vast majority of undergraduatesnot only to those whose assessment is by customary written
form but also to those whose assessment is, at least in part, through the Internet.
This paper considers but one way in which computer capability can be exploited to benefit education, adaptive
testing using online technologies, with the discussion restricted to undergraduate study.
What characterises adaptive assessment?
Adaptive assessment is most frequently seen as synonymous with adaptive testing (Ray, 2001). For the purposes
of this discussion ‘adaptive assessment’ will be construed as computer-aided adaptive testing.
The distinctive element of adaptive testing is that, as the student responds to test items, the items change to
reflect the performance on the preceding item(s). This means that, by using a statistical method of discovery,
any adaptive test is designed on the requirement that the level of the user is constantly established. Commencing
at a moderate level, as the user answers questions items become more difficult/complex or easier, depending on
the demonstrated level of answer. The score is not derived from the number of correct answers but, rather, the
level of difficulty of the questions answered correctly. Such tests can be designed to be terminated when a
student has reached a required level of proficiency/competency or that performance has reached the user’s
highest level. A further possibility for self- or group testing is that the user(s) can specify an entry level and the
sequencing of the test items will adapt to that. Arguably the greatest strength of computer-adaptive testing is
that it is responsive to the user.
Foster (2001) offers a useful analogy by comparing the experience of taking a computerized adaptive test (CAT)
with participating in a high-jump event. The high-jumper, regardless of ability, quickly reaches a challenging
level where there is about an equal chance of clearing the bar or knocking it down. The ‘score’ is the last height
that was successfully cleared and is earned without having to jump all possible lower heights or trying the higher
levels.
It is important to differentiate between online assessment tools. Those readily delivered through learning
management systems such as WebCT and Blackboard and tools such as QUIZIT, although having the significant
advantages of collation and analysis tools and immediate feedback of results, are fixed, rather than dynamic, for
they are not adaptive. Where there is a form of adaptive testing (such as with WebCTVista, for example) it is
based on the summative score rather than response to individual questions and their perceived level of difficulty.
It is important, also, to differentiate between adaptive testing that is used for certification, to assess individual
achievement and to rank (ie summatively), and adaptive testing that is used to assist learning (ie formatively). In
this latter sense the term focuses more on customising the testing to inform learning and is part of the
developmental process. As Gouli et al (2002) point out, this can be used by both the learner and the teacher:
learners can observe their personal learning progress and decide how further to direct their learning process and
tutors can individually support learners and formulate judgements about the quality and effectiveness of the
provided instruction. In this case adaptive testing can provide important diagnostic tools that contribute
meaningfully to learning.
The benefits of adaptive testing
In summative assessment
The most frequently claimed main benefit of computerised adaptive testing is efficiency as the level of
attainment can be determined with fewer questions. As Gouli et al (2002) claim, adaptive assessment is less
tedious for those undertaking the testing because the generated questions are tailored to their level of proficiency
and so they do not have to attempt numerous questions that are either too easy or too difficult for them. They
contend that research shows that, as well as being more efficient, a more accurate estimation of the learner’s
level of proficiency is established. This is of unquestioned value and is the hallmark of summative assessment
which “produces a measure which sums up someone’s achievement and which has no other real use except as a
description of what has been achieved” (Brown & Knight, 1994, p15).
As Brown et al (1999, p32) recognise, in the context of higher education it is important to identify and
differentiate higher-order cognitive skills, and to test for these. It is reductionist to regard objective testing as
solely gauging factual recall for it is readily possible to test a wider range of learning outcomes such as analysis,
synthesis, evaluation and application. As Dempster (1999) claims, the adoption of more flexible question
formats and forms of feedback are strong incentives to rethinking computer-based assessment as well-designed
tests can facilitate deeper probing of student’s knowledge as well as their understanding of concepts.
Illustrations of such approaches are case study scenarios where information is provided from multiple
perspectives and solutions can be developed by the building on screen of complex relationships and flow
processes. While James et al (2002, p.24) correctly point out that on-line examinations are likely to require more
time and effort than conventional pen and paper examinations, they also recognise that computers offer the
potential to present students with more complex scenarios through the use of interactive resources (images,
sound, simulation). They do not consider adaptive testing, but this undeniably requires considerably enhanced
skills to design effective test instruments.
With the rapid increase in systems to provide adaptive testing using a variety of methods (see, eg, Gouli et al,
2002) this situation is likely to become far more viable in the next few years. While computer-based testing is
inappropriate in some disciplines and for evidencing skills in some areas, where the intention is to test skills in
core areas it is not fanciful to see consortia employing experts to prepare banks of such tests that can be used
nationally and, with global enrolments, internationally. This would do much to improve confidence in reliability
of assessment and comparability of standards.
In formative assessment
Conventional means of assessing student learning are typically invitations to students to conceal and camouflage
areas where they lack knowledge as they strive to present the best possible case. Students are understandably
loathe to reveal any perceived weaknesses and, often, those they have failed to identify are forced to their
attention by questions they feel incapable of responding adequately to. When this occurs during a face-to-face
encounter with mentor and peers, social consequences of revealed inadequacy and failure contrive to mitigate
against effective learning. The clear advantage of online testing as part of formative assessment is that it is in a
private space and, in most cases, it can be accessed at a time when the student feels appropriate. Where the test
allows students to pursue areas of perceived weakness as well as to affirm areas of strength, the learning
becomes more useful. This is not to contend, however, that all students will respond positively to online quizzes
and their ilk because that will depend on many variables, not the least of which is the test itself. What it does
point to, is the potential for educators to recognise the different opportunities online assessment offers and to
design courses to facilitate students receiving immediate feedback on how successfully they understand and
apply key concepts.
Self-assessment has a critical role in assisting students to see their work as an ongoing source of learning. Where
the emphasis is on effective learning, there is a clear connection with maturation to a learner who has the ability
to form a reasonable judgement about the extent to which they have met certain criteria and standards and to be
proactive in response, assisting their development towards independent and autonomous learning. In this
process, self-assessment reinforces its significance by becoming a major source of the commitment to improve
(Loacker, 2000). As Boud (1986) claimed over a decade ago, “The development of skills in self-assessment lies
at the core of higher education, and as teachers we should be finding whatever opportunities we can to promote
self-assessment in the courses we teach”. While there are multiple approaches to the provision of opportunities
for self-assessment, the Internet has huge potential in this regard.
Probably the most widely recognised is that it allows students to undertake quizzes and online exercises to test
their level of knowledge, skills and/or understanding. While, typically, this is done after a learning activity as a
review, it has an important role as a prelude to a specific area of study. There are significant efficiencies for
educators as such testing can be done away from the limited direct teaching time and, once in place, similar tasks
can be repeated with other cohorts with minimal adjustment. This is because issues of security are essentially
irrelevant as the priority is for students to gain self-knowledge and, in a culture where this is valued, it is
pointless and self-defeating to cheat oneself. A further efficiency is that, where gaps and weaknesses are
exposed, remediation can be provided through embedded feedback and assistance which can be prepared once
and then accessed as frequently as needed without drawing on lecturers’ time and involving them in what is
typically base level repetitive instruction. A significant advantage is that students can specify the level at which
they wish to work (eg entry, intermediate or advanced) and, with adaptive testing, there is a reduced need for
repetitive drill before students can be confident they have the requisite level of competency. While there are
commonly general time constraints related to the sequencing of the teaching and its formal assessment, within
this construct students appreciate the opportunity to work at their own pace and level, with the opportunity to
repeat and review as often as they wish.
It is important, however, to differentiate explicitly between online formative assessment that is solely intended
for student feedback and that which has the dual role of informing educators. Even when the former is the case,
major information management systems, such as WebCT, Blackboard and Learning Space, routinely provide
such revealing data as the time spent in accessing such tests, and, for ethical reasons, students should be aware of
this and the use to which such information may be put. Any ways in which the private space of the student is
jeopardised will limit one important value of such assessment: the freedom to explore areas of perceived
weakness and to make mistakes without revealing these to those responsible for the final assessment or to peers
without this being a deliberate decision on their part. In the latter case, computer generated statistics of usage
and patterns of responses are seemlessly collected and can be a highly informative and important diagnostic tool
for faculty. Standard reports provide statistics on the number of sessions started and finished, duration, low and
high scores, mean scores and standard deviations with analysis of individual test items generating statistics on
difficulty, effectiveness of discriminators, as well as standard deviation and the frequency rates for answerchoices within questions.
The issue of security
Candidate validity
A critical aspect of delivering any assessment which counts towards the student’s final score is ensuring that the
designated candidate is completing the assessment task and that they are not using any unstipulated resources.
This is an issue facing all assessment. The making available of an assessment via an online system is relatively
straightforward with products such as WebCT, Blackboard and Learning Space providing the tools and the
engines for delivery. What is missing is candidate validation. In a controlled environment, candidates have their
identity validated by staff responsible for this environment who also are responsible for ensuring that the
assessment is taken under appropriate conditions and that the candidate does not cheat. By using centres which
have required authorisation codes a system can be relatively easily set up whereby both the centre staff and the
candidate logon on and provide authorisation codes, a vital step towards ensuring validity. The management of
the actual test can be readily done by the person/group responsible for the assessment. The testing centre is
responsible only for validating candidate identity and assessment conditions.
Divulging of content
One the main advantages of online delivery of courses is to provide flexibility. If the aim is to allow candidates
to take the test at a time of their choosing, although that may be within a defined window of opportunity, what
steps can be taken to protect the content such that it cannot be divulged by the first person to take the
assessment? As any given test is made up of a subset of items from a large pool of questions, to be able to pass
the examination simply by knowing the test items themselves would require the ability to memorise a large
number of questions to ensure adequate coverage. This can be compounded by similar questions and slight
changes of detail that would defeat the usefulness of rote learning of answers. Adaptive testing makes the
likelihood of students cheating in this way even less likely. As discussed above, a clear advantage of adaptive
testing for formative assessment and where no score is recorded, is that it is self-defeating to know before hand
what the test items will be.
Intellectual Property (IP) Protection
There has always been the risk that in the process of getting the assessment from the developer of the test to the
candidate and back again that it could be intercepted and copied. While the use of the Internet provides
flexibility of delivery, it also exposes vulnerability in terms of the systems, many of which are public, through
which the communication travels. This risk is minimised by encryption and digital certificate technologies.
While encryption can be used to ensure that information on the channel cannot be read, there is also the need to
establish trust in the end-to-end communication so the institution running the test can be confident that the
system accessing it is the testing centre and the testing centre can be confident that they have connected to the
institution, and not someone pretending to be that institution. The use of these digital technologies should ensure
that communication is not only from the sender that it claims to be from, but also that the communication has not
been tampered with.
System integrity
With added complexity comes added risk. Central servers, public networks and testing centre networks all need
to function properly. One of the noted areas of vulnerability is the propensity for computers to ‘crash’ at
inappropriate times and, in this regard, students who undertake online assessment for credit in their own
locations are at far greater risk than those at a testing centre, the more sophisticated companies providing 24x7
worldwide phone and online support for technical and operational issues. This risk can, however, albeit at a
cost, be minimised by strategies such as clustered servers, Internet server farms, redundant network components
and multiple testing servers.
****
Will these steps outlined above inevitably ensure the integrity of the assessment process? No more so than the
methods used to ensure the integrity of all assessment processes. Problems with security of examination papers
and cheating during examinations still exist and the increased security measures (such as increased surveillance,
photograph security IDs and external security firms managing delivery of print examination papers) are
testimony to this concern. There is a touch of irony in the fact that it is Internet resources that are revealing the
extent of plagiarism in traditional forms of assessment, such as essays. Cheating is a sector problem and, while
online assessment is not immune, using a testing centre should mean that, in order for a candidate to cheat, either
having someone else take the assessment for them or using materials that are not meant to be used, would require
a much higher level of collusion than if they were simply able to log on to the assessment Internet site. Further,
such centres can provide a high level of certainty that the system will deliver as intended and, in the rare cases
when it does not, support for the students affected.
Conclusion
Returning to the three pressures for change cited at the commencement of this paper (Nightingale et al, 1996), as
this discussion has aimed to show assessment using the Internet (most especially, adaptive assessment) can
assess a broad range of student abilities and hence assist their development; it can provide significant feedback
and be a valuable tool to support student learning; and it supports self-assessment. Self-assessment can be
incorporated into course design in a multitude of ways and its enhancement of skills to appraise one’s progress
accurately and its promotion of active and purposeful learning should mandate its inclusion.
While there is potential for adaptive summative assessment, because of issues of security its greatest promise at
the moment seems to be in the formative area. Educators need to see online testing (including, where
appropriate, adaptive testing) as an enabling tool that can provide students with enriched opportunities to learn in
their own time and space and exploit its potential to adapt to their needs more fully. Notions concerning the
collaborative, dynamic and evolving nature of learning are nurtured with the Internet being exploited as a tool to
engage thinking and support knowledge construction with rich, complex understanding, rather than merely
reduced to a powerful storer and transmitter of information. In this way the goals that Brown et al (1999) set for
computer-based learning will be closer to being achieved: Online assessments can be unique; students can
access specific, timely and useful feedback; this feedback can make a significant contribution to their learning
and can be used also to inform teaching. The capability is undeniably there. It is up to educators within the
higher education system to exploit it appropriately as a viable approach to assessment and as a contributor to
quality learning.
Acknowledgement: The author expresses her thanks to P. Brent Challis for his advice on the use of online
adaptive testing within the corporate domain.
References
Blacharski, D. (2001) “Computerized Adaptive Testing”, Certification News, viewed 3 Oct. 2003,
<http://www.itworld.com/nl/cert_news/02052001/.
Boud, D., (1986: 1991 edn) Implementing Student Self-Assessment, HERDSA, Campbelltown.
Brown, S. & Knight, P. (1994) Assessing Learners in Higher Education, Kogan Page, London.
Brown, S. et al, (1999) Computer-Assisted Assessment in Higher Education, Kogan Page, London.
Dempster, J. (1999) “Web-based assessment software: Fit for purpose or squeeze to fit?”, Interactions, Vol 2,
No3, viewed 3 Oct. 2003, < www.warwick.ac.uk/ETS/interactions/ vol2no3/ dempster.htm.
Foster, D. (2001) " Adaptive Testing", Exam and Testing Procedures,viewed 8 Aug. 2003
http://www.microsoft.com/traincert/downloads/adapt.doc
Gouli, E, Papanikolaou, K. and Grigoriadou, M. (2002) “Personalising Assessment in Adaptive Educational
Hypermedia Systems”, viewed 20 Aug. 2003, <http://hermes.di.uoa.gr/lab/CVs/papers/gouli/ah-2002.pdf.
James, R, et al, (2002) Assessing Learning in Australian Universities, Centre for the Study of Higher Education,
The University of Melbourne and the Australian Universities Teaching Committee, Canberra.
Loacker, G. (ed) (2000) Self-assessment at Alverno College, Alverno College Institute.
Nightingale, P. et al, (1996) Assessing Learning in Universities, University of New South Wales Press, Sydney.
Ramsden, P. (1992) Learning to Teach in Higher Education, Routledge, London.
Ray, R., (2001) “Artificially Intelligent * Adaptive Instruction: What is Adaptive Instruction?”, viewed 15 Sept.
2003 .<http://www.psych-ai.com/AdapEval2.html.
Download