Computer-Assisted Assessment in Computer Siene: Issues and Software Simon Rawles Mike Joy rawlesds.warwik.a.uk m.s.joywarwik.a.uk Department of Computer Siene Department of Computer Siene University of Warwik University of Warwik Coventry CV4 7AL Coventry CV4 7AL United Kingdom United Kingdom Mihael Evans Department of Computer Siene University of Warwik Coventry CV4 7AL United Kingdom February 4, 2002 Abstrat As student numbers and leturer workloads inrease, traditional methods of assessment make it progressively more diÆult to ondut eetive assessment and provide students with detailed, spei feedbak. Therefore, aademi institutions are onsidering ways of automating their assessment methods, at least in part. This automation is realised by omputer assisted assessment (CAA) software. Though CAA software has been proven in some limited situations, it remains a tehnology whih is yet to ome of age pedagogially. In this report we disuss the issues of inorporating omputer-assisted assessment into university-level teahing, in partiular in Computer Siene, and give an overview of the software available and its apabilities. 1 Contents 1 Introdution 3 2 About the reviews 3 2.1 The Website and its maintenane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Review methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Tehnial issues 3.1 3.2 3.3 3.4 Installation . . . . . . . . . . . . . . . . . . . Tehnologies . . . . . . . . . . . . . . . . . . . Seurity . . . . . . . . . . . . . . . . . . . . . Standards . . . . . . . . . . . . . . . . . . . . 3.4.1 The IMS Standards . . . . . . . . . . 3.5 Struture of materials, reuse and integration . 3.6 Question banks . . . . . . . . . . . . . . . . . 4 Pedagogial issues 4.1 Sope of funtionality . . . . . . . . . . 4.2 Ways of taking tests . . . . . . . . . . . 4.2.1 Adaptive testing . . . . . . . . . 4.3 Summative testing and plagiarism . . . 4.4 Marking shemes . . . . . . . . . . . . . 4.5 Feedbak to students . . . . . . . . . . . 4.6 Feedbak to sta . . . . . . . . . . . . . 4.7 Question types . . . . . . . . . . . . . . 4.8 Relevane to Computer Siene . . . . . 4.9 Deeper learning and Bloom's taxonomy 4.10 Student autonomy . . . . . . . . . . . . 5 Developer and Institution 5.1 5.2 5.3 5.4 CAA tool produers . Quality of interfae . . Support . . . . . . . . Time and money osts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4 5 5 5 5 6 6 7 7 7 7 8 8 8 9 9 9 10 10 10 11 11 11 11 12 12 6 Ideas for further work 12 7 Conlusions 13 8 Dislaimer 14 9 Appendix: Summaries of the reviews 14 Referenes 17 6.1 Review exerise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Topis in CAA researh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 12 13 1 Introdution One denition of omputer-assisted assessment (CAA) is `the use of omputers in assessment', inluding examinations and assignments. CAA software an be used to automate many of the aspets of assessment, whih an broadly be divided into delivery, marking and analysis. Though CAA is often promoted as an antidote to rising leturer workload as a result of inreasing student numbers, there are many reasons to use CAA tehnology beyond this. A key appliation of CAA is in formative assessment, and in this setting CAA software is apable of failitating learning by delivering tests, automatially marking and providing appropriate feedbak. In addition: progress may be more easily monitored though the appliation of frequent tests; student self-evaluation and autonomy is enouraged; and the interative nature of the omputer may be employed to ondut new types of assessment, for example adaptive testing. The administrative benets inlude the redution in human error, integration with institutional databases and the time savings one the software has been established and the materials prepared. This report summarises work done in ollaboration with the Learning and Teahing Support Network for the Information and Computer Sienes [18℄, one of 24 subjet entres set up to oer subjet-spei expertise, best pratie and information on learning and teahing in the Information and Computer Sienes. The LTSN-ICS CAA review exerise [15℄ is an ongoing projet, the purpose of whih is to investigate a range of software urrently available whih have apabilities of assisting the assessment proess. In this report, we summarise this work by examining both the urrent CAA software available and going further to onsider its appliation in institutions of higher eduation. The software hosen is intended to represent a wide range of produts available, from free aademi software written as proof-ofonept in eduational researh to ommerially-written virtual learning environment software intended for mass deployment in large universities. The emphasis in this exerise has been on assessment in the information and omputer sienes, though many of the produts reviewed have been general-purpose question-and-answer tools. We disuss software available in general terms; individual software reviews are available from the projet's Website [15℄. The work presented here disusses the CAA software available at the end of 2001. The projet Website [15℄ is kept up-to-date, and the reader is invited to onsult the material there for further information. 2 About the reviews 2.1 The Website and its maintenane The Website is seen as the main produt of the researh, and is intended to represent the urrent situation in CAA software through its reviews and assoiated information. The site an be aessed through the URL http://www.ds.warwik.a.uk/ltsn-is/resoures/aa/reviews/. Sine there are many ways in whih the information on the site an beome out-of-date or superseded, the Website is maintained aording to a dened `updating methodology' whih ensures the information is kept upto-date. Measures employed as part of this methodology inlude: general involvement with the CAA ommunity; systemati auray heks for eah review every three months; ative interest in, and use of, feedbak provided from the Website; provision of, and partiipation in, moderated disussion forums on the Website; a monthly Internet searh ativity for new CAA produts; maintenane of assoiated resoures, suh as the glossary and tables. 3 2.2 Review methodology Eah of the reviews are written using a standard struture whih aims to over most aspets of the software's tness for use. The struture is only used as a guideline, and is adapted where neessary. It should be noted that, although there are objetive omparisons of the failities available in CAA tools, many of the riteria against whih the tools have been measured are subjetive. On this aount, the reviews should be treated with aution. In partiular, omparisons in the reviews and the omparative summary tables reet the authors' opinions. The review struture may be summarised as follows: Basi produt information. The title and publisher of a produt. The URL of the produt's Website. A short 2{3 line desription of the produt, overing its main features or appliations. The availability and liening situation for the produt. Whih platforms the software runs on | inluding lient and server requirements or plugin information where appliable. Overview. A breakdown of the omponents in the software. A list of weak points and strong features of the software. Capabilities. The types of assessment supported with reasons why they are suitable or unsuitable. Student experienes. The ease of use of the software and how muh prior IT knowledge is required to use it. Stability of the software. Whether the software has been only trialled or fully deployed. Eet on study or learning. Form of feedbak and its usefulness. Degree of remote aess. Results of tests for HTML orretness. General feelings about the assessment. Sta experienes. Nature of leturer feedbak. Usefulness and appropriateness of the statistis and reporting. Degree to whih an up-to-date view of the assessment situation an be presented. Extent to whih the delivery of tests an be ontrolled. Robustness of the software. Integration with other tools the leturer might use. Ease of use, partiularly in question design. Previewing of questions. Degree of prior knowledge neessary to author, suh as speial markups or languages. Support for reuse of materials and question banks. Flexible speiation of question forms. Eort required for upkeep of the software ompared with traditional methods. General feelings on authoring, distribution, delivery, marking and administration. Appropriateness for deployment within institutions. Upkeep requirements for sta and depart- ments. Installation proess. Eet on student reruitment and distane learning. Quality ontrol issues. Possibilities of entral olletion and proessing of assessment data. Whether the deployment of the software is likely to be smooth and eÆient. Salability of the tehnology and the kind of sale that the software is best used at. Integration with related methods. This setion is about how well the software an be used in dier- ent learning situations. Integration with omputer-aided learning software, optial mark reogni- tion. Appliation to group-based work. Types of tasks the software supports on Bloom's taxonomy. Support for staged questions | an later questions be seleted from earlier ones? Randomisation of questions. Support for free text questions and the extent to whih they an automatially marked. Ease of adding new question or assessment types. Ability to dene and exibility of marking shemes. Ability for students to submit binary les as responses. Support from the originating organisation. Quality of doumentation. Whether the software is atively maintained and frequently upgraded. Extent of maintenane ommitment. Existene of on-site experts and their eetiveness. Seurity and anti-plagiarism. Authentiation. Password reqirement and seurity. Features in soft- ware whih disourage ollusion (opying from others), for example randomisation. Suitability of the software for summative assessment, summative assessment under exam onditions and diagnosti assessment. Handling of data and whether it is under an appropriate aess ontrol system. Extent to whih paket sniÆng is made diÆult. Bloking of aesses by time or IP address. Use of enryption or digital signature tehniques. Extent to whih the browser an be misused. Privay of marks, analyses, logged tests responses and other results. 4 Summary. A 3-4 line summary of the produt, onluding the review and providing a minimum-detail judgement for those omparing produts. Additionally, the reviews are summarised in a series of tables, with rows showing dierent produts and olumns showing features of produts. The tables ome with their own explanations, but generally, features are shown by one lled or empty irle, in the ase of features that are either present or absent, or by three irles representing a rating from 0 to 3, ranging from absene or unsuitability for a partiular feature to total overage or exellene in that feature. Tables are usually lled in after a review is omplete and should reet the judgements made in the review. 3 Tehnial issues This setion disusses the more operational and tehnial aspets of the tools. 3.1 Installation Most of the produts were relatively straightforward to install, though a ommon problem, espeially in respet of software developed at aademi institutions, is that software is typially not pakaged ompletely. In partiular, software whih is intended to be ross-platform is too spei to partiular mahine ongurations or operating systems, and required small edits to get working. In the ase of a few produts, installation was rather an involved proess, taking hours for users skilled with omputers to aomplish. Those produts may therefore require installation by system administrators rather than the leturers themselves. 3.2 Tehnologies Many of the non-ommerial produts reviewed used established and often open tehnologies like Java [27℄, Perl [21℄ or the Apahe Web server [1℄. As new tehnologies emerge and existing tehnology develops, CAA software is often not updated to take advantage of these, and CAA therefore often lags tehnologially. The most ommonly-supported operating systems on whih the produts run are the newer versions of Windows and the various avours of Unix. However, a signiant number of the ommerial produts operate as paid-for servies on the Web. Delivery tehnologies vary, though unsurprisingly the World Wide Web is a popular hoie, allowing assessments to be easily distributed through its almost universal presene on the users' desktops. However, other produts use ustom-implemented programs, often as part of a lient-server approah [16, 17℄. Other produts make diret use of third-party software. For instane, the Authorware multimedia authoring tool [20℄ is used to drive the TRIADS [19℄ assessment engine, and is diretly used to edit questions and otherwise set up assessments. 3.3 Seurity Seurity is an aspet of the produts whih has generally not been onsidered in design. For some produts, this is simply beause they are not intended for anything other than for purposes of self-testing. Many produts, however, laim that they an be used for diagnosti or even summative testing. Sine this requires the authentiation of students and the storage of their results, as well as measures to prevent one student from interepting another's submission and other ommuniation, this ould undermine the integrity of the assessment proess. Enryption and other seurity measures are not ommonly used. Provision of adequate authentiation is somewhat easier, though in some ases passwords were not as losely guarded by the produts as might be expeted. Seurity and reliability problems are inherent in some delivery tehnologies when applied to CAA [30℄. Web-delivered CAA is ommon, though very few produts use seure onnetions, leaving response data open to other users of the network. Standard browsers are designed to deliver Web douments rather than at as front ends to assessment software and so an be prone to various ats of heating, perhaps by deliberate rashing of the software, aess to unauthorised sites or other appliations through the 5 lipboard. Question Mark [23℄ provides a speial seure browser whih it laims overomes many of these problems. Some software does address seurity well, notably Nottingham University's Coursemaster system [16℄, whih enrypts traÆ between the administrator and the server, and uses a ross-heking session key system to protet students' ommuniations with the server from paket sniÆng. Other software appears to reah a given level of seurity, for example aepting passwords as part of the login proess, but ompromises the seurity by inorporating the passwords in the HTML soure of eah page served. A similar problem exists in some produts in whih the answers to the questions themselves an be read by simply viewing the soure, but this is not typially an issue with more apable, ommerial solutions. 3.4 Standards Adoption of standards, suh as the IMS's Question and Test Interoperability standard (IMS-QTI) [12℄, does not appear to be a priority for most of produers of the CAA produts reviewed, and there is little information in most ases to suggest they are onsidering onformane, though a few ommerial produers are reognising the standard and suggesting future software will use the standards to some extent. It appears that the range of question formats is beoming more diverse. Indeed, in some ases, IMS standards have been extended in an ad-ho manner, suggesting the basi IMS standard is neither omplete enough nor exible enough for the needs of developers of the more omplex produts. This trend seems likely to ontinue, as IMS standards lag behind the produts' apabilities whih they aim to over. However, markup has been used to provide a strutured and delivery-independent approah to a number of projets. In addition to the use of IMS standards [12℄, the WebMCQ [7℄ system is partly motivated by a wider agenda for the organisation of teahing materials based on a markup language, and Bristol University's TML system [13℄ has been established for several years as a usable markup language for quiz-like assessments, and is supported by the Netquest delivery system. 3.4.1 The IMS Standards An industry standard struture [12℄ has been developed for the representation of question and test data, failitating the use of these data between dierent produts. The standard has been dened by the IMS Global Learning Consortium, omprising eduational, ommerial and government organisation members. This standard, the Question and Test Interoperability (QTI) speiation, is dened in XML, the extensible markup language. The standard itself, at the time of writing, is at version 1.2, a publi draft speiation. Its strutures for delivery and response allow the onstrution of questions from a set of standard question types and the olletion and soring of responses. The standard is relatively large, with well over a hundred XML elements overing mainly the following: speiations for question-and-answer data and the pakaging of suh data; information about proessing responses for assessment (soring, feedbak, et.) and reporting these results; other meta-data; high-level presentational instrutions (though not speiations for a spei user-interfae); the denition of external APIs for produts. IMS have also developed QTILite, a ut-down version of the main speiation whih, amongst other things, only deals with multiple-hoie questions. The area of test delivery is dened in its own standard [11℄, overing the seletion and ordering of questions. Small sets of questions may be dened and a random number of questions hosen from eah bank. These an be presented in order of appearane in the question bank or in an arbitrary order, and obligatory questions an be dened. Questions may also be dened to require others (overlap inlusion ) or to exlude others from being delivered (overlap exlusion ). In its present form, the standard is limited to seletion of questions based on random seletion rather than as a result of previous responses, 6 though a number of senarios in whih this takes plae have been identied for future study. Inluded in these are assessment methods suh as omputer adaptive testing, in whih diÆulty is adjusted based on the urrent estimate of the examinee's ability, or simulated ases in whih, for example, a medial situation is desribed and questions lead examinees through dierent situations that may our based on their responses. However, it is still the ase that the onditional seletion of questions based on outome response is omitted from the standard. The ability to dene preonditions and postonditions for setions and items is also left for further study and inlusion in future versions of the standards. 3.5 Struture of materials, reuse and integration The struture, or otherwise, of learning and assessment material has an impat on how it is used, and perhaps more importantly, reused. The development of strutures and markup languages for suh materials has been the subjet of several projets, eah with divergent results [28℄. The interest in struture and reuse prompts the question of to what extent organisation of materials exists in CAA software. A related issue is to what extent are assessment and learning materials integrable into presentation media, suh as the Web. There appears to be little provision for reuse and integration in most of the software, the exeption being the Virtual Learning Environments (VLE), where it is often the ase that learning and assessment materials have some onept of struture in that they are presented on separate pages or modules. Most produts whih provide a basi level of material organisation allow questions to be stored and reused in new assessments through referene to alloated ID numbers. The IMS-QTI standards and standards assoiated with WebMCQ [7℄ detail a multilevel hierarhy, in whih there are levels for ourses, bloks, assessments, setions and questions. Meta-data is also well-developed in the standards, though both the onept of hierarhy and meta-data is often inadequately represented expliitly in produts themselves. 3.6 Question banks There has been muh interest in a question-bank approah to assessment for CAA software. That is, software whih failitates the reation and use of a set of questions, either shared between assessments or between leturers. This onept does not feature muh in the tools reviewed, perhaps beause of the relatively low penetration of standards into the produts. A signiant problem with the proess of testing using CAA is the building of questions, whih is time-onsuming and surprisingly hard to do well, and the sope to organise questions, and ultimately to share them, would enourage the reation of a resoure to be used year-on-year and to share with olleagues. 4 Pedagogial issues This setion onsiders the tools with respet to how they are deployed for use in eduation. 4.1 Sope of funtionality The intended apabilities and settings for the produts vary signiantly. Some produts, for example the TACO system [25℄, are intended only as simple multiple hoie Web interfaes for inlusion in Web-delivered oursework material. Others are intended as fully-featured question-and-answer systems addressing many types of testing, and there is a wide spetrum of funtionality between these. Most produts reviewed are reasonably foussed on omputer aided assessment, but some also address other areas that an be overed by the tools. Other produts feature omputer-assisted assessment only as part of a wider produt. Commerial produts seem to be gravitating towards a Virtual Learning Environment, intended as a `virtual university in a box'. Consequently the fous shifts from CAA towards a broad general-purpose eduational solution, losing many of the advaned apabilities whih ourse organisers may require. Some CAA produts are foussed around partiular aademi goals rather than simply assessment; for example the TML system [13℄ attempts to establish a markup for questionand-answer data from the days before IMS standards had beome well-dened. Typially, ommerial produts tend to be rather too general, favouring a \one-stop-shop" solution rather than something derived from real pedagogial requirements. In ontrast, aademi projets are 7 often developed from the spei requirements of the job. A small number of produts are developed from aademi spin-o ompanies, and to some extent have the advantages of both ommerial and aademi pratie. 4.2 Ways of taking tests These are relatively basi, and generally lak any form of onditional questioning. This signiantly limits their usefulness for formative testing, an area in whih the use of a omputer would seem partiularly appropriate. Some produts deliver using a single page with a sequene of questions and only allow for setting simple sets of questions. 4.2.1 Adaptive testing Though the provision of feedbak guides the learner to some extent, it does not exploit the interative nature of the omputer to the same extent as does adaptive testing. Adaptive testing has been dened as a method of assessment where the seletion of later questions depend on the responses to earlier questions [22℄. It an arguably be used as an aid to aurate assessment, as in the Amerian Graduate Reords Examination (GRE) [9℄, where orret answers lead to progressively harder questions and inorret ones to easier questions. Adaptive testing an be applied in the assessment-as-learning setting as a powerful feedbak reinforement mehanism. One way in whih this ould be employed might be to ategorise questions and then trak those subjets or question types in whih a student is repeatedly failing, and present more of those questions during formative or pratie tests. Another way might be to employ GRE-style diÆulty adjustment to ensure that every student is strethed, irrespetive of their ability. Some aspets of non-adaptive testing are lost, for example the ability to step bak through a test and hange answers to previous questions, sine they have been used to determine future ones. Another onsideration is that the question-setter must try not to set questions with any delivery sequene in mind, sine | though sequenes are possible with adaptive testing | they redue the dynami nature of the question seletion. Adaptive testing is an under-represented onept in both CAA tools and the standards that speify CAA questioning formats. A small minority of tools [8, 7℄ allow for onditional questioning, though this often takes a rather unsubtle form, where the outome of simple onditional tests based on ranges of sores attained so far aets the seletion of the next setion, whih typially onsists of a large sequene of questions. As is disussed later, the IMS standards do not yet over adaptive testing and the onditional seletion mehanisms required for it. Generally speaking, the omputer has not been signiantly exploited as an enabler of new assessments, rather it has been used to implement traditional assessment. Features whih exploit the omputer's interative nature, suh as adaptive testing and other types of guided learning, peer review systems and so on, are surprisingly rare. 4.3 Summative testing and plagiarism Summative and diagnosti testing are two of the areas in whih omputer-assisted assessment brings signiant benets over traditional methods and are therefore desired appliation areas for leturers using CAA tools. However, summative testing has several important requirements, stemming from the neessity to obtain an aurate set of results, and therefore to redue the likelihood of heating as muh as possible. Plagiarism detetors may be employed after results have been olleted, but there is a strong argument for hoosing software whih redues the likelihood of plagiarism in the rst plae. The software therefore needs to address the issues of impersonation, obtaining results by omputer haking, and `over-the-shoulder' opying: by allowing students to authentiate themselves adequately; by supporting omputer and network seurity; by failitating ontrol of the test environment. 8 In general, the use of CAA tools for summative testing is problemati, due in part to the diÆulty in overoming seurity onerns and the ways in whih a basi Web browser an be misused to heat. Most produts are in some way suitable for self-testing and formative testing, with the latter role being that in whih they are most useful. A general lak of adequate reporting tools means that interpreting the results of diagnosti testing is more onfusing than it should be. 4.4 Marking shemes The use of marking or keeping sore is almost universal among CAA tools. A points value, or weight, is assoiated with eah response, and the sore awarded is the total of these. Most produts allow hints, and those that do usually allow a points value to be subtrated for the hint being given. Generally, the ability to dene how marks are awarded over set questions is fairly exible, though they are usually set on a question-by-question and response-by-response basis. Some produts go further, allowing ondenebased assessment [25℄. Others allow for the grouping of marks into grades, giving a quiker interpretation of a student's progress. Software whih assesses programming ability is, of ourse, based on dierent riteria, and Coursemaster allows marking based on typographi features, language features as well as orretness in tests. A handful of systems, of whih Coursemaster is one, allow the grouping of marks into grades. Coursemaster also attributes a olour to eah grade to make Web-reported statistis more easily understandable. 4.5 Feedbak to students The possibility of inluding timely feedbak to questions allows the omputer assisted assessment tool to be used for learning rather than merely assessment [10℄. Indeed, it is often said that the harateristis of CAA tools make them best suited to the pratie of regular and frequent formative assessment for learning [5, 26℄. Repeatable and aessible tests with rih and explanatory feedbak an omplement and reinfore delivered material eetively with little teaher eort beyond the reation of the test [24℄. Many tools available provide suh feedbak, though often it is very muh based on the delivery of stored text for eah possible response. This in onjuntion with a sore so far or other simple statistis presented at the end of a test is often all that CAA tools present. Given this rigid framework, the feedbak given always needs to be well phrased and needs to take the tool's feedbak struture into aount. Clearly, the feedbak provided annot be as exible as that from proper dialogue with a person, though the ability to dene feedbak on anything other than the response-level is laking. There is generally no provision for the delivery of feedbak whih spans several questions, for example to give a general indiation of topis in whih the student is (onsistently) doing poorly, though some tools allow students to see their reent performane in a ourse. Feedbak to students is a weak area, a lear missed opportunity for CAA, though it may not beome feasible until shemes for question meta-data are devised. 4.6 Feedbak to sta Another important kind of feedbak in the eduational proess is that whih the leturer reeives in the form of summaries of the results of the assessment, for example statistis suh as average mark and the spread of marks over the lass. This is an area in whih muh of the software is laking, and again an area in whih omputers ould signiantly improve the eduational proess by visualising the progress of their lass. Feedbak to sta is most ommonly embodied in the form of reporting tools, giving statistis for a group of students taking the ourse, and suggesting bad (e.g. poor disrimination) questions to be removed. This is a area whih is laking in the urrently available CAA tools. Meaningful reporting depends partly on students being able to identify themselves to the system. Most systems don't allow this, and even some systems whih do lak any reporting apabilities. Many that have reporting apabilities give only the more basi information, suh as the average sore or a simple table showing students' identiers and their sores, delegating the analysis to a spreadsheet program. Visualisation of results, for example as a bar hart, is unusual. More ommon are tables of student name by assessment. However, features whih give a more easily-interpretable display of the lass progress are few and far between. Coursemaster's `traÆ light' approah is a step in the right diretion. 9 There is onsiderable sope for the extension of CAA tools to provide deeper and more readily interpretable feedbak to leturers [24℄. This ould allow leturers to weed out problem questions or identify areas of a ourse whih are not being understood, rather than just identifying students who are underperforming. This type of feedbak is espeially valuable sine what is being measured hanges over time [22℄, and a timely reation ould improve the understanding of urrent and later topis signiantly. The inorporation of breakdowns by subjet or the use of soring grids might be one way in whih this ould be ahieved [10℄, though this depends on the leturers ability to set appropriate questions as well as the software's apabilities. Good feedbak is a ombination of the produts apabilities and the user's assessment-setting ability | good assessment and question design and meaningful feedbak are often related. 4.7 Question types A key aim of most omputer-assisted assessment software is to support learning ativities [10℄. Generally the question types oered are rather simple, though a number of tools have more exible provisions for extending question types. Multiple hoie, multiple-response and ll-in-the-blanks questioning are typially provided, but often only a few question types beyond these exist. The dialogue between the student and the tool an often be rather shallow | the tools do not naturally lend themselves to the types of questioning whih would invite more sophistiated thinking and learning in students [22℄. The use of omputers has, however, enouraged newer question types, suh as ondene assessment, to be formulated, as well as the variation of questions by randomising numbers or hoosing dierent distrators. However, both of these are under-represented in the tools, even though randomisation is an advantage for a lass-based omputer assessment. Some produts are not oriented toward multiple hoie questioning, and oer dierent questioning as a result. Computer programming [14, 16℄ is one main fous. Diagramming is one interesting area of extension. One produt [6℄ has simple hemial diagramming built in, and Coursemaster's diagramming extensions [29℄ are a very exible and apable solution to diagrammati testing. Computer programming assessment tools are in general more apable, perhaps having been motivated by more spei requirements and learning goals. Coursemaster's [16℄ approah of allowing the user to dynamially link modules to mark responses provides the Java-apable leturer with a very exible solution, enabling the marking system to be extended signiantly, and limiting the possibilities of assessment to that whih an be done generally with a omputer program, while still staying within the framework of the Coursemaster tool. Ative learning, styles of assessment whih ask for text and diagrams, and other methods of engaging the student in the response are reognised to be eetive forms of assessment-based learning. These too are under-represented in the tools provided, possibly due to the more omplex development of these question types. 4.8 Relevane to Computer Siene Many of the produts reviewed have no speial bias toward the assessment of Computer Siene, being general-purpose question-and-answer tools. Some notable exeptions, however, are the Coursemaster [16℄ and BOSS [14℄ systems, among whose apabilities are the testing of programming assignments. 4.9 Deeper learning and Bloom's taxonomy Bloom's taxonomy [3℄ haraterises questions by their level of abstration, representing the ompetenies and ativities involved in answering them. Though question types often enourage questions belonging to a partiular ategorisation, a given question type will typially be more appropriate for a given range of dierent levels of abstration. Though many of the tools are able to test the lower levels of the taxonomy (knowledge, omprehension, appliation), and though some question types better enourage higher levels of ativity than others, the leturer often needs to be skilled in question setting and design in order to evaluate the higher levels (analysis). Deeper learning is assoiated with the partiipation in higher-level tasks, partiularly with onstruted response question types, sine they are often far more useful in terms of onveying the priniples behind the tasks [22℄ and an promote higher-order learning through reetive exeution of the tasks. However, 10 the inability of omputers to assess tasks exerising more general skills, and therefore based on a more omplex soring rubri [22℄, apparently limits CAA software to proessing this type of answer by simple tehniques suh as pattern-mathing of input. Though tehnology is unlikely to be able to deliver suh assessment in the near future, it remains possible that the appliation of CAA will be able to replae some assessment strategies in partiular subjet areas. The provision for group work is laking, though some interesting systems have been developed based on peer assessment, for example OASYS [2℄. 4.10 Student autonomy The rapid inrease in student numbers over reent years has in part led to a hange in the role of student and leturer in the higher eduation environment. There has been a shift toward the student taking responsibility for their own learning and the leturer providing the learning environment neessary and failitating learning [5℄. Indeed, self-evaluation is now regarded as an important life skill whih should be enouraged as part of the eduational proess [24℄. Self-evaluation is failitated by a number of features in software. Provision for a student to view their progress in the ontext of the module | and ultimately the ourse | ats as both an indiator of weak areas and as a motivator. Some of the software reviewed, partiularly those in the `virtual learning environment' ategory, give this kind of overview, though this is as far as the promotion of self-evaluation goes. Related to this is the idea of student-entred learning and the hange in emphasis leading to students having more ontrol over their learning than would be done traditionally [24, 5℄. Student-entred learning might inlude the ability to undertake pratie tests before undergoing an assessed one (or repeating tests until a given grade is attained). Even this relatively simple measure is rarely supported in CAA software. Student-entred learning also is failitated by the openness of the assessment, for example minimal serey of the rubri and riteria for assessment [22℄, as well as ease of aess. In these respets, muh of the reviewed software makes good provision. By making the assessment more open and less hidden and subjetive, the proesses of learning are more visible; the student knows better the aims of the ourse and how best to undertake their own learning, and an determine whether onepts have been learned and whih are yet to be learned [24, 4℄. Computer-based assessment is often said to make the proess of learning more visible to the student [22℄, and therefore enourages greater autonomy, though expliit eorts to make lear the learning proess in the software are often negleted. 5 Developer and Institution This setion disusses the tools from the viewpoints of the software developer and the user institution, and the relationship between them. 5.1 CAA tool produers With CAA tool produers either being ompanies or aademi institutions, there is an assoiation between the quality of a produt and its ost. Aademi produts are generally free, or heap, but are weak in support, where the level of input and ontinuing updates is onerned. Companies all harge for their produts, but these tend to be more polished, are better maintained, and provide more onsistent support. A number of produts ome from spin-o ompanies from aademi departments, often giving some pedagogial relevane to the tool. 5.2 Quality of interfae The quality of the human-omputer interfae tended to be limited by some of the tehnologies used, with many suering as a result of the use of (often non-standard) HTML ode for presentation on the Web. Often produts seemed to have an overdesigned or `busy' interfae, partiularly those developed by ommerial providers. Other produts use HTML whih gives the interfae a somewhat dated look and feel. Some of the produts using non-Web interfaes are better thought out. 11 5.3 Support The quality of support provided is dependent on the soure of the produt. Commerial produts oer a wide range of help from installation through to advie on question reation, but at a prie. Aademi institutions are not often in a position to provide more than informal support, but typially are friendly and willing to help via telephone and e-mail. In many ases support onsists of supporting the produt only; assoiated learning materials are neither provided nor seem available and it is up to the ourse organiser to reate them. A few produts have an emphasis on material as well as software and both sell additional learning material modules as well as providing an infrastruture for the distribution of materials from textbook authors and other ontent providers. 5.4 Time and money osts Computer-assisted assessment is often introdued out of a desire to ounter the inreasing time demands on sta while still delivering an aeptable eduational experiene. It should be onsidered, however, that this appliation arries osts of its own, both in terms of time and money. The most time-intensive stage of using CAA is when it is being introdued | long-term gains in eÆieny are often at a ost of short-term eort [10℄. A major problem lies in the fat that materials need to be prepared for delivery by the omputer. The lak of adoption of standards and the onsequent lak of modularity leads to major diÆulties in reusing materials, whih means that materials need to be onstruted from nothing, rather than being generated. There are also possible `hidden delays' in installation, upgrade and upkeep proedures, deadline extension and mark adjustment on the omputer, general system failures and ollation of neessary statistis and their transfer to a entral database in the absene of adequate reporting and database onnetivity failities. As with most software solutions, there are also nanial osts involved in applying CAA software. In addition to initial osts, some of the software requires third-part software. Many of the suppliers harge upgrade and support osts. There are also osts assoiated with the use of third-party assessment materials. 6 Ideas for further work This setion disusses possible future diretions for the CAA projet, rstly for the review exerise itself, and then topis for larger-sale CAA researh whih have been prompted by the LTSN work. 6.1 Review exerise Further investigation of the pedagogy of omputer-assisted assessment, for example its role in learning. Survey work to gain a better idea of the requirements of Computer Siene leturers in higher eduation, to at as a set of new riteria against whih to evaluate software. A study of how CAA is used in higher eduation, partiularly omputer siene, to get a better idea of patterns of use, best pratie and areas whih software needs to address. Partiularly, in whih parts of the deployment of CAA software is the most time spent, and what measures might be taken to make the proess less time-onsuming? A survey and summary of anti-plagiarism software and tehniques. Further investigation into the potential value of onditional questioning and adaptive testing and the degree to whih it is supported in CAA tools. A study into the usages of CAA for summative and diagnosti testing and whether CAA is intrin- sially primarily of use as a formative tool. An analysis of the extent to whih the reporting apabilities of the CAA produts full the needs of the leturers using them. This again would involve some requirements apture work. 12 A requirements analysis of the seurity aspets of CAA software, and the establishment of an aeptable minimum level of seurity provision. 6.2 Topis in CAA researh A rethink of the form of feedbak given to users. At present feedbak is limited in its form. How ould feedbak be extended to be of more use to the disparate roles of student and leturer, from a pedagogial perspetive? Investigation of the struture and reuse of CAA materials, and whether the IMS standard markup is appropriate for `real world' use. This is strongly related to question banks and their organisation and use. A study into how omputers and CAA might be better applied to group work, and the degree of support they already give. For example, do VLEs provide the pedagogially-sound groupwork support they promise? Extending work done in spei assessment situations, for example omputer programming or other subjet domains, or widening the range of question types, suh as with the diagramming work. Making provision for deeper learning. The dialogue between the student and omputer is often shallow in nature, and the tasks whih the students are expeted to do require only low-level thinking on Bloom's taxonomy. There is no provision for deeper learning in software | for example, there is a lak of onstruted response questioning. To what extent ould this be ahieved? How ould the priniples of student-entred learning and student autonomy be better enouraged in CAA software? We identify one possibility in this report, namely by enhaning the visibility of the learning proess, though it is possible that many other measures like this exist. 7 Conlusions Developed assessment materials are made less reusable and less distributable by a general lak of provision of standardised or interhangeable formats. Furthermore, standards are slow to be developed and often lag behind the requirements and apabilities of the produts. Developed materials are therefore neither portable nor do they failitate sharing between leturers and produts (the question bank onept), reduing the motivation for, and value of, suh materials. Seurity often is an issue, partiularly for those wishing to undertake assessment in a summative or diagnosti ontext. For suh a ruial part of the eduation-by-assessment proess, feedbak is not exible enough in the tools. Feedbak to students is often too onentrated on the responses themselves rather than at a test level. The reporting apabilities of most of the tools is laking. The great potential of using the omputer as a analyser of lass statistis, perhaps by nding problem students, areas of misunderstanding, or simply poor questions, is generally wasted. The majority of CAA tools annot be applied to general situations. They are typially intended for a partiular setting in whih CAA ommonly takes plae. This might be a simple quiz inorporated into Web-delivered materials, or a more apable summative assessment system, or a part of a VLE. In their limited appliation areas, most of the tools are apable, and in many ases have been proven to some extent in eduational institutions. There are some interesting CAA tools available, at various stages of development, with varying funtionality and of varying quality. None of the produts is a `lear winner' and most lak features spei to the needs of eduation in Computer Siene. Some produts are strong in a few areas but none an be onsidered to meet every eduational need; there are spei and fundamental shortomings of the urrently available tools and the approahes they take. 13 8 Dislaimer Every eort has been made to ensure the auray of the information supplied in this report and the assoiated Website, and to the best of our knowledge and belief the report and the assoiated Website fairly desribes the tools and pakages whih are referred to. Any error or omission an be reported to the authors and we will orret them as soon as possible Neither the University of Warwik nor the LTSN Subjet Centre for Information and Computer Sienes assumes any liability for errors or omissions, or for damages resulting from the use of the information ontained herein and on the assoiated Website, and are not responsible for the ontent of external internet sites or any other other soures. The information ontained herein and one the assoiated Website is provided on the basis that the reader will not rely on it as the sole basis for any ation or deision. Readers are advised to ontat the stated primary soure or the produt's originating institution before ating on any of the information provided. This report inludes some words and other designations whih are, or are asserted to be, proprietary names or trade marks. Where the authors are aware that a word or other designation is used as a proprietary name or trade mark, this is indiated in the list below, but the absene of a word or designation from this list does not imply any judgement of its legal status: Apahe, Authorware, Ceilidh, Coursemaster, DES, EQL, GRE, IMS, Interative Assessor, Java, JavaSript, Maromedia, Pereption, Perl, Question Mark, TestPilot, Unix, Webssessor, Webassessor, WebCT, Windows, WebMCQ, XML. 9 Appendix: Summaries of the reviews This appendix desribes the tools reviewed, and briey summarises the reviews, as presented on the Website in January 2002. BOSS BOSS is an open-soure solution developed by the University of Warwik to handle online submissions and ourse management. It allows students to submit their work seurely, and for sta to manage and mark assignmens. It has been developed with sound pedagogy in mind. It uses a role-based interfae to keep trak of the marking proess of marking, moderation and publishing. The software stores muh of its data in an SQL database, promoting salability and integration with other systems. The software is easy to use despite its omparative omplexity, and has good seurity. A lient/server approah is taken, and its lightweight lients whih download the latest versions of the student and sta lient ease large-sale deployment along with its platform independent Java implementation. Perhaps its main use is for formative testing; many of its features are suited to failitating regular programming assignments. It is also useful for diagnosti and even ontrolled summative testing, though does not lend itself to ad-ho self-testing due to the marking proedures it implements. Automati testing works well with a few limitations, though ative development will mean this area will beome more exible as development progresses. Doumentation is a strength. CASTLE The CASTLE toolkit is a set of tools designed for tutors to reate multiple hoie based online tests. Emphasis is plaed on ease of use throughout | no prior knowledge of any languages (suh as HTML or Javasript) is required. CASTLE produes HTML form-based quizzes apable of handling multiple hoie, multiple response and text-mathing. It is omparatively basi in nature, its main strength being its ease of use and deployment. ClydeVU/Miranda CVU is a omparatively early attempt at a virtual university developed by Strathlyde University for use in a number of Higher Eduation institutions in Western Sotland. Its general aim to host Web-based teahing pakages and disussions as well as assessments, though the assessment engine is more apable 14 than many other virtual university solutions. In this evaluation we onsidered only the assessment engine, whih is referred to as Miranda. Though in some respets it is beginning to show its age, for most appliations save summative testing in an examination environment, CVU is a useful and omparatively well-thought-out tool. Coursemaster Coursemaster is a lient-server based system for delivering ourses based around programming or diagramming exerises. It provides a solution for delivering oursework speiations, allowing students to develop programs and submit them, and the automated marking and monitoring of students' responses. Unlike muh of the other software reviewed on this site, Coursemaster is tailored to the needs of teahing programming, and has been developed and used over several years. Coursemaster is an updated version of the tried-and-tested (10 years) server/lient-based Ceilidh system. It has a number of unique apabilities, for example the assessment of diagram-based work. It is robust and easy-to-use, partiularly for the student. The leturer is given a wide variety of statistis, though through a slightly dated interfae. Marking is very exible, though the built in tools do not ahieve anything partiularly innovative, but are still useful. Installation may be a problem area, depending on the platforms and ongurations used at your institution. All aspets of implementation seem to have been thought through, and seurity is unusually good. EQL Interative Assessor EQL Interative Assessor is an appliation-based assessment system for Windows. Its emphasis is on tests taken against the lok in the lassroom or at home with automati analysis of results. The teaher uses a speial appliation for designing tests and the student uses a similar one for taking it. The manufaturers emphasise seurity (separating leturers' question banks and the resulting tests) and exibility (freedom in question type and layout). EQL Interative Assessor is unusual in that its method of delivery is through a Windows-based appliation. However, the software is old and a new version is due soon. The modular, exible tools makes the reation of tests easy, though the design of questions themselves is a little more involved and require a little more experiene than with other tools. Ease of use from the student's point of view is generally very good, with a number of very minor drawbaks. The whole suite ould not be tested due to the limitations on the demonstration suite. The software is perhaps best applied to self and formative testing, though under the right onditions ould also be applied to summative testing. Note that EQL Interative Assessor has sine been withdrawn from EQL's produt range. Netquest The Netquest projet was set up to investigate how `question banks' for CAA might be set up to failitate testing. The projet is based around the TML | the tutorial modelling language { a superset of HTML intended to desribe question data while keeping formatting information separate from its semantis. Software was also developed to proess TML into a form suitable for delivery to a Web browser. Netquest is a demonstrator of TML, a proposed language for assessment; it is not intended totally as a produt for delivering tests. However, for this appliation it is ertainly useful, though it takes more setting up than it possibly ould. Delivery of tests is fast and lear, and basi analysis of individual students is possible. Though the software is not immediately appliable to summative testing, it works well as a regular formative or diagnosti tool. Question Mark Pereption Pereption is a pakage of ve appliations for designing and onduting surveys. The software is seen as something of an industry standard, and is ertaintly one of the more developed produts available. Its apabilities are wide-ranging and well thought out, its partiular strengths being in its management of students, reporting of results, and the development and organisation of question data. Its question types are exible, inluding many variations on the basi question types and others. The seurity is good 15 and the database-entred approah gives the produt robustness and salability. Generally, a proven and fairly omprehensive CAA solution. TACO TACO is a Web-based appliation whih allows leturers to reate `ourseworks' for students. TACO allows the leturers to selet question sets, made up of a variety of question types, and TACO marks and gives feedbak. TACO is a well-strutured HTML assessment engine whose development was motivated by a strong requirements phase. It has a simple user interfae and a good degree of exibility in the the form of assessments, though its analytial apabilities are laking. It is perhaps most suited to formative and perhaps diagnosti assessment, the lak of tight seurity being the main issue for summative assessment. Test Pilot Test Pilot is a moderately apable Web-based assessment engine. Leturers author questions and build assessments from them, whih an be automatially graded, with good feedbak and analysis possibilities. Its main strengths lie in its relatively intuitive interfae, whih oers lots of help and is relatively easy to work with. Equal attention has been paid to all stages of the assessment proess, from reation to assessment to statistial analysis. Being a ommerial produt, it is relatively expensive, espeially the support fees. Test Pilot would be suitable for a formative or diagnosti level assessment appliation. TRIADS The name of the TRIADS projet stands for `Tripartite Assessment Delivery System', and is a ollaboration between the University of Liverpool, the University of Derby and the Open University. The projet's fous is on learning outomes and their inuene on urriulum design. By understanding this it aims to produe a CAA system whih allows a theoretially justied approah to formative and summative testing of knowledge, understanding and skills. The TRIADS system is a series of programs whih run under Maromedia's Authorware. The Authorware editor is therefore required to make use of the software. The software presents an unrivalled number of question types and exibility in question-setting, though the authoring for questions requires programming skills. Despite the TRIADS projet's notiable eorts, the Authorware environment is omplex to use and introdues a steep learning urve. The authors' plans for a better sta user interfae will go a long way to xing this problem. Taking a test if straightforward and good feedbak an be speied, though there are some user interfae issues whih need to be onsidered here as well. Overall, a promising and exible aademi produt in need of an easier user interfae. Webassessor Webassessor (also written Webssessor) is an assessment, authoring and delivery system aimed primarily at industry, but whih also laims to be as relevant for use in aademia. Emphasis is plaed on ease of use both in assessment reation and administration, while still allowing exibility and `multimedia' apabilities. The main appliation and area of ollaboration with industry is seen as distane learning rather than inorporation in mainstream eduation. Webassessor is a apable solution whih aters for most appliations. Its reporting apabilities mean that it has features somewhat more suitable for diagnosti testing. It has a reasonably good user interfae and is easy to use, with good feedbak apabilities. The basi question types are provided. Seurity has been onsidered and it has been proven in some limited ommerial ontexts. However, both the server software liene and the hosting servie represents a signiant onsiderable nanial outlay. Webassign WebAssign is desribed as a `Web-based homework delivery, olletion, grading and reording servie', aimed speially at university leturers. It has been, and is being, developed at the Department of Physis at North Carolina State University in the USA. As well as allowing leturers to write their own 16 questions, there is a emphasis on the development of question banks, in partiular sets of questions from reognised textbooks. WebAssign is signiantly more apable in many respets than its ompetitors. It is partiularly strong on the question-bank philosophy, feedbak, randomisation, reporting and ommuniation with students. Attention has been paid to allowing a exible, modular approah to assignment building. After some experiene with the mostly intuitive interfae, the software an be used to reate a good variety of assignments in a strutured manner, though the limited range of question types may be a problem. Generally, a exible and apable pakage, well-suited to the requirements of a university leturer needing to build assignments. WebCT WebCT is sold as a ourse management system, laimed by the Website to be the most popular. Emphasis is on inreasing ommuniation between leturers and students, though online assessment is one of its apabilities. Another emphasis is the delivery of online ontent, and the easy aquisition of delivered material from publishing ompanies rather than developing it `in-house' by the leturer. The assessment omponent is onentrated on in the review, though broader onsiderations take in the more general appliation of the software. The assessment apabilities are limited to the point that WebCT may not be appropriate as an assessment tool only. Furthermore, WebCT is aiming to replae its assessment omponent with Question Mark in the future, using inter-program ommuniation to ahieve the integration. WebCT is a omprehensive ourse management system, with relatively good CAA apabilities. Simply as a CAA tool WebCT would be expensive, umbersome to manage, and over omplex for assessment reation, reeting its aim to full a muh wider role. WebMCQ WebMCQ provides a Web based servie allowing quizzes foused on multiple hoie questions to be reated, edited and delivered though a Web browser. All administration and delivery of quizzes an be arried out from the WebMCQ site itself. Emphasis is on the provision of servie, as opposed to software, and on ease of use. WebMCQ provides a relatively easy to use system, with strong seurity and a good support servie. It is slightly let down with some onfusing interfae features and a restritive range of question types. Referenes [1℄ Apahe Software Foundation. Apahe website. http://www.apahe.org/. [2℄ Abhir Bhalerao and Ashley Ward. Towards eletronially assisted peer assessment: a ase study. Alt-J | Assoiation for Learning Tehnology Journal, 9(1):26{37, 2001. [3℄ Benjamin S. Bloom and David R. Krathwohl. Taxonomy of Eduational Objetives: The Classia- tion of Eduational Goals, by a ommittee of ollege and university examiners. Handbook I: Cognitive Domain. Longman, 1956. [4℄ David Boud. Assessment and Learning: Contraditory or Complimentary?, pages 35{48. Sta and Eduational Development Series. Kogan Page, 1995. [5℄ Don Charman. Issues and Impats of using Computer-based Assessments (CAAs) for Formative Assessment, pages 85{94. Sta and Eduational Development Series. Kogan Page, 1999. [6℄ Clyde Virtual University. About CVU. http://vu.strath.a.uk/admin/vudos/. [7℄ James R. Dalziel and Sott Gazzard. Assisting student learning using web-based assessment: An overview of the webmq system. http://www.webmq.om/publi/pdfs/ovrview.pdf. [8℄ Drake Kryterion. Webassessor. http://www.webassessor.om/webassessor.html. 17 [9℄ Eduational Testing Servie. Graduate reords examination website. http://www.gre.org/. [10℄ Jen Harvey and Nora Mogey. Pragmati Issues When Integrating Tehnology into the Assessment of Students, pages 7{20. Sta and Eduational Development Series. Kogan Page, 1999. [11℄ IMS Global Learning Consortium. IMS Question & Test Interoperability: ASI Seletion and Ordering Speiation, Otober 2001. [12℄ IMS Global Learning Consortium. IMS Question & Test Interoperability Speiation, Otober 2001. [13℄ Institute for Learning and Researh Tehnology. netquest/. Netquest. http://www.ilrt.bris.a.uk/ [14℄ M. S. Joy and M. Luk. The BOSS system for on-line submission and assessment. Monitor: Journal of the CTI Centre for Computing, pages 27{29, 1998. [15℄ Mike Joy, Simon Rawles, and Mihael Evans. Computer assisted assessment tools review exerise. http://www.ds.warwik.a.uk/ltsn-is/resoures/aa/reviews/, 2001. [16℄ Learning Tehnology and Researh Group. CourseMaster/, 2001. Coursemaster. http://www.s.nott.a.uk/ [17℄ EQL International Ltd. EQL. http://www.eql.o.uk/. [18℄ LTSN-ICS. LTSN-ICS Website. http://www.is.ltsn.a.uk/. [19℄ Don Makenzie. Triads (tripartite interative assessment delivery system) manual. http://www. derby.a.uk/assess/manual/tmanual.html. [20℄ Maromedia. Authorware. http://www.maromedia.om/software/authorware/, 2001. [21℄ Mis. Perl website. http://www.perl.om/, 2002. [22℄ Malolm Perkin. Validating Formative and Summative Assessment, pages 55{62. Sta and Eduational Development Series. Kogan Page, 1999. [23℄ Question Mark. Question Mark. http://www.questionmark.om/. [24℄ Kay Sambell, Alistair Sambell, and Graham Sexton. Student Pereptions of the Learning Benets of Computer-assisted Assessment: A Case Study in Eletroni Engineering, pages 179{192. Sta and Eduational Development Series. Kogan Page, 1999. [25℄ M. A. Sasse, C. Harris, I. Ismail, and P. Monthienvihienhai. Support for Authoring and Managing Web-based Coursework: The TACO Projet. Springer Verlag, 1998. [26℄ Leith Sly and Leonie J. Rennie. Computer Managed Learning as an Aid to Formative Assessment in Higher Eduation, pages 113{120. Sta and Eduational Development Series. Kogan Page, 1999. [27℄ Sun Mirosystems. Java website. http://java.sun.om/. [28℄ Christian Su and Burkhard Freitag. Learning material markup language: LMML. http://www. ifis.uni-passau.de/publiations/reports/ifis200103.pdf. [29℄ Athanasios Tsintsifas. Diadalos. http://www.s.nott.a.uk/~azt/daidalos/. [30℄ Dave Whittington. Tehnial and Seurity Issues, pages 21{28. Sta and Eduational Development Series. Kogan Page, 1999. 18