Is Rogō a viable alternative to QuestionMark Perception in the Medical Sciences Division at Oxford? Damion Young and Jon Mason Medical Sciences Division Learning Technologies (MSDLT) University of Oxford damion.young@medsci.ox.ac.uk Abstract The Medical Sciences Division at the University of Oxford has a strong background in online assessment with QuestionMark Perception but has recently taken its first steps with the Open Source e-assessment system, Rogō. A number of drivers for change are identified: authoring; performance and delivery; costs; reporting; and flexibility. The extent to which Rogō addresses these drivers is discussed. Benefits for hardware cost, performance and flexibility are clear. These and Rogō’s potential to address the other drivers make this a serious contender, but more work on confidence building, and in confirming reliability and security in particular, is required. Introduction The Medical Sciences Division (MSD) at the University of Oxford has been running online assessments using QuestionMark Perception since 2003 (Sieber and Young, 2008) - summative assessment began in 2004 and continues to grow. In 2010-11, we delivered 161 online assessments, of which 53 were formal University exams, to a total of c. 17,000 participants. Page 1 Is Rogo a viable alternative to QuestionMark Perception in the MSD at Oxford? QuestionMark Perception1 is a commercial e-assessment system which is widely used in UK HEIs. Rogō2 is an open source e-assessment system developed at the University of Nottingham (UoN) and used across its UK and international campuses. The drivers for change Perception has proved a very reliable and secure assessment delivery platform. It is packed with features and is continually improving and adding new features. However, there are five areas in which it still does not satisfactorily meet our requirements: Figure 1. Typical extended matching question Driver 1: Authoring Until recently, assessment in medical sciences has been dominated by the extended matching question type (Case and Swanson, 1993). The prevalence of this question type (Figure 1) and a number of other standards that we have developed mean that the Perception question creation and editing interfaces are inadequate. We have therefore developed our own question creator to e.g. alphabetise and generate the table of answer options. We have not been able to integrate this fully with Perception so creation involves importing QML (Perception’s native XML format) and editing is often done by recreating and overwriting questions – a process prone to errors. The complexity of this process has contributed to the great majority of question and assessment creation being carried out by our half-time e-learning administrator. We need a tool which will allow non-technical users to easily create and edit all the questions types that we commonly use. 1 QuestionMark Perception, http://www.questionmark.co.uk (accessed 5th May 2012) 2 Rogo, http://www.nottingham.ac.uk/rogo/index.aspx (accessed 5th May 2012) Page 2 Is Rogo a viable alternative to QuestionMark Perception in the MSD at Oxford? Driver 2: Performance and delivery With the move to Perception v4 in 2008, we installed a four server, load-balanced system which did, initially at least, allow us to start 90 students (the maximum we can accommodate in one sitting) simultaneously without problems. Performance has decreased over time and we now start in groups of 20 or so with the expectation that one or two students will need to restart after an error. We are in the process of upgrading to v5.4 but, although initial testing suggests that it is considerably faster than the existing system, our previous experience, and QuestionMark’s own documentation3, suggests that it may still not deliver the performance we want. We have never had a major interruption to an exam and have experienced no more than a handful of workstation failures. Perception’s Save As You Go (SAYG) does autosave students’ answers but does not save elapsed time. In the event of a failure or interruption, the background timer continues counting down, and can submit an assessment before a student can return to it. We need a system which will reissue the correct time remaining on resumption. Driver 3: Licence, hardware and maintenance costs As well as the considerable cost of the Perception hardware and licence for our existing system, our annual support package for 2000 students with QuestionMark Secure is significant. Ageing hardware and the move to v5 meant another large hardware investment last year. The size and complexity of Perception means that upgrading from version to version, particularly on high availability hardware, is a far from trouble-free process (upgrade discussions make up a large proportion of messages on the QUESTIONMARK@JISCMAIL.AC.UK list). After a dismal experience upgrading from v3 to v4, we decided to employ QuestionMark’s consultants for the current upgrade. This has certainly made the process simpler but is another regular cost. Driver 4: Reporting In order to make the most of the institution’s investment in question-writing and online assessment, we want to be able to provide reports for examiners which, for each question, combine: The question as seen by students. Correct answers. Feedback given (where appropriate). Syllabus information. Assessments in which the question has been used. Previous versions of questions/differences from previous versions. Performance of question items on individual assessments and across assessments. Highlighting of potentially problematic question items. Ideally this would also be searchable and allow filtering. Perception does not provide this out of the box. We have been able to deliver some of this using Javascript in assessment templates (Figure 2) but the process is fairly manual and vulnerable to changes in the way that Perception questions and reports are delivered. 3 Best Practice: Scalability in Perception Version 5, https://www.questionmark.com/perception/help/v5/best_practice_guides/scalability/content/v 5_scalability.pdf (accessed 5th May 2012) Page 3 Is Rogo a viable alternative to QuestionMark Perception in the MSD at Oxford? Figure 2. Questionbank report Perception template Driver 5: Lack of Flexibility Perception comes with a wealth of features to customize look and feel, delivery, etc. However, they do not address the authoring and reporting drivers outlined above. Perception also provides numerous APIs with which third party applications can interact with the system – we use these for logging students in for example – and these could be leveraged to address our needs to some extent. However, we don’t want to have to ask users to use multiple systems, with different interfaces, in order to access and manage questions and assessments – everything should be available in one place. Another problem is that changes in the features provided by Perception’s Authoring Manager, Enterprise Manager, etc. are at the discretion of QuestionMark and/or subject to a development/consultancy fee. We have suffered with a number of awkward/annoying interface issues over the years e.g. dialogue boxes that won’t resize to show long assessment names, menus that don’t work in modern browsers, etc. These are relatively minor issues to fix but we have very little ability to get them prioritized. Page 4 Is Rogo a viable alternative to QuestionMark Perception in the MSD at Oxford? The Opportunity to Change In recent years, change in the CAA domain has been fairly rapid, including the release of Moodle 2 (and the Open University’s work on its Quiz tool4) and with the decision by the University of Nottingham (UoN) to release its online assessment system, Rogō, under an open source licence. We were lucky enough to be invited to be part of UoN’s JISC Assessment and Feedback Programme Strand C5 project to support the transfer of Rogō to five HEIs and promote the creation of an open source community around it. MSD have been working with UoN since September 2011. We delivered our first assessment using Rogō, formative but sat under exam conditions, on 23rd April 2012. We are documenting our experiences on our blog6. This paper builds upon these experiences, and our much longer relationship with Perception, to consider, given the drivers identified above, whether Rogō is a viable alternative to Perception for the needs of online assessments in the Medical Sciences Division at the University of Oxford. On the assumption that most in the field will have some experience of Perception and because our focus is on whether we could move from Perception to Rogō, this paper concentrates on where we have found that Rogō differs from Perception, rather than trying to exhaustively document the two systems. Does Rogō addresses the drivers for change? Driver 1: Authoring The majority of the question types offered by Perception are also offered by Rogō (Table 1) and the ones which are not, are, with the exception of multiple MCQ, not types that we have ever used anyway. Rogō does automatically create a more readable, and optionally alphabetised, table of options at the head of an extended matching question. It also has built-in support for EMQs in which a single stem can have more than one answer whereas in Perception this requires some awkward use of the Matching question type. Rogō is also soon to have an Area question type which will assess agreement between a pre-determined polygon on an image and one drawn by a participant – something we have been repeatedly asked for in anatomy. One nice, but minor, feature of Rogō is that reorganisating questions within an assessment is a simple drag and drop operation rather than Perception’s more clunky delete and re-insert. Despite a few helpful features, Rogō gives does little to address our original authoring driver for change. 4 Tim Hunt on Moodle 2’s enhanced Quiz Tool: http://www.moodlenews.com/2010/moodle-20-new-quiz-features-moodle2/ (accessed 5th May 2012) 5 JISC Assessment and Feedback Programme Strand C: Rogo OSS, http://www.jisc.ac.uk/whatwedo/programmes/elearning/assessmentandfeedback/rogo.aspx (accessed 5th May 2012) 6 MSDLT Blog concerning Rogo: https://learntech.imsu.ox.ac.uk/blog/?cat=3 (accessed 5th May 2012) Page 5 Is Rogo a viable alternative to QuestionMark Perception in the MSD at Oxford? Table 1. Question types in Rogō and Perception Question Type Do we use it Rogō Perception (Knowledge) Matrix Y Y (Matrix and Dichotomous) Y Extended Matching Y Y Y Matrix Y Y Y Multiple Choice Y Y Y Multiple Response Y Y Y True/False Y Y Y Matching N (only for EMQs with multiple answers per stem) N (but could be approximated with Labelling) Y Select a Blank (select correct option to fill blank from drop-down list) Y Y (option in Fill-in-theBlank) Y Multiple MCQ (like EMQ, but different answers available for each stem ) Y N (can only be done using separate MCQs) Y Calculation/Numeric Y Y Y Fill-in-the-Blank (free text answer to fill blank) N Y Y Flash Interface N Y Y Image Hotspot N Y Y Labelling/Drag and Drop N Y (Labelling) Y (Drag and Drop) Likert Scale Not for exams Y Y Ranking N Y Y Script Concordance Test N Y Text Box/Essay Not for exams Y Y File upload N N Y Adobe Captivate N N Y Spoken Response N N Y Survey Matrix (matrix of Likert scale) N Y (using Matrix) Y Text Match N N Y Page 6 Is Rogo a viable alternative to QuestionMark Perception in the MSD at Oxford? Driver 2: Performance and delivery We have yet to test Rogō with a full cohort of 90 students. Our first test saw two sittings of c.40 students sitting an image-rich anatomy paper in Rogō while two sittings of c.40 of their peers sat the same paper on Perception. Performance was observed rather than measured but, starting the students in groups of c.20, Rogō delivered all the papers with only a few seconds’ delay. Perception (v4) exhibited a very similar performance. This is remarkable as Rogō was running on a single server, while Perception was running on a four-server load-balanced/clustered setup. UoN started 3877 students simultaneously across seven locations in January 2012 using a single, albeit very well-specified, server. This looks encouraging and could have implications for hardware costs as well. Rogō currently has no timer, which brings its own issues, but it does mean that it doesn’t suffer the problem with elapsed time in case of interruptions. It does, however, have the concept of a Fire Escape button which saves the assessment and blanks the screen during an evacuation. If this could be combined with Perceptionstyle SAYG and timing, we would have a system which exactly meets our requirements in this respect. Driver 3: Licence, hardware and maintenance costs Original licences for Perception with 2000 participants and QuestionMark Secure were a significant investment. The Perception Software Support Plan is then a major annual cost. This provides a technical support service, access to various connectors and free upgrades. In contrast, Rogō requires no licence or support fee. During the life of the current project, support from UoN (outside their own users) is targeted primarily at partner institutions, with support to other groups on a best effort basis. In the long term, the hope is that support will eventually be mutual within the development and user communities. However, a paid for software support plan provides a certain level of comfort and defensibility – it remains to be seen whether community support will be adequate or whether, if users are unwilling to use a system without it, UoN will consider paid for support. Assuming that it is possible to load-balance Rogō to improve performance and reliability (not yet tested), then it will have a great advantage over Perception in that extra servers will not require extra licence and support fees. This is currently the major factor limiting our ability to improve performance with Perception as these costs, with our setup, are well over six times the cost of a typical server. However, these potential savings are, as OSS Watch8 admits, far from clear-cut when development, maintenance, etc. are taken into account. Driver 4: Reporting Perception’s reporting is currently more extensive than Rogō’s. However, from the point of view of the reporting that we actually use and the needs that are driving change, the two systems are generally similar but do differ in two significant ways. 7 Anthony Brown, Pers. Comm. 5th May 2012 8 OSS Watch: Benefits of open source code, http://www.osswatch.ac.uk/resources/whoneedssource.xml (accessed 5th May 2012) Page 7 Is Rogo a viable alternative to QuestionMark Perception in the MSD at Oxford? Unlike Perception, Rogō’s reporting is entirely assessment-based – this means that it is not possible to track performance of questions across assessments, except by hand. Rogō does maintain previous versions of questions and, although these cannot currently be reported as we would like, the potential is certainly there. Driver 5: Flexibility Perception does provide a very extensive suite of tools/functionality to allow users to customize look and feel, to integrate with third-party systems such as VLEs and to allow developers to interrogate and manipulate the Perception database. Rogō, in contrast, because it has been primarily developed for, and used by, a single institution has required less of this sort of functionality before now and is therefore less flexible, in this respect, than Perception. When looked at from the point of view of a user looking to move from Perception, there are two significant examples of this problem: the importance of paper type and question locking. Table 2. Effects of table type on delivery settings on Rogō and Perception (Perc.) System Rogō Perc. Rogō Perc. Rogō Perc. Test Type SelfAssessment Test Quiz Progress Test Test Summative Exam Show Feedback Y Y (but can configure) N User decides N N (but can configure) Allow Restart N Y (if SAYG) Y Y (if SAYG) Y Yes (if SAYG) Fire Exit N n/a N n/a Y n/a Multiple Attempts Y Y (but can configure) N Y (but can configure) N Y (but can configure) Review Previous attempts Y Y (but can configure) N Y (but can configure) N Y (but can configure) The importance of paper type In both Perception and Rogō, paper type is used to change the default settings that are applied to an assessment, such as whether feedback is shown. However, the degree to which users can then alter these settings is currently very much less flexible in Rogō than in Perception (Table 2). In our pilot assessment, we had to temporarily change the underlying code to allow students to see feedback on a ‘Progress Test’ (which was used rather than a ‘Self-Assessment Test’ to allow restarting in the event of a disaster). Question locking Once the scheduled assessment start date/time has passed in Rogō, any questions in that assessment, and the assessment itself, are locked. This quite reasonably prevents any changes to ‘delivered’ questions. However, it means that even minor edits (e.g a spelling mistake) cannot be made without copying and reinserting the question, thus losing any link with the original question. This also means that any Page 8 Is Rogo a viable alternative to QuestionMark Perception in the MSD at Oxford? change to a question would remove any possible ability to track performance across assessments, except by hand. Despite these limits, in the long term, Rogō’s flexibility is limited only by the time and effort that the members of the community are prepared to invest. As a small group of programmers ourselves, this is very appealing as we should be able to deliver any functionality required by our institution, including that addressing the reporting and authoring needs driving change. It remains to be seen how effective this community will be and how best to harness its energies e.g. by developing an architecture which allows plug-ins for new question-types, reports, etc., without affecting the core software. Rogō already appears to have a number of the ‘trustworthy elements’ assessed in the OpenSource Maturity Model9 and we can only hope that these will continue to grow as its open source future continues. Other notable differences between Rogō and Perception What participants see Although the questions themselves are presented in essentially the same way, there are a number of striking differences in participant experience between the two systems (Table 3). Currently, the majority of MSD assessments are delivered as a single scrollable page of questions, display a countdown timer, and are automatically submitted when an individual's time has elapsed. Rogō lacks the latter features as it has no concept of timing. This, in turn, means that the only way to achieve the safety provided by Perception’s Save As You Go feature is to split the questions into separate pages with answers being saved as screens change. We asked students for their feedback on their experience with the Rogō-delivered assessment. The most common response was in relation to the absence of a timer. There was an even split of opinions among students with regard to displaying the questions on multiple pages as opposed to all on a single page One additional feature of Rogō that was popular with students, is the ability to click on multiple choice options to strike them through. Students liked being able to visually eliminate answers they felt incorrect. Table 3. Selected Delivery features in Rogō and Perception Feature Rogō Perception Timer No Yes Auto submission No Yes Ruling out of MCQ options Yes No (but coming to OnDemand in August 2012) Fire escape button Yes No All questions on one screen No (for anything other than a quiz as no SAYG) Yes 9 OpenSource Maturity Model, http://www.qualipso.org/sites/default/files/A6.D1.6.3CMMLIKEMODELFOROSS.pdf (accessed 20th June 2012) Page 9 Is Rogo a viable alternative to QuestionMark Perception in the MSD at Oxford? Reliability Reliability is a key concern in high-stakes online assessment (Gilbert et al., 2009). We gained permission to run summative online assessments on the basis of putting in place a number of contingency plans in case of hardware or software failures, including having a paper version of the exam ready in case of complete failure. In Perception, we have created a template to produce this automatically but this facility is available ‘out of the box’ in Rogō. The requirement for a paper-based contingency obviously has implications for question types used in formal exams e.g. audio/video. In online assessment, it only takes one student to be adversely affected by a hardware, software or human failure to seriously undermine confidence in the whole system. So, our Perception system was designed to be resilient (redundant hardware so the assessment can continue if either of two pairs of servers fails) and to be able to cope with things like machine failures (SAYG). We will need to put in place reliability measures of similar or greater strength before attempting summative assessment with Rogō. Table 4. Selected reliability measures in Rogō and Perception Reliability Measure Rogō Perception Save As You Go Not automatic, but saved on screen change Yes Loss of Connection failsafes No (but coming in v4.3) Yes Load-balancing Not attempted but should be possible Yes Security Like reliability, security is of paramount importance in summative assessment (Apampa et al., 2009). Perception comes with its own secure browser, QuestionMark Secure, that, among other things, prevents users from accessing any other programs or websites during an assessment. Rogō lacks this feature, but there are a number of third party secure browsers available that can provide the same functionality, such as Safe Exam Browser10 (which has advantages over QuestionMark Secure in installation and in allowing access to ‘permitted applications’ such as the Notepad or Calculator, although the latter is also available within Rogō). A similar ‘locked-down’ state can also be achieved by Windows Group Policies. Table 5. Selected security measures in Rogō and Perception 10 Security Measure Rogō Perception Secure Browser No but available as 3rd party software Built in (at a cost) Restriction to Rooms (/IP addresses) Yes No (but we restrict to ox.ac.uk only) Time restricted Yes, but only one instance per assessment Yes, multiple schedules per assessment Safe Exam Browser, http://www.safeexambrowser.org/ (accessed 5th May 2012) Page 10 Is Rogo a viable alternative to QuestionMark Perception in the MSD at Oxford? Conclusion Rogō addresses the hardware cost, performance and flexibility drivers identified in the introduction. On authoring, delivery and reporting Rogō is currently no better suited to our needs than Perception and, in some areas, less well suited. Software cost is a moot point with OSS. Our first steps with Rogō have been encouraging. The potential to finally be able to deliver and report on assessments as we would like to, if we, and the Rogō community can deliver the functionality, is very exciting. However, although our assessment needs can probably be generalised, the extent to which other departments/institutions can/are prepared to contribute to the code may vary and this may impact on whether they feel Rogō is as appealing to them, despite the other benefits that OSS can provide8. However, these are early days for us and we will need to build our confidence with Rogō, assuring ourselves that it does meet our reliability and security needs in particular, before we could consider moving our summative assessment from Perception to Rogō. Acknowledgements The authors would like to thank Simon Wilkinson and the Rogō team at Nottingham for their support with Rogō and to both the Rogō team and QuestionMark for checking the draft of this paper for factual correctness. References Apampa, K.M, Wills, G.B and Argles, D (2009) Towards Security Goals in Summative E-Assessment Security. At ICITST-2009, London, UK, 09 - 12 Nov 2009. Case, S. M. & Swanson, D. B. (1993). Extended-matching items: a practical alternative to free response questions. Teaching and Learning in Medicine, 5, 107115. Gilbert, L., Gale, V., Warburton, B. & Wills, G. (2009). Report on Summative EAssessment Quality (REAQ), JISC. Sieber, V & Young, D. (2008). Factors associated with the successful introduction of on-line diagnostic, formative and summative assessment in the Medical Sciences Division University of Oxford. In Khandia, F. (Ed.), CAA 2008 International Conference, University of Loughborough, http://caaconference.com. Page 11