Breckenridge April 2008 Certainty-Based Marking (CBM) [aka Confidence-Based Marking] rewarding good judgment of what is or is not reliable Tony Gardner-Medwin Physiology, University College London www.ucl.ac.uk/LAPT Certainty-Based Marking How should we reward students' knowledge? Teacher leadership & encouragement I was gratified to be able to answer promptly, and I did ! - I said I didn't know. Mark Twain Praise & criticism Certainty-Based Marking How should we reward students' knowledge? Teacher leadership & encouragement Praise & criticism Critical Adjuncts: Self-assessment for learning Exam / Test Assessment of : through interaction with : • Writing • Knowledge • Performance • Application Peers Computers How should we reward students' knowledge? [ Mark = Grade = Score for a single question ] 1. 2. 3. 4. 5. 6. How is knowledge related to probability? What is Certainty-Based Marking? How easy is it to use CBM? What are the learning benefits? What are the assessment benefits? Why doesn't everybody use it? Ordinary words we use to describe Knowledge knowledge uncertainty don't know misconception delusion Decreasing certainty about what is true. Increasing certainty about something false. Increasing "ignorance" • Knowledge is a function of certainty (confidence, degree of belief) • There are states a lot worse than acknowledged ignorance "It's not ignorance does so much damage; - it's knowin' so derned much that ain't so." - attributed to Josh Billings Will it snow next weekend? Does a (good) weather forecaster have knowledge? - obviously yes, but expressed through a probability How can you measure and reward this knowledge? - the origin of CBM >100 years ago. Does insulin raise blood glucose levels? Similar, even though the Q is not about a probability. - the probability is your certainty that your answer is right The key is to have a "proper" or "motivating" reward scheme, which ensures that the person does best by expressing their actual level of uncertainty What is CBM ? Each answer is marked according to the student’s certainty that their answer is correct. Degree of Certainty : C=1 C=2 C=3 (low) (mid) (high) No Reply Mark if correct: 1 2 3 0 Penalty if wrong: 0 -2 -6 0 Does insulin raise blood glucose? If you're sure, obviously you're best with C=3, but you must convince yourself there is a low risk of a penalty. If unsure, you gain by acknowledging this (with C=1) and avoiding the risk of a penalty. Which Certainty Level is Best? 3 C=3 High C=2 Mid C=1 Low No Reply Mark expected on average 2 1 0 -1 67% 80% -2 -3 -4 -5 -6 guessing range 0% 50% 100% How likely is your answer to be correct? How well do students discriminate reliability ? Thinking about justification and uncertainty stimulates understanding Nuggets of knowledge ? ? ? ? ? ? ? ? Networks of Understanding E V I D E N C E Confidence (Degree of Belief) Inference Choice Confidence-based marking places greater demands on justification, stimulating understanding To understand = to link correctly the facts that bear on an issue. Using CBM 1. With software from UCL : www.ucl.ac.uk/LAPT 2. Offline with a CD - ditto 3. With Moodle - full integration: work in progresss 4. With other VLEs - the IDEAL !! 5. Secure exams, with Optical Mark Reader (OMR) Cards [ Speedwell ] Student Learning: Principles they readily understand • You need to know the reliability of your knowledge to use it • Confident errors are serious, requiring attention to explanations • Expressing uncertainty when you are uncertain is a good thing • Confidence is about understanding why things cannot be otherwise, not about personality • if over- or under-confident, you must calibrate through practice • reflection and justification are essential study habits In evaluation surveys, a majority of students have always said they like CBM, finding it useful and fair. Early on they asked to include it in exams, and recently at UCL they voted 52% : 30% to retain it. Perhaps we don't need to test knowledge, now we have Google? It's so easy to find stuff out, why test knowledge? Cheap But cheap knowledge knowledge putsputs an absolute an absolute premium premium on :-on :1) Identifying misconceptions - "unknown unknowns" .... the things you will get wrong and not Google! and distinguishing them from "don't knows" 2) Judging reliability and uncertainty correctly .... setting a threshold for seeking help. If checking is expensive, you can only pick your best choice and "go for it" These lessons are core things that CBM teaches Student Assessment Mark assigned CBM quite closely follows the ideal ignorance measure 2 0 -2 -4 -6 -8 0 1 2 3 4 Lack of know ledge [ bits ] = -log 2 ( Prob'y assigned to correct choice ) The student loses about 3 marks per 'bit' of ignorance - up to a maximum of 3 bits CBM increases the reliability of exam data 'Reliability' indicates to what extent a score measures something about the student's ability, as opposed to 'luck' or chance. CBM increases the effective test length With increased 'Reliability' you don't need so many exam questions to get data of equal quality. CBM data is a more valid measure of ability 'Validity' means it measures what you want, rather than just something easily measured. Why doesn't everybody already use CBM ? - a puzzle • • • • • • • • Enthusiasm was exhausted before the age of 'online' Some CBM methods were complex, opaque or non-motivating Reluctance to treat certainty as integral to knowledge Mistaken worries about 'personality bias' Under-rating of self-assessment & practice as learning tools Worry that CBM would need new questions Adaptation of procedures for standard-setting Inertia and vested interests "THE IDEAL" needs CBM ! SUMMARY Why CBM? • Get students to think more carefully • Reward recognition of uncertainty, either personal or in a group • Highlight misconceptions • Engage students more - the game element of CBM • Encourage criticism of Qs (intolerance of ambiguity or looseness) • In general: enhance self-assessment as a learning experience NB All of the above arise with little or no practice with CBM. The following do require practice : • More searching diagnostic data • More valid and reliable assessment data (But NB with CBM you have conventional assessment data too.) A few of the names associated with confidence testing in education • • • • Andrew Ahlgren Jim Bruno Robert Ebel Jack Good London Colleagues: • Mike Gahan • David Bender • Nancy Curtin • • • • Kate Hevner Darwin Hunt Dieudonné Leclercq Emir Shuford We fail if we mark a lucky guess as if it were knowledge. We fail if we mark misconceptions as no worse than ignorance. www.ucl.ac.uk/lapt Lessons from experience with CBM • • Practice is needed before use in exams Exams should re-use questions from an open database only very sparingly • Over-confidence and diffidence are both unhealthy traits that can be moderated by practice to achieve good calibration • With multi-option questions, students tend (at least initially) to over-estimate reliability • Standard setting - it is easy (but important!) to scale CBM marks to match familiar scales based on number correct. Some Questions about CBM ! • • • • • • • • • • • • • Are there problems using it? Why doesn't my VLE support CBM? Do students need practice? Isn't computer marked assessment just factual? Does CBM increase retention? Do I need new questions? What are the best Q types? What about school education? Is it relevant to my subject, where opinions differ? Isn't it bad to encourage guessing? What if my only assessments are exams? How do I convince an exam board? Isn't it right/wrong that really matters? CBM increases the reliability of exam data with True/False Questions 'Reliability' indicates to what extent a score measures something about the student's ability, as opposed to 'luck' or chance. Cronbach alpha (reliability) 95% using CBM 90% 85% using % correct 80% 80% 85% 90% 95% To achieve these increases using only % correct would have required on average 58% more questions. Known Knowns ... things we know that we know. Known Unknowns ... things that we know that we don't know. Unknown Unknowns ... things we do not know we don't know. Donald Rumsfeld When you know a thing, to hold that you know it. And when you do not know a thing, to allow that you do not know it. This is knowledge. Confucius