The Andes Intelligent Tutoring System: Five years of evaluations Kurt VanLehn Pittsburgh Science of Learning Center (PSLC) University of Pittsburgh The physics LearnLab course committee Andes development Experimenters – Anders Weinstein – Brett van de Sande – Kurt VanLehn (co-chair) – – – – U.S. Naval Academy – – – – Don Treacy (co-chair) Bob Shelby Mary Wintersgill Kay Schulze Scotty Craig Sandy Katz Bob Hausmann Michael Ringenberg Meet weekly – Thursdays, 3:30 Funding The U. S. Office of Naval Research Cognitive Science Program USN The U.S. National Science Foundation Pittsburgh Science of Learning Center Research question Given – – – – Whole semester of instruction No change to content of course No change to lectures, labs, assignments Standard exams (not designed by experimenters) Can a homework helper increase learning? Prior work with answer-only tutoring steps Web-based homework grading systems – E.g., Web-assign, CAPA, Mastering Physics – Provide feedback & hints on the answer only Compared to ordinary paper-based homework – Positive benefits When paper-based homework is collected & graded – No benefits (Pascarella, 2002; Dufresne, Mestre & Rath, 2002) Interpretation – Motivating students to do their homework provides benefits, but the answer-only tutoring system provides no additional benefits Prior work with tutoring systems that give feedback & hints on steps Lisp Tutor (Corbett, 2001) and many others – Same homework problems & text – Experimenter’s exams only – But not a whole semester (only 5 lessons) Pump curriculum + Pat tutor (Koedinger et al) – Whole year of high-school algebra – Both experimenter’s exams & standard exams – But content confounded with tutoring system Earlier evaluations of Andes – First half-semester only – Experimenter’s exams only Why does it matter? Ideally, an intelligent homework helper… – can increase learning without changing the course, and – the increase is strong enough to show in final exam » The diligent always do well & slackers always do poorly » Cramming If not… – still useful if it facilitates content upgrades, and – the upgrades cause robust increases in learning Outline Andes Evaluation Discussion Next What kind of physics? US university introductory physics courses US high school advanced physics courses A typical problem: If a 2000 kg car at the top of a 20 degree inclined driveway 20 m long slips its parking brake and rolls down. If we ignore friction and drag, what is the magnitude of the velocity of the car when it hits the garage door? Andes user interface Read a physics problem Draw vectors Type in equations Type in answer Andes feedback and hints “What should I do next?” “What’s wrong with that?” Green means correct Red means incorrect Dialogue & hints Major challenges Dealing with equations – Giving red/green feedback – Undoing algebraic combination » For “what should I do next?” – Analyzing errors in equations Scale-up – 13 chapters, 500 textbook pages – 350+ problems – 300+ principles Outline Andes Evaluation Next – Method – Main results – Which students benefited? – Which knowledge benefited? – Interpretation of results Discussion Evaluations of Andes at the US Naval Academy Fall semesters 2000, 2001, 2002 & 2003 Only the homework modality was varied: Andes vs. paper-based – – – – Same textbook Similar lectures, labs, recitations Similar homework problems Same exams Students were motivated to do paper-based homework – Either collected and graded – Or 1 homework problem on each quiz Exams Midterm exam – 1 hour, 4 problems – Scored on derivation & answer » Drawings (30%) » Variable definitions (20%) » Equations (40%) » Answers (10%) Final exam – 3 hours, 50 problems – Multiple choice Next Checking prior competence of Andes and control students Grade-point averages equal Distribution of majors equal – Engineering majors vs. – Science majors vs. – Other majors Midterm exam results (All differences reliable, p < .01) 75 70 65 Control Andes 60 55 50 2000 2001 2002 How to calculate effect size? 2003 Calculating effect size over 4 different midterm exams Normalize each score z_score(student) = [raw_score(student) – mean(exam)] / standard_deviation(exam) For each condition, pool z-scores across years Effect size = 0.61 Final exam Exam covers 100% of course, but Andes didn’t – Does now Use 2003 exam only; Andes covered 70% – 89 Andes students – 823 non-Andes students Prior competence not equal Majors not equally distributed – Andes group had more engineering majors GPAs not equally distributed – Andes group had marginally higher GPAs Factor out prior competence statistically – For each major, regress GPA on final exam score – Residual_score(student) = raw_score(student) – predicted_score(student’s major, student’s GPA) Final exam results Difference is reliable (p = 0.028) 2.5 2 1.5 1 0.5 0 Effect size = 0.25 -0.5 -1 Control Andes Outline Andes Evaluation – Method – Main results – Which students benefited? – Which knowledge benefited? – Interpretation of results Discussion Next Benefits same regardless of GPA 3.0000 Andes y = 0.9473x - 2.4138 2 R = 0.2882 2.0000 Controls y = 0.7956x - 2.5202 2 R = 0.2048 Z-score on exam 1.0000 0.0000 -1.0000 ANDES CONTROLS Linear (ANDES) Linear (CONTROLS) -2.0000 -3.0000 1 1.5 2 2.5 GPA 3 3.5 4 Benefits varied by major on final exam but not on midterm exam Midterm exam results 0.3 Final exam results 4.5 4 0.2 3.5 0.1 3 0 2.5 2 -0.1 1.5 -0.2 1 -0.3 0.5 0 -0.4 -0.5 -0.5 -1 Engineers Scientists Control Others Andes Engineers Scientists Non-Andes Others Andes Outline Andes Evaluation – Method – Main results – Which students benefited? – What knowledge benefited? – Interpretation of results Discussion Next Effect sizes for subscores of midterm exam 1.5 1 0.5 0 -0.5 -1 Drawings Variables Equations Answers Interpretation of results Engineering & science majors learned the red path and prefer it Problem Andes – Andes does not increase their final exam scores They use blue path on the midterm – Andes increases their midterm exam scores Prior physics Other majors do not have red path, so they use the blue path on both exams Diagram & variables Andes – Andes increases both exams’ scores On midterm exams, subscores measure components of blue path separately – Biggest benefit for diagrams & variables – Smaller on equations; none on answer Equations Prior math & physics Answer Summary of results Main result: Andes provides benefits – Midterm exam effect size: 0.61 – Final exam effect size: 0.25 Andes helps students learn conceptual skills – Effect sizes on conceptual subscores: 1.21 & 0.69 – Effect sizes on calculational subscores: 0.11 & -0.08 Some students appear to have a non-conceptual method for solving problems – Competes with the conceptual method taught by Andes – They use it on the (answer-only) final exam – This dilutes the benefit of Andes on final exam Outline Andes Evaluation Discussion – Andes compared to others – Why is Andes effective? Next Effect sizes on experimenter’s & standard exams of 3 tutoring systems 1.4 1.2 1 Lisp Pump+Pat Andes 0.8 0.6 0.4 0.2 0 Experimenter's 1 Experimenter's 2 Standard Interpretation of the comparison with other tutors Andes is about the same as other tutoring systems that give feedback and hints on steps Perhaps the Pump+Pat benefits are due solely to the tutoring system and not the content upgrade Summary: Studies of homework helpers when content is controlled Ordinary paper-based homework Large benefits Motivated paper-based homework No benefits Feedback & hints on answer only Large benefits Feedback & hints on steps Outline Andes Evaluation Discussion – Andes compared to others – Why is feedback & hints on steps so effective? Next Hypothesis: Andes increases the number of successful knowledge events Without feedback & hints on steps, students skip them – Guess – Copy similar example’s step & edit – Copy & edit a higher goal’s outcome Doing a step correctly requires – Figuring out how the first time (sense-making) – Figuring out why the second & third times (refinement) – Recalling why & how the other times (fluency building) This increases number of successful knowledge events – Wherein a student constructs or applies a knowledge component Thanks for your attention! At www.andes.pitt.edu – Download stand-alone version of Andes – Try OLI version of Andes – Download papers on Andes Sorry, but Andes only runs on Windows