Andes Tutoring: Freedom, Support, and Accelerated Learning as Students Solve Complex Physics Problems Kurt VanLehn & Brett van de Sande School of Computing, Informatics and Decision Engineering Arizona State University USN 1 Andes development team Developers – Brett van de Sande Instructors/designers – Bob Shelby – Don Treacy Experimenters – – – – – Scotty Craig Bob Hausmann Sandy Katz Tim Nokes Michael Ringenberg Instructors 4-year colleges Jim Culbertson – Arizona State University, AZ John Fontinella – US Naval Academy, MD David Guerra – St. Anselm College, NH Troy Hacker – US Air Force Academy, CO Andrew Heckler – Ohio State University, OH Gerd Kortemeyer – Michigan State University Ted McClanahan – US Naval Academy, MD Mary Wintersgill – US Naval Academy, MD 2-year colleges Tom Wilbur – Ann Arundel Com. College, MD High Schools Sophia Gersham – Watchung Hills Reg. HS, NJ Paul Perkins – Belleview Christian School, WA David Richardson – Packer Academy, NY 2 Outline Goals of the Andes project Andes – the core Andes – the surroundings Andes – technology Evaluation Why does Andes succeed? Next 3 Why physics? Required for the other sciences Major source of attrition Well-studied in cognitive science Other courses are similar 4 A typical physics course Repeat: Read chapter Attend lectures Solve problems Do lab Review for exam Take chapter exam Tutoring needed Take finals: Standardized & conceptual 5 Why focus on problem solving? Students spend most of their time on problem solving Frustrating; May causes attrition Bad habits can develop – “Symbol pushing” instead of deep understanding Done right, it can elicit deep, conceptual understanding Professional human tutors focus on it 6 What do professional tutors do? Pick a good problem Andes could, but doesn’t – Neither trivially easy nor impossibly difficult – Problem targets a weakness of the student Help the student solve it Andes does – Let student try each step – Give immediate feedback if step is wrong – Give hints sparingly Reflective debriefing – What were main principles? – What did you learn? Under development 7 Repeat: Just these 2 as of today Read chapter Attend lectures Solve problems Do lab Review for exam Take chapter exam Comparison conditions How to evaluate Andes? 1. 2. 3. 4. Paper & human graders Paper & grading service Andes Professional human tutor Take finals: Standardized & Conceptual 8 Main goals: Andes should be a “workbook” where: Instructors select & assign Andes problems – Need lots of problems, covering all textbooks Students solve problems on Andes, not paper – getting immediate feedback and hints Students learn more, compared to paper 9 More goals Instructors agree with all of Andes’ advice Instructors no longer need to grade homework Students prefer Andes to paper Andes is as effective as human tutors 10 Outline Goals of the Andes project Andes – the core Andes – the surroundings Andes – technology Evaluation Why does Andes succeed? Next 11 A typical physics problem A 2000 kg car at the top of a 20º inclined driveway 20 m long slips its parking brake and rolls down. If we ignore friction and drag, what is the magnitude of the velocity of the car when it hits the garage door? University introductory physics courses High school physics courses 12 Andes user interface Read a physics problem Draw vectors Type in equations Type in answer 13 Andes feedback and hints “What should I do next?” “What’s wrong with that?” Green means correct Red means incorrect Dialogue & hints 14 Frequently asked questions (slide 1 of 3) Why is Andes sometimes called an intelligent tutoring system? – Tutors usually replace classes; Andes doesn’t. – Andes should be called a “step-based homework helper” Is Andes the same as web-based grading services such as WebAssign, Mastering Physics & LON-CAPA? – They give feedback & hints on answers only – Andes gives feedback & hints on steps leading to answers 15 Most existing tutoring systems and web-based grading services accept only the answer W = 25 30° 40° x° y° u° z° w° Answer 45° What is the value of angle w? 16 With step-based systems (e.g., Andes), students enter steps leading up to the answer Step 30° 40° x° y° u° z° w° 45° 40+30+x=180 x=110 Step x=z Step w+45+z=180 Step w+45+110=180 Step w=180-155 Step w=25 What is the value of angle w? Answer Step 17 Step-based homework helpers usually give immediate feedback on each step w = 40 30° 40° x° y° u° z° OPPS! w° 45° What is the value of angle w? OK 18 Hints start general… w = 40 30° 40° x° y° u° z° w° Lines that look parallel often are not. 45° What is the value of angle w? OK 19 Hints become more specific. w = 40 30° 40° x° y° u° z° w° Try summing the angles of the triangle that include angle w. (see pg. 212) 45° What is the value of angle w? OK 20 Usually, the last hint tells the student exactly what to enter. w = 40 30° 40° x° y° u° z° w° You should apply the triangle sum rule by entering 45+z+w=180. 45° What is the value of angle w? OK 21 Frequently asked questions (slide 2 of 3) Does Andes require students to do all the steps? – No. But skipping conceptual steps lowers the score. Does it require doing steps in a particular order? – No, except that variables must be defined before being used in equations – The only way to define a vector component is to draw the vector Are there correct solutions that Andes doesn’t accept. – No (we hope). 22 Frequently asked questions (slide 3 of 3) Why does Andes focus on problem solving? Problem solving doesn’t improve conceptual understanding [sic]. Instruction really should focus on __________ instead of problem-solving. – Andes requires more conceptual steps than paper. – Reflective debriefing is being added – Perhaps instructors should let Andes to handle problem-solving so they can focus on __________. 23 Outline Goals of the Andes project Andes – the core Andes – the surroundings Andes – technology Evaluation Why does Andes succeed? Next 24 Two methods for accessing Andes problems Stand-alone – Downloads as a standard Windows application Open-Learning Initiative (Web-based LMS) 25 Standalone opening screen is a menu of physics topics (apprx. chapters) 26 After clicking on “Translational dynamics,” get menu of problems 27 Two methods for accessing Andes problems Stand-alone – Downloads as a standard Windows application Open-Learning Initiative (Web-based LMS) – Free & open usage » OLI keeps no record of student’s usage Next – Authenticated member of a registered class » OLI keeps a grade book for the instructor 28 OLI top level screen Andes 29 OLI top level of Andes is a menu of topics (apprx. Chapters) 30 After clicking on “Translational Dynamics” module, get menu of problems 31 OLI grade book Looks/acts like a spreadsheet – One row per student – One column per problem – Cell has student’s score on the problem Can export to Excel or database Clicking a cell displays student’s solution in instructor’s Andes. – Can check student’s work – Good for office hours 32 Outline Goals of the Andes project Andes – the core Andes – the surroundings Andes – technology Evaluation Why does Andes succeed? Next 33 The major components of an ITS User interface Problem to be solved Expert All correct steps in all orders Helper Response pattern for each student step Assessor P(mastery) for each knowledge compo. The expert’s computation Expert can be authors, students or an expert system Solve the problem in all acceptable ways Record steps taken Record knowledge components used at each step Problem to be solved Expert All correct steps in all orders Helper Response patterns for each student step Assessor P(mastery) for each knowledge compo. Andes uses an expert system Knowledge base organized by “problem solving methods” – One per major physics principle – Often called as “principle schemas” – Approximately 100 principles covered in 1-year course Can solve more that 500 physics problems For each, finds all acceptable solutions – Merges them into a solution graph Pre-generates solution graphs & saves on disk – Allows regression testing (compare new to old) – Allows instructors to inspect 36 Problem solving method for Newton’s second law To apply Newton’s second law to <body> at <time> along <axis>: Draw a free-body diagram for <body> at <time> including <axis> – For each force on <body> at <time>, draw it. – Draw the acceleration of <body> at <time> Write the F=m*a in terms of components along <axis> For each vector (i.e., acceleration and each force), write a projection equation for the vector along <axis> For each minor principle of the form <force magnitude> = <expression> where the force is one of the ones on <body> at <time>, write the minor principle’s equation. 37 Problem & its solution graph Draw body for car A 2000 kg car at the top of a 20º inclined driveway 20 m long slips its parking brake and rolls down. If we ignore friction and drag, what is the magnitude of the velocity of the car when it hits the garage door? Draw coord. axes @ 20 deg Define given quantities d=20 m; m= 200kg Apply conservation of energy vf^2 = 2*g*sin(20 deg) Solve equations for final velocity vf= 11.59 m/s Enter answer: 11.59 m/s Apply translational kinematics vi^2 = 2*a*d Apply Newton’s second law a= -g*cos(200 deg) •Draw weight force •Draw normal force •Draw acceleration 1. Draw free-body diagram 2. Apply F=m*a along x-axis 3. Project vectors onto x-axis 4. Apply weight law W=m*g 38 4 popular designs for the Expert Hand-author Problem to be solved Expert All correct steps in all orders all possible solutions per problem – AutoTutor, CTAT AI problem solver + problem all possible solutions Rule-based – Andes, Cognitive tutors Hand-author one solution & use constraints to generalize to all possible solutions – Constraint-based tutors e.g., SQL Tutor Given log files from students, induce shortest/best paths – iLisp, Barnes tutor The helper’s computation When the student enters a step, match it to a correct step Give feedback & hints as necessary Record pattern response Problem to be solved Expert All correct steps in all orders Helper Response patterns for each student step Assessor P(mastery) for each knowledge compo How to tell if student’s equation is correct? Color by numbers algorithm Given student’s equation “Fw_x+m*g*sin(20 deg)” Substitute values from solution point – Fw_x 6704 N – g 9.8 m/s^2 – m 2000 kg Check arithmetic. – If checks, equation is correct See paper for dealing with variables that don’t have values in the solution point 41 When the student needs a hint, how to choose next step? Which solution branch is student probably following? First PSM along path that is not finished? First step inside the PSM that is not yet done? But how to tell which equation-writing steps have been done already? 42 Matching equations via the indy check algorithm Student enters “Fw_x+m*g*sin(20 deg)” – Row S below Which equations in solution were combined? – Rows A, B, C, D – Because S’s gradient is a linear combination of their gradients f/m f/g f/Fw f/Fw_x gradient A Fw_x –Fw*cos(250º) 0 0 –cos250º 1 (0, 0, 0.342, 1) B Fw – mc*g –g –mc 1 0 (–9.8 ,–2000, 1,0) C mc – 2000 1 0 0 0 (1, 0, 0, 0) D g – 9.8 0 1 0 0 (0, 1, 0, 0) S Fw_x+mc*g*sin(20º) g*sin20º mc*sin20º 0 1 (3.352, 684, 0, 1) Function f 43 3 main design issues for the Helper Matching All correct steps in all orders Helper Response patterns for each student step student’s step to correct steps – Natural language: Use LSA, keywords, Atlas… – Math expressions: Substitute numbers for variables – Physical actions: Fuzzy, Bayesian Handling pedagogically important student errors Managing the student-tutor dialogue – Immediate feedback + hint sequences – Delayed feedback + student or tutor controlled debriefing – Adaptive, especially decision theoretic & fading The assessor’s computation Given – Response patterns for each step taken by the student – Old P(mastery) for each knowledge component Calculate – New P(mastery) Problem to be solved Expert All correct steps in all orders Helper Response patterns for each student step Assessor P(mastery) for each knowledge compo. Currently, Andes doesn’t do such assessment Where’s the scale-up bottleneck? Code size Expert User interface Helper Assessor Number of problems Other scaling up issues = same as other reforms Coordination with curriculum & standards Teacher buy-in and training Support Etc… Implementation: Summary Expert – Expert system, not humans Helper – Matching student equations is challenging Assessor – Dynamic Bayesian networks Problem to be solved Expert All correct steps in all orders Helper Response patterns for each student step Assessor P(mastery) for each knowledge component Outline Goals of the Andes project Andes – the core Andes – the surroundings Andes – technology Evaluation Why does Andes succeed? Next 49 Prior work with answer-only homework helpers Compared to ordinary paper-based homework – Modest effect size: 0.42 (Kulik et al., 1983) Compared to paper-based homework that is collected & graded – No benefits (Pascarella, 2002; Dufresne, Mestre & Rath, 2002) Interpretation – Motivating students to do their homework provides benefits – But answer-only feedback & hints provides no additional benefits 50 Prior work with step-based homework helpers Lisp Tutor (Corbett, 2001) and many others – Same homework problems & text – Experimenter’s exams only – Not a whole semester (only 5 lessons) Cognitive Tutors (Koedinger et al; Carnegie Learning) – Whole year of high-school algebra, geometry – Both experimenter’s exams & standard exams – Curriculum confounded with tutoring system 51 Evaluation of Andes at the US Naval Academy Fall semesters 2000, 2001, 2002 & 2003 Only the homework modality was varied: Andes vs. paper-based – – – – Same textbook Similar lectures, labs, recitations Similar homework problems Same exams Students were motivated to do paper-based homework – Either collected and graded – Or 1 homework problem on each quiz 52 Exams Midterm exam – 1 hour, 4 problems – Scored on derivation & answer » Drawings (30%) » Variable definitions (20%) » Equations (40%) » Answers (10%) Final exam – 3 hours, 50 problems – Multiple choice 53 Subjects; non-random assignment 2000 2001 2002 2003 N Andes 140 129 93 93 N control 135 44 53 44 Prior competence equal, all 4 years – Grade-point averages equal – Distribution of majors equal 54 Midterm exam results (All differences reliable, p < .01) 75 70 65 Control Andes 60 55 50 2000 2001 2002 Effect size: 0.61 2003 55 Final exam: Methodological details Andes coverage of the course increased to 70% by 2003, so used only that year’s final exam Non-random sample – 89 Andes students (3 sections) – 823 non-Andes students (rest of course) – GPAs, Majors reliably different Regressed out incoming GPA, Major 56 Final exam results 2.5 Difference is reliable (p = 0.028) 2 1.5 1 0.5 0 -0.5 -1 Control Andes Effect size = 0.25 57 Effect sizes for subscores of midterm exam 1.5 1 0.5 0 -0.5 -1 Drawings Variables Equations Answers The more conceptual the subscore, 58 the greater the benefit Benefits same regardless of GPA 3.0000 Andes y = 0.9473x - 2.4138 R2 = 0.2882 2.0000 Controls y = 0.7956x - 2.5202 2 R = 0.2048 Z-score on exam 1.0000 0.0000 -1.0000 ANDES CONTROLS Linear (ANDES) Linear (CONTROLS) -2.0000 -3.0000 1 1.5 2 2.5 3 3.5 4 GPA 59 Benefits varied by major on final exam but not on midterm exam Final exam results Midterm exam results 4.5 0.3 4 0.2 3.5 3 0.1 2.5 0 2 -0.1 1.5 1 -0.2 0.5 -0.3 0 -0.4 -0.5 -0.5 -1 Engineers Scientists Control Others Andes Engineers Scientists Others Non-Andes Overall Andes 60 Interpretation of results Engineering & science majors knew red path & preferred it for answer-only – Andes didn’t affect their final exam scores Other majors did not have red path, so they used the blue path on answer-only – Andes increased their final exam scores Problem Andes Diagram & variables Everyone used blue path on midterms – Andes increased everyone’s midterm exam scores – Biggest benefit for diagrams & variables – Smaller on equations; none on answer Andes Equations Prior math & physics Answer 61 Summary of results Main result: Replacing graded paper homework with Andes provides benefits – Midterm exam effect size: 0.61 – Final exam effect size: 0.25 Andes helps students learn conceptual skills – Effect sizes on conceptual subscores: 1.21 & 0.69 – Effect sizes on calculational subscores: 0.11 & -0.08 Engineering & Science majors appear to have a non-conceptual method for solving problems – Competes with the conceptual method taught by Andes – They use it on the (answer-only) final exam – This dilutes the benefit of Andes on final exam 62 Andes’ effect sizes are typical of other step-based homework helpers 1.4 1.2 1 0.8 0.6 0.4 0.2 0 Lisp tutor Cognitive tutor Andes Experimenter's exam: Conceptual parts Experimenter's exam: Less conceptual parts Standard exam 63 Take-home message on homework helpers Baseline: – Paper-based with light or no human grading Better: – Answer-based e.g., WebAssign Best: – Step-based e.g., Andes – 64 Outline Goals of the Andes project Andes – the core Andes – the surroundings Andes – technology Evaluation Why does Andes succeed? Next 65 Why these results? Hypotheses: Baseline: – Paper-based with light or no human grading – Students often do not do their homework Better: – Answer-based e.g., WebAssign – Students do homework, but with non-optimal methods Best: – Step-based e.g., Andes – Students do homework with optimal method 66 3 methods for doing homework 1. Get answers from friends 2. Copy & edit another problem’s solution 3. Generate each step oneself 67 1. Getting answers from friends No learning, so should be discouraged Even if the numbers in a problem are randomly generated, students circulate spreadsheets that calculate answers Andes requires that students “show their work” – Andes can analyze time per step, too. – So Andes can make this kind of cheating very difficult 68 2. Copy and edit another problem’s solution Students – – – – learn general solution schemas E.g., for pulley problems, a = g(m1-m2)/(m1+m2) Good for learning to solve algebra word problems High math students tend to use it (cf. USNA) But should learn physics principles, not problems Andes mildly discourages – Must close current problem in order to open an old one – Example solutions are videos, not paper Andes should implement “fading” 69 3. Generate each step oneself If stuck, get hints (from Andes), or refer to textbook or videos of examples – Should focus on learning the principle, not just getting unstuck – Hint sequences start vague, become specific Causes learning of principles, not problems 70 Frequence of solution method Hypothesized distributions of solution methods explains results 100% 80% 3. Each step oneself 2. Copy & edit solutions 60% 1. Copy answers No solution 40% 20% 0% Light or no grading Answerbased Step-based e.g., Andes Type of homework helper 71 Take-home message, again Baseline: – Paper-based with light or no human grading – Students often do not do their homework Better: – Answer-based e.g., WebAssign – Students do homework, but with non-optimal methods Best: – Step-based e.g., Andes – Students do homework with optimal method 72 Outline Goals of the Andes project Andes – the core Andes – the surroundings Evaluation Why does Andes succeed? Future work Next 73 Future work Andes3 – Client/server (no downloading necessary) – Client runs in any browser & any OS – User interface ≈ PowerPoint (for familiarity) – Compatible with WebAssign & other services Help others to author Andes problems Support conceptual problem solving 74 Quantitative problem solving is necessary but not sufficient for deep understanding Necessary – Avoids inert knowledge Not sufficient – Question: » If a massive truck and a lightweight car have a headon collision, on which vehicle is the impact force greater? – Students who got A’s say: The car – Correct answer: Equal (Newton’s third law) Many concept inventories (exams) exist 75 Does Andes improve scores on concept inventories? Learning principles instead of problems should help But not enough – USNA: FCI not significantly different 76 Why so little transfer from quantitative to conceptual problem solving? Quantitative prob. solving knowledge Quantitative prob. solving knowledge Conceptual prob. solving knowledge Conceptual prob. solving knowledge Traditional assumption lots of transfer Cognitive modeling little transfer 77 Solution: Add conceptual problem solving practice W=m*g Quantitative prob. solving knowledge Conceptual prob. solving knowledge If no mass, then no weight Question: In the problem you just solved, what would be different if the car had no mass? 78 Katz & Connelly’s experimental addition to Andes After students have finished an Andes problem, use a natural language dialogue system to discuss the problem’s solution System asks lots of conceptual questions 79 Students first solve an Andes problem 80 Andes then asks about principles, alternative situations… Andes: Which major principle did you need to solve the problem? Student: Newton's 2nd law Andes: Yes. The main difference between this problem and most of the previous Newtons 2nd law problems is the fact that the direction of acceleration is __________. (fill in the blank) Student: centripetal Andes: Correct. All else being the same, how would the velocity at the lowest point of the swing change if the length of the rope were doubled to 4.0 m? Student: it would also double Andes: Well, it would increase, but it would not double. Recall that the formula for centripetal acceleration involves the square of the velocity. … 81 Results of Katz/Connelly expt. Successes Treatment students significantly outperformed control students on a mostly qualitative post-test Instructors would like reflective dialogue to become a permanent feature of Andes Limitations Currently covers only 8 weeks of Andes No significant effect on FCI at the end of the semester Next steps Revise content to target FCI Cover whole semester 82 Thanks for your attention! At www.andestutor.org you can… – Download the stand-alone version of Andes – Try the OLI version of Andes – Download papers on Andes – View videos of Andes being used 83 Problem to be solved User interface Questions? Expert All correct steps in all orders Helper Response pattern for each student step Assessor P(mastery) for each knowledge compo. 3.0000 Andes y = 0.9473x - 2.4138 2 R = 0.2882 2.0000 Controls y = 0.7956x - 2.5202 2 R = 0.2048 Z-score on exam 1.0000 0.0000 -1.0000 ANDES CONTROLS Linear (ANDES) Linear (CONTROLS) -2.0000 -3.0000 1 1.5 2 2.5 GPA 3 3.5 4 84