From: AAAI Technical Report SS-97-04. Compilation copyright © 1997, AAAI (www.aaai.org). All rights reserved. Degrees of Mixed-Initiative Interaction in an Intelligent Tutoring System Reva Freedman Department of CSAM Illinois Institute of Technology 10 W.31st Street 236-SB Chicago, IL 60616 freedman@steve.csam.iit.edu S: Correct. T: That’sright. Approx.whatis the areaof Brazil? S: 2,500,000squaremiles. T: Wrong. Please indicate if the following statementis correct or incorrect: Thearea of Paraguayis approx. 47432squaremiles. (excerptedfromCarbonell1970,fig. 1) Ourtutoring system,CIRCSIM-Tutor, is capableof participating in this type of mixed-initiative interaction, althoughour coherenceheuristics wouldnot generate such an abrupt topic change. But we cannot handle more sophisticatedtypes of mixed-initiativeprocessingsuch as fully cooperativeconversations.Wedefine a fully cooperative conversation as one whereparticipation requires a multi-agent model where each speaker has goals to achieve and a plan to achieve them. The intent is to distinguish conversations requiring such a modelfrom those which can be understood with a simpler model whereone agent has a goal and a plan to achieve it, and Introduction the other agent’s job is mainlyto go along with the plan. It has beensuggestedthat a fully cooperativeconversation The SCHOLAR system, developed by J. R. Carbonell can be comparedto two children building a tower of (1970), is often consideredin the UnitedStates to be the first intelligent tutoring system.Carbonellcalled SCHOLAR blockstogether, as opposedto one child telling the other whatto do. In the cooperativecase, neither maybe able to a mixed-initiative system because it had two modesof predict the shapeof the resulting tower. operation: the teacher could ask the student questions or This brings up the question: Doesa tutoring systemlike vice versa. However,SCHOLAR could not hold a continuous CIRCSlM-Tutor need to participate in fully cooperative conversation whichrequired cooperative dialogue behaconversationsin order to achieveits tutoring goals? After vior. In fact, SCHOLAR did not attemptto create a coherent all, most conversations between humansand computers conversation. The followingexcerpt is from a tutor-led today are led by oneparty or the other. For example,in a section of a dialogue with SCHOLAR; the student-led database front-end, the humanuser mightask questions sectionsare similar. which the programanswers. Conversely, in an adviceT: Thecapital of Chile is Santiago. Correct or giving systemor an automaticteller machine,the program incorrect? leads and the human’sjob is to respond. Manyhuman-tohumantutoring sessions are also led mainlyby one party This workwassupportedby the CognitiveScienceProgram, or the other. In this regard, onemightcontrast our personOffice of NavalResearchunder GrantNo.N00014-94-1-0338 to-person tutoring experiments, wherethe tutor has an to Illinois Institute of Technology. Thecontentdoesnotreflect agendato fulfill, with tutoring sessions such as those the positionor policyof the government andnoofficial endorsementshouldbe inferred. Abstract Weare currently implementingCIRCSlM-Tutor v. 3, a conversation-based intelligent tutoringsystem(ITS)which tutors medicalstudentsonthe baroreceptor reflex, a topic in cardiovascular physiology. In orderto providethe most naturalconversational experience possible,wewouldlike to let the studenttakethe initiative where possible.Onthe other hand, becauseof the increasedcomplexityof the requiredinfrastructure,the difficultyof understanding full free-text input, andthe tutor’s desire to accomplish the tutoringagenda,wemustrestrict the typesof initiatives whichthe systemwill attemptto respondto. Weclassify initiativesaccording to the natureof the student’sutterance andaccordingto the type of processingrequiredby the tutor to handlethem.Wedescribe howweencouragethe studentto giveresponseswecan handle.Weexplainwily webelievethat thesemethods donot restrict the student’s ability to communicate with the systemor to learn the material. Weillustrate the phenomena describedwith examples fromhuman-to-human tutoringsessions. 44 described by Fox(1993), wherethe student choosesthe agendafor the tutoring session. In this paper we characterize the types of mixedinitiative interactions whichCIRCSlM-Tutor can participate in. Weclassify potential studentutterances andthe reasoningpowerrequired by the tutor to respondto them. Wealso describe types of mixed-initiative interactions which the system cannot handle and howwe reduce the frequencyof such occurrences. The CIRCSIM-Tutor Planner Problem Domainand Tutoring Task CIRCSIM-Tutor conductsconversationswith students about the baroreceptorreflex, a topic in cardiovascularphysiology which all beginning medical students need to master. Thebaroreceptor reflex is the negative feedback loop whichattempts to maintaina steady bloodpressure in the human body. The focus of CmCSlM-Tutoris on tutoring students on material whichthey have already studied, not on teaching newmaterial. This orientation leads us to place greater emphasison the interactive aspects of the conversation,rather than on elaborate ways to present material. Thusgeneratingcohesiveturns which fit coherentlyinto the evolvingdialogueis moreimportant than generatingcomplexexplanations, for example. In the beginningphysiologycourse, students are given problemsto solve basedon a simplified qualitative model of the heart. In each problem, something happens to changethe processing of the heart. Thestudent is then asked to predict the direction of changeof seven core variables in each of three resulting physiologicalstages. Studentscan solve such problemsin manysettings: alone, in conversationwith a humantutor, or while interacting with a tutoring system.Whenthe student’s predictionsare complete, either the humantutor or the ITS conducts a dialogue with the student to help the student learn the correct answersandthe reasoningbehindthem. Planning the Tutor’s Response In order to model both pedagogical and linguistic strategies, the CIRCSlM-Tutor project has collected over 5000 turns of keyboard-to-keyboardtutoring sessions wherestudents solve problemswith the aid of live tutors. Ouranalysis of these transcripts showsthat CIRCSIM-Tutor needs both a global, plan-oriented modeland a turn-byturn modelsuch as that used by the ConversationAnalysis 1school(Sinclair &Coulthard1975). Thetutor maintainsglobal control of the conversation while respondingturn by turn to the student’s utterances. To be able to respondflexibly to the student, the planner refines plans only until the next moveis clear. Thusit can afford to replan wheneverthe student gives an unexpected response. Theglobalplanis visible in the hierarchicalstructure of the dialogue,whichcontains the followinglevels: ¯ Physiologicalstage ¯ Core variable ¯ Attemptsto teach each variable Withineach physiologicalstage, the tutorial dialogueis dividedinto segments,onefor eachcore variable. Usually, only the variables whichthe student missedare discussed. Each segment is divided into one or more attempts to teach the valueof a variable. Thebasis of eachattemptis a correction schemachosen from a plan library. The schemacontains one or moregoals whichmust be satisfied in the comingturn or turns. Goals are satisfied recursivelyuntil primitivespeechacts are obtained. Eachattemptends with the tutor requestingthe correct valueof the variablebeingtutored. If the studentgives the correct answer,the segmentends. Otherwise,the attempt fails, causingany remaininggoals associatedwith it to be removedfrom the agenda. The tutor can makeanother attempt or give the student the answer. A moredetailed description of the CmcsIM-Tutor planner can be found in Freedman(1996). Thespeechacts are assembledinto turns, then realized as surface text and displayedto the student. In a typical tutorial dialogue, every turn has the following basic structure. Notethat eachof the sectionsis optional. Responseto student’s previousstatement Acknowledgment of student’s statement (e.g. yes, no, you’reright) Content-orientedreply (e.g. rebuttal or statementof support) Newmaterial Nextpart(s) of current schema Questionfor student to answer This model,basedon the tenets of ConversationAnalysis, is regularly observedin our human-to-human transcripts. Although humantutors don’t necessarily do so, the mechanized tutor always ends a turn by explicitly requesting informationfrom the student, usually with a question. Some Sample Dialogues The followingis the most common schemaused to correct one categoryof variables, those controlled by the nervous 2system. 2 Theactualschemata are writtenin LisP.Forsimplicity,the prerequisitesare not shown. 1 Thisworkwasbroughtto our attentionbyCawsey (1992). 45 Correct-neural (V): 1. Makesure student knowsthat V is controlled by the nervous system. 2. Makesure that student knowsthat current stage is pre-neural. 3. Makesure student knowsthat correct value of V is no-change. Figure 1 shows in condensed form a subset of the dialogues which can be generated from this schema. Each item in italics represents text which could be generated by one element of the schema. Where convenient, we have showntwo possible realizations separated by a slash. The numbers correspond to the subgoals in the schema. The student’s responses are shownin romantype. In the lefimost path, the student gives the desired answer immediately. In the second path, the student gives a partially correct answer, in this case an answerwhichis true but does not use the tutor’s desired language. The tutor adds a goal to correct the student’s language before continuing with the schema. In the third branch, the student gives an answer which is on the path toward the correct answer. The tutor helps the student toward the correct answer in two different ways. In the rightmost branch, the tutor chooses to give the student the answer. Due to space limitations, we have only shown one possible realization for the second and third subgoals of the schema.As multiple realizations are possible for these subgoals as well, one can see how the CIRCSIM-Tutor planner can generate a large numberof coherent dialogues from a few basic elements. Although CIRCSIM-Tutordoes not have the syntactic ability or semantic range of a humantutor, it can emulate all of the major dialogue patterns used by our domainexperts. Handling Student Initiatives Classification of Student Input The reply generated by the system depends on the type of the student’s utterance. The following response types, which are the most commonones, were illustrated in Figure 1: ¯ Correct answer ¯ Wrong answer ¯ Physiological near-miss: a step toward the correct answer ¯ Linguistic near-miss: linguistically close but not exact answer Eachof these response types can refer to the tutor’s most recent question or to a higher question on the agenda. The student can also say something which does not relate to an open question posed by the tutor. Here are two examples: 46 ¯ Student adds newinformation, e.g. an explanation. ¯ Student changes the topic. For the purposes of this paper, we will consider any student statement which adds new content to the dialogue as a student initiative. This definition subsumesa narrower definition which would only include statements which change the topic of conversation. The broader definition is convenient because muchof the processing is similar. Problems with Unrestricted Student Initiatives Student initiatives present a difficult problem.On the one hand, we would like to let students respond as freely as possible. Open-endedquestions force students to think more deeply. Additionally, open-ended questions permit students to focus in on their specific problems.This ability is one of the justifications for building natural-language based ITSs. But there are several problems with permitting unrestricted student initiatives: ¯ First, the student’s utterance can be too difficult to understand at the purely mechanical levels of spelling, syntax and basic semantic processing. ¯ Second, just because we can understand a statement at a literal level does not mean that we can understand the student’s model of the domain(Borchardt 1994). This can be especially difficult if the student’s domainmodel is invalid. ¯ Third, even if we can understand what the student is telling us, we may not have a constructive response available. ¯ Fourth, even if we have a constructive response available, responding to the student initiative maynot help the tutor achieve its agenda. Reducing UnwantedStudent Initiatives Askingthe students to learn a set of rules about what they can and cannot say detracts from the goal of a natural conversation. Therefore we rely on the fact that students are cooperative conversation partners (Grice 1975) encouragethe type of response we prefer. Oneway to avoid difficult student initiatives is to ask short-answer questions instead of open-ended ones. Reducingthe size and complexity of the expected response reduces the chance of misunderstanding a student utterance and the attendant frustration on the part of the student. In fact, this informal restriction on our part encourages students to use the system to its limits. Students are sensitive to the capabilities of the system, and they won’t generate non-trivial utterances if we have proved ourselves incapable of responding to them before. Can you tell me how TPRis controlled? / What is the primary mechanismwhich controls TPR? (1) Nervous system Sympathetic vasoconstriction Right Right [ And what controls that? neural [ ~ TPR is Radius of arterioles \ e ous I have no idea Which is neurally sysiem ~~ / co?oiled TPRis neurally c~olled /J (2) And we’re in the pre-neural period now / Rememberthat we’re in the pre-neural period (3) So what do you think about TPRnow? I Figure 1: Dialogues which CIRCSIM-Tutor can generate A second wayto restrict unwantedstudent initiatives is to ensure that each turn ends with an explicit request. Withan explicit request on the table, it is morelikely that the student will answer the question rather than change the topic. Anadditional benefit of this restriction is that it provides students with an unambiguousindicator of when it is their turn to respond. Turn-taking rules work in person-to-person conversation because we are socialized to understand and use them (Sacks, Schegloff & Jefferson 1974). But people expect to have a more explicit interaction with a computer(Dahlb~lck & J0nsson 1991). Although it might seem that restricting the types of discourse we generate and hope to receive might diminish the teaching ability of the system, this is not necessarily so. Accordingto one of our domainexperts3, humantutors have two problems with open-ended questions: ¯ Even with humantutors, students frequently misunderstand what the tutor is asking. Hence, even whenthe tutor 3 Dr. Joel Michael,personalcommunication. 47 can understand the student’s response, it may not relate directly to whatthe tutor is looking for. ¯ It is often difficult for a humantutor to understand(and hence to use) what the student says in response to an open-endedquestion. There are two reasons for this. First, students do not always understand to what organizational level (e.g. heart, myocardialceils, or membrane receptors) the question refers to. Second, medical students do not naturally reason causally. Whenthey are having trouble with causal reasoning, they sometimes switch to using teleological statements ("what the myocardial cells want is ..."), and these can be difficult for the humantutor to interpret. Thus humantutors have the same type of problems with open-ended questions as the computerized tutor, although not nearly to the sameextent. Desirable Types of Student Initiatives On the other hand, we would like to encourage student initiatives whenever we can understand and respond to them.In general, the current CIRCSlM-Tutor infrastructure can handle an initiative whichsatisfies the following criteria: ¯ Doesnot require a deepunderstandingof the student’s (possibly buggy)domainmodel. ¯ Doesnot require reasoningaboutthe student’splan. ¯ Doesnot require the use of stackeddiscourse contexts, as wouldbe required, for example,if the student asked a hypotheticalquestion. Althoughclassifying the student’s response requires identifyingthe student’sgoal, moststudentstatementscan be handledwithout needingto reason about the student’s plan. Transcript analysis shows that for the type of problem-solving involved in the baroreceptor reflex domain,expert tutors lead the conversation,althoughthey give the student widefreedomin answeringthe questions. Asoneof the tutors’ goals is to implicitlyteach a modelof problem-solving,this is not an unreasonableapproach. Stackeddiscourse contexts mightbe required if tutor and student neededto switch amongmultiple topics or to describeeventsin a variety of possibleworlds.In general, these features are not requiredfor discussingthe valuesof variables and the causal relationships amongthem. Fromanotherpoint of view, wecan handlean initiative if the tutor can assimilatethe student’sdesire into its own agenda.Onecategoryof initiatives satisfyingthese criteria is simplerequests, e.g. askingfor help, a definition or an explanation.4 Thetutor can respondto these and other simpleinitiatives in severalways: ¯ Put the student’s request on the agenda above the current plan, i.e. respondto student’s statement, then return to plan ¯ Put the student’s request on the agendainstead of the current plan, i.e. switchto newplan ¯ Put the student’s request elsewhereon the agenda ¯ Acknowledge student’s input without responding ¯ Ignorestudent’s input In each of these cases, the student’s request has been incorporatedinto the tutor’s agenda. Initiatives in HumanConversation The following fragmentfrom a human-to-human tutoring session shows someexamplesof cooperative phenomena whichare outside the range of our model. -, T: ... Howcan CCgo up if it’s underneural control? S: (gives a confusedexplanationof a false statemenO 4 Although a requestfor help canalso be an indicatorof a complexstudentplan, goodresults can often be obtainedby takingsuchrequestsliterally. 48 T: (attemptsto deal with student’s confusion) Doyousee the difference? S: No,this conceptis hard for meto grasp. T: (explains further) OK? S: Is increasedcalciumthe only thing that can increase contractility? T: Yes. S: OK. So would it be accurate to say that (asks follow-up question)? T: Yes, and (gives further information). OK? S: OK. (1(10:43-52) In the first line of this example,the tutor uses an openended question to encourage the student to give an explanationin termsof a deeper-levelconceptmap.In the cooperative conversation whichfollows, both tutor and student contribute questions and ideas. In the secondand third markedstatements,the student takes the initiative, referring to concepts mentionedearlier in the dialogue whichare nowin the shared mental modelof tutor and student. The more complex and/or error-ridden the student’s explanationis, the less likely CIRCSIM-Tutor is to be able to understandit and reply constructively. For this reason we try to avoid such explanations on the student’s part by not askingopen-endedquestions such as the onethe tutor started with here. Onthe other hand, considerthe followingexample. T: (asks aboutthe value of TPR) S: I thought TPRwouldincrease due to higher flow rate through vasculature. (K10:32) CmcsIM-Tutor can handlea student responselike this one. Althoughwecannot makeuse of the explanation given by the student, wecan understand and respondto the core content CTPR will increase"). In fact, the humantutor does not necessarilyuse the additionalinformationeither: In the following example, the tutor responds to a student statement for which the truth value can be determinedbut with difficulty. This exampleis intermediate in difficulty to the twopreviousexamples.Toanswer the student’s questioncorrectly, the tutor needsa deeper domainmodel than that required just to computethe correct values of the core variables. Notethat the tutor mayneed to expend considerable resources on such a statement before it can determinewhetherit can indeed understandand answersuch a statement. T: What’syourfirst prediction? S: TPR,HR,CCall increase simultaneously. T: That’s pretty goodexcept for HR.Remember in this case this guy’s HRis solely determined by his brokenartificial pacemaker. 5 Note,however, that if the student’sreasonis incorrect,the humantutor can correct the student in cases wherethe computerized tutor cannot. -~ S: Wouldn’this other myocardial cells respond to sympatheticstimulation and couldn’t they override his artificial pacemaker? T: They might and then again they might not. We’re assumingin this case that they don’t. So whatdo you say about HR? (K18:36-40) This case is at the limit of our ability to understand natural languageinput. Althoughwestrive to increase our ability to handlesuchinput, wetry to reducethe frequency of such utterances by ensuring that every turn ends with an explicit request of the student. For example,the tutor might terminate the turn previous to the markedone by asking: T: ... Nowwhat do you think about the value of HR? Conclusions In a tutoring systemlike CIRCSIM-Tutor, the importanceof cooperativeconversation lies not in the conversationitself, but in interactivity, whichkeeps the student actively involved with the material. CIRCSlM-Tutor has several strategies for increasing interaction with the student in spite of the unavoidablelimitations on natural language processing.In particular, by askingshort-answerquestions instead of open-endedquestions and ending each turn with a question (or its equivalent), CIRCSlM-Tutor can maintain a general control over the conversation while still giving the student a great deal of leeway for discussingthe desiredsubject matter. CIRCSlM-Tutor can handle most things whichhappenin a tutor-led tutoring session, but it cannot handlea true cooperativeconversationwith two independentlyplanning agents. Thus we do not attempt to handle student utterances whichwouldrequire understanding the student’s plan, extendingour understandingof the student’s domainmodel,or using stackeddiscourse contexts. Byacceptingthese restrictions, wewereable to reduce the complexityof the architecture for CIRCSIM-Tutor v. 3. This has permitted us to reduce the response time and increase our content coverage.For a domainlike the baroreceptor reflex, webelieve that a tutor-led systemcan provide students with a large variety of coherent and helpful conversations.Since CIRCSlM-Tutor logs all of its conversations, wewill be able to update our modelas we acquire real-worldexperience. References Borchardt, G.C. 1994. Thinking Between the Lines: Computersand the Comprehensionof Causal Descriptions. Cambridge,MA:MITPress. 49 Carbonell, J.R. 1970. AI in CAI:Artificial Intelligence Approach to Computer Assisted Instruction. 1EEE Transactions on Man-Machine Systems11(4): 190-202. Cawsey,A. 1992. Explanationand Interaction: The Computer Generationof Explanatory Dialogues. Cambridge, MA:MITPress. Dahlb~lck,N. and JOnsson,A. 1989.Empirical Studies of Discourse Representations for Natural LanguageInterfaces. In Proceedingsof the Fourth Conferenceof the EuropeanChapter of the Association for Computational Linguistics, Manchester,291-298. Fox, B. 1993. The HumanTutorial Dialogue Project. Hillsdale, NJ: LawrenceErlbaum. Freedman,R. 1996. Interaction of Discourse Planning, Instructional Planning and Dialogue Management in an Interactive TutoringSystem.Ph.D.diss., Dept.of Electrical Engineering and ComputerScience, Northwestern Univ. Grice, H. P. 1975.Logicand Conversation.In P. Coleand J. L. Morgan,eds., SpeechActs (Syntax and Semantics, v. 3), 41-58. NewYork: Academic Press. Sacks, H., Schegloff, E. A. and Jefferson, G. 1974.A Simplest Systematicsfor the Organizationof Turn-Takingin Conversation.Language50(4): 696-735. Sinclair, J.M. and Coulthard, R.M. 1975. Towardsan Analysis of Discourse:The English Usedby Teachersand Pupils. London:OxfordUniversityPress. Acknowledgments Professor Martha W. Evens has shepherded the CIRCSIM-Tutor project since its inception. ProfessorsJoel A. Michaeland Allen A. Rovickof RushMedicalCollege were generouswith their time and knowledgeabout both pedagogicaland domainissues.