APOT: Atomic Path Optimizing for Traits by Thomas Lin Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree of Master of Engineering in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology May 20, 2004 L - Copyright 2004 Thomas Lin. All rights reserved. The author hereby grants to MIT permission to reproduce and distribute publicly paper and electronic copies of this thesis and to grant others the right to do so. 4 ASSACHUSETS INST OF TECHNOLOGY JUL 2 52004 LIBRARIES Author Ah Department of Electrical Engineering and Computer Science May 20, 2004 Certified by______________ i bHarold Abelson Thesis Supervisor Certified by_ IPick K.P. Yue Thesis Supervisor Accepted by Arthur C. Smith Chairman, Department Committee on Graduate Theses BARKER E This page intentionally left blank APOT: Atomic Path Optimizing for Traits by Thomas Lin Submitted to the Department of Electrical Engineering and Computer Science May 20, 2004 In Partial Fulfillment of the Requirements for the Degree of Master of Engineering in Electrical Engineering and Computer Science Abstract This thesis considers the design of automated tutoring systems that customize teaching material to accommodate individual student learning styles. In particular, we consider the following problem: Begin with one or more presentationsof a subject, and break them intofragments ("atoms') each expressing a single idea. Given information about an individual student's learningstyle, how can one select the optimal choice and sequence of atoms ("path of atoms") to create the most effective presentationfor that student? We have implemented several algorithms that automatically create such paths, and we investigate the tradeoff between number of constraints imposed by the algorithms and the number of paths they can find. We have tested one of these algorithms ("partition search") in an experiment where student volunteers in computer science studied material about planning and artificial intelligence. The results of the experiment indicate that the algorithms can produce presentations that are effectively tailored to the different learning styles. Thesis Supervisor: Harold Abelson Title: Class of 1922 Professor of Electrical Engineering and Computer Science at MIT Thesis Supervisor: Dick K.P. Yue Title: Associate Dean of Engineering and Professor of Hydrodynamics and Ocean Engineering at MIT 3 Acknowledgments I would like to thank Becky and my family for supporting me through this project. I would like to thank Professor Yue and Professor Abelson for their invaluable guidance and their financial support. 4 Table of Contentstom Sizes............................................................................................................................. Atom IDs............................................................................................................................... Am ount of Material.................................................................................................... Algorithms Usedfor Atom ization...................................................................................... 4.3.5 Atom ization Results............................................................................................................... 4.4 4.5 4.5.1 LEARNING STYLES D ESIGN CHOICES............................................................................................. IM PLEM ENTATION OF BEAM SEARCH ........................................................................................... Issues in System D esign...................................................................................................... 5 36 36 37 38 38 40 .. 40 40 41 46 48 48 Distance Calculations........................................................................................................... Creatingthe Postatom Table ............................................................................................ 48 49 4.5.4 Web Implem entation............................................................................................................. 51 4.5.5 Beam Search Resultsartition/SearchAlgorithm............................................................................................... LearningStyles versus LearningPreferencesmproving the System ............................................................................................................ 71 7.3.2 Applying the Principlesto TangentialAreasist of Figures Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 1. A Sample Atom ............................................................................................................................. 2. A History of Intelligent Tutoring Systems ................................................................................. 3. Description of Four Learning Styles .......................................................................................... 4. Atomization Can Be Done at Different Levels........................................................................... 5. Atomization Example.................................................................................................................... 6. Student Characteristics M odel................................................................................................... 7. Customization System Example................................................................................................. 8. Hybrid Systems Example .............................................................................................................. 9. Which Path Through the M aterial is M ost Effective?............................................................... 10. Graph of Constraints versus Possible Paths for Algorithms.................................................... 11. Beam Search Only Considers a Set Number of Atoms Per Level............................................... 12. M aking Large Prerequisite Graphs Level by Level................................................................. 13. Partition/Search Divides the Atoms into Groups...................................................................... 14. Path-Finding Algorithms Summary Chart............................................................................... 15. Overall Implementation Steps................................................................................................. 16. How W e Atomized and Labeled Atoms................................................................................... 17. Larger Concepts Graph ............................................................................................................... 18. Actual Prerequisite Graph for Partition/Search on the Topic of "Planning"................... 19. Zoomed-In Prerequisites Graph .............................................................................................. 20. Atoms W ithout Ordering Constraints...................................................................................... 21. Atom Arrangement Where Any Path W ill Work ........................................................................ 22. Learning Preferences Chooser.................................................................................................. 23. Learning Styles Assessment................................................................................................... 24. Distances Table ........................................................................................................................... 25. W ebsite Diagram ......................................................................................................................... 26. Chart of Hypotheses for Experiment........................................................................................ 27. Experiment M ain Page .......................................................................................... ..... 28. Flow of Experiment from Test Subjects' View............................................................................ 29. Questions Independent of Curriculum Type ............................................................................ 30. Single Question Statistics Part 1 .............................................................................................. 31. Single Question Statistics Part 2 .............................................................................................. 32. Paired Statistics Part 1................................................................................................................. 33. Paired Statistics Part 2................................................................................................................. 34. Bar Graph of Results for "M aterial Was Interesting"............................................................... 35. Bar Graph of Results for "Material Was Easy to Understand"............................................... 36. What Our Results say About Our Hypotheses ....................................................................... 7 12 15 17 18 19 20 21 21 24 26 28 30 31 35 36 39 42 43 44 44 45 46 47 49 50 54 55 56 58 59 60 61 62 65 66 67 Abbreviations and Terms MIT Classes 6.034 .................................... 6.825 .................................... 6.834 .................................... Artificial Intelligence (Undergraduate Introduction) Techniques of Artificial Intelligence Embedded Artificial Intelligence Terms Atom .................................... A fragment of course material (e.g., two paragraphs). Atom ization ......................... The process of creating atoms. Custom Curriculum ............. A course curriculum that is custom made to fit the attributes of a student. Larger Concept .................... A complete idea that can take several atoms to convey. Number of Constraints ........ Number of constraints imposed on possible paths of atoms by an algorithm. N umber of Paths .................. Number of distinct possible paths an algorithm can find. Path of Atom s ...................... A sequence of atoms put together to form a reading passage. Postatom s ............................. A list of atoms that can come after each particular atom in Beam Search. Student Model ....................... The information on student characteristics and/or knowledge level that an ITS stores for each student. Learning Styles Terms Activist ................................. Reflector ............................... Theorist ................................. Pragmatist ............................. Likes going out and doing things. Likes to reflect on things. Likes concrete proofs. Likes planning what to do next. GeneralAbbreviations Al ....................................... AIM A ................................... CGI ...................................... ITS ....................................... KR ...................................... PERL .................................... POP ...................................... Artificial Intelligence Artificial Intelligence: a Modem Approach (textbook) Common Gateway Interface, used for web programming Intelligent Tutoring System Knowledge Representation A computer programming language. The Partial-Order Planning algorithm Intelligent Tutoring Systems Abbreviations APOT ................................... Atomic Path Optimizing for Traits - a hybrid system AST ...................................... Adaptive Statistics Tutor - a hybrid system CoCoA ................................. Concept-based Courseware Analysis - a course verifier DCG ..................................... Dynamic Course Generation - a traditional system ELM-ART ............. ELM Adaptive Remote Tutor - a traditional system ID ......................................... Interactive Documents - a hybrid system 8 Java Tutorial ......................... Variables b ........................................... C ............................................ d ............................................ 1 ............................................ m .......................................... n ........................................... y ........................................... A customization system Average branching factor for each graph hierarchy level Cohesion of a path, in terms of prerequisites and flow Average characteristics distance of a path of atoms Length of path of atoms Number of atoms an expert can remember at one time Total number of atoms being used Levels in the atom graph hierarchy Algorithms Beam Search ......................... Involves assigning postatoms for each atom and finding paths where atoms can only be followed by their postatoms. Partition/Search ................... Involves partitioning the set of atoms, creating a prerequisite graph, and searching in the graph. 9 1 Introduction 1.1 Teaching More Effectively with Computers When MIT students go to class, they all get the same lecture. Live teaching is optimized toward each class's average in learning style, level, and interest. Years ago, this method worked well because the MIT student body was fairly uniform. However, there is much more diversity found in MIT's student body today. Admitted students come from diverse backgrounds and have more widely varying interests and learning styles. For this reason, lecturing to the mean will now neglect the learning styles and interests of more students. Felder and Silverman [14] found that learning styles of most engineering students are in many ways incompatible with the current lecturing styles of most engineering professors. One way to try to address the individualization problem is by offering small recitation sections. However, there are still 10 to 30 students per recitation. Also, recitation instructors are often less knowledgeable than the lecturers. The solution to this problem could be a web-based course sequencing computer system that gives high weight to individual student characteristics. (Course sequencing is the idea of reordering and selectively presenting course material). A perfect such system would be analogous to having the lecturer present material to each individual student that customizes not only for the knowledge level of the student, but also the student's background, learning interests, and learning styles. Good teachers employ several methods to be effective, and presenting the material that is most natural for individual students is one of them. A teacher can take as input many books ofa curriculum and a student's learningstyle and naturally output a curriculumwell-suitedfor a particularstudent. We seek to reproduce this form of intelligence algorithmically. As input, the program takes many books of a curriculum and a student's learningstyle. The curriculum is then broken down into atomic fragments, each of which expresses a single idea. The program's goal is to put together an optimal curriculumfor eachparticularstudent from these atomic fragments. 10 It makes sense that people can learn better from some books than others. A student who learns best from examples might learn better from a book that emphasizes examples rather than theory. Also, receiving reading material from different sources is not new to today's students. Students are routinely asked to read different sections from different books to learn a topic. This research differs from past research because (1) it integrates curricula from different textbooks, and (2) it can thus create curricula that aim to customize for learning styles. Most existing systems customize only for knowledge levels. This research contributes not only toward the Intelligent Tutoring Systems field, but also toward our understanding of learning, teaching, and academic material knowledge representation. 1.2 Expressing One Aspect of The Teaching Problem The larger question is to explore how we can teach more effectively with computers. We have already mentioned how part of this can be seen as mapping curriculum fragments and learning styles to customized curricula. Let us further specify the problem we are addressing as the following: Begin with one or more presentationsof a subject, and break them intofragments ("atoms') each expressing a single idea. Given information about an individualstudent's learningstyle, how can one select the optimal choice and sequence of atoms ("path of atoms") to create the most effective presentation for that student? We will call this the Atomic Path Optimization problem because it involves finding an optimal paths of atoms. When looking for the right algorithm, we notice that we are looking for the right tradeoff between the number of possible paths an algorithm can find and the number of constraints imposed by the algorithm. On the one hand we want an algorithm that can consider all possible paths and find the best path for each student, but on the other hand we realize that this would be computationally infeasible. There are infinite possible paths if we do not restrict path length, an exponential number of paths if we do, and still a factorial number of paths if we insist that no curriculum presents the same atom twice. 1.3 Algorithmsfor the Path-Optimization Problem We do not know the optimal point between the number of constraints and the number of possible paths, and we do not know what the best algorithm would be. In this 11 paper, we will study the path optimization problem and discuss various strategies for approaching it. We discuss the random path strategy, the one-path-fits-all strategy, the WalkSAT strategy, the Beam Search strategy, the Partition/Search strategy, the BeamPartition Hybrid strategy, the Collaborative Filtering strategy, and several other strategies. 1.4 Scenario Showing Our Implementation After the discussion of the strategies, we conduct a study to try to learn more about the problem. The first part of the study is to implement the Beam Search algorithm and the Partition/Search algorithm to learn about the details involved. This section (1.4) briefly describes the specific scenario we addressed, and shows of what our implemented Beam Search and Partition/Search programs are capable. We started with five textbook chapters (150 pages) of material on the topic of Al Planning: Chapters 11 ("Planning"), 12 ("Practical Planning") and 13 ("Planning and Acting") of Russell and Norvig's Artificial Intelligence: A Modern Approach, Chapter 15 ("Planning") of Winston's Artificial Intelligence, and Chapter 13 ("Planning") of Rich and Knight's ArtificialIntelligence. BASIC REPRESENTATIONS FOR PLANNING legnai mmSie assa. inioossie = 1551mte lS pegn lends omis itoefliem plamnn algorithms. while mninng much of the expessiveess d n bs... calculus reenaions. fonmdr enoad tomb ANN Ihe Sims Iaips owns me represetd by comjuim of fmsdium-fieamend Iftersls. that is. sensoncaes applied so onsr symbs. possibly seiaed For exads. temin stsate milk-asd-baans problem might be described as for the A Ai(Hoiim) A -Hie(Milk) A -Aieeeiasenal) A As we mentionedemliea sar description does ensend to be coee. An inooqdm ag or"ePeeds .ut ig obtaeindby as upoen in sanneseile anlenvi desrinen s uchmas nebe so.ain dfpossible cenlem mano fortwhichbthe aenoud like toieben a sucessefud PIN. M~any daseas system instad adopt te consmssmo-aalogsus en the "nellasn en fail=a givn uenon does not description e eate if pegmmWg--ba i logic r med eneaion posionvelimed then de limeda can be sssumed tobn falme Goalsmeaso described by conjunctions of literads. For example. the shoppins goal ight be represeenedas Ai(liene) A Ha(Mildk) A H eW(itxas) A Haoe(Desll Gols cn also contain variables. For exampe. the goal of beingaa sme tht sells milk would be sepresened a; ,Hae(Drilh Ai) A S1els(xMi&) As itel gols livnmt tmm prMless. Ase smssbles me ussemel tobe existentialy qsasmile gen ta Howeverome musm dstinguis cearly bmnen a gal givent plamer d a M enr s fenitualsquesnd go lsmessed asin sg laniagsyatam. thsbot epsro illis quiteoonI~ fore soni psoess itself so mientain only seyboit speesuatiml of ates. Because men actilasobhnge ody small pan of As amhe repreentaio. is me elcient to keep truck of she clages We will see how this is done shotly. I seseaanpmeeeos l-segiltaswhium5ashM-dise.On Figure 1. A Sample Atom Aiming to get atoms that each express a single idea, we designated small sections as atoms, diagrams as atoms, and broke larger sections into several atoms, each about the length of a small section. This division resulted in 150 atoms. Figure 1 shows a sample 12 atom that introduces the STRIPs language. After deciding how to divide the atoms, we scanned the 150 pages of material and used photo-editing software to divide and connect different pages so that we would end up with each atom as an individual image file. Next, we chose four learning style scales ("Activist," "Reflector," "Theorist," and "Pragmatist"), and rated each of our 150 atoms as Low, Medium, or High on each of the four scales. With that setup complete, we focused on developing algorithms that could map student learning styles to optimal curricula. The first such algorithm we implemented was a Beam-Search style algorithm. To set up Beam Search, we had an expert choose 5 atoms that could be plausibly presented after each atom. We chose a path length of 20 atoms, and also specified a few starting atoms. A valid Beam Search path is any path of 20 atoms where every atom following a given atom was one of the 5 atoms chosen by the expert for that atom. There are 520 such paths. A student using the Beam Search program is first given two learning style assessments, each of which rates the student as Low, Medium or High on the Activist, Reflector, Theorist, Pragmatist scales, to create a student model. Then, Beam Search uses the following method to return a valid Beam Search path that is a good fit for the student model: 1. Choose the starting atom that best matches the student model (by using a distance formula between the atoms' classifications and the student model). 2. Look at the 5 atoms that could come after the current atom, and pick the one that best matches the student. 3. Repeat step two until the path length reaches 20 atoms. Once a path is found, the program displays the 20 chosen atom images on the screen for the student to read. The second algorithm we implemented was Partition/Search. Partition/Search works by creating a directed graph over all the atoms (with each atom as a node). The graph procedure takes O(n x log(n)) expert time (n = number of atoms) to set up, and is described in further detail later in this thesis. Partition/Search returns the path within its directed graph that has the smallest average atom distance from the student model. Let's say a student has a student model of Activist=Low, Reflector=High, Theorist=Low, Pragmatist=High. For this student, our two programs would output paths 13 whose atoms were as close to the student model as possible. For example, if there were only two possible Partition/Search paths (in our implementation there are actually over 20,000 possible paths), and they were the same except path A had an Activist=Low, Reflector-Medium, Theorist=Low, Pragmatist=High atom where path B had an Activist=High, Reflector=Low, Theorist=High, Pragmatist=Medium atom, then the program would choose and display path A because it is a better match. The result is that we now have programs to generate curricula that are customized for individual students' learning styles. 1.5 Overview of Our Experiment Having devised several atomic path optimization algorithms and implemented two of them, we decided to run an experiment to see if/how customizing for learning styles actually improves the effectiveness of teaching. Our main hypothesis was that Partition/Search customizing for student learning styles could provide advantages for students looking to learn Planning. We also had several secondary hypotheses concerning exactly what the advantages were. For the experiment, we implemented a "worst fit" version of Partition/Search. The worst fit version returns the curriculum path that lies in the prerequisite graph but has the furthest average atom distance from the student model. We recruited 18 student test subjects by emailing the MIT Electrical Engineering and Computer Science mailing list. Each subject received a 30 minute best fit Partition/Search reading and a 30 minute worst fit Partition/Search reading. The subjects filled out questionnaires about their impressions of each curriculum and also took short quizzes on the learning material. Our data showed that the best fit curricula was more effective than the worst fit curricula in many ways. For example, we have statistically significant results showing that students thought the best fit curricula was easier to understand and more interesting, engaging, rewarding and meaningful. The results show that customizing for learning styles does make a difference in Intelligent Tutoring Systems teaching. So, the Atomic Path Optimization problem is indeed worth considering, and the work done here sheds some light on how the problem can be approached and might eventually be solved. 14 Background Literature 2 The idea of course sequencing is not new. Traditional course sequencing systems like DCG/CoCoA (Dynamic Course Generation / Concept-based Courseware Analysis) [6,10,15] and ELM-ART (ELM Adaptive Remote Tutor) [9] work by breaking course curricula into material on individual concepts, then customizing for student knowledge levels by giving students only the concepts they need to go from what they know to what they want to know. There are also customization-oriented systems like Java Tutorial [24] which develop several versions of the curriculum, test for student learning styles, then give students the version of the curriculum that best fits their learning style. Hybrid systems like AST (Adaptive Statistics Tutor) [22, 23] and ID (Interactive Documents) [8] perform traditional course sequencing first, then for each concept, decide which version (of the concept) to teach based on student learning styles. The system we explore combines sequencing with customization, but in a different way than existing hybrid systems. The system expands upon some dynamic delivery ideas explored by Niewiadomska [I]. Intelligent Tutoring Systems 2.1 Most Intelligent Tutoring Systems (ITS) produce customized curricula for students. The idea of customized curricula is fairly intuitive: a human tutor presents different material to different students, so a computer system should be able to do so as well. 1960 1970 1960's: The earliest adaptive response systems. 1980 197os-1 980's: Many ITS developed. 1973: Basic outline ITS rquirements Sleeman and Hartley. 1990 200 1990-2000's: ITS that focus more on Internet and multimedia 1983: First A rtificial Intelligence in Education conference Figure 2. A History of Intelligent Tutoring Systems 15 ITS have been around in some form or another for almost forty years, as shown in Figure 2. The majority of ITS have focused on how to accurately diagnose the knowledge level of the students as they learn and how to present the material most suitable for that knowledge level. ITS is also of particular interest to the distance learning community, because distance learners do not have regular access to teaching faculty. With the increasing popularity of the internet, some recent ITS research has focused on how the idea of ITS can interact with the online environment. As computer processing power has increased, people have also worked on creating animated ITS teachers to make students feel more comfortable. The ITS topic we explore in this paper addresses a recent problem: more and more course material is available digitally. For many topics, there is much more material available online than students have time to read. Some topics are taught in multiple subjects and students end up learning the same thing multiple times. There is a need for a system that can reduce all this material to a single curriculum that is best suited for each particular student. 2.2 Learning Styles Research "Learning Styles" has been an active field of study in educational research. Honey and Mumford [25] describe learning as a repeating cycle of experiencing, reviewing, concluding, and planning. Many people develop a preference for one or two of these stages. The four learning styles, each corresponding to preference for a particular stage, are: Activists, Reflectors, Theorists, and Pragmatists. * Activists get excited about new concepts, but can lose this enthusiasm quickly. They learn well when faced with challenges and competition. * Reflectors like to spend time reflecting before making decisions. They learn better when they are able to reflect on the learning material beforehand. * Theorists try to fit their observations into consistent models. They learn best when asked to make sense out of complicated ideas and problems. * Pragmatists like to test out potential solutions right away. They prefer learning that has practical benefits, or learning where the potential applications are clear. [1] 16 Individual students will identify at some level with each of the four learning styles, and this forms the basis for their "learning style" classification. For instance, a learning style classification (on a ten-point scale) might be "Activist: 8, Reflector: 6, Theorist: 2, Pragmatist: 4." Learning style questionnaires exist for rating people on these four scales. With learning styles come learning preferences. For example, most reflectors learn best from material that they have to think about and reflect on. Educational research has shown that in some cases, presenting customized material will help the student learn better. Refer to Figure 3 for a summary of the Activist, Reflector, Theorist, Pragmatist learning styles and preferences. learning style Activist Reflector Theorist Pragmatist description kind of material preferred Likes active participation, challenges and competition. Like to spend time reflecting. Like to fit their observations to models. Likes learning when it provides practical benefits. Examples and sample problems that encourage participation. Detailed descriptions of deep ideas that encourage reflection. Complex, proof-style, precise material. Material that clearly relates to real-world applications. Figure 3. Description of Four Learning Styles One caveat is that some students can switch learning styles depending on course constraints. However, many students might have difficulty switching learning styles, and even if they are able to, they might not be as comfortable with the style that they do not naturally use. Another caveat is that instead of focusing on the styles the student is strong in, it may be useful to try training the student to become stronger in the other learning areas. However, this may be a difficult task for students who have already finished many years of schooling and are in college. In addition, there have been other proposed learning styles classifications. Marton and SaIjo in Sweden have proposed a single learning styles scale that ranges from "deep learning" to "surface learning." Kolb [3] did studies where the student is assessed on "active vs. reflective" and "concrete vs. abstract" learning preferences. Kolb only has 2 scales compared to Honey-Mumford's 4 scales, so his final learning classifications contain less information. Similarities and differences between various scales are discussed in greater depth in a study by Cymeon [28]. We chose to use the Honey- 17 Mumford scale because a previous study by Niewiadomska used this scale, but the other learning style scales would have been equally valid choices. 2.3 Atomization Atomization is the idea of breaking a course curriculum into individual pieces ("atoms"). These pieces can be sections, paragraphs, sentences, or other types of fragments. The idea of atomization was briefly covered in Niewiadomska [1]. While other intelligent tutoring systems have had to use some basic unit, most papers have not discussed in depth how they came to choose the particular units that they did. ~ .,........~.. = . *. ~ -. ~ .F S.~-..-------.-... - V. original material chapter by chapter p~'s... ~ 4 FSF.....~. Idea by Idea sentence by sentence Figure 4. Atomization Can Be Done at Different Levels Atomization can be done at many levels. For instance, you could specify that every chapter was a single atom, and break the material down that way. Or, you could let every small independent idea be an atom. A finer grain like setting every sentence as an atom would give you many atoms but it would become harder later to meaningfully reassemble the atoms into a coherent curriculum. Figure 4 shows several levels that atomization could be done at. No "best" atom size has been established yet. Setting each idea as an atom works best for many applications, but even within this atom-size choice 18 there are finer classifications. Ideas come in many sizes, and it is not obvious what size of idea makes for the best atoms. Course curriculum atomization and the problem of how to best reassemble the atoms has parallels to the "atomization" done in fields like nanotechnology. With course atomization we have to decide what size atoms we should use and how we can better decontextualize the atoms so that they can still fit together when combined with atoms from other sources. 2.4 Traditional Course Sequencing The course sequencing idea has been around for over 10 years. "Traditional course sequencing" systems work by atomization and overlaying knowledge models. B B A B C D D F Figure 5. Atomization Example In traditional course sequencing systems, the atoms are organized into a graph based on prerequisites and effects. In Figure 5, A is a prerequisite for B. B is a prerequisite for C and E. This can be represented by a directed graph with arrows coming from prerequisites. Two knowledge models are then used for each student: the first is for what knowledge the student already has, and the second is for what knowledge the student desires. For instance, Alice might know A and E, while desiring knowledge about C. Bob might know nothing beforehand, and desire knowledge about D. The course sequencing system runs by finding a path through the material that connects what the student knows with what the student would like to know. So, Alice's path would be A-*B-*C, while Bob's path might be A-B-+C-+D. DCG/CoCoA and ELM-ART are examples of traditional course sequencing systems. DCG forms a model of a student's current knowledge and desired knowledge, 19 and constructs a path. If it finds that the student is doing poorly, it generates a new path through the material that avoids the more difficult material. One weakness of traditional course sequencing systems is that they do not customize for student backgrounds and learning styles. Two students with vastly different backgrounds and learning styles would get the same path through the material as long as they had the same initial knowledge and same final desired knowledge. 2.5 Customization Systems "Customization systems," on the other hand, give different learning materials to students with different backgrounds and learning styles. Instead of "previous knowledge," the customization system's student model stores information like the student's background, interests, time allocated, and learning style (see Figure 6). Student model categories math / science background major / pedagogical information abstraction and other capabilities interests and learning goals time allocated learning style motivation / affective state Figure 6. Student Characteristics Model A customization system also has several full sets of curriculum, as shown in Figure 7. The different sets of curriculum might be specially designed to suit different backgrounds or different learning styles. When a new student enters the system, the student takes a psychology test to build the student characteristics model (of items described in Figure 6). Then, the student is assigned to the stored version of the course that best matches his or her particular student model. 20 version .: B A -version2: Figure 7. Customization System Example Java Tutorial [24] is an example of a customization system. Java Tutorial is a system developed in Japan for teaching Java programming. While customization systems offer the advantage of customizing towards a student's background and learning style, they also have disadvantages. These systems are expensive to set up, as even just two custom versions of the curriculum can take a long time to develop. Also, these systems will teach a student the full curriculum (A through F) even if the student just needed to learn one particular atom of information (e.g., D). 2.6 Hybrid Systems There are existing hybrid systems that combine the traditional course sequencing approach and the customization approach in order to customize for both knowledge levels and user characteristics. A B C E D - F ®EI Figure 8. Hybrid Systems Example If we have the two sets of curricula as shown in Figure 7, we do not necessarily have to present the full sets of the curricula to the student in order. We could run the traditional course sequencing first, and then the customization system part after a path is 21 determined. As shown in Figure 8, path planning would be done first, then if the path goes through E, the system would choose the version of E that fits better. For instance, let's say that Alice knows A and E, while desiring knowledge about C. Meanwhile Alice's learning style best matches the learning style catered to in version 2 of the curriculum. The way the default hybrid system works would be to first run a traditional course sequencing system to find the A- B-+C path for Alice. Then, it would run the customization system and realize Alice is closest to curriculum 2. It would then give Alice the A-*B-+C material from curriculum 2. AST and ID are hybrid systems. AST's learner model is based on the user's background, preferences, and goals. Constant testing keeps the system aware of the user's knowledge level, so that AST can dynamically re-plan the traditional sequencing based on how well the student is learning. At first glance, this approach combines the best of both worlds. Alice is only presented with what she needs to know (A->B-+C), and she is presented with material that suits her learning style (curriculum 2). However, instructors often do not have several good sets of curricula which cater to different backgrounds while covering identical concepts. Also, writing such custom curricula is very difficult [26]. 2.7 Importance of Customizing for Characteristics in Rich Domains Niewiadomska [1] explored the idea of how to dynamically deliver course material for the rich domain of fluid dynamics. "Rich domain" refers to domains where there are many ways to solve particular problems. She found that students' academic performance and class satisfaction is dependent on learning styles, and that giving different lectures is necessary for evaluating students most accurately. 2.8 Open Questions In rich domains, there are often several possible correct ways to learn something. If we collect two textbooks in a rich subject, both of which teach D (see D from Figure 5), we are more likely to find A-+B--+C-+D in one textbook and A-+B-+E-+F--D in the other textbook, than A-+B-+C-+D taught with different styles in the two textbooks. 22 This scenario also arises when we only have one textbook which includes both paths, but the textbook covers more material than there is time to cover during the course. Plus, in the future as more and more courses (some of which teach overlapping concepts) are uploaded, there are sure to be multiple atomically different online paths for teaching the same concepts. When faced with a setting like this, the hybrid system can provide no benefit over the traditional approach because it does not have multiple sets of atomically similar curricula to use. Existing systems have avoided this problem by staying with one book of material and using just traditional course sequencing, or by taking the extra effort to come up with extra sets of material. If we want to take advantage of both knowledge sequencing and customization (in order to attain the best student performance and class satisfaction) without having to write substantial amounts of new educational material, then we need to look into developing a new system. 23 3 A Study of the Problem 3.1 The Problem Before we go on, let's re-examine the question. The large AI/Computer Science question is: (Can/how can) computers help (us teach/students learn) more effectively? We can state one aspect of this larger question as the following problem: Given n atoms, from which of the n! paths through the atoms can the student learn the best (taking into consideration how much she learns, how long it takes, how easy it is for them to learn, etc.) ? 1 2 STARTEN 3 Figure 9. Which Path Through the Material is Most Effective? Figure 9 illustrates this idea. We have books where the source atoms come from. After extracting the atoms, they get put into a graph. If there are no restrictions on which path to choose, then we end up with more paths through the material than a program would have the time to examine. So, we need a good algorithm for finding paths through this graph. Now, let's consider some issues in choosing and designing an algorithm. For example, do we want an algorithm that could possibly output a chapter atom by atom exactly from one of our sources? For some students, this kind of path might be the 24 optimal path we could construct. Some algorithm choices (e.g., "random path" and possibly "beam search") will be able to output this kind of path, while other algorithms (e.g., "partition/search") may impose constraints that prevent this kind of path from being chosen. Let us define a metric of path cohesion (hereafter "c") in terms of prerequisite satisfaction from 0 (low) to 1 (high). If we take a textbook and randomly generate a path, then c will be low and the student is more likely to get confused. We can approximate c by surveying people who have looked at (or tried to learn) from a path. Similarly, we define characteristics-match-distance (hereafter "d") for how well the path of atoms matches the student's characteristics (like learning style). A random curriculum should have a lower d value for a student than a curriculum that is custom-generated to match the student's characteristics. For each student, there will exist a single optimal path (or a few equally-optimal paths) through the atoms that best fits her characteristics. If we are able to consistently find the best path(s), then we could learn some interesting things. For example, will we notice that student A's best path requires fewer atoms than student B's best path? If so, would this mean that student A can learn just as well when presented with less material than student B? This would be an interesting result. 3.2 Constraints versus Possible Paths The optimal strategy for our problem will involve adding constraints to what the path can be. Constraints (e.g., "paths must lie on a directed prerequisites graph" or "paths cannot be longer than 40 atoms") are imposed by the algorithms to reduce the total number of paths being considered from a computationally infeasible number to a more manageable number. The optimal strategy will also involve many possible choices for what the path can be, because different paths work best for different students. Figure 10 shows a graph where the x-axis is "# of constraints" and the y-axis is "# of possible paths." All possible strategies for solving our problem lie somewhere along this graph. We want to find the optimal strategy, and where it lies on the graph. 25 # possible paths that can still be found by the algorithm /11- 0(n 1 ) random path of length I random path where you only get each atom up to once O(n!) beam search with beam width of c 0(cl) the optimal algorithm sh oul d f all In here collaborative Vf Iftering 0(1) partition/search sac a everyone gets the same predefined path # constraints imposed on paths Figure 10. Graph of Constraints versus Possible Paths for Algorithms 3.3 Extremes First, let us consider the strategies at the extremes of Figure 10. If our strategy is to pick a random path of random length, then this imposes no constraints and allows all possible paths, so it lies on our graph near the Y-axis on a point like ( 0%, infinite ). If we constrain the path to have length at most I (because we know that realistically, the best path is not going to contain a million atoms), then we now have ni possible paths and a ( 1%, O(n) ) point. We could further constrain the solutions so that each atom can only appear once in the curriculum. This would reduce the number of possible paths to n!, and might lie at ( 2%, O(n!) ). However, it is possible that even this kind of constraint would filter out the best path. Perhaps the best path involves presenting an atom early on and coming back to the same atom again later in a different context. 26 If our strategy is to give all students the same textbook chapter, then this imposes many constraints leading to only one possible path, and it lies on our graph near the Xaxis on a point like ( 100%, 1 ). The best strategy is clearly somewhere between these extremes. 3.4 WaIkSAT-Style The "satisifiability problem" is the problem of finding satisfying assignments to a Boolean formula. For instance, a solution to (X and Y) could be the assignment [X=l, Y=l]. WalkSAT is an algorithm that incorporates random walks to solve the satisfiability problem. First, WalkSAT guesses a solution to the problem. In our example problem, maybe it guesses [X=0, Y=0]. Then, it picks one of the variables, and flips its value. So, X=0 could become X=1. The algorithm keeps doing this until it randomly "walks" onto a satisfying assignment. There are several heuristics used by WalkSAT for finding out which variable it can flip to have the maximum chance of walking closer to a solution. A "WalkSAT-style" path-finding strategy would take a random path and refine it a large number of times. First, this kind of strategy would need a formula for determining how good a given path was (a "path-evaluating metric"). The path-evaluating metric could be another program that simulates a learner, or it could be a large formula that uses knowledge entered by the expert in the field. After the path-evaluating metric is established, the algorithm picks a random path. Then, it picks an atom along the path to replace during each step of the walk. Eventually, the algorithm should be able to walk its way to a good path. The key to finding a good path with the WalkSAT-style algorithm is to have a good path-evaluating metric. The number of constraints imposed by the algorithm is very low in theory because if you only run WalkSAT for one step, then it reduces to the "random path" algorithm. However, some path-evaluating metric could impose many constraints. If the metric was "A -> B -> D is the best path" and returned scores corresponding to how close the given path was to A -> B -> D, then this would actually be imposing many constraints on the final path. If WalkSAT was run for a million iterations with this metric, it would almost always converge onto the A -> B -> D path. 27 The number of paths that can be explored by WalkSAT corresponds to the number of iterations that the algorithm is set to run for. If the algorithm is set to run for 3 iterations, then even though any path in the search space might be hit, the algorithm is really only considering 3 different paths during the run. Because the number of constraints and number of paths both depend on the exact parameters that the algorithm is run with, it is difficult to place the WalkSAT-style algorithm on any particular point in the constraints versus paths graph. However, individual instances of this algorithm could be plotted to the graph. For instance, there could be a point that corresponds to "WalkSAT-style with path-evaluating metric A and 100 iterations" and another point that corresponds to "WalkSAT-style with pathevaluating metric B and 2 million iterations." 3.5 Beam Search In the Beam Search strategy, a domain expert picks the next best constant number ("beam size") of atoms after any particular atom. The beam size can be arbitrarily set (e.g., "5" or "2"), or it can be a value related to the total number of atoms (e.g., "log(n)"). a dj 'I ibi - i h C a f Figure 11. Beam Search Only Considers a Set Number of Atoms Per Level Let's consider the atoms in Figure 11, and arbitrarily choose 2 as a good beam size for this number of atoms. So for Beam Search, the expert needs to pick the next best 2 atoms from every atom. For atom a, the expert might decide that the next best atoms are b and e. By picking b and e, the expert is saying that if all he knows is that atom a was just taught, then he thinks teaching atom b or atom e next would be most appropriate. 28 After the expert sets up the table (hereafter the "postatoms table"), we can run Beam Search. Let's say you start the search at atom a. First, Beam Search adds atom a to your path. Then, it decides whether atom b or e is a better fit for your learning style. Let's say atom e fits you better. Now, Beam Search adds atom e to your path and looks at atoms f and g (which the expert chose as postatoms for atom e) next. This process continues until a pre-specified path length is reached or until a pre-specified end atom is reached (e.g., we could specify that all paths end after presenting atom i). Beam Search can be run as a one-time search or as a memoryless one-step-at-atime process. The main advantage of Beam Search is that it reduces the search space. If we wanted a path of length 1 but did not have any restrictions, we would have to consider nI possible paths. Beam Search reduces this number to (beam size)' (as shown on the right side of Figure 11), which is considerably lower than n. Taking beam size to be log(n), Beam Search would lie around ( 5%, O(log(n)') ) on our graph. One disadvantage of Beam Search is that the postatoms table takes O(n 2) expert time to set up, and that is too much required time. If we had a thousand atoms, our expert would need to make over a million comparisons to set up the table. Another disadvantage is that the Beam Search results are not very good (this results from the memoryless nature). So, we know that we want an algorithm with more constraints and fewer possible paths. 3.6 Partition/Search The "Partition/Search" strategy adds edges to the atoms and creates a directed prerequisite graph, then it assumes that the best path for each student lies in the directed graph. Partition/Search begins by creating a directed prerequisite graph in under O(n) time. We assume that the expert can keep m (maybe -10) things in his head at once, and that it is reasonable to ask an expert to create a directed prerequisite graph out of m atoms (this involves O(m 2 ) work). The graph-creating procedure involves hierarchically dividing atoms into categories, and can work with any number of atoms. First, decide on m categories that the n atoms can be divided into. Now, assign each atom to one of the m categories. There 29 - :- - - -- - - -- - - ---- - - - - == : - , -. zz -- - - -- - - should be m groups of n/m atoms, as shown in Figure 12. Next, for each of the m groups, divide all the atoms in the group into m more categories. Repeat this process until the groups at the lowest level have <= m atoms. Each level takes O(n) work (the expert assigns each of the n atoms to one of the m categories in his memory) and there are logmn levels, so this takes O(n x logmn) work. I group of n atoms + m groups of nim atoms+ n/nM "I m2 groups of nI(m 2) atoms logmn total levels nlm groups of m atoms mm mm m~mm Figure 12. Making Large Prerequisite Graphs Level by Level For instance if you took all the knowledge in the world, this might be a billion atoms. At the highest level, we want m categories to divide the atoms into. One category might be "Scientific Knowledge" and another category might be "Common Sense." Each of these categories would have around 100 million atoms. "Scientific Knowledge" could then be further divided into m categories like "Physics" and "Chemistry." This division would continue for 9 (logol,000,000,000) levels until the categories in the lowest level each had under 10 atoms. Now, we create directed graphs within every category and subcategory. Start at 2 the lowest level, which has n/m groups of m atoms each. It takes O(m ) work to create a directed graph in each group. So, it takes O(n/m x m2 ) = O(n x m) work to create all the 2 directed graphs at that level. Next, move up one level. There should be n/m groups, each containing m subgroups. Create a directed graph for each of the n/M2 groups. This should 30 - - --f take 0(n/m2 x m2)= O(n) work. Continue this process for each of the logmn levels in the hierarchy. The total amount of work needed is O(n x m x logmn). m is a constant, so this reduces to around O(n x log(n)). We now take all our directed graphs and combine them into a single graph. Wherever we created a directed graph of groups, replace each of the groups with the directed graph of the particular group (from the lower level). 2. organize them into concepts 1. begin wfth ii the atoms 'N ' '* *N I K> N I I,) 'I \~_2 6> 6> 6~> I w; 6> 2 (~> ~> ~ (-~> K 6~> 2 .' I 7 . order the concepts I 6 I 4 order the atoms within concepts atart tart (~N ~ 'I /"*"N, I.. K> '~__~ N,, I \~> ,' ~ 6'N KI *~ 2 ~ - N """N 6> \.,.~/ \~2 n-end end Figure 13. Partition/Search Divides the Atoms into Groups Figure 13 shows the general idea. We start with 9 atoms, then break them into a group of 4, a group of 3, and a group of 2. In step 3 we create a prerequisite ordering over groups. The ordering we made means that a student will always be given the 4-concept group first. After the 4-concept group, they could go to either the 3-concept group or the 2-concept group. If they went to the 2-concept group, then the curriculum would end afterward without going to the 3-concept group. If they went to the 3-concept group, then 31 they would get the 2-concept group next. In step 4, the individual atoms within the groups get ordered. Once Partition/Search has the directed prerequisite graph, the challenge is just to find the best path in the directed graph for each individual student. One way to do this is to find the path that has the lowest "average atom distance" from the student. To calculate this, we need a distance formula dist(student model, atom information). Let's say a path has 3 atoms, with respective distances 4, 5, and 6 from the student model. The average atom distance of this path is 5. We can find the overall path with the lowest average atom distance by using our hierarchical system. First, find the best path (lowest average atom distance) through each of the n/m groups of m atoms. Then, set the lowest average atom distance as the "distance" for that entire group, and find the best path at the next higher level using these distances. Eventually, you will have a best path at the highest level which can be expanded back to get the best overall path of atoms. Partition/Search assumes that the curriculum can be taught as successions of larger concepts, and takes O( n x log(n) ) work to set up. This strategy lies on the graph closer to the ( 100%, O(1) ) point than the ( 2%, O(n!) ) point because with this strategy, all the possible paths have in effect been "pre-approved" by the expert. This strategy takes a reasonable amount of expert work but filters out the many good paths that do not happen to lie in the directed graph. So, we want a solution with fewer constraints and more possible paths. 3.7 Beam-Partition Hybrid Beam-Partition Hybrid is a hybrid algorithm between Beam Search and Partition/Search. It runs just like Partition/Search, except that at the lowest level of the prerequisite graph it runs a Beam Search instead of creating an directed prerequisites graph. Setting up a Beam Search at the lowest level takes the same amount of work as setting up the directed prerequisites graph would have, so Beam-Partition Hybrid takes the same amount of work to set up as Partition/Search. However, Beam-Partition Hybrid is able to explore more possible paths than normal Partition/Search because the Beam Search at the lowest level creates fewer restrictions than a directed graph would have. 32 In summary, Beam-Partition Hybrid imposes more constraints than Beam Search but fewer constraints than Partition/Search. Beam-Partition Hybrid can explore more possible paths than Partition/Search, but fewer possible paths than Beam Search. BeamPartition Hybrid takes around O( n x log(n)) work to set up. We know that the optimal solution on our paths vs constraints graph is between Beam Search and Partition/Search, and Beam-Partition Hybrid is between Beam Search and Partition/Search, so BeamPartition Hybrid may be worth exploring in greater detail. 3.8 CollaborativeFiltering Collaborative Filtering is a meta-approach. It requires us to use another strategy as a starting point, then it tries to find better results than the original strategy. It also requires us to choose parameters specifying how many "generated" solutions are presented and how many "random" solutions are presented. Collaborative Filtering starts with a training period before it becomes effective. Let's consider a Collaborative Filtering strategy that uses Partition/Search, with 3 generated solutions and 2 random solutions at each level. During the training period, many students are asked to use the system. These students are given the atoms one by one. After each atom, they are given 5 (= 3 + 2) atoms to choose from. 3 of the 5 are the next 3 atoms that Partition/Search would recommend, and 2 of the 5 atoms are randomly chosen. The student is asked to pick which of the 5 atoms she would like to see next, and gets the atom that she picks. This procedure is repeated until the student goes through the entire curriculum. Collaborative Filtering tries to improve itself during the training period. Late in the training period, it is likely that the system will encounter students that have student models similar to past students. When this happens, the system includes the previous student's atom choice in the list of 5 atoms for the current student. If the current student chooses the same atom at the same point as the previous student, then it is likely that the chosen atom fits well after the given atom. After the training period, the Collaborative Filtering system generates full curricula for new students by finding the paths that were hand-picked by the past students with the closest student models, and recommending those paths. 33 Because Collaborative Filtering includes randomly chosen atoms during the training period, it is able to find the potentially best paths and does not stay restricted to paths that fit along the prerequisite graph. Collaborative Filtering uses the help of students to try to move along the constraints vs paths graph (Figure 10) toward the optimal solutions. 3.9 Developing Theory If we develop a lot of theory about how to teach a subject, then we can approach the path-finding problem similar to the way we would approach a planning problem. The idea behind this is that we want to annotate every atom with plenty of pre-conditions and post-conditions. For instance, an atom on "How to drive a car" would have many preconditions checking if the user was ready to drive a car (e.g., "is-tall-enough," "knowshow-to-walk," "has-5-hours-free-time").The better the theory we have about what knowledge is needed, the better the algorithm will turn out. The atom will also have many post-conditions describing what attributes the user has after learning the atom. The overall problem now becomes a matter of picking the necessary atoms to go from a starting state of conditions to a desired ending state of conditions. 3.10 Summary Chart The following table (Figure 14) summarizes some of the algorithms for our path- finding problem, and describes some advantages and disadvantages of each. strategy Pick a random path of (random/fixed) length. (Allow/disallow) repeated atoms. WalkSAT-style strategy advantages 0 Easy to set up. 0 No expert work involved. 0 Can hit best solution. disadvantages 0 99.99% of the time, you won't get a good answer. 0 0 0 Can hit best solution. Results will be better than pure random. * 0 Beam Search - either through the entire curriculum, or as a memoryless 1-step-ata-time process. 0 o Simple to understand. Can return paths an expert would give. 34 0 0 A good prerequisite enforcement score function is hard to come up with, and may take a lot of expert time or deep domain knowledge. Students with the same student model might get different curricula. Results are only as good as the goodness-of-fit metric. O(n 2 ) work to set up. Hard to efficiently add more atoms later. 0 Partition/Search strategy. A Collaborative Filtering strategy. Pick an existing strategy and choose how many options come from that strategy and how many random options to present. A Planning Algorithm. For instance, POP or FF. Develop theory on the best way to teach a topic and on the structure of the topic. For each atom, mark what you need to know before and what you would learn. Then, search from starting knowledge to desired knowledge. 0 Scales well if more atoms are added later. O(n x log(n)) expert work. Enforces prerequisites. Can hit best solution. Students can be given paths that other students have chosen and done well with. System gets better with time. * 0 There are many established planning algorithms to choose from. 0 0 Can hit best solution if detailed enough. Generally good solutions that account for starting knowledge. 0 * * 0 * 0 a 0 0 0 Does not enforce prerequisites as well. The large reduction of overall search space leaves good paths but might cut out the best path and/or many clever paths. Takes many students before much improvement is shown. If the material changes, adjusting the system requires many additional student-hours. Planning is more about finding "any good path" than "the best path." Our problem is more about finding the "best path." Very domain dependent and requires theory on the best ways to teach topics in the particular domain. Gets complicated when looking at many factors per atom. Figure 14. Path-Finding Algorithms Summary Chart "Dynamic delivery" of content is an interesting idea that normally takes the form of periodic quizzes and re-planning. On the one hand, we can create a system that outputs an entire curriculum for the user. On the other hand, we can create a system that evaluates the user as she is going along, and adapts to the user's performance. The algorithms explored in this chapter did not use dynamic delivery because our formulation of the problem (as optimal-path-finding across atoms) did not require dynamic delivery. 35 ................. .... .... . .......... 4 Implementation 4.1 Objectives To further enhance the effectiveness of the course sequencing method, it is important to consider the following factors: 1. The ability of a system to use existing curricula instead of requiring development of new curricula. 2. Human teachers naturally teach examples and reading assignments from different sources. It would be useful for a system to be able to mix together material from different sources. 3. Greater emphasis on customizing for student characteristics. The system developed for this project improves on traditional course sequencing with factors 2 and 3. It improves on customization systems with factors 1, 2, and the ability to do course sequencing. It improves on existing hybrid systems with factors 1, 2, and 3. 1 2,4,46 curricula postaton table outputs custom curricula r- scanned atoms xeroxed pages LJ 0 I LHMM beam search atom learning styles lassifications F -1 outputs custom curricula partition search atoms prerequisites graph Figure 15. Overall Implementation Steps 36 learning styles quizzes We decided to implement Beam Search and Partition/Search in order to explore the implementation issues involved. The steps involved in this implementation can be seen Figure 15. Afterward, we also conducted an experiment where we ran two variants of our Partition/Search on MIT students, in order to gauge the effectiveness of customizing for learning styles in ITS. In addition to educational applications, the proposed research also has artificial intelligence applications. Data on the effectiveness of a rich hybrid system could be used to support theories about the role that learning styles play in student knowledge representation. Also, the system attempts to algorithmically reproduce some of the intelligence behind effective teaching. 4.2 Domain Choice The ideal subject for this experiment is one that has richness of material, different possible modes of teaching, and more focus on larger principles and ideas than on particular methods. Poor subject topics for this experiment would be those that emphasize heavy memorization (such that all students must get the same material), or subjects that are already "solved" (there is less flexibility in what to teach students). Many subjects now taught at MIT would work well. A few examples include: Computer Systems Engineering, Hydrodynamics, Linear Algebra, and Structures and Interpretation of Computer Programs. We chose the topic of "Planning" for this project. Planning is a way for programs to achieve goals by constructing plans of actions. Planning is good for this problem because: * It can be taught in different contexts and from different perspectives. MIT has several classes that teach this topic, such as 6.034 (Intro to AI), 6.825 (Techniques of AI), and 6.834 (Embedded AI). * It has different aspects that would appeal to different people. For example, industry engineers might be most interested in real-world planning with feedback, new Al students might be interested in blocks-world planning, algorithms people might be interested in learning it as a new set of algorithms, KR people might be interested most in how it takes advantage of knowledge representation to accomplish tasks. 37 * There are multiple "correct" ways to learn how a program could plan: partial order planning, randomized planning, hierarchical planning, etc. " The field is not solved yet. There is still active research being done. * It focuses more on learning and applying concepts than on significant amounts of memorization. * Accessible - Planning is covered by most A.I. textbooks, so there is a solid amount of textbook-style material out there about it, and it would not be too difficult to find more material if needed. e It is not so hard that very few volunteers/students would be able to learn it, and it is not so easy that almost all potential volunteers/students already know it. Students with some math/algorithms background should be fine (and there are many of those at MIT). 4.3 Atomization Design Choices 4.3.1 Atom Sizes A straightforward method (which we used in this project) to atomize is to split different textbook sections into different atoms. This way, each atom contains a complete idea, and we get some level of modularity. For example, the student could get the "How Planning Relates to Problem Solving" half-page-atom from one source and then the "Basic Planning Knowledge Representation Overview" two-paragraph-atom from another source, and it would still make sense. If we had atomized by turning every paragraph into an atom, it would be harder to combine different atoms into a passage that still makes sense. Each figure (usually a picture or an algorithm) becomes its own atom. A diagram (and its caption) describing the Partial Order Planning (POP) algorithm from one source could easily be of use to someone who got their general POP description from another source. Our atoms came from five textbook chapters (150 pages) of material on the topic of Al Planning: Chapters 11 ("Planning"), 12 ("Practical Planning") and 13 ("Planning and Acting") of Russell and Norvig's ArtificialIntelligence: A Modern Approach, 38 Chapter 15 ("Planning") of Winston's Artificial Intelligence, and Chapter 13 ("Planning") of Rich and Knight's Artificial Intelligence. We xeroxed (copied) the textbooks onto sheets of paper, cut and taped the sheets of paper to form out our atoms, then scanned the paper atoms to get digital atoms. From the 150 pages of source material, we got about 150 total atoms. This comes out to an average of 1 page per atom. We believe that these 150 atoms touch most of the general topics in planning, especially because one of the sources covered many topics in planning. If we needed more atoms, we could find more books and online notes about planning. It is likely that any new atoms from additional sources would add atoms that cover either (1) more advanced topics, or (2) the same topics taught with a different teaching style. Atoms from page 367 d source R 114 Atom R357A -- 64.. Atom R357B ,* *... " . .. - . -- a19 Atom R358A -O 4.9.4.44 o .41 .44.. .4-...k-. 7 - - - -- '.4 - -. &V Atom R358B *9*9. 0.&. . at. .9. ....i -- 4*.ag.9--.. - -0; 00: .,* 0 n 9- .49... b%.& W4Sa Atom R358Cj -a4-4.4a~ 0.**. *..'*~....4 origina. matril 4- Atom R359A orgia Atoms from page 368 of source R - material .. by9 *Ida 8~.Idea a.... IL . . 4 dom IDs Figure 16. How We Atomized and Labeled Atoms 39 1 4.3.2 Atom IDs In our implementation, Atom IDs took the form [letter] [number] [letter]. The first letter comes from the source of the atom. For instance, R stands for Artificial Intelligence: A Modern Approach by (R)ussel and Norvig. K stands for Artificial Intelligence by Rich and (K)night. W stands for Artificial Intelligence by (W)inston. As more sources are added, we can take unused letters from the authors names for the first letter. The number is the page number in which the atom begins in the original source. If an atom is the first atom from a particular source and page number, then it gets "A" as the second letter. If it's the second atom (e.g., there's another atom from the same source that starts on the same page), then it gets "B." The third atom gets "C," and so on. This system for assigning Atom IDs would scale until we have atoms from more than 26 different sources. If that happens, we can start using two letters at the beginning of Atom IDs. For the purposes of this project, allocating [1 letter][up to a 4 digit number] [1 letter] (up to 6 characters per ID) is enough. Figure 16 shows how we took the source material, divided it into atoms by letting each section/idea be an atom, and how we labeled the atoms. 4.3.3 Amount of Material A learning objective like "The student gains an understanding of how an Al system can plan actions to achieve a goal" with tests like "The student is able to describe a planning algorithm to solve a relevant problem" should be achievable with about 20 pages worth of material. In a textbook, this would be enough space for a new student to learn what planning is, one or more knowledge representations, one or more basic algorithms, some examples, and possibly a more advanced topic in planning that they are interested in. Given the average of 1 page per atom, this would come out to an average of around 20 atoms per student. So for Beam Search, we set the path length to 20. 4.3.4 Algorithms Used for Atomization The following is the procedure used to atomize and create the Partition/Search prerequisite graph: input: material from textbooks output: computer graph of atoms 40 > Copy the material onto sheets of paper with a copy machine. > Cut and tape the papers to form "atoms." An atom should express an idea. Diagrams and sections of chapters make good atoms. Sometimes, a section of a chapter will cover several ideas, and should be broken into several atoms. > Scan the atoms. Edit them digitally to make them more context-free (remove chapter numbers and blatant references to what was "just taught" or "will be covered next.") > Classify each atom based on how well it matches each of the 4 learning scales (Activist, Reflector, Theorist, Pragmatist). > Take the total number of atoms, n, and calculate the number of levels y = log(n)* > Calculate the branching factor per level, b = the yth root of n (so that by = n). b should be at most the value we chose for m (10). > Divide the material into b concepts, order these concepts in a prerequisite graph, and assign each atom to one of these concepts. > For each of the b concepts > Divide the concept's atoms into b smaller concepts, order these concepts in a prerequisite graph, and assign each of the concept's atoms to one of the smaller concepts. > Repeat the dividing and assigning until you've done it a total of y times. You should end up with a digraph of <= 10 concepts, each of which contains a digraph of <= 10 concepts, and so on. The lowest layer should contain <= 10 atoms as the concepts. * Each layer of the hierarchy should be a directed prerequisite graph with a reasonable number (e.g., under 1000) of possible paths (normal digraphs with under m=10 nodes should work). y=ceiling(log(n)/log(m)) reduces to log(n) when m is 10. As discussed in the algorithms section, it is important to choose the right b. If b is too low, then your paths will not be as rich. If b is too high, then it will become very difficult for the expert to create a prerequisite graph out of b atoms. An algorithm like the one we used allows the system to be very scalable. If we want to add an atom to the system, we just find where it goes and insert it. If a future incarnation of the system uses 1,000 instead of 100 atoms, things will still work out because the person or system doing the atomizing only has to look at creating prerequisite graphs for 10 categories or atoms at a time. They would have a lot more trouble if they tried to draw a prerequisite graph between all 1,000 atoms. 4.3.5 Atomization Results After atomization, we have 150 atoms floating around with no particular ordering. The first thing we did was to arrange the atoms into twelve larger concepts. Then, we created a prerequisite ordering over these concepts (see Figure 17). 41 The STRIPS representation Simple planning domains Planning Planning operators exercises Partial-order planning Withn eahvlrgier con ept also oesl h Post-POP ideas tm nt Examples of real -world planning Hierarchical planning euneta Advanced topics More expressive operators Planning and acting Figure E 17. Larger Concepts Graph Within each larger concept, we also ordered all the atoms into a sequence that could be presented to students. The overall picture of all the atoms can be seen in Figure 18. Notice what the zoomed-in figure looks like (Figure 19). There are some tricks involved in making the Prerequisite graphs. Note that if we want to teach both STRIPS and Predicate Logic but the ordering does not matter, this could still be represented in the graph (see Figure 20). Additionally, if we had 3 atoms and wanted to be able to teach any combination of them, a structure like the one shown in Figure 21 would suffice. 42 7t~ I~ ~It ~:,*, ~ ' _V Figure 18. Actual Prerequisite Graph for Partition/Search on the Topic of "Planning" 43 C3end C2end Cstar C4staro R341A K334A K-336A R,343A K332A R359B K334C W323A W325A K337A R360A. K-338A R344A R361A W345A C4end C5end C6start Figure 19. Zoomed-In Prerequisites Graph t strip A cate re0 r 2 S sS Lo r~aeL Figure 20. Atoms Without Ordering Constraints 44 start 3 S2 End Figure 21. Atom Arrangement Where Any Path Will Work At the lowest level (of the hierarchy mentioned in Figure 12), we established "teaching objectives" in order to help us create the directed graph across the atoms. When ordering the graphs, we made sure that every path covers at least the teaching objectives. Some examples of teaching objectives are: 1. "Overview": introduces the idea of planning. 2. "Key Basic Ideas": teaches fundamental ideas including the difference between planning and problem solving, and also why simple search is not enough. After creating the prerequisites graph, we analyzed all the possible paths through the graph and found: Number of distinct paths through larger concept graph: 128 paths Shortest path through graph: 6 concepts Longest path through graph: 13 concepts Number of distinct paths through atom graph: 24,256 paths Shortest path through graph: 13 atoms Longest path through graph: 117 atoms We are most interested in the paths that correspond to the student models. For instance, if the system can only consider 812 different student models, then the 812 paths that correspond to these models are the most interesting to us. Are many of these paths similar, or are most of them different? To get a feel for the diversity of this path space, we enumerated the 81 possible paths from assigning Low, Medium, or High to each of Activist, Reflector, Theorist, and Pragmatist. We initially expected there to be in the range of 40-50 different paths with a dozen very different paths and the rest of the paths looking like modifications on those dozen (e.g., a different example presented, a different 45 section for a particular concept). Surprisingly, we found that there were 77 distinct paths present in the 81 paths we generated. Number of distinct paths through atom graph: 77 paths (out of 81) Shortest path through graph: 18 atoms Longest path through graph: 44 atoms 4.4 Learning Styles Design Choices Learnrdng preferences For each ofthe folowing rows, select a description ofthe kind oflearmngmaterial that most appeals to you, by selecting its correspondingletter in the drop-down box to the side. Reading detailed passages and explanations. Passive leaning' You don't have to think too Plugging and chugging," working through examples and sample (inbetween) A deeply about the materialto understandKit. En e May be memotization or participation-focused. Relies on intuition more than proof. Or discusses whatto do with (inbetween) something rather than why thatsoewy thing is true. some r long detailed description of a deep idea. EncourIgfhri Choice A Complex, proof-style, precise, abstract, shows why something is Choice A R and deep reflection - encourages the user to reflect on theaterial Very theory-ish, no applcations in igbt relate to realword Cleadyeles to appicdions of applications somehow, or maybe a sight Mit be full of abstrct equations. Choice A:, Choice A tyeape Submit your chxoces return to main page Figure 22. Learning Preferences Chooser We rated each atom on how well it fit each of the four learning styles we used (Activist, Reflector, Theorist, Pragmatist). We used a scale with levels of Low, Medium and High for this fit. For example, material that strongly relates to real world applications would receive a Pragmatist rating of High, while material that is pure theory would receive a Pragmatist rating of Low (but probably a Theorist rating of High). We considered a scale of 1 to 5, but it was not very clear to us what the difference between a 2 and a 3 would be. The full chart used to judge the material is included in Appendix A.2. In order to create student learning styles models, we set up two separate online quizzes. The first quiz (shown in Figure 22) lets them select their learning style directly. It displays a chart similar to that shown in Appendix A.2 (without listing the names of the 46 .......... learning styles), and asks them to select what kind of learning material they prefer. It takes about 2 minutes to complete. Learning Styles Assessment This questionnaire is designed to identify your preferred way of learning. Administration of the questionnaire is not timed and will probably take about 10 minutes. The accuracy of the results depends on an honest appraisal of yourself. There are no right or wrong answers. If you agree more than you disagree with a statement then answer 'Yes'. If you disagree more than you agree then answer'No'. If you are unsure then answer 'Can't decide.' Can't F r [I IDo you often change your interests? e you often feel that you don't have enough control over the direction your life is 3 Do taking? 4 Would you rather see a comedy than a documentary on TV? r 5JDo youlike doing things in which you have to act quickly? r f6 -Do r I r 2 Were you ever late during your school days? r rCr C, Cr you often leave things to the last minute? r 7JHave you ever felt as though you were completely under somebody else's control? j]Are you bored by museums that feature archaeology and classical history? 9 Do you often do things on the spur of the moment? 101Does it take you along time to get started on something? Do you find that things are changing so fadt today that it is difficult to know what rules to follow? 12 Do you like work that involves action rather than profound thought and study? 1u3Do you generally do and say things without stopping to think? 14 Do you sometimes have a tendency to be inconsistent and untidy in your work? C IC C r r r. r. __ r r r r r r C r rC* rCr r R C 11f something goes wrong do vou usually attribute it to bad luck rather than bad Figure 23. Learning Styles Assessment The second quiz is an 80-question learning styles questionnaire that takes about 10 minutes to complete. This quiz was developed by Cymeon Research, and is available on their website and described in a paper [28] available on their website. This quiz uses questions like "Do you often change your interests?" and "Do you often do things on the spur of the moment?" in order to gauge users' preferred ways of learning. Our implementation of this quiz can be seen in Figure 23. Both quizzes output a ranking of Low, Medium or High on each of the four learning styles (The Cymeon quiz actually outputs a score from 0 to 40 on each of the learning styles before we map those values onto a Low, Medium, High scale). In our experiments, we use the results from both quizzes in determining the student learning styles. In terms of distance calculations, we use the formula: distance = ((distance using 47 -1# first learningstyles quiz) + (distance using second learningstyles quiz)) divided by 2, so we are using an average distance. Implementation of Beam Search 4.5 "Beam-Search" was the strategy we implemented first after setting up the atoms. 4.5.1 Issues in System Design There are several issues in system design. Is it possible that the system will return AIMA chapters 11, 12, and 13 exactly for a particular user? Is it possible that some learning styles will get fewer atoms than others? The way we implemented Beam Search, it is possible to get the exact material back, and every student gets the same number of atoms. 4.5.2 Distance Calculations We used a form of "squared distance" measurements (Low-1, Medium=2, High=3, distance between Low and High is (1-3)2=4, distance between High and Medium is (3-2)2=1). This works well with the Learning-Styles Preference-Chooser, but does not work as well with the full Learning Styles quiz. Let's consider material that is lowactivist, high-reflector. If an individual is a strong-activist and a strong-reflector, then she would be able to handle this material well by using her strong reflector skills. However, someone who is low-activist and low-reflector would be stuck on this material because she has not developed the skills needed to master the material. The distance calculations should be adjusted for the full Learning Styles quiz so that material more difficult than that which the student can handle is penalized greater (e.g., -40 and -10 instead of -4 and -1 in the distance calculations). A different way of calculating distances could have been to create a 3x3 table. In this kind of table, we would want user-high and material=high to be the best match. For instance, this table might look like Figure 24. low material medium material high material low user 2 2 5 (worst match) medium user 2 1 (2 nd best) 2 48 high user 5 (worst match) 2 0 (best match) Figure 24. Distances Table 4.5.3 Creating the Postatom Table When atomizing, we noticed that there were two main types of atoms: those which start a new concept and can be presented anywhere (hereafter "Category 1 atoms"), and those which refer strongly to the material around them (hereafter "Category 2 atoms"). This effect of having two main types of atoms comes from how some authors refer strongly to what was just taught or what will be taught next. One example of the second type of atom is the use of examples in a text that are referred to across multiple sections. From any given atom, you can present almost any Category 1 atom next, without confusing the student too much. However, if you present the wrong Category 2 atom, then the student is likely to be confused. We created a list of all the Category 1 atoms, with the corresponding Category 2 atoms that can be presented after each Category 1 atom (and Category 2 atoms are stored in ordered "chains" to preserve how they refer to each other). We also assigned the Category 1 atoms to the "larger concepts" that they fit in. We chose to consider 5 postatoms from each atom. To identify the 5 atoms that come after any particular atom, we followed these steps: 1. Identify the one atom that sequentially followed the original atom (in the text). This is the first of the postatoms. 2. If the atom is part of a Category 2 chain, then put the atoms following it in this chain into the postatom list (up to 5). 3. If we still do not have 5 postatoms, then take turns picking atoms from (1) the current "larger concept," and (2) the "larger concepts" that come directly after the current "larger concept" on the concept map. Within each "larger concept," the Category 1 atoms are ordered (arbitrarily). When picking more atoms from the current "larger concept," only pick other atoms that come after the current atom. (First you would pick the next available Category 1 atom, then you would pick that atom's Category 2 49 atoms, then you would find the following Category 1 atom, and so on). This prevents infinite loops from happening within a single "larger concept." The way we implemented postatoms was not true to the original method of having the expert compare every pair of atoms. Our expert considered doing this, but then he decided it would take too long (1502= 22,500 comparisons) and came up with the system we just described. While implementing the postatoms list, we also realized it might work to assign probabilities to each of the postatoms (instead of giving them all equal weight). A full postatoms table using probabilities could be stored as a Markov matrix. U dboeLupI: Aboit the project ndw.pI: The splash page ignupp: Gat an account Is.badc.pl: Learning maIn:pl: The main page Is-profier.pl: Learning Styles quiz Preferences quiz U.-1 cc-search.pl: Custom Cuicudum by search Figure 25. Website Diagram 50 4.5.4 Web Implementation For running Beam Search (and Partition Search too later), we set up a Perl-based website as shown in Figure 25. 4.5.5 Beam Search Results We chose a beam size of 5 and a path length of 20, so the space of possible Beam Search results was 520, or about 95 trillion different paths. The space of actual Beam Search results is limited by the number of distinct student model combinations, which is 812 because we have two learning styles quizzes with 81 possibilities each. From looking at the Beam-Search results and running one volunteer subject through Beam-Search, we concluded that it did not find very good paths. Two main reasons why Beam Search did not find very good paths are: (1) sometimes there were not 5 good postatoms and (2) Beam Search could not enforce knowledge prerequisites very well. The first problem is that sometimes there were not 5 good postatoms. In these scenarios, the expert had to add in postatoms that were not as appropriate. The most common case of this is when atom B (e.g. "how to tie your shoe") should definitely come after atom A (e.g. "how to put your foot in your shoe"), and there are no other atoms that teach what atom B teaches. With Beam Search, we must have 5 postatoms from atom A, so in addition to atom B we would be forced to add atoms C (e.g. "how to walk"), D (e.g. "how to put on your jacket"), E and F into the postatoms list. Now when the program gets to atom A, it would sometimes choose atom C, D, E or F next, when atom B was the only sensible choice. This results in curricula that skips over necessary knowledge. The second problem is that Beam Search could not enforce knowledge prerequisites very well. Consider a scenario where atom W requires atoms D and E as background knowledge. We tried to express this requirement by making sure all paths to atom W had to have passed through atom E and all paths to atom E had to have passed through atom D. However, two problems here are: (1) Atom W is an advanced idea while atoms D and E are basic ideas. Unless every path is forced to go through atoms D and E, there is no way for postatoms tables to "remember" whether a path went through atoms D and E by the time Beam Search is deciding between atoms V, W, and X. (2) We cannot 51 force any path to go through atoms D and E anyway because there must be 5 distinct choices at each junction. Looking back, we should have expected problems like these. Beam Search has relatively few constraints, so the path quality is not as high. 4.6 Implementation of Partition/Search The Partition/Search algorithm improves on Beam Search in several ways: (1) Partition/Search results will satisfy prerequisite orderings better. (2) Beam Search could be forced into repeating itself if the postatoms of the current atom have all been visited. (3) Adding more atoms in the future can be done more easily with a Partition/Search system. (4) Regardless of whether the beam size is chosen arbitrarily or with a function, Beam Search forces you to consider a preset number (the beam size) of postatoms. Let's say this number was 5. For some atoms, there are 7 good atoms that come afterward. For other atoms, there are only 2 good atoms that come afterward. Arbitrarily ignoring 2 of the 7 atoms could cause us to miss a good path, and forcing 3 not-so-good atoms could lead us down a bad path. 4.6.1 Partition/Search Algorithm input: graph of atoms, user learning style output: custom curriculum > For each larger concept > Find every possible graph path through the concept > For each possible graph path > Find the average distance between the path's atoms and the user's style > Store the best path as the concept's path > Store the average distance of the best path as the concept's weight > Find every possible graph path through the graph of larger concepts > For each possible graph path > Find the average distance between the path's concepts' weights and the user's style > Take the best path through the larger concepts, and expand it using the stored best paths > Return the atoms in the expanded path The Partition/Search algorithm takes advantage of the particular knowledge representation. It finds the best path within each larger concept first. Then, it assigns the best path weight to be the weight for the entire larger concept, and searches over all the 52 paths of larger concepts. By doing this, it is able to find the optimal path through its prerequisite graph very quickly. The Partition/Search algorithm assumes that we are working with rich domains where there are multiple correct paths through the material. In the Partition/Search algorithm, prerequisite fit is the most important factor. However, after the prerequisite constraints are satisfied, we are still left with many possible paths to give to the user. Customization for learning styles and other characteristics can come in and make the final path choice. From self-inspection, we decided that the paths generated by Partition/Search are quite reasonable. This is mostly because every possible Partition/Search path has in some sense been pre-approved by the expert. 53 5 Experiment 5.1 Objectives We ran an experiment on our implementation of Partition/Search in order to learn more about the effectiveness of Partition/Search and of ITS customizing for learning styles in general. 5.2 Hypotheses A main hypothesis is that the course sequencing system we implemented (which customizes for learning styles) can provide some advantages for students looking to learn Planning. If the group that the system gave the best-fit learning-style material to shows improvement over the group that got the worst-fit learning-style material, then the claim will have been shown. This improvement would take the form of higher "ease-of-use" responses, better quiz scores, or of similar test scores after less student time or effort. If no improvement is shown, then the claim will not have been proven but the results would still be valuable for people designing future such systems. Some additional hypotheses we decided to test in our experiment are: " * " * * * * 0 0 * Hypotheses for Experiment Partition/Search is less coherent than straight textbook. Customizing for Learning Styles actually makes a difference in effectiveness compared to using a textbook (both quiz results & meaningfulness). Partition/Search is coherent enough that students feel that no necessary knowledge is missing. Partition/Search is not redundant, even though it draws from redundant sources. The lack of "glue" in this kind of system makes it harder to read. Partition/Search could still handle questions that spanned concepts, despite its modularity. Some "effectiveness" areas in particular are improved (over the textbook and "worst fit" groups) by customizing with Partition/Search. These areas tell us where to focus the development effort. People feel Partition/Search gave them a more appropriate number of examples/theory/applications. (Learning Styles Fit) Partition/Search leads to more fatigue. The customizing does not cancel out the disjointedness yet. Partition/Search presents material closer to students' internal representation of the material. Figure 26. Chart of Hypotheses for Experiment 54 5.3 Procedures We set up an online system to conduct the experiment (for each student, it handles the consent form, the learning styles questionnaire, and the initial knowledge survey; generates the custom curriculum; and gives the ease-of-use survey and the final knowledge survey). Figure 27 shows the main page that test subjects work with. Welcome, sampleuser! The system creates a model of your learning style from steps 2 and 3. Then, it generates a custom curiculum for you in step 5. Please vork tn Se ilen in te greez box. The system lets you go back to items and change the answers, but do this only to correct typos and such. before: 11. Consent fon(30 seconds) 2. Select your learning preferences directiE (2 minutes) 3. Take alearning styles guestionnaire (10 minutes) 4. Take the initial knowledge assessment (5 minutes) main: 5. View the custom curriculum part 1 (30 minutes) 7. View the custom curriculum part 2 (30 minutes) 8. Take ease-of-use survey for part 2 (7 minutes) (and specify where youd like the movie passes mailed) after 9. Take the final knowledge assessment (10 minutes) log 0 Figure 27. Experiment Main Page We sought 20 MIT students to be test subjects. Recruitment was done by sending messages to the MIT Electrical Engineering and Computer Science (EECS) jobs mailing list. We were able to recruit 20 students, but only 18 ended up participating. COUHES [29] states that "specific groups should be neither favored or (sic) excluded for trials, and subjects should not be selected based on easy availability, convenience, or the ability to manipulate." We feel that recruiting subjects from the EECS mailing list would ensure that we would get a good sample of students who had the necessary computer science 55 background to participate in this study. Also, these subjects come from similar backgrounds, so this minimizes the effects of factors unrelated to our study. 18 students Consent Form Initial Knowledge Assessment the 9 students randomly chosen forthe "best fit" group 'Worst Fit" PartitiontSearch Curriculum on concept set I "Best Fit" PartltionlSearch Curriculum on concept set I I Ease-of-use Survey I IIfor 'Worst Fit" Partition/Search Curriculum on concept set 2 Ease-of-use Survey I "Best Fit" PartitiontSearch Curriculum on concept set 2 the 9 students randomly chosen the "worst fit" group Ease-of-use Survey 2 Final Knowledge Assessment Figure 28. Flow of Experiment from Test Subjects' View An alternative to recruiting volunteers would have been to conduct this experiment as part of a class. However, because the hypothesis is that one system will help students more than another system, it was best to conduct this study so that students' grades were not affected. Otherwise, students in one group might feel that students in the other group had an unfair advantage toward getting better grades. Also, if the system was run in addition to live lectures, then there would be noise from how well the students learned during the live lectures. Half of the subjects were assigned to a best fit group, and the other half were assigned to a worst fit group. Figure 28 shows the flow of the experiment from the viewpoints of the students in the different groups. Each group received the same information about the overall study. As shown in Figure 28, both groups end up getting both the best fit and worst fit curricula. We felt this would help minimize the effects of bad test subjects (e.g., those that have inherent biases toward learning from computer screens). 56 Each student spent about 60 minutes reading the material, and 30 minutes taking diagnostic quizzes and being interviewed. Most of the research was conducted at MIT Athena clusters or students' home computers, which could load the online system. We decided the main data from the experiment would be from the ease-of-use surveys because there are so many external factors that could influence the knowledgesurvey data. The ease-of-use surveys ask students questions to determine how "meaningful" their learning experience was (on the "meaningfulness" scale recommended by Dr. Mitchell at the MIT Teaching and Learning Laboratory). The ease-of-use survey also asks students to evaluate several attributes of the curriculum that they received, and to judge whether the material had an appropriate quantity of examples/applications/theory/length. It asks the students how well they think they learned the material, whether they are now more interested in the topic, and whether they thought the curriculum was more effective than a lecture/recitation/straight textbook would have been. Lastly, the ease-of-use survey asks questions about the interface (e.g., "was the font size distracting") in order to get feedback about how to design better such systems in the future. Additional data from the experiment includes student quiz scores and how much time and effort the students claim to have needed before attaining a good understanding of the material. 5.4 Results The full data can be found in Appendix B. This section reports some statistical calculations performed on the data. The main results are the responses to the ease-of-use surveys. The following data summarizes these responses. Six questions were independent of curriculum type, so they were only asked once. question (1 = Agree, 7 = Disagree) mean std dev I am naturally interested in the subject of Al Planning I learn best from interacting with human beings (e.g., asking 3.611 2.722 1.685 1.742 sample size 18 18 2.888 1.640 18 4.055 3.666 1.954 2 18 18 TAs and professors questions). The way some text referred to previous/past text (that I didn't get) was distracting to me. The variation in font and font size was distracting to me. The way images are presented was distracting to me. 57 The occasional imperfect scans were distracting to me. 3.944 1.797 18 Figure 29. Questions Independent of Curriculum Type As we can see from Figure 29, "The way some of the text referred to previous text (that [the test subjects] didn't get)" was the most distracting of the four distractions choices. Second most distracting was the way the images were presented. The text refers to images but you only get the image if the image atom is part of your curriculum. The images that you do get appear after the associated text, not entwined with the text. There are probably better ways to handle images (e.g., images appear in separate pop-up windows when you click on the associated text) than what we did. The imperfect scans and the variations in font size were not really issues, because the mean responses were about 4 on the 1 (agree) to 7 (disagree) scale. The rest of the questions were asked after both the "best fit" and "worst fit" curricula, so there are two sets of data per question. The "for" data corresponds to the "best fit" group and the "aga" (against) data corresponds to the "worst fit" group. We calculated 90% confidence intervals for what the true mean of the data sets should be, so that we could tell whether the true mean was likely to be above 4, below 4, or if we did not have enough data points to tell. Figure 30 shows the means, standard deviations, and 90% confidence intervals. For these questions 1=agree and 7=disagree, so the 90% intervals that lie completely below 4 are marked with "agr" for agree, and the intervals that lie completely above 4 are marked with "dis" for disagree. question 1. Ihad the background to understand the material that was presented. 2. The system was easy to use. for aga for aga 4. I am interested in learning more about Al Planning now. for aga 6. I am satisfied with my understanding of this material for now. aga 7. I found the ordering of the material more intuitive than for what most textbooks present. aga 8. I could have learned the material better by directly for reading a textbook. aga 9. I could have learned the material better by attending a for mean 2.944 3.333 3.111 3.666 3.111 3.944 4.166 5.055 3.555 4.888 3.944 3.5 3.611 st.dev 90% range 2.071 2.094-3.793 1.94 2.537-4.128 1.745 2.395-3.826 1.6812.976-4.355 1.45 2.516-3.705 1.797 3.207-4.680 1.504 3.549-4.782 1.474 4.450-5.659 1.688 2.862-4.247 1.745 4.172-5.603 1.392 3.373-4.514 1.33912.950-4.049 1.819 2.864-4.357 agr agr agr dis dis 100-student lecture. aga 3.444 1.756 2.723-4.164 10. I could have leamed the material better by attending a 10-student recitation. for 2.944 1.513 2.323-3.564 agr aga 2.611 1.377,2.046-3.175 agr 58 for aga 12. The length of the readings presented were appropriate. for aga for 13. There was an appropriate amount of theoretical background covered. aga 14. An appropriate amount of practical applications was for 11. The number of examples presented was appropriate. covered. 15. There were too many examples presented. 2.833 3.444 3.529 3.611 3.222 3.555 2.666 1.617 1.822 1.699 1.914 1.628 1.822 1.571 2.169-3.496 agr 2.696-4.191 2.812-4.245 2.826-4.395 2.554-3.889 agr 2.807-4.302 2.021-3.310 agr aga 3.888 1.745 3.172-4.603 for 5 1.74814.283-5.716 dis aga 4.777 1.352 4.222-5.331 dis 16. The reading passages felt too lengthy and detailed. for 3.666 1.533 3.037-4.294 aga 3.222 1.628 2.554-3.889 agr 17. The reading was too theoretical. for 4.777 1.477 4.171-5.382 dis aga 3.444 1.916 2.658-4.229 18. The reading focused too much on practical applications. for 5.444 1.293 4.913-5.974 dis aga 5.5 1.15 5.028-5.971 dis 19. I became more tired as I progressed. 20. I felt more fatigue when using this system than when reading a similar amount of textbook material. 21. This was less coherent than what I usually read in textbooks. for aga for aga for aga 3.222 2.294 4.611 3.277 4.333 3.055 22. It felt like necessary knowledge was skipped over. for 4.111 1.604 3.453-4.768 1.664 2.539-3.904 1.263 1.760-2.827 1.195 4.120-5.101 1.637 2.605-3.948 1.782 3.602-5.063 1.513 2.434-3.675 agr agr dis agr agr aga 2.666 1.371 2.103-3.228 agr 23. The lack of good *glue* in this system made it harder to for 4 1.714 3.297-4.702 read. *glue* is defined as text references to what was just taught or what is coming next. 24. The average textbook is less effective than what I just aga 3.055 1.349 2.501-3.608 agr for 3.388 1.719 2.683-4.092 read. aga 4.333 1.495 3.719-4.946 25. The material was presented in a way similar to (my best for 3.882 1.615 3.200-4.563 understanding of) my internal representation of knowledge. aga 4.944 1.433 4.356-5.531 dis Figure 30. Single Question Statistics Part 1 We can see that the best fit group agreed with statements 1, 2, 4, 10, 11, 13, 14, 19 and disagreed with statements 15, 17, 18, 20. The worst fit group agreed with statements 10, 16, 19, 20, 21, 22, 23 and disagreed with statements 6, 7, 15, 18, 25. In addition to the first 25 questions, we asked 7 questions concerning the students' "learning experience" and 9 questions concerning the students impressions of their curricula. The results from these questions are shown in Figure 31. term Igroup Imean Jstd.devJ 90% rangeI How well do the following words and phrases describe the learning experience you had? (1 =Poorly, 7=Extremely Well) 1. Meaningful 2. Stimulating for against for lagainst 59 4.833 3.333 4.388 13.3331 1.543 4.200-5.465 well 1.495 2.719-3.946 poorly 1.78613.655-5.1201 1.37112.770-3.895 1poorly 3. Sense of Discovery 4. Rewarding 5. Leads to New Questions for against for against 4.277 3.388 4 2.833 1.564 1.539 1.414 1.15 3.635-4.918 2.756-4.019 3.420-4.579 2.361-3.304 for 4.277 1.637 3.605-4.948 poorly 4.111 1.745 3.395-4.826 against 4.055 1.661 3.373-4.736 for 6. Challenging 4.611 1.819 3.864-5.357 against 2.944 1.661 2.262-3.625 poorly for 7. Moments of Wonder 2.666 1.414 2.086-3.245 poorly against How well do the following words and phrases describe the custom curriculum you just read? (1 =Poorly, 7=Extremely Well) 1. Coherent for 4.333 1.814 3.589-5.076 2. Relevant against for against 3.388 5.222 4.333 1.576 2.741-4.034 1.555 4.584-5.859 1.495 3.719-4.946 3. Engaging for 4.555 1.503 3.938-5.171 against 3.055 1.513 2.434-3.675 poorly 4.722 4.388 5 1.487 4.112-5.331 1.819 3.641-5.134 1.084 4.555-5.444 well 5. Useful for against for 6. Effective against for 4 4.5 1.371 3.437-4.562 1.382 3.933-5.066 7. Interesting against for 3.444 4.888 1.464 2.843-4.044 1.529 4.260-5.515 against 3.388 1.576 2.741-4.034 for against for 3.777 3.055 4.833 1.733 3.066-4.487 1.433 2.467-3.642 1.424 4.248-5.417 poorly well against 2.823 1.38 2.240-3.405 poorly 4. Tedious 8. Redundant 9. Easy to Understand well well well Figure 31. Single Question Statistics Part 2 The best-fit curriculum learning experiences were rated as meaningful, but lacking in moments of wonder. The worst-fit curriculum learning experiences were rated as not meaningful, not stimulating, not rewarding, and lacking in moments of wonder. The best-fit curricula were rated as relevant, tedious, useful, interesting, and easy to understand. The worst-fit curricula were rated as not engaging, not redundant, and not easy to understand. To compare the best-fit data and the worst-fit data, we ran paired t-tests. Figure 32 shows the paired t-test data for questions 1 to 25. The "mean diff' is the mean difference between the best-fit and worst-fit values, and "prob" represents the confidence levels that we found for whether the difference is statistically significant. question mean duff t-stat 60 prob sign 1. I had the background to understand the material that was presented. 2. The system was easy to use. 4. I am interested in learning more about Al Planning now. 6. I am satisfied with my understanding of this material now. 7. I found the ordering of the material more intuitive than what most textbooks present. 8. I could have learned the material better by directly -0.38 0.977 < 80% -0.55 -0.83 1.13 < 80% 2.472 95% - 98% sig for -0.88 3.032 > 99% sig for -1.33 3.509 > 99% sig for 0.444 1.323 < 80% reading a textbook. 9. I could have learned the material better by attending a 100-student lecture. 10. I could have learned the material better by attending a 10-student recitation. 11. The number of examples presented was appropriate. 12. The length of the readings presented were 0.166 0.612 < 80% 0.333 0.97 < 80% -0.61 1.111 < 80% -0.23 0.456 < 80% appropriate. 13. There was an appropriate amount of theoretical -0.33 0.555 < 80% background covered. 14. An appropriate amount of practical applications was covered. 15. There were too many examples presented. -1.22 2.211 95% - 98% 0.222 0.421 < 80% 16. The reading passages felt too lengthy and detailed. 17. The reading was too theoretical. 18. The reading focused too much on practical 0.444 0.912 < 80% 1.333 3.011 > 99% -0.05 0.14 < 80% sig for sig aga applications. 19. I became more tired as I progressed. 20. I felt more fatigue when using this system than when 0.823 1.91 90% - 95% sig aga 1.333 3.366 > 99% sig aga reading a similar amount of textbook material. 21. This was less coherent than what I usually read in textbooks. 22. It felt like necessary knowledge was skipped over. 23. The lack of good *glue* in this system made it harder to read. *glue* is defined as text references to what was 1.277 2.64 98% - 99% sig aga 1.444 4.578 > 99% 0.944 3.307 > 99% sig aga sig aga just taught or what is coming next. 24. The average textbook is less effective than what I just -0.94 2.401 95% - 98% sig for read. 25. The material was presented in a way similar to (my best understanding of) my intemal representation of knowledge. -1 3.687 > 99% sig for Figure 32. Paired Statistics Part 1 The analysis shows that the subjects agreed with the following statements more after the best fit curriculum: "Iam interested in learning more about Al Planning now," "I am satisfied with my understanding of this material now," "I found the ordering of the material more intuitive than what most textbooks present," "An appropriate amount of practical applications was covered," "The average textbook is less effective than what I 61 just read," and "The material was presented in a way similar to (my best understanding of) my internal representation of knowledge." Meanwhile, subjects agreed with the following statements more after the worst fit curriculum: "The reading was too theoretical," "I became more tired as I progressed," I felt more fatigue when using this system than when reading a similar amount of textbook material," "This was less coherent than what I usually read in textbooks," "It felt like necessary knowledge was skipped over," and "The lack of good glue in this system made it harder to read." prob sign mean diff t-stat term experience you had? phrases describe the learning How well do the following words and sig for 1.5 3.218 > 99% 1. Meaningful 1.055 2.816 98% - 99% sig for 2. Stimulating 0.888 2.673 98% - 99% sig for 3. Sense of Discovery 1.166 3.376 > 99% sig for 4. Rewarding 0.166 0.509 < 80% 5. Leads to New Questions -0.55 1.1 < 80% 6. Challenging 0.277 0.79 < 80% 7. Moments of Wonder How well do the following words and phrases describe the custom curriculum you just read? 0.944 2.188 95% - 98% sig for 1. Coherent 2. Relevant 0.888 2.464 95% - 98% sig for 3. Engaging 1.5 5.303 > 99% sig for 4. Tedious 0.333 0.594 < 80% 5. 6. 7. 8. 9. 1 1.055 1.5 0.722 1.882 Useful Effective Interesting Redundant Easy to Understand 3.194 3.036 4.231 1.758 5.344 > 99% > 99% > 99% 90% - 95% > 99% sig for sig for sig for sig for sig for Figure 33. Paired Statistics Part 2 Figure 33 shows the paired t-test data for the questions concerning the learning experience and custom curriculum. Subjects felt like the following terms better described the best fit curriculum learning experience (than worst fit): "Meaningful," "Stimulating," "Sense of Discovery," and "Rewarding." Subjects felt the following terms better described the best fit curriculum (than the worst fit curriculum): "Coherent," "Relevant," "Engaging," "Useful," "Effective," "Interesting," "Redundant," and "Easy to Understand." Another set of results we have are the general user comments regarding the system. A few of these comments focused on the images: "difficult to follow the diagrams with the text because they were far away from each other and required too frequent scrolling back and forth." 62 "Having all the figures mentioned in a reading section placed after the section made understanding the reading difficult." "Taking book material and trying to reorder it can cause problems with linearity being destroyed. Likewise, having image/tables near the point they're referenced, either in-line or linked, would be really useful, since I never know whether to look around for the referenced image, since I don't know if it'll be included or not." One person commented about how his first curriculum was more stimulating than his second curriculum. It turns out that he was in the FOR group, so his first curriculum was the one that customized for his learning style. "The first curriculum was more stimulating and allowed me to reflect more. It was definitely more stimulating than the 2nd system because of the manner in which material was presented. However, I learned from both." Some people commented on the interface: "The interface is great." Some people commented on the computer medium. Several people said that if they were reading textbook material, they would prefer printed copies because they are used to highlighting and making notes on the side of the text. "I find reading for a long time on the computer makes me tired in general." "I think I really prefer actual textbooks if the material is going to be presented in a textbook fashion. (That way I can make notes and highlight.)" "Although you have a clever idea, the technical limitations of today's Web make delivery difficult. I was tempted to (but did not) print out the passages to read them better. Perhaps in the future, a paper like interface, better scans, and the ability to view the diagrams at the same time as the text, would allow your system to be more practical and more deliverable." "In general, it's a lot nicer to have paper copies because you can take notes and mark up the pages. Perhaps a way to annotate alongside the text would be helpful." "i wish the information were presented in a less textbook-like and more interactive fashion. with a regular textbook, it's actually easier to skim thru the pages to pick out important information. but the problem with textbooks is, they are tedious and boring. and reading the same sections of textual information on a computer screen doesn't really make it better. i think this kind of online curriculum would have a significant advantage over textbooks if it incorporated examples that students can actually work thru, better colors, animations, etc. Basically, make the learning experience less passive." One person commented that out of his two curricula, he prefer the one that seemed like it came from multiple sources. This helps reinforce one of the ideas behind this thesis: namely that combining material from multiple sources could be an advantage. 63 "I felt the first set of passages was more helpful because it from multiple sources" combined readings Lastly, we have the results from the knowledge assessments that users took. Each user was given 8 questions: 4 questions on the content of the "best fit" curriculum and 4 questions on the content of the "worst fit" curriculum. The average "best fit" score was 2.82/4.00 (48 points from 17 subjects). The average "worst fit" score was 2.53/4.00 (43 points from 17 subjects). The "best fit" score coming out to be higher than the "worst fit" score is what we would expect. However, we feel that there is a lot of noise involved in this kind of assessment (e.g., varying student background and varying student motivations) so we choose not to draw conclusions from this data. 64 6 Discussion 6.1 Data Analysis and Fate of Hypotheses How well does "Interesting" describe your curriculum? 7 6 0 4 E Worst Fit 3 0 Best Fit U. 0 0 1 2 3 5 4 6 7 Response (1=Poorly, 7=Extremely Well) Figure 34. Bar Graph of Results for "Material Was Interesting" Our main hypothesis was that the course sequencing system we implemented could provide some advantages for students looking to learn Planning. We had decided that if the best-fit learning-style-data showed improvement over the worst-fit learningstyle data, then the claim would have been shown. Our results (refer to the "results" section) do suggest that the best-fit learning data shows improvements in many ease-ofuse areas. We have statistically significant data to show that subjects agreed with statements like "I am satisfied with my understanding of the material" and "The average textbook is less effective than what I just read" more after the best fit curriculum, and "I became more tired as I progressed" and "This was less coherent than what I usually read in textbooks" after the worst fit curriculum. Also, subjects rated the best fit learning experience as more "meaningful," "stimulating," and "rewarding" and the best fit 65 curriculum as more "engaging," "interesting" (see Figure 34), and "easy to understand" (see Figure 35). How well does "Easy to Understand" describe your curriculum? 76 4 Worst Fit * Best Fit 3 2- 0 1 2 3 5 4 6 7 Response (1=Poorly, 7=Extremely Well) Figure 35. Bar Graph of Results for "Material Was Easy to Understand" Figure 36 lists the various other hypotheses that we were testing, and what our results are able to say about them. The hypotheses we were testing Partition/Search is less coherent than straight textbook. Customizing for Learning Styles actually makes a difference in effectiveness compared to using a textbook (both quiz results & meaningfulness). Partition/Search is coherent enough that students feel that no necessary knowledge is missing. Partition/Search is not redundant, even though it draws from redundant sources. What we can say about the issue after looking at the results. The data does not show this result. "This was less coherent" has a 90% confidence range of 3.602 to 5.063 on a 1-7 scale. With best fit, "Ifound the ordering of the material more intuitive than what most textbooks present" and "The average textbook is less effective than what I read" both have mean values below 4 on a 1-7 scale (3.55 and 3.38 respectively), but the top of the 90% intervals for both of them are just over 4. The data shows we can be around 8090% sure the mean value for both of these statements is below 4. Worst fit made students feel like knowledge was skipped over more than best fit did. (>99% confidence). Also, the 90% confidence interval for "It felt like necessary knowledge was skipped over" for worst fit is below 4, which shows subjects agreed with the statement. Subjects rated worst fit as not redundant (<4 on the 1-7 scale). Also, they rated best fit as more redundant than 66 The lack of "glue" in this kind of system makes it harder to read. Partition/Search could still handle questions that spanned concepts, despite its modularity. Some "effectiveness" areas in particular are improved (over the textbook and "worst fit" groups) by customizing with Partition/Search. These areas tell us where to focus the development effort. People feel Partition/Search gave them a more appropriate number of examples/theory/applications. (Learning Styles Fit) Partition/Search leads to more fatigue. The customizing does not cancel out the disjointedness yet. Partition/Search presents material closer to students' internal representation of the material. worst fit (90-95% confidence) The lack of "glue" made worst fit harder to read than best fit. (> 99% confidence) We decided there was too much inherent noise to draw conclusions from quiz results. Partition/Search on "best fit" leads to a learning experience that is more meaningful, stimulating, rewarding, and has a higher sense of discovery than "worst fit." The "best fit" curriculum is more coherent, relevant, engaging, useful, effective, interesting, redundant, and easy to understand than the "worst fit" one. Also, students felt the ordering of "best fit" was more intuitive than "worst fit." Students agreed (< 4 on a 1-7 scale) that the number of examples, amount of theoretical background, and amount of practical applications were appropriate. Students said that "best fit" covered a more appropriate amount of practical applications than "worst fit." Students agreed with "The reading was too Theoretical" more for "worst fit" than "best fit." (>99% confidence) Students agreed that they became more tired as they progressed for both best fit and worst fit (90% range of both are <4 on a 1-7 scale). We also have statistically significant data showing they rated the amount of fatigue from the best fit curriculum as less than that from reading a textbook, but the amount of fatigue from the worst fit curriculum as more than that from reading a textbook. The worst fit curriculum made students more tired as they progressed more often than best fit curriculum. (90-95% confidence) For best fit, 90% confidence on "The material was presented in a way similar to my internal representation of knowledge" was 3.2 to 4.5 on our 1=Agree, 7=Disagree scale. The best fit group agreed more with the statement than the worst fit group. (>99% confidence) Figure 36. What Our Results say About Our Hypotheses 6.1.1 Learning Styles versus Learning Preferences We also decided to look at whether learning styles preferences matched preferred learning material preferences. This project implemented both a learning-preferencesselection and a learning-styles-quiz, so we measured the total squared distances between the two results for each test subject. The average total squared distance you would get from choosing two random learning styles is 16/5 (= 5.33..). The average total squared distance from our test subjects was 5.6 with a standard deviation of 2.521 and a 90% 67 confidence interval of 4.619 to 6.580. This means that there was not a strong direct relation between their learning styles result and their learning preferences result. It would be interesting to consider whether there is an algorithm to map learning styles into learning preferences. 6.2 Comparison to Other Researchers' Findings Like Niewiadomska [1], we were able to show that customizing for learning style can help students feel more comfortable with the material they are learning. Our results help confirm that Niewiadomska's results work for students taking materials from Intelligent Tutoring Systems. 6.3 Lessons about the Nature of the Question The original question was "How can we find the best path for a student?" Our exploration of different algorithms certainly revealed a lot of information about the nature of this question. For instance, we learned that the question is really about balancing the algorithm's number of constraints and number of possible paths. 6.4 Lessons about the Answer to the Question Our experiment shows that customizing for student learning styles does make a statistically significant difference in the effectiveness of the material, so customizing for learning styles is indeed a goal worth pursuing. We implemented Beam Search and Partition/Search, and found that the answer to our question is likely to be between the two. From our exploration of the various algorithms, we decided that Collaborative Filtering and Beam-Partition Hybrid were promising areas to explore next. 6.5 AssumptionsMade Partition/Search makes the assumption that the best path will lie along the expert's prerequisite graph. We can be fairly certain that this assumption does not hold, because Partition/Search eliminates so many promising paths. Because the assumption does not hold, we note that the right answer to the problem is likely to have fewer constraints than Partition/Search. 68 6.6 Notes for Future Researchers on this Topic The general user comments we received were useful in helping us learn how to better approach similar problems. We learned things like: * It is important to come up with a good way for handling images. Perhaps the system should have let users click on image references in the text, and popped up the corresponding images in response. * Some way to annotate the curriculum would be useful. Many students are used to annotating their textbooks and paper handouts. * Customizing for learning styles can make a difference in the effectiveness of curricula, and is worth building into ITS that customize for student characteristics. 69 7 Conclusion 7.1 Statement of Work Done We defined a problem, conducted background research, came up with a specific formulation for an aspect of the larger problem, explored different algorithms for solving our specific formulation, implemented two of the algorithms to test the issues involved, conducted a test of students to gauge the effectiveness of one of the algorithms, and analyzed the data. 7.2 Contributions Some of this project's main contributions are: 1. We introduced and formalized the Atomic Path Optimization view of the Intelligent Tutoring Systems problem. 2. We showed how to take 5 chapters of real-world textbook material, chop them up and represent them as 150 scanned atoms of material. Then, we found that these pieces could be put together again in different ways and still make some sense. 3. We found that Beam Search does not work but Partition/Search works, and with our method, the Partition/Search prerequisites graph only takes O(n x log(n)) work to set up. 4. We found that "best fit" curricula for learning styles provides advantages over "worst fit" curricula for learning styles. In particular (with >99% confidence), "best fit" curricula are more "Engaging," "Useful," "Effective," "Interesting," and "Easy to Understand." The learning experience is more "Meaningful" and "Rewarding." Test subjects felt that "best fit" presented the material in a way more similar to their "internal representation of knowledge." (See results section). 5. We found that most people still feel that they learn best from interacting with human beings. The more personalized the attention (i.e., 10-student recitation vs. 100-student lecture), the better they feel they learn. 6. We came up with several promising algorithms for the Atomic Path Optimization problem. In particular, we discussed Partition/Search, Partition-Beam Hybrid, and the Collaborative Filtering Approach. 70 7.3 Future Work Having found those results, there are two promising directions we can explore next. The first direction is to improve our system, with the final goal of finding a good (Atomic Path Optimization problem) solution and creating a practical application that can go side-by-side with lectures or be used in distance education. The second direction is to take the general principles learned here and apply them to tangential areas. 7.3.1 Improving the System Results 1 ("The problem can be formulated as Atomic Path Optimizing"), 2 ("We are able to atomize real-world material, and the pieces could be put together in different ways and still make sense"), and 4 ("Customizing for learning styles does make a difference") show that the problem is worth pursuing. Results 3 ("Partition/Search can find good results in reasonable expert time") and 6 ("Here are several promising algorithms for Atomic Path Optimizing...") help suggest that the problem is solvable. So, one future work direction is to refine/improve/build-on our system with the final goal of finding a good solution and creating a practical application with it. There are several main ways to improve on what we did: 1. Use original textbook sources (i.e., LaTeX) instead of scans. This would allow us to embed images naturally in the text, and avoid the image placement and font issues. 2. Go through the material with more than one expert to improve the quality of the classifications. 3. Customize for other student characteristics too (e.g., "abstraction capabilities," "amount of time dedicated"), not just learning style. 4. Explore more algorithms: Partition-Beam Hybrid, Collaborative Filtering, and possibly algorithms we have not yet thought of. 5. Try the system out in a real-life setting (side-by-side with lectures, or as distance education) instead of the testing-environment we had, and see what students think. Because we scanned the material, our custom curricula often had text that varied slightly in font, font size, and formatting. Also, the edge of atoms would sometimes "curve off' in the way that photocopies of books curve off at the edges of the pages (due 71 to books' raised spines). We could not embed images in the text because the text atoms were computer images themselves. All of these problems could be solved if we had the original computer documents (i.e., LaTeX sources) for the textbooks. Without these small visual problems, students might be less distracted and might be able to learn better from the curricula. Note that OCR scanning of the atoms would not work as well because OCR would have trouble with the page "curve offs" and with mathematical equations. Another issue that could be improved upon is the quality/quantity of experts. For our project, a single person classified the learning styles of all the atoms and created the postatoms table and the prerequisite graphs. The quality of the classifications could be improved if they were done by a team of domain experts working with psychologists, and the quality of the postatoms table and prerequisites graph could be improved if they were done by a small team of professors with years of experience teaching the subject. Improved classifications could lead to even better custom curricula than what we developed. Our project only customized for learning styles. There are many other categories for which we could have customized, such as: math/science background, major/pedagogical information, abstraction capabilities, interests and learning goals, time allocated and motivation/affective state. The same procedures used in this project could be used to test the effectiveness of customizing for those other areas. Eventually we would want a system that could simultaneously customize for as many effective areas as possible (by incorporating multiple areas into the student models and distance formulas). We described many algorithms, and tested out Beam Search and Partition/Search. Our results suggest that Partition/Search has more constraints and fewer possible paths than what the optimal algorithm should have. So, it would be useful to explore the BeamPartition Hybrid and Collaborative Filtering algorithms next. Testing more algorithms is essential for finding out what works and what does not work, and for eventually finding the best solutions for the Atomic Path Optimization problem. Lastly, our project was run on student volunteers in a testing environment. To see whether a system like this would work in the real world (i.e., to complement lectures or as distance education), it makes sense to test it out in the real world. Real-world testing might reveal issues that our test on volunteers did not. Also, our project was run on a 72 relatively small scale with only a few topics covered. A larger experiment should be run with a full semester worth of material to see whether the same results are reached. Research in this direction would explore the Atomic Path Optimization problem in greater depth, and could even lead to an effective practical computer application for helping us teach and helping students learn. 7.3.2 Applying the Principles to Tangential Areas The second future-work direction is to take the principles learned in this project, and apply them to tangential research areas. This direction is a little more open-ended and requires more reflection on what was learned and how it can be applied, so this section will be more vague on the specifics than the previous section. Two tangential areas we have thought of for this are "knowledge distillation" and knowledge representation, but there are definitely more areas (that we have not thought of yet). One way to view what we did is that we distilled an excess of knowledge down to exactly what a particular student needs. Students today learn the same material across different classes, and there are more textbooks out there for most subjects than anybody has the time to read. We took the huge excess of knowledge (including basic topics, advanced topics, and a lot of redundant information expressed in different ways) and distilled it to a single curriculum customized for a single student. This is similar to what today's internet search engines do: they take an excess of information (4 billion web pages) and reduce it to what a single person with limited time can use (a list of results selected from the 4 billion pages). This is also similar to what some military Al systems do: they draw large amounts of tactical information from many sources (radar, spy planes, intelligence reports, etc.) and reduce it to a smaller amount of (relevant) information that a single human can use to make decisions. In our Partition/Search solution, we performed knowledge distillation by atomizing the knowledge, using an expert to create a prerequisite graph, then searching over the prerequisites graph to find the paths that were good matches for student characteristics. In our WalkSAT-style solution, a path is generated randomly and then refined to score well on an evaluation metric. The future work direction here is to apply the Atomic Path Optimization algorithms' principles toward areas where knowledge 73 distillation is needed. The principles here are especially applicable to areas where the final form of the knowledge must have certain enforced orderings or constraints (our algorithms have enforced orderings to handle material prerequisites). Knowledge representation is another "tangential area" to which we might be able to apply this research. We might be able to take knowledge-based systems that have more to do with "using the knowledge" than "teaching/learning," break their stored knowledge into atoms, and see what we are able to get by applying to these atoms algorithms similar to the Atomic Path Optimization problem algorithms. Or, we could research whether and how general Al systems could take advantage of textbook knowledge expressed in atoms/prerequisites graph/postatoms table form. 7.4 Conclusion In conclusion, when we started this research, we wanted to explore the important idea of whether computers could help us to teach more effectively. This problem is challenging both because effective teaching is a difficult problem in itself, and because students are used to learning from human teachers, not computers. Good teachers do many things: they encourage students, teach material at the right knowledge level, make sure students understand before they move on, present ideas clearly, make the material interesting, teach material in ways to best suit individual students and they inspire students to learn. How could computers emulate some of these behaviors? We decided to research how computers (as Intelligent Tutoring Systems) could teach material in ways to best suit individual students. We looked at how we could teach better with computers by finding customized paths of atoms. There was little information on this topic. The problem of finding good paths of atoms had not yet been well defined. We were not sure what areas of customization would make a difference in ITS. So, we were not sure how well things would work. We starting thinking about the many issues involved: how could we define this problem more precisely, what kind of algorithms might work for this problem, what areas could we customize for to make a difference, what atom size would be good, what domain might this work in, and how would these systems interact with live lectures? 74 In this thesis, we took a shot at a particular aspect of the problem. We defined a specific atom size to use, formulated the Atomic Path Optimization problem, thought about several algorithms for our formulation, and conducted experiments on our Beam Search and Partition/Search algorithms in the domain of "planning" while customizing for learning styles. We found interesting results: Partition/Search (which only takes a reasonable amount of expert work to set up) is able to arrange the atoms into customized paths that make sense and that students can learn from, and customizing for learning styles did lead to more effective teaching. In addition to shedding light on how computers could teach materials in ways to best suit individual students, our research also gives us some insight into whether and how computers could address other aspects of effective human teaching. We found that students were more interested in learning more about planning after being given a best fit curriculum than after being given a worst fit curriculum. This shows that ITS can help encourage students. Our results help show that ITS customization is feasible, so customizing for other areas (such as knowledge level) could work. Students rated the best fit curricula as more easy to understand and more interesting, and this shows one way that ITS can present ideas clearly and make the material interesting. Now that we have done this work, we can revisit the original ideas. We now know more about the nature of the customized-path-of-atoms idea, and by extension, more about the nature of the overall idea of whether computers can help us teach more effectively. Our guess is that our current main limitations for the customized path of atoms idea (and thus possibly aspects of the overall idea) are expert time and understanding of learning, and not computing power. There is a need for clever algorithms and techniques that can reach good results with the limited expert time. A better understanding of how people learn (what areas can be customized for, what kind of prerequisites must be enforced, and what tricks are essential for particular domains) would also be useful. We showed that the ideas can be implemented and are not just abstract. Also, it appears the ideas are not crazy after all. Based on our results, we believe it is in fact possible to teach better with customized paths of atoms. Our results were significant enough that we feel optimistic that customized paths of atoms might work with some 75 other domains/customization areas as well. The systems we developed must be capturing some aspect of effective teaching, and this encourages the notion that computers can in fact help us teach more effectively. 76 8 Bibliography [I] K. M. Niewiadomska, "Knowledge Representation, Content Indexing and Effective Teaching of Fluid Mechanics Using Web-Based Content," M.S. thesis, Cambridge: MIT Department of Ocean Engineering, 2002. [2] S. Niemczyk, "An Adaptive Domain-Independent Agents-Based Tutor for WebBased Supplemental Learning Environments," Ph.D. thesis, Cambridge: MIT Department of Civil and Environmental Engineering, 2003. [3] D. A. Kolb, Learning Styles and Disciplinary Differences, from The Modern American College. San Francisco: Jossey-Bass, 1981. [4] D. Sleeman and J.S. Brown (1982). Intelligent Tutoring Systems. Academic Press, Inc. Orlando FL. [5] Student Modeling: The Key to Individualized Knowledge-Based Instruction. [6] P. Brusilovsky and J. Vassileva. Course Sequencing technology for large scale webbased education (2003). [7] P. Brusilovsky. Adaptive Educational Systems on the World Wide Web: A Review of available technologies. [8] J. Siekmann, C. Benzmuller, et al., Adaptive Course Generation and Presentation. [9] P. Brusilovsky, E. Schwarz, G. Weber. ELM-ART: An Intelligent Tutoring System on World Wide Web. [10] J. Vassileva. Dynamic Course Generation on the WWW (1997). In Proceedings of Al-ED '97 [11] Proceedings of Al-ED '97. Notable papers: Architecture of an Intelligent Tutoring System on the WWW. [12] P. Brusilovsky. Adaptive and Intelligent Technologies for Web-based Education. [13] P. Brusilovsky. A Framework for Intelligent Knowledge Sequencing and Task Sequencing. [14] R. M. Felder, L. K. Silverman, "Learning and Teaching Styles In Engineering Education" from Engineering Education, April 1988, pg. 674-681. [15] P. Brusilovsky. Course Sequencing for Static Courses: Applying ITS Techniques in Large-Scale Web-based Education. (1999/2000) 77 [16] Kinshuk, A. Patel, D. Russel, Hyper-ITS: A Web-based Architecture for Evolving and Configurable Learning Environment. [17] Advances in Web-Based Learning, First International Conference (2002). [18] T. Liao. Advanced Education Technology: Research Issues and Future Potential. [19] W. J. McKeachie, "Learning Styles Can Become Learning Strategies," from The National Teaching and Learning Forum, Volume 4, Number 6, 1995. [20] H. Mandl, A. Lesgold, Learning Issues for Intelligent Tutoring Systems (1988). [21] P. Brusilovsky, N. Henze, E. Millan. Adaptive Systems for Web-based Education. Proceedings of the AH'2002 Workshop on Adaptive Systems for Web-based Education Fourth such workshop. [22] M. Specht, G. Weber, S. Heitmeyer, V. Schoch. AST: Adaptive WWW-Courseware for Statistics. [23] M. Specht, A. Kobsa. Interaction of domain expertise and interface design in adaptive educational hypermedia. [24] G. Aguilar, K. Kaijiri. Personalization Approach of a Web-based Java Programming Tutorial. [25] P. Honey and A. Mumford. The Manual of Learning Styles. Berkshire, UK, 3 rd Edition, 1992. [26] Personal correspondence with Professor Dick Yue. [27] J. Rhem, "Deep/Surface Approaches to Learning: An Introduction," from The National Teaching and Learning Forum, Volume 5, Number 1, 1995. [28] "Learning Styles and its measurement" from the Cymeon website http://www.cymeon.com. [29] MIT Committee On the Use of Human Experimental Subjects website http://web.mit.edu/committees/couhes. 78 A Atoms A.I Table ofAtoms and Descriptions Act = Activist Ref= Reflector The = Theorist Pra = Pragmatist L = Low M = Medium H = High Atom ID One-sentence summary of atom R337A Describes the parts of a simple planning agent R338A DIAGRAM: Algorithm for a simple planning agent Act Ref The Pra L M L M M M H L R338B R339A The problems in using problem solving search to do planning Describes three key ideas behind planning M H M H H M M M R340A R341A DIAGRAM: Modeling planning as forward search Representing states and operators with situational calculus H M M L L H H M R343A Introduction to the STRIPS language L L M M R344A R344B Describes where STRIPS in used in planning DIAGRAM: Example of an operator in a planning graph L L L M H M M L R345A R346A Progression planning vs. regression planning vs. partial planning Ways to choose steps while planning L M H M L M M M R346B R348A Formally defining the plan data structure DIAGRAM: Plan graphs, drawing preconditions L H L M H M M M R348B R349A DIAGRAM: Linearizing partial ordering plans Defining "completeness" and "consistency" for plan solutions H L L L H H M L R349B A long POP example H L M M R350A R351A DIAGRAM: What an "initial plan" looks like DIAGRAM: Instantiating variables in preconditions H H L M M M M M R352A R352B DIAGRAM: A POP achieving preconditions DIAGRAM: A flawed POP H H L M M M M M R353A DIAGRAM: Protecting causal links H M H L R354A DIAGRAM: Protecting causal links in a POP example H L M M R355A DIAGRAM: A POP solution for a sample problem H L M M R355B R356A Describing the algorithm for POP in words DIAGRAM: The algorithm for POP M L M L M H L L R357A How to resolve threats during planning L M M L R358A DIAGRAM: Algorithm for resolving threats L L H L R359A R359B Methodology for solving problems using planning Formalizing the blocks world for planning L M H M M L M M R360A R361A Representing SHAKEY's world with STRIPS DIAGRAM: A diagram of SHAKEY's world H H M L M L R362A Overview of the planning problem and shortfalls of situational calculus L H M M R363A Bibliographical notes describing the roots and history of A.I. planning L L M 79 H H H R364A K329A Exercises related to basic POP and STRIPS Planning combines problem-solving strategies with knowledge representation H L M M L L M H K329B K329C We must work on small pieces at a time to be practical Describing some characteristics of planning and what to look out for L L M L M L M M K330A K331A K332A K333A K334A More characteristics of the planning problem Fixing problems that arise from planning in the real world A specification of the simple blocks world Five things that must be done by a planning system DIAGRAM: A blocks world example diagram and description L L H M H M M L M M L L M M M M H M L M K334B K334C Overview of choosing rules to apply Motivation for, and description of, STRIPS M L H M H H M M K336A DIAGRAM: STRIPS operators for the blocks world H L H M K337A K338A K338B DIAGRAM: simple search-tree planning Using predicate logic to know when it has a solution Filtering out bad paths while planning (with example) H M M M M M H H M M L M K339A Using least-commitment strategies and patching at the last moment L H L H K339B K340A A detailed goal stack planning example DIAGRAM: A blocks world start and goal for an easy problem H H L M M M M M K345A K345B K347A K348A K349A DIAGRAM: A several-step blocks world problem DIAGRAM: What a goal-stack might look like A discussion of nonlinear planning using the TWEAK example DIAGRAM: Searching in states vs. in constraints DIAGRAM: Heuristics for constraint-posting planning (by TWEAK) H H H H M M M M H M M M H H M M M M M L K353A DIAGRAM: Overview of algorithm for non-linear planning (TWEAK) M M H L K354A K354B K355A DIAGRAM: Using modal truth to check if propositions hold The need for hierarchical planning DIAGRAM: A complex operator from a planning system M L L L H H H M M L M L K356A K357A Using "reacting" instead of traditional planning A brief overview of some more advanced planning techniques M M H H M H L M K357B R392A Exercises related to STRIPS and TWEAK Planning and acting systems must take their own advice H L H M L L M H R392B Incomplete and incorrect info can be dealt with using conditional planning and execution monitoring M H M H R393A The flat-tire example with incomplete knowledge H L H H R395A R395B DIAGRAM: Algorithm for a conditional planning agent Tracing through an example of conditional POP M H L M H H L H R396A R396B R397A DIAGRAM: An initial state for the flat-tire problem DIAGRAM: A step while solving the flat-tire problem DIAGRAM: Inflating a tire is conditional on tire being intact M H H L L L M M M M M M DIAGRAM: CPOP prepares for both possibilities of a conditional H L M M R397B condition R397C R398A DIAGRAM: Setting up the conditional part of the plan DIAGRAM: A complete CPOP plan with causal and conditional links H H L M M M M M R398B R399A R401A R402A R402B Parameterized plans and runtime variables DIAGRAM: The CPOP algorithm A general overview of action monitoring How execution monitoring works with simple replanning DIAGRAM: An algorithm for execution monitoring and replanning L L L M L H L H M L M H M M H 80 M L H H M R403A R404A R405A A situated planning agent is constantly replanning DIAGRAM: When replanning is needed in the blocks world DIAGRAM: Part 1 of a 6 part example of blocks world replanning H H H H M L M H H H M M R405B R405C DIAGRAM: Part 2 of 6: Unsupported links DIAGRAM: Part 3 of 6: Dropping redundant steps H H L L H H M M R406A DIAGRAM: Part 4 of 6: Re-calculating the start state H L H M R406B R407A DIAGRAM: Part 5 of 6: Assigning new precondition bindings Weaknesses of conditional planning and replanning H L L H H L M H R407B R408A DIAGRAM: Part 6 of 6: A finished plan for a situated planning agent DIAGRAM: An algorithm for a situated planning agent H M L M H H M L R409A R410A DIAGRAM: Using coercion and abstraction in planning A summary of how planning agents can handle the unexpected and M H M H L M M H R411A unknown Historical notes about conditional planning and execution monitoring L L L H R412A Exercises related to planning for acting L M L M M H H W323A Planning can be done with STRIPS or with logic W323B A plan prescribes a sequence of actions W324A Introduces the ideas of operators with prereqs and postreqs L M H L M M L M M W325A DIAGRAM: Blocks world initial and goal states H L M M W326A Breadth first search leads to exponential growth, so we need to be M H M M smarter W327A DIAGRAM: How planning world be done as BFS H M H M M H M M W329A DIAGRAM: Shows how links between operators are formed H M H L W330A DIAGRAM: Example of backwards chaining W331A Monitoring links lets you detect impossible plans W331 B You can make a plan by extending partial plans H H M L M L H M H L M M DIAGRAM: Describes how to protect threatened links DIAGRAM: Example of how planning can take place DIAGRAM: Example of same steps in a plan DIAGRAM: How to protect threatened links Planning uses logic where truth values can change H H H H M L M L M H H H H H L M M M M L W336B When there is uncertainty, commit as little as possible H M M M W338A Example of planning using situational logic W338B Green's trick for tracing situational history W339A DIAGRAM: A very simple blocks world example description H M H M M H H H H L M M W341A DIAGRAM: Visualization of example in W338A H L H L W343A Introducing the frame problem in situational planning W344A DIAGRAM: Visualization of example in W338B M H H L H H M L W345A Quick summary of planning with STRIPS and with logic L M M M W346A R367A R367B R367C R369A Background on STRIPS and situational variables Practical planning takes significant modification on planning algorithms We can find weaknesses in POP by trying it on the real world Describes requirements for planning in spacecraft assembly Describes requirements for planning in an assembly factory L L L M M L L L M M L L L L L H H H H H R369B Describes planning for scheduling space missions M M L H R371A R371 B SIPE is a practical planner that does replanning All practical planners use hierarchical decomposition M M M L H H H M W327B W332A W333A W334A W335A W336A Partial paths might involve impossible choices, so do backward chaining 81 R372A Introduces primitive operators, non-primitives, and decomposition M M H L R373A R374A DIAGRAM: Shows hierarchical decomposition Introduces the hierarchical decomposition planner HDPOP H M H M M H M M R374B R375A DIAGRAM: HDPOP algorithm DIAGRAM: Example of a step decomposition L H L H M M L H R375B R376A A more precise description of hierarchical decomposition DIAGRAM: The downward and upward solution properties L H M H M M L M R377A DIAGRAM: Size of search space in hierarchical decomposition H M H M R378A R379A DIAGRAM: How decomposition can solve what POP cannot Points out what two actions might want to share steps H H M M M M M H R379B R380A R381A Critics can be used to share steps and fix plans An approximation hierarchy considers different levels of prereqs Necessary extensions for broadening applicability of planning L M L H L H L M L M H H R381 B Introducing new syntax: "effect when condition" L M M H R382A R382B DIAGRAM: An algorithm that uses "effect when condition" Introducing negated and disjunctive goals L M L H H M H M R383A Planning with universal quantification M M H M R384A R385A R386A POP-DUNC incorporates many extensions on POP DIAGRAM: Algorithms for parts of POP-DUNC Introduces the problem of resources constraints in planning L L H L L H L M H M M M R386B Introducing "measures" and inequality tests in preconditions M L H R388A R388B Temporal constraints for planning and some implications An overview of extensions on basic STRIPS M L H M H H M H R389A R390A Historical notes about extensions for practical planning Some exercises about extensions for practical planning L H L H L L A.2 H H M Chart Used to Rate Learning Style Fits for Atoms Learning Style Activist Low A long description. More "passive learning" than "active learning." Medium Reflector Something not meant to be reflected upon (i.e. memorization), or which can be understood without much thought, or encourages participation Material that might have the reader thinking "hey, that's a neat idea." High Material with active participation: an example or sample problem. Material that's interesting at a glance, and not too long (unless an example). A long, detailed description of a deep idea. Encourages thinking and reflection -encourages the user to reflect on the material. more than thought. Theorist Relies on intuition more than proof. Or discusses what to do with something, rather than why that thing is Complex, proof-style, precise, abstract, shows why something is some way. true. Pragmatist Very theory-ish, no applications in sight. Might Might relate to real world applications somehow, or Clearly relates to applications of the material be full of abstract equations. maybe a toy example. in the real world. 82 B Data B. I Learning Styles Data BESI ... BES9 represent the 9 test subjects that were assigned to the best-fit group. WORI ... WOR9 represent the 9 test subjects that were assigned to the worst-fit group. EXI ... EX2 represent 2 additional test subjects who took only the learning styles quizzes. Learning styles are listed in (Activist)(Reflector)(Theorist)(Pragmatist) form. So, LLMH would mean Activist=Low, Reflector=Low, Theorist=Medium, Pragmatist=High. id BES1 BES2 BES3 BES4 BES5 BES6 BES7 BES8 BES9 WOR1 WOR2 WOR3 WOR4 WOR5 WOR6 WOR7 WOR8 WOR9 EX1 EX2 B.2 Learning Preferences Chooser Learning Styles Assessment (80 question quiz) (4 question selection) LMLL MHMH LLMH HMMH MLMM LMHL MLML HMLH LMLH HLLH LMLM MLLH HLLL MHMH HHLM MMMH LMMH LLLL MMHL MLLH LLLM HMLL MMLH LHLM MHLH HLLL HLLM LHMM LMMH HLLH MMLH LMML MMLH HLML LLLM LLLH MHMM MMML HLMM MMMH Ease-of-Use and Knowledge Assessment Data A blank indicates that a test subject did not answer a particular question. Curriculum Set 1 1. 1had the background to understand the material that was presented. 2. The system was easy to use. 3. 1am naturally interested in the best-fit group, so they got the best-fit curriculum here BES1 BES2 BES3 BES4 BES5 BES BES7 BES8 BES9 6 1 3 3 7 2 3 6 2 1 1 1 3 3 4 3 6 6 2 2 1 6 6 3 2 3 3 4 1 3 4 6 2 4 3 3 4 subject of Al Planning. 4. I am interested in learning more 83 about Al Planning now. 5. I learn best from interacting with human beings (e.g. asking TAs and professors questions). 6. 1 am satisfied with my understanding of this material now. 7. 1found the ordering of the material more intuitive than what most textbooks present. 2 4 3 1 2 6 2 2 3 6 3 3 6 5 6 5 3 4 4 3 2 6 3 4 2 2 6 5 5 3 4 6 3 4 4 3 2 3 2 4 3 2 6 2 3 1 3 3 2 2 1 6 2 3 11. The number of examples presented was appropriate. 7 4 3 2 4 5 3 4 1 12. The length of the readings 3 4 4 6 5 4 5 4 1 5 3 3 3 5 3 6 4 1 7 3 4 3 3 3 3 5 1 7 5 3 6 4 6 6 5 7 6 5 4 3 2 4 2 4 5 17. The reading was too theoretical. 18. The reading focused too much 6 7 4 5 3 3 4 6 3 5 5 5 5 6 4 5 7 7 on practical applications. 19. 1became more tired as 1 3 3 4 2 2 4 1 2 2 5 3 3 6 5 7 4 4 5 similar amount of textbook material. 21. This was less coherent than what I usually read in textbooks. 6 5 3 6 5 3 3 6 5 22. It felt like necessary knowledge 2 5 3 3 4 3 2 5 7 6 5 3 1 5 5 2 3 6 2 3 3 1 3 6 2 5 7 7 4 3 7 4 5 3 4 8. I could have learned the material better by directly reading a textbook. 9. I could have learned the material better by attending a 100-student lecture. 10. 1could have learned the material better by attending a 10student recitation. presented were appropriate. 13. There was an appropriate amount of theoretical background covered. 14. An appropriate amount of practical applications was covered. 15. There were too many examples presented. 16. The reading passages felt too lengthy and detailed. progressed. 20. I felt more fatigue when using this system than when reading a was skipped over. 23. The lack of good *glue* in this system made it harder to read. *glue* is defined as text references to what was just taught or what is coming next. 24. The average textbook is less effective than what I just read. 25. The material was presented in a way similar to (my best understanding of) my internal representation of knowledge. 84 1. Meaningful 2. Stimulating 3. Sense of Discovery 4. Rewarding 5. Leads to New Questions 6. Challenging 7. Moments of Wonder 5 6 5 4 7 4 4 6 6 4 4 4 4 2 4 3 3 4 2 3 5 6 1 2 4 1 6 2 2 3 3 4 3 3 2 5 4 6 4 5 2 1 5 5 3 3 2 5 3 5 4 4 3 3 2 1 6 5 5 5 6 3 1 1. Coherent 7 6 3 2 5 1 3 5 6 2. Relevant 3. Engaging 4.Tedious 6 3 5 6 6 2 2 4 3 7 2 6 5 3 6 3 3 5 5 5 5 6 4 4 7 6 2 5. Useful 6 6 4 7 5 5 5 4 6 6. Effective 7. Interesting 5 6 6 6 3 4 4 2 3 4 3 5 3 6 4 5 7 6 8.Redundant 9. Easy to Understand 3 3 2 6 4 3 2 5 5 5 3 5 3 5 5 3 1 7 25 45 30 90 45 25 45 20 20 1 1 1 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1. About how much time did you to finish this reading? Curriculum 1 Quiz Question Curriculum 1 Quiz Question Curriculum 1 Quiz Question Curriculum 1 Quiz Question it take 1 2 3 4 Curriculum Set 2 _ best-fit group, so they got the worst-fit curriculum here BESI BES2 BES3 BES4 BESS BES BES7 BES8 BES9 6 1 7 5 5 2 5 2 5 5 4 3 7 2 6 5 5 6 7 4 6 6 5 5 5 7 2 6 5 4 4 5 3 4 5 5 5 3 4 1 4 2 4 4 2 3 4 1 1 4 4 4 3 1 2 2 1 2 5 3 2 4 5 1 3 5 7 5 2 7 5 5 6 5 3 2 5 6 6 5 1 2 4 2 2 3 2. The system was easy to use. 6 1 3 3. I am interested in learning more 2 3 2 about Al Planning now. 4. 1 am satisfied with my 6 3 3 understanding of this material now. 5. 1found the ordering of the 7 3 3 1. 1had the background to understand the material that was 3 5 presented. material more intuitive than what most textbooks present. 6. I could have learned the material better by directly reading a textbook. 7. I could have learned the material better by attending a 100-student lecture. 8. I could have learned the material better by attending a 10-student recitation. 9. The way some text referred to previous/past text (that I didn't get) was distracting to me. 10. The variation in font and font size was distracting to me. L11. The way images are presented 85 was distracting to me. 12. The occasional imperfect scans were distracting to me. 7 5 3 7 4 5 3 4 2 13. The number of examples 5 1 3 7 3 3 6 5 5 5 1 2 6 4 4 5 4 6 5 4 2 7 3 6 3 3 4 covered. 16. An appropriate amount of 3 4 2 7 4 3 6 4 4 practical applications was covered. 17. There were too many examples 3 6 2 6 4 4 5 5 2 18. The reading passages felt too lengthy and detailed. 1 6 2 4 3 2 3 4 1 19. The reading was too theoretical. 6 6 2 1 4 1 2 3 7 20. The reading focused too much 6 6 4 7 4 4 6 5 4 presented was appropriate. 14. The length of the readings presented were appropriate. 15. There was an appropriate amount of theoretical background presented. on practical applications. 21. 1 became more tired asl 1 _ 1 6 3 1 2 2 2 3 1 2 6 3 4 3 6 3 2 3 2 6 3 4 4 1 2 2 4 2 5 2 2 3 2 2 2 6 3 4 3 1 5 4 2 3 6 6 4 3 4 3 5 2 5 5 7 3 5 7 4 6 6 5 4 1. Meaningful 2. Stimulating 3. Sense of Discovery 4. Rewarding 5. Leads to New Questions 2 5 4 3 7 6 5 3 3 3 4 3 4 3 4 1 1 1 1 1 5 4 4 4 4 2 1 2 2 4 3 3 2 2 2 2 3 4 2 4 3 4 3 3 5 6. Challenging 7. Moments of Wonder 7 5 3 2 3 3 3 1 5 4 7 1 6 2 4 3 5 2 1. Coherent 2. Relevant 2 5 5 5 4 3 1 1 5 5 2 4 3 5 4 5 6 6 3. Engaging 2 4 3 1 4 2 3 3 3 4.Tedious 5. Useful 6. Effective 7. Interesting 7 5 1 3 3 5 5 4 3 5 4 2 1 3 2 1 5 5 5 4 1 3 2 1 5 2 2 3 5 4 3 5 6 5 5 2 progressed. 22. I felt more fatigue when using this system than when reading a similar amount of textbook material. 23. This was less coherent than what I usually read in textbooks. 24. It felt like necessary knowledge was skipped over. 25. The lack of good *glue* in this system made it harder to read. *glue* is defined as text references to what was just taught or what is coming next. 26. The average textbook is less effective than what I just read. 27. The material was presented in a way similar to (my best understanding of) my internal representation of knowledge. 86 8.Redundant 9. Easy to Understand 1. About how much time did you to finish this reading? Curriculum 2 Quiz Question Curriculum 2 Quiz Question Curriculum 2 Quiz Question Curriculum 2 Quiz Question 5 1 50 it take 4 4 30 3 5 25 1 1 20 3 1 30 4 4 30 2 2 40 3 2 15 I 1 2 3 4 Curriculum Set 1 1 1 1 0 1 1 1 1 0 1 0 1 3 25 I 0 0 0 1 1 0 1 1 0 1 1 1 1 1 1 1 1 0 1 1 worst-fit group, so they got the worst-fit curriculum here WOR WOR WOR WOR WOR WOR WOR7 WOR WOR9 1 1. 1had the background to 2 3 4 5 6 8 2 4 7 1 2 1 5 2 3 2. The system was easy to use. 1 5 6 3 3 2 6 4 2 3. I am naturally interested in the subject of Al Planning. 4. 1am interested in learning more about Al Planning now. 5. I learn best from interacting with human beings (e.g. asking 3 3 2 3 2 5 6 3 7 2 3 3 1 3 5 6 5 5 3 2 2 1 5 1 2 1 7 4 6 7 7 2 5 5 6 4 2 5 7 7 6 2 6 5 5 understand the material that was presented. TAs and professors questions). 6. 1am satisfied with my understanding of this material now. 7. 1found the ordering of the material more intuitive than what most textbooks present. 8. 1 could have learned the I 4 4 1 2 1 4 2 3 5 4 4 4 7 6 2 6 1 3 4 2 2 1 5 2 4 1 4 1 5 4 1 3 1 3 4 2 3 4 4 1 2 1 5 7 1 2 5 5 1 2 1 5 5 1 2 3 5 6 2 1 4 7 3 15. There were too many examples presented. 6 6 6 6 4 6 5 5 5 16. The reading passages felt too lengthy and detailed. 17. The reading was too 4 4 2 3 3 6 3 1 6 4 2 2 3 4 6 3 1 5 material better by directly reading a textbook. 9. 1 could have learned the material better by attending a 100-student lecture. 10. 1could have leamed the material better by attending a 10student recitation. 11. The number of examples presented was appropriate. 12. The length of the readings presented were appropriate. 13. There was an appropriate amount of theoretical background covered. 14. An appropriate amount of practical applications was covered. I 87 theoretical. 18. The reading focused too much on practical applications. 4 5 6 6 5 7 19. 1became more tired as 1 3 3 1 2 3 2 6 3 1 2 2 3 5 2 1 3 2 22. It felt like necessary 2 1 1 3 knowledge was skipped over. 23. The lack of good *glue* in this 3 4 1 2 5 progressed. 20. I felt more fatigue when using this system than when reading a similar amount of textbook material. 21. This was less coherent than I 1 1 6 7 7 1 3 2 2 6 5 2 2 5 2 5 3 2 3 2 1 3 3 4 3 7 5 6 4 6 4 2 what I usually read in textbooks. system made it harder to read. *glue* is defined as text references to what was just taught or what is coming next. 24. The average textbook is less effective than what I just read. 25. The material was presented in 4 3 7 6 5 3 6 5 3 a way similar to (my best understanding of) my internal representation of knowledge. 1. Meaningful 2. Stimulating 5 5 4 3 2 3 3 3 5 5 5 5 4 3 1 1 3 3 3. Sense of Discovery 6 6 2 3 5 4 2 1 5 4. Rewarding 5. Leads to New Questions 5 7 3 5 2 4 3 6 5 4 3 6 2 4 1 1 4 3 6. Challenging 7. Moments of Wonder 1. Coherent 6 5 5 6 5 2 4 1 1 6 3 3 5 3 3 6 3 6 5 3 4 1 1 2 1 1 3 2. Relevant 6 6 3 3 5 6 5 3 2 3.Engaging 6 3 1 3 5 6 3 1 2 4.Tedious 5. Useful 6. Effective 4 6 5 2 4 3 6 3 2 5 3 3 4 3 4 4 7 5 6 3 3 7 2 2 5 4 6 7. Interesting 6 5 2 3 5 6 4 2 3 2 4 25 3 2 35 1 2 25 4 3 20 5 4 25 2 5 20 3 2 20 6 2 15 1 4 15 1 0 1 1 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 0 1 8.Redundant 9. Easy to Understand 1. About how much time did it take you to finish this reading? Curriculum Curriculum Curriculum Curriculum 1 1 1 1 Quiz Question Quiz Question Quiz Question Quiz Question 1 2 3 4 Curriculum Set 2 worst-fit group, so they got the best-fit curriculum here WO WOR WOR WOR WOR WOR WOR7 WOR WOR9 1 2 3 4 5 6 8 1. 1 had the background to 2 1 7 1 1 1 4 3 5 understand the material that was 2 1 88 presented. 2. The system was easy to use. 3. am interested in learning more about Al Planning now. 4. 1am satisfied with my understanding of this material now. 5. 1found the ordering of the material more intuitive than what most textbooks present. 6. 1could have learned the 2 1 1 1 5 5 5 2 2 2 1 3 5 3 4 5 2 2 7 5 3 3 4 5 3 3 2 7 6 3 1 3 3 4 6 5 3 3 2 1 4 4 6 4 5 4 7 6 1 6 1 4 4 4 3 3 5 1 4 1 5 1 4 1 1 1 5 2 5 2 3 6 1 1 3 6 3 3 2 2 6 1 1 2 5 4 6 6 3 6 1 3 4 3 4 1 6 2 1 2 1 3 1 2 2 4 1 1 5 4 1 3 6 3 1 1 4 5 2 1 4 2 5 1 1 3 1 2 1 2 2 3 6 1 7 5 3 6 5 6 2 4 6 3 2 3 6 4 1 2 I 1 3 4 1 material better by directly reading a textbook. 7. 1could have learned the material better by attending a 100-student lecture. 8. I could have learned the material better by attending a 10student recitation. 9. The way some text referred to previous/past text (that I didn't get) was distracting to me. 10. The variation in font and font size was distracting to me. 11. The way images are presented was distracting to me. 12. The occasional imperfect scans were distracting to me. 13. The number of examples presented was appropriate. 14. The length of the readings presented were appropriate. 15. There was an appropriate amount of theoretical background covered. 16. An appropriate amount of practical applications was covered. 17. There were too many examples presented. 18. The reading passages felt too lengthy and detailed. 19. The reading was too theoretical. 5 7 3 2 5 20. The reading focused too 4 7 5 5 6 I much on practical applications. 21. 1 became more tired asl progressed. 22. I felt more fatigue when using 7 5 5 6 7 5 7 3 7 5 2 3 3 4 6 4 6 _ 3 6 1 3 5 1 4 6 3 7 3 5 5 5 1 3 2 7 this system than when reading a similar amount of textbook material. 23. This was less coherent than what I usually read in textbooks. I 89 I 3 I 24. It felt like necessary knowledge was skipped over. 5 6 2 3 3 7 4 5 5 25. The lack of good *glue* in this 2 6 3 3 2 7 4 5 4 2 2 4 5 5 1 4 4 2 2 2 4 5 5 1 4 3 3 representation of knowledge. 1. Meaningful 2. Stimulating 6 6 7 7 1 1 5 5 4 3 7 7 4 5 4 3 5 5 system made it harder to read. *glue* is defined as text references to what was just taught or what is coming next. 26. The average textbook is less effective than what I just read. 27. The material was presented in a way similar to (my best understanding of) my internal 3. Sense of Discovery 7 6 1 5 4 5 5 3 6 4. Rewarding 5 7 1 5 5 5 5 2 2 5. Leads to New Questions 6 5 3 5 5 6 4 5 5 6. 7. 1. 2. 7 4 4 7 6 6 7 7 1 1 3 3 6 5 4 4 4 4 4 5 6 3 7 7 4 3 4 5 4 5 5 5 3 1 2 4 Challenging Moments of Wonder Coherent Relevant 3.Engaging 6 7 4 4 6 7 4 3 5 4.Tedious 5. Useful 6. Effective 7. Interesting 8.Redundant 9. Easy to Understand 5 6 5 6 5 6 6 6 6 7 6 6 6 4 3 4 1 2 6 5 3 6 4 5 6 4 5 5 6 6 3 6 7 7 2 7 3 3 5 4 4 5 6 4 4 3 6 4 6 4 5 2 6 4 1. About how much time did it 25 35 20 35 25 45 15 20 30 take you to finish this reading? Curriculum 2 Quiz Question 1 1 1 1 1 1 1 1 1 0 Curriculum 2 Quiz Question 2 1 1 0 0 1 0 0 0 1 Curriculum 2 Quiz Question 3 Curriculum 2 Quiz Question 4 1 1 1 1 0 1 0 1 0 1 1 1 1 1 0 1 0 0 90