Supplemental Files A Meta-Analysis of the Effectiveness of Intelligent Tutoring Systems on College Students’ Academic Learning Supplemental Table 11 Substantive Features of the Studies Included Study (independent ITS name Intervention sample) Conditions VanLehn et al. (2007) (1) Why 2-Atlas/Why2ITS as principal AutoTutor instruction Comparison Conditions human tutoring Subject ITS duration Assessment type Measurement timing immediately following Research design Sample size2 experimental 41 Adjusted Unadjusted ES3 ES4 -0.10 Physics short term specific VanLehn et al. (2007) (1) Why 2-Atlas/Why2AutoTutor ITS as principal instruction computerized materials Physics short term specific immediately following experimental 45 -0.33 VanLehn et al. (2007) (2) Why2-AutoTutor ITS as principal instruction computerized materials Physics short term specific immediately following experimental 48 0.88 VanLehn et al. (2007) (2) Why2-AutoTutor ITS as principal instruction do nothing control Physics short term specific immediately following experimental 51 0.88 VanLehn et al. (2007) (3) Why2-AutoTutor ITS as principal instruction computerized materials Physics short term specific delayed experimental 62 0.26 VanLehn et al. (2007) (4) Why2-Atlas/Why2AutoTutor ITS as principal instruction computerized materials Physics short term specific immediately following experimental 103 0.10 VanLehn et al. (2007) (4) Why2-Atlas/Why2AutoTutor ITS as principal instruction printed text short term specific immediately following experimental 64 -0.06 Lane & VanLehn (2005) ProPL (dialogue-based intelligent tutoring systems) ITS as principal instruction computerized materials computer science short term specific immediately following experimental 25 Wang, Li, & Chang (2006) CooTutor (coordinate tutor) ITS as principal instruction computer-assisted learning computer science 3-12 weeks specific immediately following quasi-experimental 22 Stylianou & Shapiro (2002) Cognitive Tutor ITS-integrated class instruction traditional classroom instruction mathematical subjects one semester/above embedded end of semester quasi-experimental 38 Graesser et al. (2003a) Why/AutoTutor ITS as principal instruction printed text physics short term specific immediately following experimental Arnott, Hastings, & Allbritton (2008) Research Methods Tutor ITS-supplemented traditional classroom (RMT) class instruction instruction other subjects ng specific end of semester quasi-experimental 125 0.01 Graesser et al. (2003b) AutoTutor ITS as principal instruction printed text computer literacy short term specific end of semester experimental 20 0.23 Graesser et al. (2003b) AutoTutor ITS as principal instruction do nothing control computer science short term specific end of semester experimental 20 0.99 physics 29 0.58 -0.32 0.52 1.23 Study (independent sample) Corbett (2001) ITS name Intervention Conditions ACT Programming Tutor ITS as principal (APT) instruction Comparison Conditions computer-assisted learning Subject ITS duration Assessment type Measurement timing immediately following Research design Sample size2 quasi-experimental 20 Adjusted Unadjusted ES3 ES4 0.70 computer science short term specific Miller & Butz (2004) Interactive Multimedia Intelligent System (IMITS) ITS as principal instruction traditional classroom instruction other subjects one semester/above embedded end of school year quasi-experimental 83 0.53 Aberson et al. (2003) Web Interface for Statistics Education (WISE) ITS-assisted activities Self-reliant learning statistics short term specific immediately following quasi-experimental 25 0.78 Reif & Scott (1999) Personal Assistants for Learning (PALs) ITS-assisted homework human tutoring physics short term specific immediately following quasi-experimental 30 -0.77 Reif & Scott (1999) Personal Assistants for Learning (PALs) ITS-assisted homework Self-reliant learning physics short term specific immediately following quasi-experimental 30 0.85 Grubišić, Stankov, & Zitko (2006) eXtended Tutor-Expert System (xTex-Sys) ITS as principal instruction traditional classroom instruction computer science 3-12 weeks specific immediately following quasi-experimental 80 0.15 Livergood (1994) Multimedia modified ITS as principal intellifent tutoring system instruction printed text computer science short term specific immediately following experimental 114 0.52 Livergood (1994) Multimedia modified ITS as principal intellifent tutoring system instruction computer-assisted learning computer science short term specific immediately following experimental 136 0.62 Stankov, Glavinić, & Grubišić (2004) DTEx-Sys: Distributed Tutor Expert System ITS as principal instruction human tutoring computer science one semester/above embedded delayed experimental 22 -0.09 Stankov, Glavinić, & Grubišić (2004) DTEx-Sys: Distributed Tutor Expert System ITS as principal instruction traditional classroom instruction computer science one semester/above embedded delayed experimental 22 0.90 Xu, Meyer, & Morgan (2009) Assessment and ITS as principal Leearning in Knowledge instruction Spaces (ALEKS) traditional classroom instruction statistics one semester/above embedded end of semester quasi-experimental 86 Hagerty & Smith (2005) ALEKS-Assessment and ITS-supplemented traditional classroom Leearning in Knowledge class instruction instruction Spaces (ALEKS) mathematical subjects ng embedded end of semester quasi-experimental 195 0.59 Hu et al. (2007) ALEKS behaviorial statistics ITS as principal instruction traditional classroom instruction statistics ng embedded end of semester quasi-experimental 473 0.17 Hampikian et al. (2007) ALEKS ITS as principal instruction traditional classroom instruction mathematical subjects one semester/above embedded end of semester quasi-experimental 19 0.28 0.96 Study (independent sample) Baxter & Thibodeau (2011) ITS name Intervention Conditions ITS as principal instruction Comparison Conditions traditional classroom instruction Subject ITS duration Assessment type Measurement timing end of semester Research design Sample size2 quasi-experimental 99 Adjusted Unadjusted ES3 ES4 0.33 business subjects one semester/above embedded Conati &VanLehn (2000) Self-explanation coach (SE-Coach) ITS as principal instruction computer-assisted learning physics short term specific immediately following quasi-experimental 56 0.17 Aberson et al. (2002) Web Interface for Statistics Education Power Applet (WISE) ITS-supplemented traditional classroom class instruction instruction statistics short term specific end of semester quasi-experimental 25 1.43 Aberson et al. (2000) Web Interface for Statistics Education Power Applet (WISE) ITS as principal instruction traditional classroom instruction statistics short term specific immediately following experimental -0.25 Bliwise (2005) Web-based Tutorial for teaching introductory statistics ITS-supplemented traditional classroom class instruction instruction statistics ng embedded end of semester quasi-experimental 225 0.67 Morris (2001) Link ITS as principal instruction do nothing control statistics short term specific immediately following experimental 34 0.57 Morris (2001) Link ITS as principal instruction printed text statistics short term specific immediately following experimental 33 -0.21 Koch & Gobell (1999) (1) Design-Statistics Finder ITS-assisted activities Self-reliant learning statistics short term specific meantime experimental 26 0.91 Koch & Gobell (1999) (2) Design-Statistics Finder ITS-assisted activities printed text statistics short term specific immediately following experimental 41 0.92 Mitrovic & Ohlsson (1999) Structured Query Language-Tutor (SQLTutor) ITS-assisted activities Self-reliant learning computer science short term embedded end of semester quasi-experimental 46 0.78 Grubišić et al. (2009) Extended Tutor-Expert System (xTEx-Sys) ITS as principal instruction traditional classroom instruction computer science 3-12 weeks specific immediately following quasi-experimental 39 0.30 Rosé et al. (2003) Knoledge Construction Dialogues (KCDs) ITS as principal instruction computerized materials physics short term specific immediately following experimental 28 0.19 Grubišić, Stankov, & Hrepic (2008) eXtended Tutor-Expert System (xTEx-Sys) ITS-assisted activities computer-assisted learning physics 3-12 weeks specific immediately following quasi-experiment 48 0.09 Heyden (1990) DSM-III-R computer tutorial ITS as principal instruction printed text other subjects short term specific immediately following quasi-experimental 78 0.68 ALEKS Financial accounting course 111 Study (independent sample) Chang et al. (2003) ITS name Intervention Conditions Web-Soc tutoring system ITS as principal instruction Comparison Conditions printed text Subject ITS duration Assessment type Measurement timing immediately following Research design Sample size2 experimental 48 Adjusted Unadjusted ES3 ES4 0.84 computer science short term specific Johnson, Phillips, & Chase (2009) Transaction analysis and ITS-assisted recording tutor homework Self-reliant learning business subjects short term specific meantime quasi-experimental 55 0.57 Phillips & Johnson (2011) Transaction analysis and ITS-assisted recording tutor homework computer-assisted learning business subjects short term specific immediately following quasi-experimental 140 0.10 Shute & Glaser (1990) Smithtown ITS as principal instruction traditional classroom instruction business subjects short term specific immediately following quasi-experimental 20 -0.17 Shute & Glaser (1990) Smithtown ITS as principal instruction do nothing control business subjects short term specific immediately following quasi-experimental 20 1.26 VanLehn et al. (2010) Andes Physics Tutoring System ITS-supplemented traditional classroom class instruction instruction physics one semester/above embedded immediately following quasi-experimental 1066 Note. ES = effect size, ng = not given 1 Supplemental Table 1 presents the 48 effect sizes from the 39 independent studies, each indicating ITS’s effectiveness in 48 comparison conditions used in the studies. Thirty studies each had one comparison condition. Nine studies each provided effect sizes for two comparison conditions. 2 The sample sizes reported in this table are the total sample sizes of each independent study. 3 These are adjusted overall effect sizes corresponding to a type of comparison condition within each independent sample. If a study provided both an adjusted and unadjusted effect size, we chose the adjusted over unadjusted effect sizes to represent the study. 4 These are unadjusted overall effect sizes corresponding to a type of comparison condition within each independent sample. 0.47 Reference List of Included Studies Aberson, C. L, Berger, D. E., Healy, M. R., & Romero, V. L. (2002). An interactive tutorial for teaching statistical power. Journal of Statistics Education, 10 (3), [Online]. Retrieved from www.amstat.org/publications/jse/v10n3/aberson.html Aberson, C. L., Berger, D. E., Healy, M. R., & Romero, V. L. (2003). Evaluation of an interactive tutorial for teaching hypothesis testing concepts. Teaching of Psychology, 30(1), 75-78.doi: 10.1207/S15328023TOP3001_12 Aberson, C. L., Berger, D. E., Healy, M. R., Kyle, D. J., & Romero, V. L. (2000).Evaluation of an interactive tutorial for teaching the central limit theorem. Teaching of Psychology, 32, 3952.doi: 10.1207/S15328023TOP2704_08 Arnott, E., Hastings, P., & Allbritton, D. (2008). Research methods tutor: Evaluation of a dialogue-based tutoring system in the classroom. Behavior Research Methods, 40(3), 694698.doi: 10.3758/BRM.40.3.694 Bliwise, N. G. (2005). Web-based tutorials for teaching introductory statistics. Journal of Educational Computing Research, 33(3), 309-325. Chang, K.-E., Sung, Y.-T., Wang, K-Y., & Dai, C.-Y. (2003). Web_Soc: A socratic-dialect-based collaborative tutoring system on the World Wide Web. IIEE Transactions on Education, 46, 69-78. Retrieve fromhttp://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1183669&tag=1 Conati, C., &VanLehn, K. (2000). Further results from the evaluation of an intelligent computer tutor to coach self-explanation. In G. Gauthier, C. Frasson, & K. VanLehn (Eds), Intelligent Tutoring Systems: 5th International Conference (pp. 304-313). Montreal, Canada. doi: 10.1007/3-540-45108-0_34 Conati, C., Muldner, K., & Carenini, G. (2006). From example studying to problem solving via tailored computer-based meta-cognitive scaffolding: Hypotheses and design. Technology, Instruction, Cognition and Learning (TICL), Special Issue on Problem Solving Support in Intelligent Tutoring Systems, 4(2), 1-52. Retrieved from http://people.cs.ubc.ca/~conati/mypapers/TICL%204.2%20(Conati).pdf Corbett, A. (2001). Cognitive computer tutors: solving the two-sigma problem. In M. Bauer, P. J. Gmytrasiewicz, & J. Vassileva (Eds), User Modeling: Proceedings of the Eight International Conference, UM 2001 (137-147). Springer-Verlag Berlin Heidelberg. Graesser, A. C., Jackson, G. T., Mathews, E. C., Mitchell, H. H., Olney, A., Ventura, M., … Tutoring Research Group. (2003). Why/AutoTutor: A test of learning gains from a physics tutor with natural language dialog. In R. Alterman & D. Hirsh (Eds.), Proceedings of the Twenty-Fifth Annual Conference of the Cognitive Science Society (pp. 1-6). Mahwah, NJ: Erlbaum. Graesser, A. C., Moreno, K. N., Marineau, J. C., Adcock, A. B., Olney, A. M., & Person, N. K. (2003). AutoTutor improves deep learning of computer literacy: Is it the dialog or the talking head?. In U. Hoppe, F. Verdejo, & J. Kay (Eds), Artificial Intelligence in Education: Shaping the Future of Learning through Intelligent Technologies (pp. 47-54). Amsterdam, IOS Press. Grubišić, A., Stankov, S., & Hrepic, Z. (2008).Comparing the effectiveness of learning content management systems to intelligent tutoring systems. Proceedings of the IASK International Conference E-Activity and Learning Technologies &Inter TIC (pp. 93-100). Madrid, Spain. Grubišić, A., Stankov, S., & Zitko, B. (2006).An approach to automatic evaluation of educational influence. Proceeding of the 6th WSEAS International Conference on Distance Learning and Web Engineering, Lisbon, Portugal, September 22-24, 2006. Grubišić, A., Stankov, S., Rosić, M., & Zitco, B. (2009). Controlled experiment replication in evaluation of e-learning system’s educational influence. Computers & Education, 53(3), 591602.doi: 10.1016/j.compedu.2009.03.014 Hagerty, G. & Smith, S. (2005). Using the web-based interactive software ALEKS to enhance college algebra. Mathematics and Computer Education, 39(3), 183-194.Retrieved from http://search.proquest.com/docview/235830320?accountid=10598 Hampikian, J., Guarino, J., Chyung, S. Y., Gardner, J., Moll, A., Pyke, P., & Schrader, C. (2007). Benefits of a tutorial mathematics program for engineering students enrolled in precalculus: A template for assessment. ASEE Annual Conference. Retrieved from http://www.icee.usm.edu/ICEE/conferences/asee2007/papers/1998_BENEFITS_OF_A_TUTO RIAL_MATHEMATICS_PROGR.pdf Heyden, D. C. (1990). A DSM-III-R computer tutorial for abnormal psychology. Teaching of Psychology, 13(3), 203-206.doi: 10.1207/s15328023top1703_21 Hu, X., Luellen, J. K., Okwumabua, T. M. Xu, Y. & Mo, L. (2007). Observational findings from a web-based intelligent tutoring system: Elimination of racial disparities in an undergraduate behavioral statistics course. Accepted as a paper presentation at the 2007 Annual Meeting of the American Educational Research Association (AERA). Chicago, IL; April 2007. Johnson, B. G., & Phillips, F. (2011). Online homework versus intelligent tutoring systems: Pedagogical support for transaction analysis and recording. Issues in Accounting Education, 26, 87-97. Johnson, B. G., Phillips, F., & Chase, L. G. (2009). An intelligent tutoring system for the accounting cycle: Enhancing textbook homework with artificial intelligence. Journal of Accounting Education, 27, 30-39.doi: 10.1016/j.jaccedu.2009.05.001 Koch, C., & Gobell, J. (1999).A hypertext-based tutorial with links to the web for teaching statistics and research methods. Behavior Research Methods, Instruments, & Computers, 31, 713. doi: 10.3758/BF03207686 Lane, H. C., &VanLehn, K. (2005).Teaching the tacit knowledge of programming to novices with natural language tutoring. Computer Science Education, 15(3), 183-201.doi: 10.1080/08993400500224286 Livergood, N. D. (1994). A study of the effectiveness of a multimedia intelligent tutoring system. Journal of Educational Technology Systems, 22(4), 337-344. Livergood, N. D. (1994). A study of the effectiveness of a multimedia intelligent tutoring system. Journal of Educational Technology Systems, 22(4), 337-344. Mitrovic, A., & Ohlsson, S. (1999). Evaluation of a constraint-based tutor for a database language. International Journal of Artificial Intelligence in Education, 10(3-4), 238-256. Retrieved from http://hdl.handle.net/10092/327 Morris, E. (2001). The design and evaluation of Link: A computer-based learning system for correlation. British Journal of Educational Technology, 32, 39-52.doi: 10.1111/14678535.00175 Reif, F., & Scott, L. A. (1999).Teaching scientific thinking skills: Students and computers coaching each other. American Journal of Physics, 67(9), 819-831.doi: 10.1119/1.19130 Rosé, C. P., Bhembe, D., Siler, S., Srivasteva, R., &VanLehn, K. (2003). Exploring the effectiveness of knowledge construction dialogues. . In U. Hoppe, F. Verdejo, & J. Kay (Eds.), Artificial intelligence in education: Shaping the future of learning through intelligent technologies (pp. 497–499). Amsterdam, the Netherlands: IOS Press. Shute, V. J., & Glasser, R. (1990). A large-scale evaluation of an intelligent discovery world: Smithtown. Interactive Learning Environments, 1, 51-77.doi: 10.1080/1049482900010104 Stankov, S., Glavinić, V., & Grubišić, A. (2004). What is our effect size: Evaluating the educational influence of a web-based intelligent authoring shell? In IEEE international conference on intelligent engineering systems 2004 – INES 2004, Cluj-Napoca, Romania. Stylianou, D. A., & Shapiro, L. (2002). Revitalizing algebra: the effect of the use of a cognitive tutor in a remedial course. Journal of Educational Media, 27(3), 147-171.doi: 10.1080/1358165022000081404 VanLehn, K., Graesser, A. C., Jackson, G. T., Jordan, P., Olney, A., & Carolyn, P. (2007). When are tutorial dialogues more effective than reading? Cognitive Science, 31, 3-62.doi: 10.1080/03640210709336984 VanLehn, K., Van de Sande, B., Shelby, R., & Gershman, S. (2010). The Andes Physics Tutoring System: An experiment in freedom. Studies in Computational Intelligence, 308, 421-443.doi: 10.1007/978-3-642-14363-2_21 Wang, H-C., Li, T-Y., & Chang, C-Y. (2006). A web-based tutoring system with styles-matching strategy for spatial geometric transformation. Interacting with Computers, 18(3), 331-355.doi: 10.1016/j.intcom.2005.11.002 Xu, Y. J., Meryer, K. A., & Morgan, D. D. (2009).A mixed-methods assessment of using an online commercial tutoring system to teach introductory statistics. Journal of Statistics Education, 17(2), 1-17. Retrieved fromhttp://www.amstat.org/publications/jse/v17n2/xu.pdf Supplemental Table 2 Results of Testing for Moderators on Adjusted Effect Sizes Variable Grade level Undergraduates Mixed Graduates students Prior knowledge2 Yes No Compensation3 Yes No Not given Duration Short term 3-12 weeks One semester or longer Not given Additional time4 Yes No Not given Sample size Less than 40 40-100 More than 100 Research design Experimental Quasi-experimental Research setting Real environment Laboratory Both Measurement timing Immediately after End of semester/school Delayed Meantime Outcome5 Single outcome Multiple outcomes Country US Non-US College type 4 year Multiple Other k1 g 23 2 1 .40 .16 -.32 18 8 .42 .22 6 16 4 .21 .43 .39 17 4 2 3 .32 .11 .50 .59 6 13 7 .51 .37 .22 Fixed Qb 2.32 .44 .24 .54 .36 .37 19 7 .41 .22 14 8 4 .42 .25 .47 .182 4 .212 1 .492 1 .107 . .314 1 .44 .25 .54 7.57 16 10 1 .30 .45 2.32 .24 .63 .28 .57 .350 .52 .33 .29 2.59 18 5 2 1 1 .50 .38 .22 1.42 15 10 1 .165 .37 .11 .50 .59 3.10 .24 .45 1 .23 .42 .37 4.86 10 16 .314 .42 .22 2.13 .53 .32 .36 g .39 .55 -.32 1.93 10 11 5 Rand p .056 5 .27 .63 .28 .57 .01 .937 . .39 .34 1.27 .260 1 .41 .22 1.84 .399 . .42 .36 .47 Variable Report type Journal article Conference paper Book chapter k1 g 18 7 1 .38 .36 .19 Fixed Qb .21 Rand p .901 g . .38 .36 .19 Notes. Qb denotes the heterogeneity status between all categories of a particular variable. 1 The analyses were conducted on the second dataset. It involved the 26 unadjusted effect sizes extracted from 26 studies. 2 Prior knowledge refers to whether the samples had knowledge background on the subject/topic tutored or studied. 3 Compensation refers to whether the students received compensation for participating the research. 4 Additional time refers to whether the intervention group used additional time for learning than the control group did. 5 Outcome refers to whether the effectiveness was measured through a single outcome or multiple outcomes. Supplemental Table 3 Results of Testing for Moderators on Unadjusted Effect Sizes Variable Grade level Undergraduates Mixed Graduates students Prior knowledge Yes No Not given Compensation Yes No Not given Duration Short term 3-12 weeks One semester or longer Not given Additional time Yes No Not given k1 g 32 3 2 .33 .30 .21 23 13 1 .30 .34 .70 Fixed Qb .37 22 4 7 4 .38 .22 .40 .12 10 18 9 .26 .43 .23 Rando Qb .82 .623 .68 .37 .34 .70 1.17 .33 .30 .45 g .35 .54 .21 .95 7 23 7 p .831 .557 .887 .37 .32 .51 6.34 .096 6.88 .43 .22 .40 .12 4.33 .115 3.61 .35 .45 .23 Variable k1 g Fixed Qb 5.25 Sample size Less than 40 14 .43 40-100 15 .39 More than 100 8 .21 Research design 2.62 Experimental 15 .42 Quasi-experimental 22 .27 Research setting 1.14 Real environment 24 .30 Laboratory 12 .40 Both 1 .46 Measurement timing 8.76 immediately 22 .33 End of semester 10 .27 Meantime 2 .26 Delayed 2 1.04 End of school 1 .09 Outcome .04 Single outcome 24 .32 Multiple outcomes 13 .31 Country .01 US 29 .32 Non-US 8 .31 College type .66 4 year 23 .30 Multiple 8 .34 Other 6 .42 Report type 1.21 Journal article 25 .31 Conference paper 9 .28 Book chapter 3 .46 Notes. Qb denotes the heterogeneity status between all categories of a particular variable. 1 p .072 g Rando Qb 3.56 .43 .40 .21 .106 2.50 .48 .28 .565 .80 .31 .44 .46 .067 4.94 .34 .29 .40 1.01 .09 .846 .44 .38 .30 .944 .12 .36 .32 .720 .66 .32 .43 .42 .550 The analyses were conducted on the third dataset. It involved the 37 unadjusted effect sizes extracted from 37 studies. .60 .35 .32 .46