Assessing Student Critical Thinking and Information Literacy Skills Michael Fosmire (Libraries), Ruth Wertz (ENE), Senay Purzer (ENE), Stephanie Gardner (Biology), Brian Dillman (Aviation Technology), Amy Van Epps (Libraries), Megan Sapp Nelson (Libraries), Karen Chang (Nursing), Bob Jacko (CE) Overview: This project explored the effectiveness of using the Critical Thinking Assessment Test (CAT), developed by Tennessee Tech University and available for purchase, and the Critical Engineering Literacy Test (CELT), developed by researchers at Purdue University, for measuring student performance on critical thinking and information literacy skills. These assessments were considered to be candidates for providing baseline information for meeting core curriculum outcomes. The purpose of the study was to characterize how these assessments could be implemented, what resources it would take and what value instructors placed on the results of those assessments. Summary: The CAT and CELT critical thinking assessments both correlate with final course grades for the students tested. The CAT takes much longer to administer and grade, and motivation for students, especially for test re-test can be a problem. Additionally, the CAT showed significant bias toward English language proficiency. The CELT, on the other hand, did not show bias toward English language proficiency, nor gender or race/ethnicity. The CELT takes significantly less time to administer and grade. It can be computer graded, while the CAT’s short answer format requires human grading. In the courses studied, no significant gains were seen in the CAT. The CELT was not given as both pre and post-test. The Tools: Briefly, the CAT is a 15-question, scenario-based, pencil and paper, short answer assessment. It takes approximately an hour for students to complete. Since evaluating open-response answers contains a qualitative aspect, each question is graded a minimum of two times, and three times if the first two graders disagree on the final response. It takes significant experience for graders to become proficient and efficient in the grading process, but by the end of the experiment, the graders were able to finish grading an assessment in 20 minutes. The CELT is a scenario-based multiple choice and short answer assessment. It is available in online and pencil and paper format. The multiple choice portion can be graded automatically, while the short answer can be graded at the discretion of the assessor. The short answers ask for students to state their reasoning behind choosing each response. Thus, this assessment can provide more depth than a simple multiple-choice assessment, but it can also be used just in multiple-choice mode to extract quick results of student performance. Currently, the assessment has two scenarios, each with ten multiple choice questions. It takes students about 20 minutes to complete a scenario, and grading of the short-answer questions can be done in 5 minutes. Rationale: The CAT assessment has been validated in a number of environments, and it has been used on Purdue’s campus as a standardized assessment of critical thinking as part of an NSF grant to improve student outcomes in Biology. The CELT was developed by three of the members of this project, as a way to develop an easy to administer and grade standard assessment that captures especially the information literacy and critical reading skills of students. The CELT was originally developed for an audience of engineering undergraduates, so this project also sought to determine whether the assessment could be used to measure these skills in allied STEM disciplines. Results: We conducted different combinations of assessments corresponding to the interests of the instructors involved in this project and constraints of the classroom situations. Some courses gave the CAT as both a pre- and post-assessment of student abilities, and some paired the CAT and CELT assessments to see correlations between the instruments, and two courses actually gave pre and post assessments as well as administering the CELT assessment. A summary of the deployment of assessments is given here: Table 1: Distribution of Student Assessments CAT Pre CAT Post CELT v2.1 Paired CAT Pre & Post Paired CELTv2.1 & CAT Aviation Technology 104 -- 82 -- 72 First Year Engineering 91 -- 72 -- 69 Civil Engineering 56 51 -- 47 -- Nursing 25 25 25 25 25 Biology 17 18 10 17 9 Instrument Course Section In total, we distributed 387 copies of the CAT assessment and 189 copies of the CELT assessment. Descriptive statistics for the CAT and CELT are included in Appendix I. Aviation Technology students scored lower than those in Biology, First Year Engineering, and Nursing. The lower scores of Aviation Technology students were consistent for both the CAT and CELT scores. There may have been some motivational/deployment factors that led to those lower scores, as the instructor reported that, due to the constraints of the course timing, students felt rushed by the CAT, and due to the number of assessments given as part of the course, from a variety of sources, motivation for the CELT might have been impacted by ‘survey fatigue.’ The investigators gathered the most usable data from the First-Year Engineering course. For the course, the Spearman’s Correlation matrix of final course grades to the CELT assessment items and to the CAT assessment items are given in Appendices II and III, respectively. Overall, both assessments correlate significantly to the final course grade (see Table 2). The CAT and CELT share approximately 10% of the variance (i.e., rho squared). We found that 8 of the 18 CELT items had positive significant correlations with the final grade, while only 2 of the 15 CAT items correlated with the final grade. Furthermore, the constellation of CAT and CELT total scores explained a significant proportion of variance in final grades for First-Year Engineering students, R2 = .25, F(2, 66) = 10.70, p < .001. A simultaneous multiple linear regression showed that the CELT total score significantly predicted the final grade for firstyear engineering students, b = 0.69, t(66) = 3.63, p < .001. The CAT total score, however, did not uniquely predict the students' final grade when viewed in combination with the CELT. Table 2: Spearman's Correlation Matrix of All Assessments for First-Year Engineering Students Final Grade Final Grade 1.00 CAT Total .31** (.009) .49** (.000) CELT Total CAT Total CELT Total 1.00 .34** (.004) 1.00 Note: N = 69; significance (two-tailed) shown in parenthesis; *p < .05; **p < .01. Comparing the CELT and CAT total scores, Table 3 shows the Pre- and Post- test correlations with the CELT. There was a significant correlation between the CELT and CAT for all courses except Aviation Technology and the CAT post-test for biology. With the small number of students in the paired pool for the biology course (N<10), it is not unexpected that it would be difficult to detect significant correlations. Table 3: Nonparametric Spearman's Rho between CAT Total Scores and CELT Total Scores CAT Pre-Test Spearman's Rho Significance Biology CELT .71* (.046) CAT Post-Test N Spearman's Rho Significance N 8 .49 (.180) 9 Nursing CELT .57** (.003) 25 .68** (.000) 25 FYE CELT .31** (.009) Aviation Tech CELT .20 (.094) All Students CELT .41** (.000) 69 72 -- -- 174 .59** (.000) 34 Looking at the trends for pre- vs. post- test scores for the CAT assessment, Biology (t=1.48, p=.159) and Nursing (t=.26, p=.796) showed no significant change, while the Civil Engineering course showed significant decrease in scores (t=-5.65,p=.000). As mentioned above, there appeared to be significant lack of motivation for those students to take the post-test seriously. In particular, the scores on items 13 increased (significantly for question 1), while all subsequent questions showed a decrease from pre-test to post-test (5 of 12 items, significantly). Demographic Results The CAT assessment did show a significant correlation with proficiency in English (see Figure 1). This showed up in the data for First-Year Engineering, as it has a significant population of international students. The CAT assessment does require a significant amount of reading and extracting information from those readings, so it is conceivable that students with English as a second language would have more difficulty navigating this assessment. The data for Native vs. Non-native English speakers was even more pronounced (Mean: Native=19.1, non-Native=14.00; p=.000). There was no significant difference in ethnicity, except that Asian students scored significantly lower than white students (an effect that disappeared once English proficiency was factored in). For First-Year Engineering students, females scored significantly better than males (Mean: Female: 19.58; Male=1.775; p=.017). Figure 1: CAT assessment score as a function of English proficiency (CAT Report from Tennessee Tech) For the CELT assessment, no significant difference between students of different ethnicities was found. There was a significant difference between first-year engineering male and female students at the p<.05 level (Mean F=12.2; M=10.7; p<.047). In addition, there was no significant correlation (or apparent trend) between students’ English proficiency and their CELT scores (see Table 4). Comparing native speakers vs. non-native speakers, there was no significant difference either (Mean Native= 11.25, Nonnative=10.33; p=.867). Table 4: CELT Scores by English Proficiency Level English proficiency 1 2 3 4 5 CELT 2.1 Mean Score 11.000 11.625 10.000 10.133 11.363 Std. Deviation N 2.9262 1 4 3.1379 2.2398 3.8884 14 15 131 Total 11.139 3.6877 165 Discussion/Impact: This project indicated that there is a fair amount of overlap between information literacy and critical thinking skills as measured by the CAT and CELT assessments. CAT might be used effectively in small-course settings or as a sampling mechanism for measuring the baseline of student performance. Although Tennessee Tech indicated no difficulty with test-retest, the project participants noticed a challenge with maintaining motivation for students in the second implementation. In particular, since there is only one form of the assessment, having the students work through the same questions in an hour-long assessment led to instances of fatigue with the assessment. It was especially evident in the results for the Civil Engineering course, where scores actually went up for the first few questions from pre- to post-test, but then the students seemed to lose patience, and the scores on those items plummeted. The fact that CAT does not have a ‘Form B’ means that students have to take the exact same assessment for re-testing, and this seems to be a significant barrier for students. The CELT can be used on a larger scale. The depth of information gathered is not equivalent, but it is more scalable. There does not seem to be a ‘ceiling effect’ with this assessment, but it is also still undergoing development and does not have the same psychometric quality as the CAT. The CELT did not show ethnicity/language/gender differences at the same level that the CAT. Additionally, the Biology, Nursing, and Engineering students scores were very similar, indicating that there isn’t a strong ‘engineering-bias’ to the assessment as currently configured. One question for further investigation is the relationship of verbal skills to the outcomes of this assessment. The CAT only contains one simple calculation, and the CELT contains no calculations, so the focus is on critical reasoning rather than computation. While the assessments ask students to extract data and information from figures and text, both assessments focus on qualitative arguments. It would be helpful, for example, to compare SAT Verbal scores with results of CAT and CELT assessments to see if this conjecture is borne out. Next Steps: Further development and dissemination of the CELT assessment tool has been funded by the NSF (Award: 1245998-DUE). In addition to instructors at Purdue, we have faculty from approximately a dozen external institutions interested in using the assessment in their curricula. The results of their implementations will be fed back into the development as well as used as generalizability evidence for the assessment tool. Appendix I: Descriptive Statistics for CELT and CAT Table A.1: Descriptive Statistics for CELTv.2.1 N Min Max (20) M SD Aviation Technology 82 2.00 16.00 8.76 3.25 Biology 10 7.50 17.50 12.35 2.60 First Year Engineering 72 5.50 19.50 12.83 2.64 Nursing 25 8.00 20.00 13.26 2.92 All Students 189 2.00 20.00 11.10 3.60 Course Section Table A.2: Descriptive Statistics for CAT Pre-Tests N Min Max M SD Aviation Technology 104 5 32 16.30 5.19 First Year Engineering 91 6 32 20.19 6.19 Civil Engineering 56 10 30 18.93 4.93 Nursing 25 10 34 19.04 5.77 Biology 17 13 31 19.88 5.16 All Students 291 5 34 18.45 5.74 Course Section Table A.3: Descriptive Statistics for CAT Post-Tests N Min Max M SD Civil Engineering 51 4 27 15.75 5.54 Nursing 25 8 31 19.24 6.43 Biology 18 13 33 21.28 6.15 All Students 94 4 33 17.73 6.33 Course Section Appendix II: Spearman's Correlation Matrix of Final Course Grades to CELTv2.1 Items for First Year Engineering Students Final Grade Final Grade CELT 1 CELT 2 CELT 3 CELT 4 CELT 5 CELT 6 CELT 7 CELT 8 CELT 9 CELT 10 CELT 11 CELT 12 CELT 13 CELT 14 CELT 15 CELT 16 CELT 17 CELT 18 CELT Total CELT1 CELT2 CELT3 CELT4 CELT5 CELT6 CELT7 CELT8 CELT9 CELT10 CELT11 CELT12 CELT13 CELT14 CELT15 CELT16 CELT17 CELT18 CELT Total 1.00 .34** (.005) .26* (.029) .15 (.205) .05 (.663) .19 (.109) .33** (.006) -.09 (.466) .27* (.025) .12 (.326) .10 (.414) .24* (.046) .31* (.011) .23 (.063) .09 (.461) .12 (.311) .26* (.031) .11 (.352) .26* (.031) .49** (.000) 1.00 .18 (.142) -.05 (.704) .26* (.033) .31* (.010) .41** (.000) .03 (.800) -.11 (.349) .10 (.416) .13 (.269) .05 (.668) -.04 (.745) .26* (.031) -.04 (.738) .09 (.456) .30* (.011) .10 (.396) .26* (.031) .52** (.000) 1.00 .31** (.010) .04 (.763) .01 (.942) .03 (.781) .06 (.633) .47** (.000) .17 (.152) .08 (.528) .11 (.347) .05 (.655) .10 (.391) .10 (.434) .22 (.074) .14 (.241) .13 (.283) .19 (.125) .47** (.000) 1.00 .06 (.608) -.06 (.621) .15 (.205) .00 (.989) .09 (.467) .21 (.081) .19 (.113) -.02 (.839) .21 (.076) .13 (.298) .04 (.730) .24* (.049) .21 (.086) -.04 (.763) .10 (.411) .41** (.000) 1.00 .07 (.556) .07 (.540) .23 (.053) -.05 (.670) .06 (.609) .19 (.113) -.02 (.839) .12 (.340) .01 (.949) -.13 (.277) .24* (.049) .15 (.228) .05 (.659) .04 (.730) .40** (.001) 1.00 .37** (.002) .04 (.758) -.09 (.477) .14 (.266) .20 (.105) .11 (.351) -.01 (.951) .13 (.278) -.07 (.542) .16 (.181) -.07 (.566) .09 (.442) -.07 (.542) .32** (.007) 1.00 -.11 (.366) -.25 (.038) -.01 (.952) .10 (.408) -.04 (.760) .14 (.251) .38** (.001) -.20 (.092) .11 (.366) .07 (.557) .09 (.464) .11 (.358) .30* (.011) 1.00 .12 (.323) .11 (.369) .18 (.135) .11 (.387) -.09 (.481) -.04 (.749) -.02 (.890) .13 (.286) -.01 (.910) -.06 (.633) -.08 (.536) .29* (.016) 1.00 -.02 (.899) .11 (.358) .08 (.525) .23 (.053) .06 (.609) .32** (.007) .09 (.448) -.14 (.245) -.03 (.815) .04 (.727) .26* (.030) 1.00 .26* (.030) -.02 (.865) -.14 (.259) -.01 (.957) .03 (.806) .19 (.117) .07 (.557) .17 (.156) .10 (.394) .35** (.003) 1.00 -.03 (.797) .12 (.323) -.03 (.836) -.07 (.545) .11 (.384) .01 (.942) .14 (.235) -.22 (.074) .26* (.033) 1.00 .18 (.142) -.12 (.333) .05 (.701) -.11 (.387) -.01 (.940) -.02 (.853) -.07 (.562) .20 (.101) 1.00 .21 (.079) -.06 (.619) -.13 (.286) .14 (.262) .07 (.580) .16 (.177) .26* (.028) 1.00 -.04 (.736) .22 (.070) .18 (.144) .17 (.154) .02 (.883) .39** (.001) 1.00 .08 (.536) -.10 (.410) -.10 (.434) .19 (.121) .23 (.054) 1.00 .08 (.538) .242* (.045) .02 (.890) .48** (.000) 1.00 .14 (.245) .08 (.504) .33** (.006) 1.00 .09 (.486) .31** (.011) 1.00 .38** (.001) 1.00 Appendix III: Spearman's Correlation Matrix of Final Course Grades to CAT Items for First Year Engineering Students (N = 69) Final Grade Final Grade 1.00 CAT 1 .07 (.587) .21 (.088) .20 (.097) .32** (.008) .14 (.247) .08 (.528) .24* (.044) .06 (.619) .15 (.216) .13 (.286) -.04 (.714) .09 (.441) .14 (.262) .18 (.136) .16 (.182) .31** (.009) CAT 2 CAT 3 CAT 4 CAT 5 CAT 6 CAT 7 CAT 8 CAT 9 CAT 10 CAT 11 CAT 12 CAT 13 CAT 14 CAT 15 CAT Total CAT 1 CAT 2 CAT 3 CAT 4 CAT 5 CAT 6 CAT 7 CAT 8 CAT 9 CAT 10 CAT 11 CAT 12 CAT 13 CAT 14 CAT 15 CELT Total 1.00 .02 (.862) .09 (.464) .11 (.374) -.02 (.894) .17 (.165) .20 (.105) .26* (.031) .07 (.549) -.03 (.810) .20 (.101) -.13 (.304) .13 (.302) .36** (.002) -.02 (.888) .31** (.009) 1.00 .253* (.036) .23 (.058) .24* (.045) .21 (.083) .22 (.068) .12 (.337) .19 (.117) .16 (.196) -.12 (.308) -.03 (.784) .19 (.114) -.02 (.900) .11 (.374) .39** (.001) 1.00 .40** (.001) .13 (.294) .23 (.054) .41** (.000) .09 (.446) .33** (.006) .04 (.721) .03 (.812) .03 (.789) .20 (.107) .15 (.232) .25* (.037) .55** (.000) 1.00 .21 (.085) .34** (.004) .42** (.000) .08 (.527) .30* (.013) .20 (.106) -.10 (.428) .30* (.012) .36** (.003) .19 (.110) .52** (.000) .69** (.000) 1.00 .46** (.000) .22 (.074) -.03 (.826) .19 (.111) .27* (.026) .01 (.933) .14 (.241) .31** (.009) .19 (.117) .11 (.382) .43** (.000) 1.00 .16 (.194) .17 (.161) .28* (.019) .18 (.142) -.10 (.408) .16 (.179) .17 (.166) .16 (.191) .16 (.198) .51** (.000) 1.00 .04 (.773) .31** (.010) .20 (.101) .06 (.605) .22 (.070) .25* (.038) .10 (.413) .19 (.122) .50** (.000) 1.00 .24* (.043) -.02 (.863) -.01 (.926) -.16 (.182) .19 (.120) .14 (.256) .01 (.932) .23 (.060) 1.00 .13 (.296) -.08 (.539) .03 (.820) .16 (.178) .13 (.301) .21 (.081) .47** (.000) 1.00 .08 (.538) .25* (.038) .19 (.115) .23 (.055) .14 (.265) .41** (.000) 1.00 -.02 (.874) .07 (.572) .17 (.152) .02 (.882) .15 (.212) 1.00 .18 (.130) .04 (.720) .11 (.378) .23 (.053) 1.00 .53** (.000) .42** (.000) .65** (.000) 1.00 .31* (.010) .62** (.000) 1.00 .56** (.000) 1.00