Assessing Student Critical Thinking and

advertisement
Assessing Student Critical Thinking and Information Literacy Skills
Michael Fosmire (Libraries), Ruth Wertz (ENE), Senay Purzer (ENE), Stephanie Gardner (Biology), Brian
Dillman (Aviation Technology), Amy Van Epps (Libraries), Megan Sapp Nelson (Libraries), Karen Chang
(Nursing), Bob Jacko (CE)
Overview:
This project explored the effectiveness of using the Critical Thinking Assessment Test (CAT), developed
by Tennessee Tech University and available for purchase, and the Critical Engineering Literacy Test
(CELT), developed by researchers at Purdue University, for measuring student performance on critical
thinking and information literacy skills. These assessments were considered to be candidates for
providing baseline information for meeting core curriculum outcomes. The purpose of the study was to
characterize how these assessments could be implemented, what resources it would take and what value
instructors placed on the results of those assessments.
Summary:
The CAT and CELT critical thinking assessments both correlate with final course grades for the students
tested. The CAT takes much longer to administer and grade, and motivation for students, especially for
test re-test can be a problem. Additionally, the CAT showed significant bias toward English language
proficiency. The CELT, on the other hand, did not show bias toward English language proficiency, nor
gender or race/ethnicity. The CELT takes significantly less time to administer and grade. It can be
computer graded, while the CAT’s short answer format requires human grading. In the courses studied,
no significant gains were seen in the CAT. The CELT was not given as both pre and post-test.
The Tools:
Briefly, the CAT is a 15-question, scenario-based, pencil and paper, short answer assessment. It takes
approximately an hour for students to complete. Since evaluating open-response answers contains a
qualitative aspect, each question is graded a minimum of two times, and three times if the first two
graders disagree on the final response. It takes significant experience for graders to become proficient
and efficient in the grading process, but by the end of the experiment, the graders were able to finish
grading an assessment in 20 minutes.
The CELT is a scenario-based multiple choice and short answer assessment. It is available in online and
pencil and paper format. The multiple choice portion can be graded automatically, while the short answer
can be graded at the discretion of the assessor. The short answers ask for students to state their
reasoning behind choosing each response. Thus, this assessment can provide more depth than a simple
multiple-choice assessment, but it can also be used just in multiple-choice mode to extract quick results
of student performance. Currently, the assessment has two scenarios, each with ten multiple choice
questions. It takes students about 20 minutes to complete a scenario, and grading of the short-answer
questions can be done in 5 minutes.
Rationale:
The CAT assessment has been validated in a number of environments, and it has been used on Purdue’s
campus as a standardized assessment of critical thinking as part of an NSF grant to improve student
outcomes in Biology. The CELT was developed by three of the members of this project, as a way to
develop an easy to administer and grade standard assessment that captures especially the information
literacy and critical reading skills of students. The CELT was originally developed for an audience of
engineering undergraduates, so this project also sought to determine whether the assessment could be
used to measure these skills in allied STEM disciplines.
Results:
We conducted different combinations of assessments corresponding to the interests of the instructors
involved in this project and constraints of the classroom situations. Some courses gave the CAT as both a
pre- and post-assessment of student abilities, and some paired the CAT and CELT assessments to see
correlations between the instruments, and two courses actually gave pre and post assessments as well as
administering the CELT assessment. A summary of the deployment of assessments is given here:
Table 1: Distribution of Student Assessments
CAT Pre
CAT Post
CELT v2.1
Paired
CAT
Pre & Post
Paired CELTv2.1
& CAT
Aviation Technology
104
--
82
--
72
First Year Engineering
91
--
72
--
69
Civil Engineering
56
51
--
47
--
Nursing
25
25
25
25
25
Biology
17
18
10
17
9
Instrument
Course Section
In total, we distributed 387 copies of the CAT assessment and 189 copies of the CELT assessment.
Descriptive statistics for the CAT and CELT are included in Appendix I. Aviation Technology students
scored lower than those in Biology, First Year Engineering, and Nursing. The lower scores of Aviation
Technology students were consistent for both the CAT and CELT scores. There may have been some
motivational/deployment factors that led to those lower scores, as the instructor reported that, due to the
constraints of the course timing, students felt rushed by the CAT, and due to the number of assessments
given as part of the course, from a variety of sources, motivation for the CELT might have been impacted
by ‘survey fatigue.’
The investigators gathered the most usable data from the First-Year Engineering course. For the course,
the Spearman’s Correlation matrix of final course grades to the CELT assessment items and to the CAT
assessment items are given in Appendices II and III, respectively. Overall, both assessments correlate
significantly to the final course grade (see Table 2). The CAT and CELT share approximately 10% of the
variance (i.e., rho squared). We found that 8 of the 18 CELT items had positive significant correlations
with the final grade, while only 2 of the 15 CAT items correlated with the final grade.
Furthermore, the constellation of CAT and CELT total scores explained a significant proportion of variance
in final grades for First-Year Engineering students, R2 = .25, F(2, 66) = 10.70, p < .001. A simultaneous
multiple linear regression showed that the CELT total score significantly predicted the final grade for firstyear engineering students, b = 0.69, t(66) = 3.63, p < .001. The CAT total score, however, did not
uniquely predict the students' final grade when viewed in combination with the CELT.
Table 2: Spearman's Correlation Matrix of All Assessments for First-Year Engineering Students
Final Grade
Final Grade
1.00
CAT Total
.31**
(.009)
.49**
(.000)
CELT Total
CAT Total
CELT Total
1.00
.34**
(.004)
1.00
Note: N = 69; significance (two-tailed) shown in parenthesis; *p < .05; **p < .01.
Comparing the CELT and CAT total scores, Table 3 shows the Pre- and Post- test correlations with the
CELT. There was a significant correlation between the CELT and CAT for all courses except Aviation
Technology and the CAT post-test for biology. With the small number of students in the paired pool for
the biology course (N<10), it is not unexpected that it would be difficult to detect significant
correlations.
Table 3: Nonparametric Spearman's Rho between CAT Total Scores and CELT Total Scores
CAT
Pre-Test
Spearman's Rho
Significance
Biology
CELT
.71*
(.046)
CAT
Post-Test
N
Spearman's Rho
Significance
N
8
.49
(.180)
9
Nursing
CELT
.57**
(.003)
25
.68**
(.000)
25
FYE
CELT
.31**
(.009)
Aviation Tech
CELT
.20
(.094)
All Students
CELT
.41**
(.000)
69
72
--
--
174
.59**
(.000)
34
Looking at the trends for pre- vs. post- test scores for the CAT assessment, Biology (t=1.48, p=.159) and
Nursing (t=.26, p=.796) showed no significant change, while the Civil Engineering course showed
significant decrease in scores (t=-5.65,p=.000). As mentioned above, there appeared to be significant
lack of motivation for those students to take the post-test seriously. In particular, the scores on items 13 increased (significantly for question 1), while all subsequent questions showed a decrease from pre-test
to post-test (5 of 12 items, significantly).
Demographic Results
The CAT assessment did show a significant correlation with proficiency in English (see Figure 1). This
showed up in the data for First-Year Engineering, as it has a significant population of international
students. The CAT assessment does require a significant amount of reading and extracting information
from those readings, so it is conceivable that students with English as a second language would have
more difficulty navigating this assessment. The data for Native vs. Non-native English speakers was even
more pronounced (Mean: Native=19.1, non-Native=14.00; p=.000). There was no significant difference
in ethnicity, except that Asian students scored significantly lower than white students (an effect that
disappeared once English proficiency was factored in). For First-Year Engineering students, females
scored significantly better than males (Mean: Female: 19.58; Male=1.775; p=.017).
Figure 1: CAT assessment score as a function of English proficiency (CAT Report from Tennessee Tech)
For the CELT assessment, no significant difference between students of different ethnicities was found.
There was a significant difference between first-year engineering male and female students at the p<.05
level (Mean F=12.2; M=10.7; p<.047). In addition, there was no significant correlation (or apparent
trend) between students’ English proficiency and their CELT scores (see Table 4). Comparing native
speakers vs. non-native speakers, there was no significant difference either (Mean Native= 11.25, Nonnative=10.33; p=.867).
Table 4: CELT Scores by English Proficiency Level
English proficiency
1
2
3
4
5
CELT 2.1 Mean Score
11.000
11.625
10.000
10.133
11.363
Std. Deviation
N
2.9262
1
4
3.1379
2.2398
3.8884
14
15
131
Total
11.139
3.6877
165
Discussion/Impact:
This project indicated that there is a fair amount of overlap between information literacy and critical
thinking skills as measured by the CAT and CELT assessments.
CAT might be used effectively in small-course settings or as a sampling mechanism for measuring the
baseline of student performance. Although Tennessee Tech indicated no difficulty with test-retest, the
project participants noticed a challenge with maintaining motivation for students in the second
implementation. In particular, since there is only one form of the assessment, having the students work
through the same questions in an hour-long assessment led to instances of fatigue with the assessment.
It was especially evident in the results for the Civil Engineering course, where scores actually went up for
the first few questions from pre- to post-test, but then the students seemed to lose patience, and the
scores on those items plummeted. The fact that CAT does not have a ‘Form B’ means that students have
to take the exact same assessment for re-testing, and this seems to be a significant barrier for students.
The CELT can be used on a larger scale. The depth of information gathered is not equivalent, but it is
more scalable. There does not seem to be a ‘ceiling effect’ with this assessment, but it is also still
undergoing development and does not have the same psychometric quality as the CAT. The CELT did
not show ethnicity/language/gender differences at the same level that the CAT. Additionally, the Biology,
Nursing, and Engineering students scores were very similar, indicating that there isn’t a strong
‘engineering-bias’ to the assessment as currently configured.
One question for further investigation is the relationship of verbal skills to the outcomes of this
assessment. The CAT only contains one simple calculation, and the CELT contains no calculations, so the
focus is on critical reasoning rather than computation. While the assessments ask students to extract
data and information from figures and text, both assessments focus on qualitative arguments. It would
be helpful, for example, to compare SAT Verbal scores with results of CAT and CELT assessments to see
if this conjecture is borne out.
Next Steps:
Further development and dissemination of the CELT assessment tool has been funded by the NSF
(Award: 1245998-DUE). In addition to instructors at Purdue, we have faculty from approximately a
dozen external institutions interested in using the assessment in their curricula. The results of their
implementations will be fed back into the development as well as used as generalizability evidence for the
assessment tool.
Appendix I: Descriptive Statistics for CELT and CAT
Table A.1: Descriptive Statistics for CELTv.2.1
N
Min
Max (20)
M
SD
Aviation Technology
82
2.00
16.00
8.76
3.25
Biology
10
7.50
17.50
12.35
2.60
First Year Engineering
72
5.50
19.50
12.83
2.64
Nursing
25
8.00
20.00
13.26
2.92
All Students
189
2.00
20.00
11.10
3.60
Course Section
Table A.2: Descriptive Statistics for CAT Pre-Tests
N
Min
Max
M
SD
Aviation Technology
104
5
32
16.30
5.19
First Year Engineering
91
6
32
20.19
6.19
Civil Engineering
56
10
30
18.93
4.93
Nursing
25
10
34
19.04
5.77
Biology
17
13
31
19.88
5.16
All Students
291
5
34
18.45
5.74
Course Section
Table A.3: Descriptive Statistics for CAT Post-Tests
N
Min
Max
M
SD
Civil Engineering
51
4
27
15.75
5.54
Nursing
25
8
31
19.24
6.43
Biology
18
13
33
21.28
6.15
All Students
94
4
33
17.73
6.33
Course Section
Appendix II: Spearman's Correlation Matrix of Final Course Grades to CELTv2.1 Items for First Year Engineering Students
Final
Grade
Final
Grade
CELT
1
CELT
2
CELT
3
CELT
4
CELT
5
CELT
6
CELT
7
CELT
8
CELT
9
CELT
10
CELT
11
CELT
12
CELT
13
CELT
14
CELT
15
CELT
16
CELT
17
CELT
18
CELT
Total
CELT1
CELT2
CELT3
CELT4
CELT5
CELT6
CELT7
CELT8
CELT9
CELT10
CELT11
CELT12
CELT13
CELT14
CELT15
CELT16
CELT17
CELT18
CELT
Total
1.00
.34**
(.005)
.26*
(.029)
.15
(.205)
.05
(.663)
.19
(.109)
.33**
(.006)
-.09
(.466)
.27*
(.025)
.12
(.326)
.10
(.414)
.24*
(.046)
.31*
(.011)
.23
(.063)
.09
(.461)
.12
(.311)
.26*
(.031)
.11
(.352)
.26*
(.031)
.49**
(.000)
1.00
.18
(.142)
-.05
(.704)
.26*
(.033)
.31*
(.010)
.41**
(.000)
.03
(.800)
-.11
(.349)
.10
(.416)
.13
(.269)
.05
(.668)
-.04
(.745)
.26*
(.031)
-.04
(.738)
.09
(.456)
.30*
(.011)
.10
(.396)
.26*
(.031)
.52**
(.000)
1.00
.31**
(.010)
.04
(.763)
.01
(.942)
.03
(.781)
.06
(.633)
.47**
(.000)
.17
(.152)
.08
(.528)
.11
(.347)
.05
(.655)
.10
(.391)
.10
(.434)
.22
(.074)
.14
(.241)
.13
(.283)
.19
(.125)
.47**
(.000)
1.00
.06
(.608)
-.06
(.621)
.15
(.205)
.00
(.989)
.09
(.467)
.21
(.081)
.19
(.113)
-.02
(.839)
.21
(.076)
.13
(.298)
.04
(.730)
.24*
(.049)
.21
(.086)
-.04
(.763)
.10
(.411)
.41**
(.000)
1.00
.07
(.556)
.07
(.540)
.23
(.053)
-.05
(.670)
.06
(.609)
.19
(.113)
-.02
(.839)
.12
(.340)
.01
(.949)
-.13
(.277)
.24*
(.049)
.15
(.228)
.05
(.659)
.04
(.730)
.40**
(.001)
1.00
.37**
(.002)
.04
(.758)
-.09
(.477)
.14
(.266)
.20
(.105)
.11
(.351)
-.01
(.951)
.13
(.278)
-.07
(.542)
.16
(.181)
-.07
(.566)
.09
(.442)
-.07
(.542)
.32**
(.007)
1.00
-.11
(.366)
-.25
(.038)
-.01
(.952)
.10
(.408)
-.04
(.760)
.14
(.251)
.38**
(.001)
-.20
(.092)
.11
(.366)
.07
(.557)
.09
(.464)
.11
(.358)
.30*
(.011)
1.00
.12
(.323)
.11
(.369)
.18
(.135)
.11
(.387)
-.09
(.481)
-.04
(.749)
-.02
(.890)
.13
(.286)
-.01
(.910)
-.06
(.633)
-.08
(.536)
.29*
(.016)
1.00
-.02
(.899)
.11
(.358)
.08
(.525)
.23
(.053)
.06
(.609)
.32**
(.007)
.09
(.448)
-.14
(.245)
-.03
(.815)
.04
(.727)
.26*
(.030)
1.00
.26*
(.030)
-.02
(.865)
-.14
(.259)
-.01
(.957)
.03
(.806)
.19
(.117)
.07
(.557)
.17
(.156)
.10
(.394)
.35**
(.003)
1.00
-.03
(.797)
.12
(.323)
-.03
(.836)
-.07
(.545)
.11
(.384)
.01
(.942)
.14
(.235)
-.22
(.074)
.26*
(.033)
1.00
.18
(.142)
-.12
(.333)
.05
(.701)
-.11
(.387)
-.01
(.940)
-.02
(.853)
-.07
(.562)
.20
(.101)
1.00
.21
(.079)
-.06
(.619)
-.13
(.286)
.14
(.262)
.07
(.580)
.16
(.177)
.26*
(.028)
1.00
-.04
(.736)
.22
(.070)
.18
(.144)
.17
(.154)
.02
(.883)
.39**
(.001)
1.00
.08
(.536)
-.10
(.410)
-.10
(.434)
.19
(.121)
.23
(.054)
1.00
.08
(.538)
.242*
(.045)
.02
(.890)
.48**
(.000)
1.00
.14
(.245)
.08
(.504)
.33**
(.006)
1.00
.09
(.486)
.31**
(.011)
1.00
.38**
(.001)
1.00
Appendix III: Spearman's Correlation Matrix of Final Course Grades to CAT Items for First Year Engineering Students (N = 69)
Final
Grade
Final
Grade
1.00
CAT 1
.07
(.587)
.21
(.088)
.20
(.097)
.32**
(.008)
.14
(.247)
.08
(.528)
.24*
(.044)
.06
(.619)
.15
(.216)
.13
(.286)
-.04
(.714)
.09
(.441)
.14
(.262)
.18
(.136)
.16
(.182)
.31**
(.009)
CAT 2
CAT 3
CAT 4
CAT 5
CAT 6
CAT 7
CAT 8
CAT 9
CAT 10
CAT 11
CAT 12
CAT 13
CAT 14
CAT 15
CAT
Total
CAT 1
CAT 2
CAT 3
CAT 4
CAT 5
CAT 6
CAT 7
CAT 8
CAT 9
CAT 10
CAT 11
CAT 12
CAT 13
CAT 14
CAT 15
CELT
Total
1.00
.02
(.862)
.09
(.464)
.11
(.374)
-.02
(.894)
.17
(.165)
.20
(.105)
.26*
(.031)
.07
(.549)
-.03
(.810)
.20
(.101)
-.13
(.304)
.13
(.302)
.36**
(.002)
-.02
(.888)
.31**
(.009)
1.00
.253*
(.036)
.23
(.058)
.24*
(.045)
.21
(.083)
.22
(.068)
.12
(.337)
.19
(.117)
.16
(.196)
-.12
(.308)
-.03
(.784)
.19
(.114)
-.02
(.900)
.11
(.374)
.39**
(.001)
1.00
.40**
(.001)
.13
(.294)
.23
(.054)
.41**
(.000)
.09
(.446)
.33**
(.006)
.04
(.721)
.03
(.812)
.03
(.789)
.20
(.107)
.15
(.232)
.25*
(.037)
.55**
(.000)
1.00
.21
(.085)
.34**
(.004)
.42**
(.000)
.08
(.527)
.30*
(.013)
.20
(.106)
-.10
(.428)
.30*
(.012)
.36**
(.003)
.19
(.110)
.52**
(.000)
.69**
(.000)
1.00
.46**
(.000)
.22
(.074)
-.03
(.826)
.19
(.111)
.27*
(.026)
.01
(.933)
.14
(.241)
.31**
(.009)
.19
(.117)
.11
(.382)
.43**
(.000)
1.00
.16
(.194)
.17
(.161)
.28*
(.019)
.18
(.142)
-.10
(.408)
.16
(.179)
.17
(.166)
.16
(.191)
.16
(.198)
.51**
(.000)
1.00
.04
(.773)
.31**
(.010)
.20
(.101)
.06
(.605)
.22
(.070)
.25*
(.038)
.10
(.413)
.19
(.122)
.50**
(.000)
1.00
.24*
(.043)
-.02
(.863)
-.01
(.926)
-.16
(.182)
.19
(.120)
.14
(.256)
.01
(.932)
.23
(.060)
1.00
.13
(.296)
-.08
(.539)
.03
(.820)
.16
(.178)
.13
(.301)
.21
(.081)
.47**
(.000)
1.00
.08
(.538)
.25*
(.038)
.19
(.115)
.23
(.055)
.14
(.265)
.41**
(.000)
1.00
-.02
(.874)
.07
(.572)
.17
(.152)
.02
(.882)
.15
(.212)
1.00
.18
(.130)
.04
(.720)
.11
(.378)
.23
(.053)
1.00
.53**
(.000)
.42**
(.000)
.65**
(.000)
1.00
.31*
(.010)
.62**
(.000)
1.00
.56**
(.000)
1.00
Download