Analysis of the 2006 CAAP Assessment of General Education: Critical Thinking

advertisement
Analysis of the 2006 CAAP Assessment
of General Education: Critical Thinking
Submitted by
Lou Milanesi
Executive Summary
UW-Stout administered the ACT CAAP Critical Thinking test as a standardized
nationally-validated measure of general education preparedness during the spring
semester of 2006. A total of 488 test scores were collected (97.6% of the target sample of
500) from upper division students (97.6% juniors and seniors; 46.8% females, 53.2%
males) whose instructors volunteered their classes for participation in the assessment.
Raw data from ACT was analyzed using the national ranking of Stout participants
(National Percent at or Below) to better understand the performance of our students
relative to national norms.
Strengths
• Overall, UW-Stout student ranking scores were statistically equal to the national
norm, whereas
• Mean ranking scores for UW-Stout students that reported exerting at least
“moderate effort” (74.2%) were significantly higher than the national norm (54.1
vs. 50.0)
Opportunities for Improvement
• Raise the target benchmark to above the national norm
• Increase effort levels of UW-Stout students taking the test
Action Plan for Improvement
• Work to improve the context of the test administration such that students perceive
a clear motivation to do well on the test.
• Investigate raising the “stakes” of the testing from “low” to “moderate”
Detailed Results and Analyses
Synopsis of Results
As Table 1 illustrates, 49.4% of Stout students scored at or above the national median
score, and Figure 1 shows that the mean of Stout rankings was 49.7 (SD = 27.0);
combined these demonstrate that the Stout sample is statistically equivalent to the
national sample (normally distributed mean = median = 50.0). We currently find no
indication that the sampling method used contributed to overestimating Stout student
abilities; instead, given evidence described below regarding the impact of individual
motivation on performance, we suggest that these data represent a conservative estimate
of Stout student abilities when compared to the overall national data.
We analyzed differential performance across subgroups defined by several demographic
grouping variables available within the data captured by the CAAP instrument; and these
analyses found no differences in performance on the central critical thinking measure
when we segmented by gender, junior/senior class standing, or whether students enrolled
at Stout as freshmen. There was a significant difference between mean performance
between those who spoke English as their first language (mean = 50.5, SD = 20.7) and
those who did not speak English as their first language (mean = 16.6, SD = 17.7),
however there were only 11 individuals in the latter group. We were unable to perform
statistical analyses based on ethnicity due to the homogeneity of the Stout population and
the overlap with English as a second language among the few non-Caucasians.
We next preformed multivariate analyses and found that both self-reported motivation
and self-reported GPA were significantly and independently related to performance on
the CAAP, with greater effort and higher GPA associated with higher levels of
performance on the critical thinking test. Regarding effort, Figure 2 indicates how much
effort Stout participants reported investing in taking the test, and Figure 3 shows the
relationship between effort and performance. Thus, on the average, those who invested at
least moderate effort performed at or above the mean of the national sample; whereas on
the average those who invested little or no effort (or declined to reveal their effort) fell
consistently short of the national average. Similarly regarding GPA, Figure 4 indicates
participants’ reported GPA levels, and Figure 5 displays the direct linear relationship
between GPA and performance on the CAAP test. Finally Figure 6 presents the combined
effects of effort and GPA on performance and relative standing to the national mean
(50.0) respectively.
Conclusions
Strengths: Simply stated, we found consistent support for asserting that, on the whole,
UW-Stout graduates are on par with their national peers regarding their critical thinking
abilities. Performance on the CAAP Critical Thinking test triangulated well with other
self-reported measures including GPA and individual effort invested in taking the test.
Moreover, these data replicate earlier results obtained using the same instrument in 2004
that, on the average, placed UW-Stout graduates’ critical thinking performance at or
slightly above the national average.
Opportunities for Improvement: While this analysis found that, as a group, UW-Stout
graduates are equal to the average critical thinking abilities of their peers across the
nation, it also suggests that there is ample room for improvement. Future benchmarking
could include raising the average performance of Stout graduates to an above average
target. This goal would be supported by simultaneously working to reduce the variability
observed regarding performance of individual students, such that a greater percentage of
individuals score at or above the national average.
As described above we found clear evidence linking individual motivation to
performance; and curiously, we found evidence that this effect seems to be mostly
attributable to the males in the test sample. Drilling down by gender and controlling for
the influence of GPA, we found that for females self-reported motivation had a near zero
influence in predicting performance on the test (0.2%); whereas under the same control
conditions, we found that for males self-reported motivation predicted 13.5% of test
performance. This gender effect is particularly troubling in that the males in the sample
also reported significantly lower GPA achievements than their female counterparts.
Therefore, since both genders are exposed to the same instruction, we conclude that
meaningful opportunities for improvement need to extend beyond (e.g. recruiting,
engaging, advising, etc.), but by no means exclude, curriculum revision.
Performance the ACT CAAP Critical Thinking test should be viewed holistically with
other general education data (such as the 2005 ACT writing assessment) to inform an
overarching strategic action plan for general education improvement; a plan that where
appropriate includes curriculum revision and extracurricular processes that are preceded
by systematic improvements to assessment practices. This analysis indicates that, while
UW-Stout students’ current performance on the CAAP is equal to the national norm, their
overall true ability is likely above the national norm but not seen due to inconsistencies in
the effort invested by students taking the test.
“Low Stakes” versus “High Stakes testing”: Currently, UW-Stout employs a very low
stakes approach to general education testing; one where individual instructors “volunteer”
students in the sections of courses they teach as participants, and testing is conducted
solely within those classes. Students taking the CAAP fully understand that they will
experience no consequences for poor performance, whereas only a few appreciate the
potential advantages of doing well on the test. In contrast, a high stakes approach to
general education testing mandates performance benchmarks on an identified test and
imposes significant consequences on those that fall short of meeting them (denial of
admission, failure to advance to the next higher level of progress and/or failure to
graduate). Our inquiry to ACT revealed that a few institutions have already moved to
high stakes models. For example, all six state institutions in South Dakota test Juniors
and provides intervention if necessary. San Jose State University and Fresno State
University also use ACT CAAP and both require a passing grade on the ACT CAAP
Writing test before students are allowed into an upper-level writing class which is
required for graduation. Our contact at ACT was quick to point out that ACT does not
provide a set cut score; however, in his experience most institutions set their cut scores at
one standard deviation below the national mean.
Mandatory Assessment Days as a “Medium Stakes” testing alternative:
Several institutions have mandated “Assessment Days” wherein a variety of institutional
data are collected. Participation is mandatory, and students are individually assigned to a
specific assessment activity for the day. Through this assignment/sampling approach,
institutional assessment is simultaneously conducted at a variety of levels (program,
college/school, university) and participation is monitored and enforced. The “stakes” for
the student are contractually defined by the individual institution as a condition of
acceptance/enrollment to the school.
Table 1. Examples of Assessment Day Models
Winona State
University of Southern Indiana
James Madison University
Eastern New Mexico
University
Geneva College
http://www.winona.edu/AIR/info/info.htm
http://www.usi.edu/depart/instires/AssessmentDay.asp
http://www.jmu.edu/assessment/JMUAssess/Aday_Overview.htm
http://www.enmu.edu/academics/excellence/assessment/students/day/index.shtml
http://www.geneva.edu/object/assess_day01.html
In my opinion, this approach could work well at UW-Stout if the following were
accomplished.
1. Establish a highly integrated and aligned central model of assessment, one that
a. Identifies all mandated and ongoing needs for information/data for both
external and internal stakeholders
b. Simplifies and better coordinates existing assessment practices
c. Defines a scope of assessment activities broad enough to engage a
significant number of students in the Assessment Day activities
2. The model developed above is used to determine the optimal timing of data
collection.
3. It is adequately supported as a shared governance process.
Suggested immediate improvements to the current approach to testing:
While the approaches to testing describe above will need to be studied and discussed to
be deployed by UW-Stout, some immediate improvements can be made to existing
assessment practices to better engage students in the testing activity. For example, the
instructors that volunteer their students need to be better engaged in the testing activity.
Historically, some instructors have been quite enthusiastic about the assessment activities
and openly endorsed them to their students prior to testing. However, other instructors
described the assessment activity as something that “had to be done” to satisfy others.
These differences in the attitudes of instructors should clearly impact student effort that is
directly linked to performance on the test. Instructors could also be used to build
individual incentive to do well on the CAAP assessment by calling attention to the
recognition certificates provided by ACT for above average performance and how these
could be used to enhance student employment portfolios. Currently, we call attention to
the chance to win a prize that is randomly drawn from all participants. This approach has
two basic shortcomings. First, as in any raffle, most individuals will not expect to win,
and more importantly, being eligible to win is not contingent upon test performance.
Since the testing protocol described by ACT requires reading a standard script that
clearly lacks any inspiration to the students, I suggest we provide instructors (or
spokespersons) a presentation that emphasizes the personal benefits of doing well on the
test.
Appendix
Table 1
Distribution of National Percent at or Below Ranking
Stout Participants in the 2006 ACT CAAP Critical Thinking Assessment
Valid
Cumulative
Percent
.2
0
Frequency
1
Percent
.2
Valid Percent
.2
1
2
.4
.4
.6
2
4
.8
.8
1.4
3
8
1.6
1.6
3.1
6
6
1.2
1.2
4.3
8
11
2.3
2.3
6.6
11
12
2.5
2.5
9.0
15
23
4.7
4.7
13.7
19
17
3.5
3.5
17.2
24
18
3.7
3.7
20.9
29
36
7.4
7.4
28.3
34
56
11.5
11.5
39.8
40
24
4.9
4.9
44.7
45
29
5.9
5.9
50.6
50
33
6.8
6.8
57.4
61
27
5.5
5.5
62.9
66
38
7.8
7.8
70.7
72
55
11.3
11.3
82.0
79
18
3.7
3.7
85.7
85
15
3.1
3.1
88.7
90
20
4.1
4.1
92.8
94
14
2.9
2.9
95.7
98
13
2.7
2.7
98.4
99
8
1.6
1.6
100.0
488
100.0
100.0
Total
Figure 1
Distribution of National Placement; Percent at or Below Individuals Score for All UW-Stout Students
Tested
Distribution of National Level Placement;
Percent at or Below Individual's Score
60
50
Frequency
40
30
20
10
Mean = 49.69
Std. Dev. = 26.992
N = 488
0
0
20
40
60
80
100
National Percent at or Below
Figure 1A
Distribution of National Placement; Percent at or Below Individuals Score for Those Reporting at
Least “Moderate” Effort
50
Frequency
40
30
20
10
Mean =54.07
Std. Dev. =25.597
N =362
0
0
20
40
60
National Percent at or Below
80
100
__
Figure 2
Self-reported Levels of Motivation within the Stout CAAP Participants
Motivational Responses
No Report of Effort
Tried my best
Gave moderate effort
Gave little effort
Gave no effort
1.6
8
16.2
45.5
28.7
Figure 3
Mean CAAP Performance across Self-reported Levels of Motivation
100
Mean National Percent at or Below
90
80
70
60
50
40
30
58.6
51.2
20
42.3
32.3
10
9
0
Did not
report effort
Gave
no effort
Gave
little effort
Gave
moderate effort
Tried
my best
Self-reported Motivational Level
Table 2
Statistically Significant CAPP Performance Differences across Self-reported Levels of Motivation
Effort Level
Groups
NO effort
LITTLE effort
Statistically Lower
than effort level
Statistically Equal
to effort level
Statistically Higher
than effort level
ALL OTHER
GROUPS
NO
Did not report
MODERATE &
BEST
NO & LITTLE
Did not report &
BEST
MODERATE
MODERATE
effort
BEST effort
Did not report
effort
BEST
LITTLE &
MODERATE
NO, LITTLE & Did
not report
NO
Figure 4
Self-reported GPA within the Stout CAAP Participants
Cummulative GPA
Category %
2.00 to 2.50
2.51 to 3.00
3.01 to 3.50
3.51 or above
8.8
20.6
29
41.5
Figure 5
Mean CAAP Performance across Self-reported Levels of GPA
Mean National Percent at or Below
100
80
60
40
62.7
52.5
41
20
33.1
0
2.00 to 2.50
2.51 to 3.00
3.01 to 3.50
Cummulative GPA Category
3.51 or above
Table 3
Statistically Significant CAPP Performance Differences across Self-reported GPA Levels
GPA Level Groups
2.00 to 2.50
2.51 to 3.00
3.01 to 3.50
3.51 or above
Statistically Lower
than GPA level
3.01 through 3.51
or above
3.01 through 3.51
or above
3.51 or above
Statistically Equal
to GPA level
2.51 to 3.00
Statistically Higher
than GPA level
2.51 to 3.00
Figure 6
Mean CAAP Performance across Self-reported Levels of Effort
Clustered by Self-reported Levels of GPA
2.00 through 3.00
2.00 through 3.50
Download