STLF Report to CWSEI and Mathematics Department

advertisement
STLF Report to CWSEI and Mathematics Department
STLF: Sandra Merchant
Period: 05/02/12 – 15/05/12
Submitted: 15/05/12
Specific activities performed by STLF
1) Professional development
• Attended the morning portion of the CWSEI end-of-year event (Apr 20).
2) MATH SEI general meetings/activity
• Met with Director Costanza Piccolo to review the status of my projects and plan for work
during my part-time employment (Jan 6, Mar 2).
3) Course-specific meetings/activities
Tracking Proof Skills Project (MATH 220 – Mathematical Proof and subsequent courses)
1. Started writing a paper about the proof skills diagnostic test. This paper will outline the
process by which the test was developed and how it has been used so far to assess learning in
MATH 220. The paper will include an item analysis of each of the questions on the diagnostic
test. I performed the item analysis on a variety of subsets of data, including pre-test and posttest data, as well as section-by-section, term-by-term, and all terms pooled. The results of the
item analysis differ slightly depending on the data treatment. In our opinion the most relevant
treatment is the pooled pre-test, as this represents the general incoming population for MATH
220, whereas post-test results may depend on instructional differences within the course. The
item analysis for the pooled pre-test is summarized in the following table:
Test
Item
1
2
3
4a 4b 4c 4d 5
6
7
8
9
10 11 12 13 14 15
Difficulty
Index (P)
.60 .56 .70 .84 .62 .90 .95 .71 .77 .54 .85 .71 .59 .37 .24 .08 .43 .34
Discrimination .36 .52 .50 .27 .40 .25 .10 .40 .33 .47 .31 .66 .52 .55 .41 .29 .61 .51
Index (D)
Item-to-total
Correlation
.08 .25 .23 .15 .09 .16 .04 .20 .15 .18 .19 .33 .21 .20 .18 .25 .26 .09
Difficulty index (P): the proportion of responses that are correct.
Discrimination index (D): an extreme group method is used. The “high performing” group
is comprised of the top 21% of scores on the full test, the “low performing” group is comprised
of the lowest 21% of scores on the full test. D is the difference between the proportion correct
in the high performer group and the proportion correct in the low performer group.
2. Revised the basic proof skills diagnostic test according to the results of the item analysis.
Specifically, we removed two questions (4c and 4d) that were deemed too easy and that had
low discriminatory power. In addition, several questions were reworded and formatted to
clarify them and the problems were grouped into two parts that were more clearly labelled to
require only one answer or multiple answers (in the term 2 post-test many students circled
multiple answers for questions with only one correct answer). Finally, we have changed the
scoring method for questions 13, 14 and 15. The dichotomous scoring for these questions gave
a score of 0 if any errors were present (there are 8 parts to the question). The new scoring
method allows a single error to receive a score of 1. The item analysis for the diagnostic test
with these changes is shown in the table below (note: this table includes Summer 2012 pre-test
data, the previous table does not):
Test
Item
1
2
3
4a
4b
5
6
7
8
9
10
11
12
13
14
15
Difficulty
Index
.62 .59 .69 .87 .61 .70 .78 .56 .87 .71 .62 .37 .26 .29 .50 .46
Discrimination .37 .60 .50 .23 .45 .42 .38 .52 .30 .62 .60 .55 .45 .45 .40 .33
Index
Item-to-total
Correlation
.12 .26 .23 .13 .16 .20 .18 .19 .20 .32 .26 .19 .24 .26 .18 .05
We are now satisfied with the test items, as there is a broad range of difficulty levels and nearly
all items exceed the desired discrimination index of 0.30. In addition, the test as a whole has a
broad distribution of test scores (Ferguson's delta is 0.95) and reasonable overall difficulty
(mean score on the pre-test is 59%). We also think the test has high test-retest reliability, since
the correlation coefficient between difficulty indices for the same question in Term 1 vs. Term
2 is r = 0.92 (r2 = 0.842).
3. Examined the learning gains on the basic proof skills diagnostic to see if we could detect an
effect on learning caused by the new workshops introduced in MATH 220 in Term 2. The
results are summarized in the following table:
Pre-test Mean
(std error)
Post-test Mean
(std error)
Mean NLG
(std error)
Full
Log
Alg
Rdg
Full
Log
Alg
Rdg
Full
Log
Alg
Rdg
2011 WT1
(control)
N=62
0.59
(.02)
0.65
(.02)
0.32
(.02)
0.38
(.02)
0.78
(.02)
0.85
(.02)
0.77
(.03)
0.69
(.03)
0.41
(.07)
0.49
(.09)
0.45
(.08)
0.20
(.11)
2011 WT2
(workshops)
N=59
0.59
(.02)
0.66
(.03)
0.52
(.04)
0.57
(.03)
0.72
(.02)
0.83
(.02)
0.56
(.04)
0.71
(.03)
0.32
(.05)
0.43 -0.05 0.35
(.07) (.10) (.07)
Unfortunately, the diagnostic test did not detect any effect of the workshop treatment on the
learning of the students. Possible reasons for this include:
1. The workshops did not improve learning.
2. The diagnostic test is not sensitive to the type of improvement caused by the workshops.
3. There is too much "noise" in the data to detect the effect. That is, differences in class
composition, material covered, homework, etc. are too large to detect the effect of the
workshops.
Most likely, it is a combination of the last two reasons that is responsible. In particular, the
2011 WT1 group was comprised of two sections, and despite similar pre-test scores, they
differed substantially for post-test scores. In fact, it seems there are often non-trivial
differences in post-test scores between section. The table below summarizes the pre-test
scores and normalized learning gains for all five section in the 2011 academic year.
Pre-test Mean
(std error)
Mean NLG
(std error)
Full
Log
Alg
Rdg
Full
Log
Alg
Rd
2011 SUM – 921
N=32
0.62
(0.03)
0.67
(0.04)
0.55
(0.05)
0.61
(0.04)
0.11
(0.11)
0.13
(0.20)
0.14
(0.13)
0.01
(0.13)
2011 WT 1 - 101
N=32
0.58
(0.02)
0.66
(0.03)
0.30
(0.02)
0.37
(0.03)
0.32
(0.08)
0.45
(0.12)
0.29
(0.12)
0.17
(0.16)
2011 WT1 – 102
N=30
0.60
(0.03)
0.65
(0.03)
0.35
(0.03)
0.38
(0.03)
0.51
(0.11)
0.52
(0.14)
0.65
(0.13)
0.34
(0.14)
2011 WT 2 – 201
N=38
0.62
(0.03)
0.67
(0.04)
0.51
(0.05)
0.62
(0.04)
0.37
(0.06)
0.42
(0.09)
0.16
(0.13)
0.43
(0.11)
2011 WT 2 – 202
N=21
0.55
(0.03)
0.63
(0.04)
0.54
(0.06)
0.47
(0.04)
0.23
(0.06)
0.46
(0.10)
-0.35
(0.17)
0.22
(0.08)
Current Project Status (material was prepared by either STLF or other members of the MATH
SEI group)
MATH 220:
Learning Goals: Learning goals have previously been created for this course and are in use.
Assessments: The basic proof skills diagnostic pre-test was administered in the current session
(summer 2012) and the post-test will be given at the end of term.
New Methods/Materials: The basic proof skills diagnostic was revised.
Plan for immediate future work
MATH 220:
1. Compare workshop vs. control treatment on a question-by-question basis
2. Assist with the workshops for the summer term and observe what learning is occurring in the
workshops.
3. Investigate differences in instruction and course materials for the various sections of the 2011
academic year. In particular, focus on the 2011 WT 1 section 102 class, which had high
learning gains.
4. Compare the workshop vs. control treatment on isometric final exam questions (if such a
question set exists!).
5. Perform student validation on the portions of the basic proof skills diagnostic that have not
been validated.
6. Establish a timeline for the remainder of the MATH 220 project.
Higher-Level Proof Courses (likely MATH 312 and MATH 342)
1. Create a detailed timeline for the development of the higher-level proof skills test.
2. Determine the key skills we would like to assess with a higher-level proof diagnostic.
Examine and possibly code some past MATH 312 and MATH 342 final exams as a start on
this.
Download