Document 12884636

advertisement
Journal of Assessment and Accountability in Educator Preparation
Volume 2, Number 1, February 2012, pp. 23-35
Evidence for Improved P-12 Student Learning and
Teacher Work Sample Performance from PreInternships to Student-Teaching Internships
Peter R. Denner, Shu-Yuan Lin, Julie R. Newsome,
Jack D. Newsome, Deborah L. Hedeen
Idaho State University
Teacher Work Samples (TWSs) were examined for the P-12 student learning reported by pre-interns (TWS1)
and student-teaching interns (TWS2), and for the total TWS scores of the teacher candidates at the two
internship levels. Across sets of TWSs from different years, teacher candidates performed better overall as
student-teaching interns on their TWSs than the same candidates did as pre-interns. The percentages of
students achieving the lesson targets and showing learning gains were high for both pre-interns and studentteaching interns. The student-teaching interns reported a higher percentage of their students who showed
learning gains from lesson pre-assessments to post-assessments than the pre-interns, but fewer students
achieving the lesson targets on the post-assessments according to their stated success criteria. The latter
finding was partly due to the pre-interns selecting an easier learning goal for their first achievement target.
The TWS total scores were related positively to each of the student learning measures for the student-teaching
interns, but only one of the measures for the pre-interns. Recent teacher candidates had better evidence for
their impacts on student learning and higher percentages of their students who show learning gains than
teacher candidates who graduated five years earlier. The results support a positive influence for a sequential
teacher preparation program on the abilities of teacher candidates to meet targeted teaching standards and to
support student learning as they progress from pre-interns (TWS1) to student-teaching interns (TWS2).
During the last decade, teacher preparation programs
have responded to national and state mandates to set
rigorous standards for teacher training and to
demonstrate their accountability for high-quality
teacher preparation. An important aspect of accountability is establishing the link between teacher
candidates’ performance and evidence for their impacts
on the learning of the students they teach (Schalock,
Schalock, & Myton, 1998). Indeed, National Council
for Accreditation of Teacher Education’s (NCATE,
2008) Professional Standards for the Accreditation of
Schools, Colleges, and Departments of Education
requires teacher preparation programs to provide
evidence of the impact of their candidates on the
learning of P-12 students. The NCATE requirement
emerged from value-added research showing that
teacher effects on student learning are both additive and
cumulative (see Marzano, 2003, Chapter 8 for a
review).
Schalock, Cowart, and Staebler (1993) proposed
that teacher impacts on student learning could be
examined in two ways: (a) as teacher effectiveness,
defined by Schalock et al. as positive impacts on
student achievement resulting from shorter-term
Correspondence:
Peter Denner, College of Education, 921 South 8th Ave., Stop 8059, Idaho State University, Pocatello, ID
83209. Email: dennpete@isu.edu
Author Note:
Special thanks to the many teacher education faculty (too numerous to name here) who tirelessly scored
Teacher Work Samples for the College of Education at Idaho State University between 2003 and 2009.
Journal of Assessment and Accountability in Educator Preparation
Volume 2, Number 1, February 2012, 23-35
24
Journal of Assessment and Accountability in Educator Preparation
instruction, and (b) as teacher productivity, defined as
gains in student achievement on state-mandated
achievement tests resulting from longer-term instructtion. For teacher preparation programs to connect
candidate performance measures to long-term teacher
productivity, it is first essential for them to be able to
show the link between their candidate performance
measures and short-term teacher effectiveness.
Schalock (1987) was among the first to suggest that the
effectiveness of prospective teachers could be assessed
using a measure of learning gains as demonstrated in
the context of teacher work samples.
Responding to the call to connect candidate
performance to student learning, the Renaissance
Partnership for Improving Teacher Quality (Pankratz,
1999), as one of its strategies, adapted the Western
Oregon University (Schalock, Schalock, & Girod,
1997) Teacher Work Sample Methodology (TWSM).
In addition to documenting candidates’ abilities to plan,
deliver, and assess a standards-driven instructional
sequence, the Renaissance Teacher Work Sample
(RTWS) assessment requires teacher candidates to
profile their impacts on student learning, and to reflect
on the results of their instruction in order to increase
student learning. Consistent with the Western Oregon
University TWSM, the Renaissance approach (Denner,
Norman, Salzman, Pankratz, & Evans, 2004; Pankratz,
1999) has been to set specific criteria for quality
teaching performance in the RTWS standards-linked
scoring rubric. The criteria take into consideration the
significance of the learning goals, quality of the
assessments used to measure student learning, and the
candidates’ abilities to profile student performance
relative to the learning goals (Denner, et al., 2004).
Teacher candidates are not held directly accountable by
the RTWS scoring criteria for their effectiveness.
Nevertheless, as part of their analysis of student
learning, teacher candidates are asked to report learning
gains and the number and percent of their students who
achieved two of the learning goals (or achievement
targets) of the lessons. Consequently, the relationship
between the RTWS total scores and these reported
measures of impact on student learning could be
examined.
The present study employed Idaho State
University’s (ISU) implementation of the RTWS. A
major purpose of the present study was to determine
whether TWS total scores are related to the teacher
candidates’ impact on P-12 student learning (teaching
effectiveness) as reported in their TWSs. McConney,
Schalock, and Schalock (1998), using the Western
Oregon University TWSM, reported positive links
between student learning as measured by an index of
pupil growth and instructional variables measured
within the context of their TWSM. One previous study
(Denner & Salzman, 2003), using a very small number
of the RTWSs, showed a positive trend across the
RTWS holistic scores of teacher candidates and the
percentage of their students who showed learning
gains. In an effort to extend the findings of the
previous investigations, this study investigated the
relationship between TWS total scores and teacher
effectiveness as measured by the percentage of the P-12
students meeting achievement targets on the postassessment and the percentage of the P-12 students who
showed improvement from the pre-assessment to the
post-assessment in the context of the TWS lessons.
At ISU, teacher candidates learn about the essential
skills required for their TWSs in foundations
coursework and are required to write an initial TWS
(TWS1) during a pre-internship associated with a
general methods course (EDUC 309 Planning, Delivery
& Assessment, 6 credits) as a critical assessment for
entrance to student-teaching. The candidates complete
a second TWS (TWS2) during their student-teaching
internships as a critical assessment for program
completion. The candidates’ performance scores and
their student learning data are retained from both TWSs
as part of the unit assessment system. Because of the
availability of data from the two TWSs, the TWS total
scores and the reported measures of student learning
could be examined for changes from TWS1 to TWS2.
Expanding beyond the previous studies, a major
purpose of the present study was to determine whether
value was added to the candidates’ teaching
effectiveness as they progressed sequentially from preinterns (TWS1) to student-teaching interns (TWS2) in
their teacher preparation programs.
Methods
Participants
The first set of participants were 548 teacher
candidates who completed one or more Teacher Work
Samples (TWSs) at ISU during the period from fall
2005 through spring 2007. In this set, there were 288
pre-interns and 260 student-teaching interns. For the
Improved Student Learning and TWS Performance
pre-interns, 49.5% of the TWSs (TWS1s) were from
candidates in elementary education and 50.5% from
candidates in secondary education. For the studentteaching interns, 43.8% of the TWSs (TWS2s) were
from candidates in elementary education and 56.2%
from candidates in secondary education. Of the 548
teacher candidates, there were 152 teacher candidates
who completed both internships (both TWS1 and
TWS2) within the period from fall 2005 to spring 2007.
As a follow-up reexamination, the paired TWSs (TWS1
and TWS2) for a second set of 84 teacher candidates
who completed their student-teaching internships
(TWS2s) in calendar year 2008, and who had
previously completed a TWS1, were examined for
documentation of the candidates’ impacts on P-12
student learning. The 84 pairs of TWSs included 41
(48.8%) from candidates in elementary education and
43 (51.2%) from candidates in secondary education.
Sixty-three of these TWSs were submitted by female
teacher candidates (75.0%) and 21 of the TWSs were
submitted by male teacher candidates (25.0%).
In addition, a representative set of 20 paired TWSs
collected by Denner, Newsome, and Newsome (2005)
from teacher candidates at ISU who completed their
pre-internship during the spring of 2003 and who
completed their student-teaching internship during the
fall of 2003 were compared with the later sets of TWSs
with respect to the reported student learning measures.
The retrospective set of TWSs from 2003 consisted of
20 pairs of TWS1s and TWS2s. The paired set of
TWSs included nine (45.0%) from candidates in
elementary education and 11 (55.0%) from candidates
in secondary education. Fifteen (75.0%) of the TWS
pairs were submitted by female teacher candidates and
five (25.0%) were submitted by male teacher
candidates.
Measures
The teacher candidates in this study completed
their TWSs according to the guidelines employed at
ISU. The guidelines specified the standards to be
demonstrated and the tasks to be performed. (See
http://ed.isu.edu/depts/assistdean/assistdean_index.shtm
l for the targeted TWS standards, TWS guidelines, and
TWS scoring rubric.) A description of the tasks
required for the TWSs was presented in Denner,
Salzman, Newsome, and Birdsong (2003). Among the
required TWS tasks, all teacher candidates were asked
to profile and analyze student learning for at least two
25
of the achievement targets (learning goals) of the TWS
lessons.
The instructors of the courses associated with the
internships used the TWS scoring rubric to rate the
indicators of each standard, and then the eight
standards, on a three-point scale of 0 = Not Met, 1 =
Met Acceptable, or 2 = Met At Target. For this study,
the TWS total scores were determined by summing the
scores for each standard. Hence, the TWS total scores
could vary from 0 to 16 points.
The additional measures employed in this study
came from the student learning measures reported by
the teacher candidates in their TWSs. The teacher
candidates reported the number and percent of their
students who achieved each of two featured
achievement targets (Target1 and Target2) and the
number and percent of their students who showed
improvement (learning gains) from the pre-assessment
to the post-assessment on those same achievement
targets (Target1 and Target2). Inspection of the TWSs
indicated the candidates typically set a criterion of 75%
or 80% of the possible points for achievement of the
two lesson targets on their post-assessments, depending
upon the possible points on the measure. Two
additional measures were generated for this study by
averaging the percentage of students showing gains and
the percentage of students achieving the targets across
Target1 and Target2.
This study did not examine measures of teacher
candidate demographics. Denner, Norman, and Lin
(2009) examined the effect of various demographic
characteristics of teacher candidates—including gender,
age, race/ethnicity, grade point average, and program—
on TWS performance levels. They found program
major was a consistent predictor of TWS performance
along with grade point average, but other demographic
factors were not primary predictors. As reported
previously, the proportions of teacher candidates in
elementary education and in secondary education were
similar across all of the sets of TWSs in this study.
In addition, no effort was made in this study to
control for lesson content, number of students taught,
student demographics, grade level, school setting, or
other factors that might affect teaching effectiveness as
measured in this study. The position of the developers
of the RTWS (see Denner et al., 2004), which followed
the lead of the developers of the Western Oregon
University TWSM (see Schalock et al., 1998), has been
that teacher candidates should consider such factors
when they plan their lessons and demonstrate positive
26
Journal of Assessment and Accountability in Educator Preparation
impacts on student learning regardless of such
situational and contextual factors. Indeed, Wright,
Horn, and Sanders (1997) have shown variables such as
class size and student heterogeneity exert little
influence on teacher effectiveness as measured by
academic gain. For the candidates for whom we had
scores from TWS1 and TWS2, as explained in Denner
et al. (2003), policies at ISU regarding internship
placements ensured that the same teacher candidate
taught a different topic to different students at a
different grade level in a separate semester and usually
at a different school, when he or she completed the two
TWSs. Again, no effort was made to control for any
situational or contextual factors when comparing the
TWS1 to the TWS2 performances of these teacher
candidates.
Scoring
The TWS scores used in this study were the scores
assigned by the program faculty of the courses the
candidates were required to take in conjunction with
their internships. All of the course instructors were
trained and experienced raters. A random sample of 50
TWSs was selected by internship level (25 TWS1s and
25 TWS2s). The 50 TWSs were rescored by a trained
and experienced rater who had not previously scored
any of the TWSs used in this study. The Pearson
correlation between the original TWS total scores and
the second ratings was r = .92, p < .001, indicating a
high level of inter-rater agreement and sufficient
scoring reliability for the purposes of this study.
The reliability of the achievement and
improvement percents reported by the teacher
candidates for their students in their TWSs was not
assessed. Because the ISU teacher candidates were
only required to report these measures, and they were
not required to demonstrate any level of achievement or
learning gains, there was no incentive for the teacher
candidates to make false reports. When submitting
their TWSs, the teacher candidates signed an affidavit
affirming that the reported work was completed by
them and was not being reported dishonestly.
Cooperating teachers and university supervisors
observed the candidates during the TWS lessons and
reviewed their TWS reports. No additional effort was
made to verify the accuracy of the achievement and
improvement data as reported by the teacher
candidates.
Procedures
The TWSs were completed to meet graduation
requirements for teacher education programs in the
College of Education at ISU. The TWS scores and
student learning measures are routinely collected and
entered into a database for the teacher education
programs. This study made use of the existing TWS
scores, the existing student learning measures (the
reported percent of students meeting Target1 and
Target2 and the reported percent of students who
showed improvement from the pre-assessment to the
post-assessment of Target1 and Target2) contained in
the database. The records of the candidates were
located as found sets by the principal investigator
within the database via a search of the records based on
the semester and year when the TWSs were completed.
The data were then exported from the database and
entered into Windows® SPSS®16.0 for data analysis.
The TWSs from calendar year 2003 had been collected
as part of a previous study (Denner et al., 2005). The
student learning measures were available from the
existing data files of the earlier study, but had not been
reported. The use of the existing data was approved by
the ISU Human Subjects Committee. Candidates in the
ISU teacher education programs were informed of the
purposes of the unit assessment system and the fact that
their assessment information may be used as part of
program evaluation studies.
Design
The design was descriptive, correlational, and
causal-comparative. Descriptive statistics for TWS
performance levels and the student learning measures
were calculated separately by internship level for the
TWS sets. Regression analyses and Pearson correlations were used to investigate the relations of the
participants’ total TWS scores and the student learning
measures by internship level for the initial TWS set.
The effects of internship level on TWS performance
and the student learning measures were examined only
for the sets of teacher candidates that produced both
TWS1 and TWS2 during the periods of the study. The
effects were tested using correlated t-tests.
Primary dependent variables were total TWS
scores, average percentage of students achieving the
lesson targets, and the average percentage of students
showing improvement on the lesson targets. Other
dependent variables were the reported percent of
Improved Student Learning and TWS Performance
students achieving each of the two lesson targets
(Target1 and Target2), and the reported percent of
students showing improvement on the two lesson
targets. The level of significance was set at α = .05 for
statistical tests for separate dependent variables and
was held at α = .033 (FWE = .10) for statistical tests
performed for related dependent measures. Cohen’s d
was reported as the measure of effect size for all t-tests.
Results
Reported Impacts on Student Learning
Table 1 presents the means and standard deviations
for the student learning measures reported in the TWSs
by internship level (TWS1 and TWS2) by the 20052007 teacher candidates. The pre-interns reported an
average of 81.6% of their students met the achievement
targets on the post-assessments, whereas the average
percent reported for the students taught by the studentteaching interns was lower at 75.5%. In addition, an
average of 80.6% of the students taught by pre-interns
showed pre-assessment to post-assessment learning
gains, while 92.6% of the students taught by studentteaching interns showed learning gains. When the preinterns reported a higher percentage of students who
achieved Target1, they also reported a lower percentage
of students who showed improvement on the same
target from pre-assessment to post-assessment. In
contrast, although the student-teaching interns reported
lower percentages of students achieving the criterion
they set for Target1 and Target2, they reported a higher
percentage of students showing improvement for both
achievement targets.
Relations of TWS Scores to the Student
Learning Measures
Table 1 also presents the means and standard
deviations for TWS performances of the 288 preinterns who completed TWS1 between fall 2005 and
spring 2007 and for the 260 student-teaching interns
who completed TWS2 during the same period. The
mean TWS1 total score was 13.1 (SD = 2.3) and the
mean TWS2 total score was 14.7 (SD = 1.6).
For the 2005-2007 pre-interns, a linear regression
of the TWS1 total scores on the student learning
measures showed a statistically significant relationship
27
to the reported percent of students achieving Target2,
F(1, 275) = 5.78, MSE = 483.41, p = .017, r = .14, but
not to the reported percent of students achieving
Target1, F(1,274) = 0.07, MSE = 303.47, p = .796, r =
.02, or to the average percent of the students achieving
the targets, F(1,272) = 3.14, MSE = 230.44, p = .077, r
= .11. The regression for the pre-interns of their TWS1
total scores on the average percent of students showing
improvement (learning gains) on the achievement
targets from the pre-assessment to the post-assessment
did not yield a statistically significant relationship, F(1,
270) = 1.85, MSE = 239.35, p = .174, r = .08. There
was also no statistically significant relationship to the
reported percent of students showing improvement on
Target1, F(1, 276) = 0.53, MSE = 498.75, p = .466, r =
.04, or to the reported percent of students showing
improvement on Target2, F(1, 270) = 1.52, MSE =
354.94, p = .219, r = .08. Together, these results
indicate the TWS1 total scores of the pre-interns were
only positively related to achievement for the second
featured achievement target (Target2) and not to any of
the other student learning measures.
For the 2005-2007 student-teaching interns, the
regression analyses revealed that the TWS2 scores were
related statistically to each of the student learning
measures.
There was a statistically significant
relationship between the TWS2 total scores and the
reported percent of students achieving Target1, F(1,
251) = 4.64, MSE = 468.94, p = .032, r = .14, the
reported percent of students achieving Target2, F(1,
250) = 4.88, MSE = 517.67, p = .028, r = .14, and the
average percent of the students achieving the targets,
F(1,250) = 6.13, MSE = 400.70, p = .014, r = .16. A
statistically significant relationship was also shown for
the student-teaching interns between their TWS2 total
scores and the reported percent of students showing
improvement Target1, F(1,252) = 5.05, MSE = 185.37,
p = .025, r = .14, the reported percent of students
showing improvement on Target 2, F(1,250) = 12.93,
MSE = 243.69, p < .001, r = .22, and the average
percent of the students showing improvement, F(1,
250) = 10.87, MSE = 178.38, p = .001, r = .20.
Although these relationships were small, TWS
performance was positively linked to the P-12 student
learning impacts of the student-teaching interns.
28
Journal of Assessment and Accountability in Educator Preparation
Table 1
Means and Standard Deviations for the Teacher Work Sample (TWS) Total Scores and Student Learning
Measures by Internship Level (TWS1 and TWS2) of all 2005-2007 Teacher Candidates and the Teacher
Candidates with Paired TWSs.
All Teacher Candidates
n
TWS2
M (SD)
n
TWS2
M (SD)
n
13.1 (2.3)
260
14.7 (1.6)
152
13.3 (2.2)
152
14.9 (1.4)
274
81.6 (15.2)
252
75.5 (20.2)
140
82.0
(15.4)
140
76.1 (20.9)
Percent Achieving Target1
276
88.7(17.4)
253
76.1(21.8)
140
89.5
(16.2)
140
77.0 (22.5)
Percent Achieving Target2
277
74.5 (22.2)
252
75.0 (22.9)
140
74.6
(22.4)
140
75.2 (23.0)
Average Percent Showing
Improvement
272
80.6 (15.5)
252
92.6 (13.6)
140
80.1
(16.9)
140
92.4 (15.3)
Percent Improving on Target1
278
73.0 (22.3)
254
92.8 (13.7)
140
70.9
(24.3)
140
92.4 (15.2)
Percent Improving on Target 2
272
88.4 (18.9)
252
92.4 (16.0)
140
89.3
(19.3)
140
92.4 (17.3)
Measures
n
TWS Total Scores
288
Average Percent Achieving
Lesson Targets
TWS1
M (SD)
Teacher Candidates with TWS Pairs
TWS1
M (SD)
Note. TWS stands for Teacher Work Sample. TWS1 was completed by pre-interns and TWS2 was completed
by student-teaching interns. Target1 refers to the first lesson achievement target featured in a TWS. Target2
refers to the second lesson achievement target featured in a TWS.
Improved Student Learning and TWS Performance
29
Table 2
Means and Standard Deviations for the 2008 and 2003 Student Learning Measures by Internship Level
(TWS1 and TWS2).
2008
2003
TWS1
TWS2
Student Learning Measures
n
M (SD)
M (SD)
Average Percent Achieving Targets
84
81.6 (16.1)
79.2 (16.0)
Percent Achieving Target1
84
86.1 (21.6)
79.3 (19.4)
Percent Achieving Target2
84
77.1 (21.1)
79.0 (19.2)
Average Percent Improving
84
79.9 (16.0)
93.5 (11.3)
Percent Improving on Target1
84
72.5 (23.9)
Percent Improving on Target2
84
87.3 (17.0)
TWS1
n
a
TWS2
a
M (SD)
n
M (SD)
9
64.9 (25.5)
10
83.3 (22.4)
9
70.6 (25.5)
9
80.7 (27.2)
94.0 (12.1)
13
84.7 (19.5)
17
84.6 (19.0)
92.9 (12.0)
13
85.0 (18.6)
14
87.3 (22.4)
Note. TWS stands for Teacher Work Sample. TWS1 was completed when the candidates were interns and TWS2 was
completed when the same candidates were student-teaching interns. Target1 refers to the first lesson achievement
target featured in a TWS. Target2 refers to the second lesson achievement target featured in a TWS.
a
Valid n only for the 2003 teacher candidates out of 20 TWS pairs.
Effect of Sequential Internships on TWS
Performance
Performances were available for both TWS1 and
TWS2 for 152 of the teacher candidates from 20052007. The means and standard deviations are shown
in Table 1. The mean TWS total score of these 152
teacher candidates was 13.3 on TWS1 and 14.9 on
TWS2. The correlated test for the positive mean
difference of 1.6 for the paired TWS1 and TWS2
total scores was statistically significant, t(151) =
7.61, p < .001, d = 0.62. The effect size was considerable. Thus, the same teacher candidates were
shown to perform better overall as student-teaching
interns on their TWSs than they did as pre-interns.
Effect of Sequential Internships on Student
Learning
As can be seen in Table 1, all of the learning
measures were available from both TWS1 and TWS2
for 140 of the 152 teacher candidates from 20052007 with both TWSs available.
The student
learning measures were not adequately reported by
twelve of the 152 teacher candidates (7.2%). The
mean for the average percent of their students
achieving the lesson targets was M = 82.0 for TWS1
and M = 76.1 for TWS2. The correlated test for this
mean difference of -6.0% was statistically significant,
t(140) = -2.88, SE = 2.08, p = .005, d = 0.24. Hence,
these teacher candidates reported a higher percentage
of students who achieved the lesson targets on
average when they were pre-interns than when they
were student-teaching interns.
Separately, the
negative mean difference of -12.5% for the Target1
was statistically significant, t(139) = -5.34, SE = 2.35,
p < .000, d = -0.45, but the positive mean difference
of 0.62% for the Target2 was not statistically
significant, t(139) = 0.24, SE = 2.57, p = .809, d =
.02.
For the average percentage of their students
showing improvement, the mean of these 140 teacher
candidates was M = 80.1 on TWS1 and M = 92.4 on
TWS2. The correlated test for this mean difference
of 12.3% was statistically significant, t(140) = 6.72,
SE = 1.83, p < .001, d = 0.57. Looking at the
achievement targets separately, the positive mean
difference of 21.6% for the Target1 was statistically
30
Journal of Assessment and Accountability in Educator Preparation
significant, t(139) = 8.93, SE = 2.41, p < .000, d =
.75, but the positive mean difference of 3.1% for
Target2 was not statistically significant, t(139) =
1.49, SE = 2.11, p = .140, d = .13. Hence, the teacher
candidates reported a higher percentage of students
who showed learning gains from the pre-assessment
to the post-assessment when they were studentteaching interns than when they were pre-interns, but
the main influence was for the first featured
achievement target.
Reported Impacts on Student Learning in
2008
The effects of the sequential internships on P-12
student learning were reexamined for the studentteaching interns in 2008. Table 2 presents the means
and standard deviations for the student learning
measures of the paired TWSs (TWS1 and TWS2) of
the 84 teacher candidates who completed their
student-teaching internships (TWS2s) in calendar
year 2008 and who had previously completed a
TWS1.
The student learning measures were
available from both TWSs for all 84 of the teacher
candidates. As can be seen from Table 2, a higher
percentage of P-12 students were reported to achieve
the lesson targets on average (M = 81.62 versus M =
79.16) when the teacher candidates were pre-interns
(TWS1) than when they were student-teaching
interns (TWS2). However, this difference was not
statistically significant, t(83) = 1.10, SE = 2.24, p =
.276, d = .12. Singly, the negative mean difference of
-6.83% for Target1 was statistically significant, t(83)
= -2.28, SE = 3.00, p = .025, d = .25, but the positive
mean difference of 1.92% for Target2 was not
statistically significant, t(83) = 0.68, SE = 2.84, p =
.501, d = .07. Similar to the teacher candidate
performances from fall 2005 through spring 2007,
there was a negative mean difference for the average
percent of students achieving the lesson targets
reported by teacher candidates when they were
student-teaching interns compared to when they were
pre-interns.
However, the difference was only
negative and statistically significant for Target1.
Similar to the earlier findings for the 2005-2007
teacher candidates with paired TWSs, Table 2 shows
the mean percentage of the students showing
improvement on both achievement targets was higher
when the teacher candidates from 2008 were studentteaching interns (TWS2) than when they were pre-
interns (TWS1). The means for Target1 were M =
72.5 for TWS1 and M = 94.0 for TWS2, and the
means for Target2 were M = 87.3 for TWS1 and M =
92.9 for TWS2. The correlated t-test for the positive
mean difference of 13.6% in the average percent of
students improving across both achievement targets
was statistically significant, t(83) = 7.51, SE = 1.80, p
< .001, d = .82. Separately, the correlated t-test for
the positive mean difference of 21.6% for Target1
was statistically significant, t(83) = 8.37, SE = 2.58, p
< .001, d = .91. The effect size was large. The
correlated t-test for the positive mean difference of
5.5% for Target2 was also statistically significant,
t(83) = 2.58, SE = 2.14, p = .012, d = .28. However,
this effect size was small. Together with the earlier
findings for the 2005-2007 teacher candidates, the
results indicate the teacher candidates had higher
percentages of P-12 students who showed learning
gains from pre-assessments to post-assessments when
they were student-teaching interns than when they
were pre-interns. Like the teacher candidates from
2005-2007, the effect was larger for the first featured
achievement target (Target1). Overall, the findings
indicate the teacher candidates increased their
abilities to impact student learning as they progressed
in a sequential teacher preparation program from a
pre-internship to a student-teaching internship. In
addition, the average percentage of P-12 students
reported to show improvement by the 2008 studentteaching interns was very high at 93.5%.
Reported Impacts on Student Learning in
2003
The effects of the sequential internships on the
evidence for P-12 student learning contained in
TWSs was also examined for a representative set of
20 paired TWSs from 2003. Table 2 contains the
means and standard deviations for the reported
student learning measures by internship level (TWS1
and TWS2). As can be seen from Table 2, the
number of TWSs containing the student learning
information varied from TWS1 to TWS2.
In
addition, the number of TWSs containing information
about the percentage of students achieving the lesson
targets was different from the number of TWSs with
information about the percentage of students
improving on the lesson targets.
From Table 2, it can also be seen that half or less
of the 40 TWSs in the paired set of 20 TWSs
Improved Student Learning and TWS Performance
contained information sufficient to determine the
percentages of students achieving each of the
achievement targets. Inspection of the means in
Table 2 indicates the average percentage of students
reported to achieve each of the lesson targets was
higher for TWS2 than for TWS1. However, the
number of TWSs in 2003 with data for both TWS1
and TWS2 for the two achievement targets was too
small to test the differences for statistical
significance.
Table 2 also shows the mean percentage of the
students reported to improve on both achievement
targets was much lower in the set of TWS2s from
2003 than in the set of TWS2s from 2008. Again, the
number of candidates with information about the
percentage of students improving on the achievement
targets in both of their TWSs was insufficient to test
meaningfully. It should be noted, however, that the
means shown in Table 2 indicate little difference in
2003 in the percentage of students showing
improvement on the two achievement targets from
TWS1 to TWS2.
Together, the results for the 2003 TWSs
indicated that ISU teacher candidates in 2003 did not
show consistent evidence for their impacts on student
learning. The percentage of the teacher candidates
(50% or less) showing evidence sufficient to
determine the percentage of their students achieving
the lesson targets was low for both pre-interns and
student-teaching interns. Although the percentage of
candidates showing evidence sufficient to determine
the percentages of their students improving on the
lesson targets was somewhat higher (around 65%),
the means shown in Table 2 did not reveal the
candidates were getting any better at improving
student learning from their pre-internship to their
student-teaching internship. This finding suggests
that the increase in teaching practice from the
candidates’ pre-internships to their student-teaching
internships did not increase the percentage of their
students reported to improve on their TWS
achievement targets. In addition, Table 2 shows the
student-teaching interns in 2008 reported higher
average percentages of their students with learning
gains (94.0% on the first achievement target and
92.9% on the second achievement target) than did the
student-teaching interns with available information in
2003 (84.6% for the first achievement target and
87.3% for the second achievement). The differences
31
are revealing despite the fact that they could not be
tested statistically.
Discussion
Evidence for Teaching Effectiveness
Do Teacher Work Samples (TWSs) show
evidence for the teaching effectiveness of teacher
candidates? Consistent with the pioneering work at
Western Oregon University (McConney et al., 1998;
Schalock, 1987; Schalock et al., 1997), the findings
of this study confirm that TWSs provide evidence for
the teaching effectiveness of teacher candidates. The
2005-2007 student-teaching interns reported that an
average of 75.5% of their students met the criterion
(typically set between 75% and 80% of the possible
points on the post-assessment) for the achievement
targets, and 92.6% of their students showed learning
gains from pre-assessments to post-assessments.
These findings were replicated by the 2008 studentteaching interns, who reported an average of 79.2%
of their students met the achievement targets and
93.5% showed learning gains from their preassessments to their post-assessments. Hence, TWSs
do serve as a means of quality assurance, whereby
teacher candidates demonstrate their abilities to teach
so that students can learn.
Due to the statistical phenomenon known as
regression toward the mean and other factors
affecting the reliability of gain scores, it is doubtful
that any selected set of TWSs is going to include
learning gains that average 100%. In that light, the
average percentage of P-12 students showing
learning gains reported here (93.5% for the 2008
student-teaching interns) was a very good
approximation to a demonstration that our studentteaching interns are able to teach so that all students
can learn in accordance with the expectation of the
NCATE (2008) Professional Standards for the
Accreditation of Schools, Colleges, and Departments
of Education. This finding is relevant to policy
makers who are considering a TWS assessment as a
requirement for teacher licensure. It is also relevant
to teacher preparation programs that use, or are
considering using, TWSs as a means to document the
teaching effectiveness of their program graduates.
32
Journal of Assessment and Accountability in Educator Preparation
Relation of TWS Performance to Teaching
Effectiveness
Is there a relationship between TWS performance
levels and the evidence for teaching effectiveness
exhibited in the TWSs of teacher candidates?
Consistent with the findings reported by McConney
et al. (1998), TWS performance levels of the 20052007 student-teaching interns in this study were
related positively to all of the evidence for student
learning contained within their TWSs. Dissimilar to
the findings reported by McConney et al. (1998, p.
357), where TWS measures were reported to explain
from 24.5% to 59.5% of the variance in their metrics
of student learning, the amounts of variance in the
percentages of students showing learning gains that
could be explained by the total TWS performance
levels were found to be much smaller in this study.
The TWS total scores of the student-teaching interns
accounted for only 2% to 5% of the variance of the
learning measures. The difference in the results was
undoubtedly linked to the different metrics that were
used to assess impacts on student learning.
McConney et al. (1998) used an adjusted index of
pupil growth that took into account both the
complexity of the achievement targets and the quality
of the assessments. The learning measures employed
in this study were not adjusted for these or any other
factors.
In contrast, for the pre-interns (TWS1), a positive
relationship was only shown for the second featured
achievement target, but not the first achievement
target. On the second featured achievement target,
those pre-interns who scored better on their TWS1
overall began to report higher achievement levels, but
the percentage of explained variance was very small
(only 2%). In addition, the TWS1 total scores were
not related to the evidence for learning gains. The
contrast between the results for TWS2 and the results
for TWS1 suggests the relationship between TWS
performance and teaching effectiveness increases as
candidates progress sequentially in their preparation
programs from being pre-interns to student-teaching
interns. This prospect will be explored later in the
context of other findings of this investigation.
Finally, although the McConney et al. (1998)
study and the present study both showed positive
relationships between TWS total scores and reported
measures of student learning contained within TWSs,
the amounts of explained variance imply the
assessment of teacher candidates’ abilities to meet the
targeted teaching standards is also considerably
separate from the issue of their teaching
effectiveness. Indeed, it has always been claimed
that the RTWS does not hold teacher candidates
directly accountable for their teaching effectiveness,
but is instead a measure of their abilities to meet the
targeted teaching standards (Denner et al., 2004).
The findings of this investigation support this claim.
Teaching Performance Gains
Do teacher candidates improve their abilities to
meet teaching standards in a sequential preparation
program that requires them to document their
teaching performances using TWSs in both a preinternship and a student-teaching internship? The
findings of the present investigation revealed that
teacher candidates performed higher overall as
student-teaching interns on their TWSs than they did
as pre-interns. This supports a contribution resulting
from the sequential experiences of completing two
TWSs during two levels of internship experience.
However, this finding is contrary to our previous
studies (Denner & Lin, 2005; Denner et al., 2005;
Denner et al., 2003) that did not show overall
performance differences by internship level. The
differences in the findings were likely due in part to
the dissimilar methodologies of the studies, but may
also be attributed to program changes that have
occurred because of those earlier studies. (The most
notable change was a repositioning and redesign of a
course on diversity away from the student-teaching
internship to an earlier position in the program in
favor of a new student-teaching seminar focused on
teaching performance issues.) The finding of the
present study is important because it shows that
teacher preparation programs that employ two TWSs
can make a difference in the documented teaching
abilities of their teacher candidates with respect to
widely acknowledged teaching standards.
Of course, part of the improvement on TWS2
might be due to practice in completing the tasks
required by the TWSs, although this was not found in
our prior studies. For performance assessments,
improvements in abilities to meet standards are not
separable from improvements in abilities to execute
the authentic tasks and to supply the required
documentation used to measure those standards. As a
result, higher TWS performance is evidence of better
Improved Student Learning and TWS Performance
ability to meet the teaching standards.
This
interpretation is strengthened by the additional
evidence discussed next for improved teaching
effectiveness as the candidates progressed from preinterns to student-teaching interns.
Improved Teaching Effectiveness
Do TWSs contain evidence of increased teaching
effectiveness as candidates progress from being preinterns and to student-teaching interns in a sequential
teacher preparation program? For the two recent sets
(2005-2007 and 2008) of paired TWS performances
(TWS1 and TWS2), the TWS documentation
revealed a higher percentage of P-12 students who
showed learning gains from the pre-assessments to
the post-assessments when teacher candidates were
student-teaching interns than when they were preinterns. For both of the sets, the mean differences
were statistically significant and substantial. This
result supports an influence for a sequential teacher
preparation program on the abilities of teacher
candidates to support student learning as they
progress from pre-interns to student-teaching interns.
Other teacher preparation programs should consider
using a similar sequential internship model (preinternship with TWS1 followed by a student-teaching
internship with TWS2).
The finding of a decline in the reported average
percent of students achieving the lesson targets when
the teacher candidates were student-teaching interns
compared to their TWSs as pre-interns was likely due
to the teacher candidates setting more challenging
achievement targets as student-teaching interns.
When the teacher candidates were pre-interns, they
tended to choose easier achievement targets. This
was particularly true for the first achievement target,
where a higher percentage of the students taught by
the pre-interns achieved the target on their postassessments, but a smaller percentage of their
students showed learning gains from the preassessment to the post-assessment. Hence, rather
than a negative finding, this result may reflect an
improvement by the teacher candidates from their
pre-internships to their student-teaching internships
in setting appropriate achievement targets and
choosing challenging content. The issue of setting
appropriate achievement targets and its relation to
demonstrated impacts on student learning merits
further consideration and investigation.
33
While the sequential model was undoubtedly one
reason for the results, similar results were not present
in the TWSs from calendar year 2003. Although the
percentages of students reported to achieve the TWS
lesson targets was similar in both 2003 and 2008, in
calendar year 2003 many of the teacher candidates
were missing evidence in their TWSs, which was
much less the case from 2005 to 2007, and was not
the case at all in 2008. In addition, evidence for
improvement in the percentages of students showing
learning gains on the two lesson targets from the preinternship (TWS1) to the student-teaching internship
(TWS2) was not present in the calendar year 2003
TWSs. The comparisons between 2003 and 2008
reveal that beyond any practice effects resulting from
the sequential internships, the use of two TWSs as
part of the assessment system has enabled us to
improve the quality of our teacher preparation
program in terms of the abilities of candidates to
teach so their students can learn. This is consistent
with the expectations of NCATE (2008) that focusing
on candidate performances as an important aspect of
unit accountability is a vehicle for program
improvement. It is also consistent with the vision of
Del Schalock, Mark Schalock and their colleagues at
Western Oregon University (Schalock et al., 1997;
Schalock et al., 1998; Schalock, 1987; Schalock et
al., 1993) that a Teacher Work Sample Methodology
that focused on student learning would lead to the
reform and improvement of teacher preparation
programs.
References
Denner, P. R., & Lin, S.-Y. (2005). Fairness and
aspects of the consequential validity of
performance assessments using a Teacher Work
Sample. In P. R. Denner (Chair), Fairness and
aspects of the consequential validity of performance assessments using a teacher work sample.
Symposium presented at the 59th annual meeting
of the American Association for Colleges of
Teacher Education, Washington, D.C.
Denner, P., Newsome, J., & Newsome, J. D. (2005,
February). Generalizability of teacher work
sample
performance
assessments
across
occasions of development. A research report
presented at the Annual Meeting of the
Association of Teacher Educators, Chicago, IL.
34
Journal of Assessment and Accountability in Educator Preparation
Denner, P., Norman, A., & Lin, S. (2009). Fairness
and consequential validity of teacher work
samples. Educational Assessment, Evaluation
and Accountability, 21, 235-254.
doi:
10.1007/s1109-008-9059-6
Denner, P. R., Norman, A. D., Salzman, S. A.,
Pankratz, R. S., & Evans, C. S. (2004). The
Renaissance Partnership teacher work sample:
Evidence supporting score generalizability,
validity, and quality of student learning
assessment. In E. M. Guyton & J. R. Dangel
(Eds.), Teacher education yearbook XII:
Research linking teacher preparation and student
performance (pp. 23-56). Dubuque, IA: Kendall/
Hunt.
Denner, P. R., & Salzman, S. A. (2003, January).
Ways Teacher Work Samples Impact the
Learning of All Students. In R. S. Pankratz
(Chair), Evidence of Teacher Work Sample
Impact on P-12 Student Learning, Teacher
Performance and Teacher Preparation Programs. Symposium conducted at the 55th annual
meeting of the American Association for
Colleges of Teacher Education, New Orleans,
LA.
Denner, P. R., Salzman, S. A., Newsome, J. D., &
Birdsong, J. R. (2003). Teacher work sample
assessment: Validity and generalizability of
performances across occasions of development.
Journal for Effective Schools, 2(1), 29-48.
Marzano, R. J. (2003). What works in schools:
Translating research into action. Alexandria,
VA: Association for Supervision and Curriculum
Development.
McConney, A. A., Schalock, M. D., & Schalock, H.
D.(1998). Focusing improvement and quality
assurance: Work samples as authentic performance measures of prospective teachers’
effectiveness. Journal of Personnel Evaluation
in Education, 11, 343-363.
National Council for Accreditation of Teacher
Education. (2008). Professional standards for
the accreditation of schools, colleges, and
departments of education. Washington, DC:
Author.
Pankratz, R. (1999). Improving teacher quality
through partnerships that connect teacher performance to student learning.
Unpublished
manuscript, Western Kentucky University.
Schalock, M. D. (1987). Teacher productivity: What
is it? How might it be measured? Can it be
warranted? Journal of Teacher Education, 38(5),
59-62.
Schalock, M. D., Cowart, B., & Staebler, B. (1993).
Teacher productivity revisited: Definition,
theory, measurement, and application. Journal of
Personnel Evaluation in Education, 7, 179-196.
Schalock, H. D., Schalock, M., & Girod, G. (1997).
Teacher work sample methodology as used at
Western Oregon State College. In J. Millman
(Ed.), Grading teachers, grading schools: Is student achievement a valid evaluation measure?
(pp. 15-45). Thousand Oaks, CA: Corwin Press.
Schalock, H. D., Schalock, M., & Myton, D. (1998,
February). Effectiveness—along with quality—
should be the focus. Phi Delta Kappan, 79, 468470.
Wright, S. P., Horn, S. P., & Sanders, W. L. (1997).
Teacher and classroom context effects on student
achievement: Implications for teacher evaluation.
Journal of Personnel Evaluation in Education,
11, 57-67.
Authors
Dr. Peter R. Denner is the Associate Dean of
the College of Education at Idaho State
University. His current research interests are
focused on standards-based performance
assessments of teacher quality and the linking of
teacher performance assessments, particularly
Teacher Work Samples, to the learning of P-12
students.
Dr. Shu-Yuan Lin is an associate lecturer in the
Department of Educational Foundations in the
College of Education at Idaho State University.
Her current research interests and special
projects are focused on English as a
second/new/foreign
language
instruction,
technology integration in K-16 instruction, and
cultural and linguistic diversity in education.
Dr. Julie R. Newsome is an associate professor
in the Department of Educational Foundations in
Improved Student Learning and TWS Performance
the College of Education at Idaho State
University. Her current research interests and
special projects are focused on performance
assessments of P-12 students and teacher
candidates and how these can demonstrate
teacher quality in the accreditation process.
Dr. Jack D. Newsome is the former Associate
Dean of the College of Education at Idaho State
University. Before his retirement in 2010, his
research interests were focused on standardsbased performance assessments of teacher
quality and the linking of teacher performance
assessments,
particularly
Teacher
Work
Samples, to the learning of P-12 students.
Dr. Deborah L. Hedeen is the Dean of the
College of Education at Idaho State University.
Her current research interests are focused on
teacher candidate impact on student learning, P16 seamless education and the Common Core
State Standards, and designing inclusive learning
environments for students with disabilities.
35
Download