Journal of Assessment and Accountability in Educator Preparation Volume 1, Number 1, June 2010, pp. 3-15 What Teacher Work Samples Reveal About Teacher Candidates’ Modifications and Adaptations for English Language Learners Shu-Yuan Lin Peter Denner Angela Luckey Idaho State University The adaptations and modifications made by teacher candidates for their students who were English Language Learners (ELLs) as evidenced in their Teacher Work Samples (TWSs) were evaluated using a modified version of the Sheltered Instruction Observation Protocol (SIOP®; Echevarria, Vogt, & Short, 2004; 2008). The results demonstrated TWSs could be analyzed from an alternative perspective. However, TWS scores overrated the teacher candidates’ abilities to make appropriate adaptations and modifications for ELLs. Studentteaching interns outperformed preinterns on the TWSs, but they did not outperform the preinterns on the modified SIOP® total scores. Nonetheless, when examined separately, interns did perform better than the preinterns on two of the SIOP® indicators. A concurrent course on adaptations for diversity taken by the interns had greater impact on the SIOP® scores when taught by a faculty member who had received some professional development on teaching ELLs. The percentage of linguistically diverse students in schools in the United States who spoke a language other than English at home was about 20 percent in 2006, according to the National Center for Educational Statistics (2008), and about one quarter of those students spoke English with difficulty. Hence, it is important for teacher preparation programs to meet this challenge and to ensure their teacher candidates are ready to address the needs of their students who are English Language Learners (ELLs). Unfortunately, after their review of the available literature, Lucas and Grinberg (2008) concluded that most teachers are inadequately prepared to teach their students who are ELLs. As part of institutional and state outcomes assessment requirements, and state and national program ac- creditation requirements (NCATE, 2008), institutions preparing teachers now routinely evaluate the performances of their teacher candidates and the effectiveness of their teacher preparation programs to make datadriven decisions that foster program improvement. This study focused on what the data from teacher candidates’ Teacher Work Samples (TWSs) revealed about their abilities to make adaptations and modifications for English language learners (ELLs) to support the learning of all students. Introduced at Western Oregon University during the 1990's (Schalock, Schalock, & Girod, 1997), a Teacher Work Sample (TWS) is a performance measure that requires teacher candidates to document their teaching abilities relative to targeted teaching standards, and to profile their impacts on student learning. The College of Education at Idaho State Uni- _______________________________________________________________________________________________________ Correspondence: Shu-Yuan Lin, Department of Educational Foundations, Stop 8059, College of Education, Idaho State University, Pocatello, ID 83209. Email: linshu@isu.edu Journal of Assessment and Accountability in Educator Preparation Volume 1, Number 1, June 2010, 3-15 4 Journal of Assessment and Accountability in Educator Preparation versity contributed to the further extension of this assessment tool as part of its participation in the Renaissance Partnership for Improving Teacher Quality (see, Denner, Norman, Salzman, Pankratz, & Evans, 2004). Guidelines, scoring rubrics, and benchmark levels for the Idaho State University TWS have been sufficiently tested to support the use of TWS scores for advancement decisions based on teacher candidates’ teaching performance levels (Denner, Salzman, & Bangert, 2001; Denner, Salzman, Newsome, & Birdsong, 2003). The TWS guidelines specify the teaching standards to be demonstrated and the tasks to be performed. They also direct the teacher candidates regarding the required documentation. Similar to the Renaissance TWS (Denner, Norman, Salzman, Pankratz, & Evans, 2004) and as specified in our prior studies (Denner, Salzman, & Bangert, 2001; Denner, Salzman, Newsome, & Birdsong, 2003), the required tasks for the Idaho State University TWS include: (1) description and analysis of the learning-teaching context, (2) specification of achievement targets for the instructional sequence that are aligned with state achievement standards, (3) formation of an assessment plan that includes both formative and summative assessment of the achievement targets, (4) documentation of lesson plans aligned to the achievement targets for at least six learning activities used in teaching the instructional sequence, (5) use of formative assessment data to make modifications to instruction, (6) analysis of student learning resulting from the instructional sequence for two of the achievement targets, and (7) reflection on the success of the instructional sequence with regard to student learning and future practice. The TWS guidelines also specify other requirements such as formatting and the quality of the written communication. All of the TWS tasks have a related subtask that expects the teacher candidates to show adaptations and modifications to meet individual student needs. This requirement starts with the analysis of the teaching/learning context, where the teacher candidates are expected to identify the characteristics of their student population and to assess the implications of those characteristics for their goal of helping all students to attain the achievement targets of the lessons featured in their TWS. Thus, the Idaho State University TWS is intended as a measure of candidates’ abilities to teach in ways that support the learning of all students (Denner, Salzman, & Bangert, 2001). However, because adaptations and modifications to meet student needs are embedded subtasks of the larger teaching processes demonstrated by the TWS, it is possible that the TWS scores of teacher candidates might not tell the full story regarding their preparation as educators—educators who are able to help every child to achieve. Hence, the central purpose of this investigation was to determine what TWS performances really tell us regarding teacher candidates’ preparation to impact the learning of all students, in particular their students who were ELLs. The specific target of this investigation was the types and quality of adaptations and modifications made by teacher candidates to meet the needs of their students who were ELLs. For that reason, this study was limited to teacher candidates who indicated in their analysis of their teaching/learning context that one or more students who were ELLs were present in the classrooms where they taught their TWS lessons. In this context, the designation of ELL is intended to reflect the official school counts of ELLs as annually reported by the school districts to the State Department of Education and the U.S. Department of Education. However, the counts supplied by the teacher candidates may not always have accurately reflected the official counts, and the investigators had no way to verify that they did. For the purposes of this study, however, this was immaterial. It was sufficient that the teacher candidates believed one or more students were present in their classroom whose first language was not English and whose proficiency in English might be different from the proficiency of the other students in the classroom. The central concern, given the candidates’ statements that one or more students who were ELLs were present, was whether they reported making any adaptations or modifications to support the learning of those students. The criteria applied to examine the TWS were from the Sheltered Instruction Observation Protocol (SIOP®) Model and the SIOP® protocol (Echevarria, Vogt, & Short, 2004; 2008). The SIOP® Model was developed for teachers, school administrators, teacher educators, and researchers as a resource for improving instruction for English language learners (Echevarria, Vogt, & Short, 2008). It is a scientifically investigated, evidence-based approach to sheltered lesson planning and delivery for students with limited English proficiency (Echevarria, Vogt, & Short, 2004; 2008). The SIOP® protocol is the observation instrument for rating lessons with respect to their implementation of the Model; it provides school administrators with a tool for observation of their teachers (Echevarria, Short, & Vogt, 2008). It is also a useful tool for university faculty members who supervise field experiences. Both the SIOP® Model and protocol have been widely implemented in school districts and universities across the Teacher Work Samples and English Language Learners United States and in several other countries (Echevarria, Vogt, & Short, 2008). The SIOP® protocol can be used to provide feedback to teachers or teacher candidates regarding their use of techniques necessary to make the instruction comprehensible to their students who are ELLs (Echevarria, Vogt, & Short, 2004; 2008). It consists of eight categories of adaptations/modifications and 30 features. The eight categories are Lesson Preparation, Building Background, Comprehensible Input, Strategies, Interaction, Practice/Application, Lesson Delivery, and Review/Assessment (Echevarria, Vogt, & Short, 2004: 2008). Because not all of the categories and features apply to or could be observed from the documentation provided in a TWS, this investigation employed a modified version of the SIOP® protocol to investigate the type and quality of the adaptations and modifications for ELLs exhibited by teacher candidates in their TWSs. Additionally, this study was conducted to gain a perspective on the value added by a course on adaptations for diversity when taken concurrently with a senior-level, student-teaching internship. In general, but with some exceptions, teacher candidates at Idaho State University produce two TWSs during their teacher education program. The first TWS is completed as part of the requirements for a junior-level preinternship that accompanies a general methods course titled EDUC 309 Planning, Delivery, and Assessment (6 credits). At the time of this study, the second TWS was completed as part of a senior-level, student teaching internship and the accompanying course, EDUC 402 Adaptations for Diversity (3 credits). The EDUC 402 course was designed to provide timely instruction on creating inclusive/differentiated “classroom environments, curricula, and educational experiences that enable all students to learn” (Idaho State University Undergraduate Catalog 2007-2008, p. 176). The interns were given feedback and assessed on their second TWS as part of the requirements of this course. Hence, the second TWS completed by the student-teaching interns should show an increased focus on adaptations/modifications when compared to the first TWS of the preinterns. The outcome has implications for the design of teacher preparation programs. This study addressed several questions important to teacher preparation programs, particularly programs employing TWSs as part of their unit assessment systems. What do TWS scores tell us about the performance levels of teacher candidates regarding adaptations and modifications to support the learning of all stu- 5 dents? Are the performance levels the same when teacher candidates’ TWSs are examined from the perspective of the SIOP® specialty indicators of appropriate adaptations and modifications for students who are ELLs? Is a TWS a valid measure of the abilities of teacher candidates to teach in ways that support the learning of students who are ELLs? Does a course addressing adaptations for diversity when taken concurrently with a student-teaching internship enhance the abilities of teacher education candidates’ to address the needs of their students who are ELLs? Do changes need to be made to teacher preparation programs? Method Participants The participants for this study were teacher education candidates who completed TWSs in the College of Education at Idaho State University, and who reported teaching at least one student who was an English Language Learner (ELL). The teacher candidates completed TWSs as part of the requirements for the juniorlevel preinternship, or the senior-level student-teaching internship. For the academic year, there were 48 preinterns and 34 student-teaching interns with TWSs that met the criteria for this study. However, nine of the preinterns were also among the student-teaching interns (26%), because they completed both internships in the same academic year. To eliminate the overlap and to maintain independence between the internship levels for data analysis, it was decided that those nine teacher candidates would be included among the studentteaching interns only. Hence, the total number of participants for this investigation was 73 (39 preinterns and 34 student-teaching interns). Of the 39 preinterns, 32 (82.1%) were elementary education majors and 7 (17.9%) were secondary education majors. At the time of this study, special education majors at Idaho State University were required to double major in elementary education or secondary education; hence, they were not reported separately. For the student-teaching interns, 31 (67.6%) were elementary education majors and 11 (32.4%) were secondary education majors. To be selected for this study, the teacher candidates had to have reported the presence of one or more English language learners among the students in the classrooms where they taught their TWS lessons. The preinterns included in this study taught a mean of 25.2 (SD = 5.1) students and a mean of 1.6 6 Journal of Assessment and Accountability in Educator Preparation (SD = .94) students who were ELLs. The studentteaching interns included in this study taught a mean of 23.5 (SD = 4.8) students and a mean of 4.8 (SD = 5.1) students who were ELLs. For both groups of interns, the modal number of students taught who were ELLs was one student. Measures This program evaluation study made use of existing collected TWS performances and scores. Teacher Work Sample (TWS) scores have been established as a valid and dependable measure of teacher candidates’ teaching abilities relative to eight targeted teaching standards (Denner, Salzman, & Bangert, 2001; Denner, Salzman, Newsome, & Birdsong, 2003). The TWS scores consist of total scores across standards, and eight sub-scale scores assessing eight targeted teaching standards (Copies of the guidelines and scoring rubric for the Idaho State University TWS are available upon request.). The course instructors who supervised the preinterns or student-teaching interns used an analytic scoring rubric to score the TWS. Each indicator on the rubric was rated on a three-point scale of 0 = Indicator Not Met, 1 = Indicator Met Acceptable, or 2 = Indicator Met At Target. The indicator ratings were summed and converted to ratings for each of the eight standards (subscales). Each standard score was expressed on a three-point scale of 0 = Standard Not Met, 1 = Standard Met Acceptable, or 2 = Standard Met At Target. A total score for each TWS was computed by summing the standard scores, which yielded a total possible score of 16 points. A customized and simplified version of the SIOP® protocol (Echevarria, Vogt, & Short, 2004; 2008) was developed for use in this study to assess the teacher candidates’ inclusion of adaptations and modifications for their students who were ELLs in their TWSs. The SIOP® protocol has eight categories of performance and 30 indicators of the SIOP® Model features. The protocol uses a 5-point scale for rating each feature indicator, ranging from 0 to 4, with 0 indicating absence of implementation of the indicator and 4 indicating high implementation of the indicator. Scores on the protocol are generated by summing across the 30 feature indicators. Category scores are not computed separately for the eight performance categories, but they could be. Echevarria, Vogt, and Short (2004) reported that the SIOP® protocol has been shown to be a valid measure of sheltered instruction with high inter-rater reliability (r = .99). In addition, according to Echevar- ria, Vogt, and Short (2004, p. 215), “experienced observers of classroom instruction (e.g., teacher education faculty who supervise student teachers) who were not specifically trained in the SIOP® model were able to use the protocol to distinguish high and low implementers of the model” (p. 215). However, because the SIOP® protocol was designed as an observation instrument, some of the specific indicators could not be applied in the context of a TWS. As a result, the investigators developed a simplified version of the SIOP® protocol. The eight indicators employed in this study were developed as composite indicators from the SIOP® Protocol indicators (see Table 2 for the indicators). We selected and combined the protocol indicators (features) that in our judgment could be found in the context of the documentation required for a TWS. The composite indicators were organized around seven of the eight SIOP® protocol categories. The SIOP® protocol category of Lesson Delivery was dropped from the rubric for this study because all of the SIOP® protocol indicators (features) for this category depended upon lesson observation. In addition, the final indicators from the SIOP® protocol category of Review/Assessment were split into two separate composite indicators. The first of those composite indicators focused on review activities in advance of assessment and the second composite indicator focused on modifications and adaptations to the assessments themselves. Thus, the rubric indicators for this study addressed eight types of adaptations and modifications: Preparation, Building Background, Comprehensible Input, Strategies, Interaction, Practice/Application, Review, and Assessment. However, instead of multiple indicators under each category, a single composite indicator was developed for each category out of the related SIOP® protocol indicators (features) that could be assessed via the documentation provided in a TWS. As a further simplification of the SIOP® protocol, instead of using the 5-point rating scale, a 3-point scale was used. The simplified scale took the anchor points (0 and 4) and the middle point (2) of the protocol scale descriptors, collapsed them across the indicators being combined, and then used the combined descriptors as the three-point scale descriptors. Each of the eight composite indicators was rated on a scale of 0 = Indicator Not Met, 1 = Indicator Met Acceptable, or 2 = Indicator Met At Target. This simplified SIOP® protocol was used to determine whether the teacher candidates created any instructional opportunities to meet the needs of their students who were ELLs. Adaptations or modifications did not need to be present in every one of Teacher Work Samples and English Language Learners the six TWS lessons for the indicators to be met. Instead, the indicators were judged across the six sequential lessons required for the Idaho State University TWS. In addition, the indicator judgments took into consideration the entire documentation and all of the evidence provided by the TWS. The evaluations reported here were made by an assessor that completed both SIOP® I and SIOP® II training workshops from the SIOP® Institute. The assessor was also an experienced teacher of English as a new language, and was an experienced supervisor of teachers of English to non-native speakers. A second assessor, who also completed both SIOP® I and II training workshops from the SIOP® Institute, rescored all but one of the TWS using the same rubric. For one TWS, the compact disc was damaged and could no longer be read. Both assessors hold doctoral degrees in education and teach courses on the social foundations of education and methods of teaching English as a new language. The inter-rater agreement for their total scores summing across the eight indicators was found to be r = .97 (N = 81), p < .01. Additional data employed in this study came from existing data contained in the college database. The data included information about each teacher candidate’s demographic characteristics and degree program, and the sections of the courses where they completed TWSs. Procedures The teacher candidates in this study completed their TWS according to the current guidelines employed at Idaho State University. The college collects the TWS scores and products at the end of each semester for each academic year. As part of the TWS process, teacher candidates submitted general demographic information about the students they taught while implementing their TWS lessons. This information was collected in the college database as documentation of the number of contacts teacher candidates have with diverse student populations. As a result, a search of the existing database was performed for TWS performances of teacher candidates who completed TWSs during the fall or spring semester of the same academic year, and who indicated they taught the TWS lessons in a classroom that had one or more students who were ELLs present. Because the TWS products were also collected and retained in the Office of the Assistant Dean for Assessment, the TWSs that met the 7 search criteria could be located and examined using the simplified SIOP® protocol described previously for the types and quality of the adaptations made for students who were ELLs. Design This study was a descriptive and cross-sectional investigation of the adaptations or modifications made for ELLs by teacher candidates at two internship levels, preinterns and student-teaching interns, as measured from their TWS performances. For the first part of the investigation, the frequency and percent of the teacher candidates’ TWS performance levels, overall and across the eight TWS standards, were reported separately for both the preinterns and the student-teaching interns. The TWS performance levels were then compared using chi-square analysis to determine whether there were any differences between the two internship levels in terms of their performances across the TWS standards. The overall TWS performance levels of the preinterns were compared to the student-teaching interns using an independent t-test. For the second part of the investigation, the frequencies and percents of the teacher candidates’ adaptations and modifications for their students who were ELLs were reported. Chisquare analysis was used to determine whether there were any differences between the two internship levels across the simplified SIOP® protocol indicators that assessed their adaptations or modifications for their students who were ELLs. Finally, the overall performance levels of the preinterns on the simplified SIOP® protocol were compared to the overall performance levels of the student-teaching interns using an independent t-test. This was done to determine whether the type and quality of the adaptations and modifications for students who were ELLs was influenced by the student-teaching interns greater teaching experience and concurrent enrollment in the course EDUC 402 Adaptations for Diversity (3 credits). For the independent t-tests, the level of significance was set at .05 unadjusted for the number of statistical tests performed. For the chi-square tests, after weighing the risks of Type I and Type II decision errors, a family-wise error rate of .10 was set. A Bonferroni-type adjustment for the number of statistical tests performed in each set of chi-square analyses established a significance level of .013 for each of the chi-square tests. For all chi-square analyses, Cramér’s V coefficient was reported as a measure of effect size. 8 Journal of Assessment and Accountability in Educator Preparation Results TWS Performance Levels The TWS performance levels by standard are presented in Table 1 by internship level. As can be seen from the table, nearly all teacher candidates were judged to meet the standards at the acceptable level or higher across all eight of the targeted standards. This means that across the teaching standards, both the preinterns and the student-teaching interns were judged to have demonstrated the ability to plan, deliver, and assess an instructional sequence aligned with state achievement standards, and to have demonstrated their ability to reflect on the impacts of their instruction on student learning. The positive ratings across the standards also entail the judgment and should imply that our candidates are making adaptations or modifications to support the learning of all students. The mean TWS total score for the preinterns was M = 13.5 (SD = 2.1) and the mean total score for the student-teaching interns was M = 14.5 (SD = 1.8) out of the possible total score of 16 points. An independent ttest indicated that the difference between the means of the teacher candidates completing the two internships was statistically significant, t(71) = 2.26, p = .027, d = .53. The student-teaching interns (M = 14.5) outperformed (p < .05) the preinterns (M = 13.5) on the TWS assessment. This finding is contrary to previous investigations (Denner, Newsome, & Newsome, 2005; Denner, Salzman, Newsome, & Birdsong, 2003) that looked at the longitudinal development of TWS by the same teacher candidates across the two occasions of development. Hence, the findings may reflect a valueadded improvement to the performance levels of the teacher candidates and also program improvement when compared to the results from earlier years. Further examination of our candidates’ TWS performances by targeted teaching standard using chisquare analysis revealed no difference between the preinterns and the student-teaching interns with respect to description and analysis of the teaching/learning context (Standard 1), χ2 (1, N = 73) = .11, p = .74, V = .04; setting of achievement targets (Standard 2), χ2 (1, N = 73) = 1.01, p = .31, V = .12; quality of their achievement plans (Standard 3), χ2 (1, N = 73) = .54, p = .46, V = .09; quality of their instructional sequence designs (Standard 4), χ2 (1, N = 73) = .82, p = .37, V = .11; abilities to profile and analyze student learning (Standard 6), χ2 (2, N = 73) = 5.42, p = .07, V = .27; abilities to reflect on the outcomes of their teaching (Standard 7), χ2 (2, N = 73) = 5.00, p = .08, V = .26; or the quality and organization of their written communication (Standard 8), χ2 (1, N = 73) = .00, p = .96, V = .01. The only statistically significant difference between the preinterns and the student-teaching interns was with respect to the quality of their reflections during instruction on student learning progress and their subsequent modifications to instruction to meet students’ diverse needs (Standard 5), χ2 (1, N = 73) = 8.59, p = .003, V = .34. On this standard, 91.2% of the student teaching interns compared to 61.5% of the preinterns were judged to be at the target level. Taken together, it appears the overall difference in TWS performance between the preinterns and the studentteaching interns was largely due to their differences on Standard 5. This finding may mean the course EDUC 402 Adaptations for Diversity (3 credits) taken concurrently by the student-teaching interns had a positive effect on their TWS scores on Standard 5 by increasing the quality or quantity of their modifications for students’ diverse needs—as would be expected. Adaptations and Modifications for English Language Learners Table 2 shows the frequency and percent of the Idaho State University preinterns and student-teaching interns who met the criteria for adaptations and modification for students who were ELLs. Inspection of Table 2 reveals the indicators were not met a high percentage of the time. For both the preinterns and the studentteaching interns, the worst indicator was indicator four, which focused on instructional strategies. Seventy-four percent of both the preinterns and the student-teaching interns did not meet this indicator. For both the preinterns and the student-teaching interns, the highest rated indicator was indicator eight, which looked at adaptations or modifications to assessments. Twenty percent of the preinterns and 44% of the student-teaching interns met indicator eight at the target level. Still, in both cases, less than 70% of the teacher candidates met this standard at an acceptable level or higher. The contrast of the performance ratings of our teacher candidates in Table 1 and in Table 2 raised concerns about the extent to which the standards-based judgments from the common scoring rubric implied that our teacher candidates were making appropriate adaptations or modifications to support the learning of all students, particularly their students who were ELLs. Table 1 Number and Percent of Teacher Work Sample Performance Ratings by Targeted Standard for the Preinterns and Student-Teaching Interns Who Taught English Language Learners TWS Standards Rating Preinterns n % Interns n % 30 76.9 25 73.5 1 = Met Acceptable 9 23.1 9 26.5 0 = Not Met 0 0.0 0.0 0.0 31 79.5 30 88.2 1 = Met Acceptable 8 20.5 4 11.8 0 = Not Met 0 0.0 0 0.0 32 82.1 30 88.2 1 = Met Acceptable 7 17.9 4 11.8 0 =Not Met 0 0.0 0 0.0 30 76.9 29 85.3 1 = Met Acceptable 9 23.1 5 14.7 0 = Not Met 0 0.0 0 0.0 2 = Met Target 24 61.5 31 91.2 1 = Met Acceptable 15 38.5 3 8.8 0 0.0 0 0.0 1. The teacher uses information from the learning-teaching context and knowledge of human development and learning to plan instruction and assessment. 2 = Met Target 2. The teacher uses their knowledge of subject matter to set important, challenging, varied, and meaningful achievement targets. 2 = Met Target 3. The teacher uses formal and informal assessment methods and strategies aligned with achievement targets to evaluate and advance student performance and determine teaching effectiveness. 4. The teacher plans and prepares instruction using a variety of instructional strategies to meet specific achievement targets, student characteristics and needs, and learning contexts. 5. The teacher reflects, during instruction, on student learning progress and modifies instruction and assessment to meet students’ diverse needs and experiences. 2 = Met Target 6. The teacher profiles student performances and analyzes and interprets assessment data to determine student progress. 2 = Met Target 24 61.5 29 85.3 1 = Met Acceptable 14 35.9 5 14.7 1 2.6 0 0.0 2 = Met Target 15 38.5 21 61.8 1 = Met Acceptable 22 56.4 13 38.2 2 5.1 0 0.0 30 76.9 26 76.5 1 = Met Acceptable 9 23.1 8 23.5 0 = Not Met 0 0.0 0 0.0 2 = Met Target 0 = Not Met 0 = Not Met 7. The teacher reflects, after completion of the instructional sequence, on his or her instruction and on student learning and is continuously engaged in purposeful mastery of the art and science of teaching. 8. The teacher uses effective written communication skills. 0 = Not Met 2 = Met Target Table 2 Number and Percent of the Modified SIOP® Ratings1 by Indicator for the Preinterns and StudentTeaching Interns Who Taught English Language Learners. English Language Learners1 1. Lesson Preparation - The teacher modifies the lesson to accommodate English language learners by developing a language objective, selecting supplementary materials, describing adaptations of content and creating activities for language practice in reading, writing, speaking, and/or listening. 2. Building Background - The teacher modifies the lesson to link new concepts to students’ background experiences, past learning and describes activities or strategies for emphasizing key vocabulary. 3. Comprehensible Input - Use techniques to make content concepts clear (e.g., modeling, visuals, handson activities, demonstration, gestures, body language). Rating Preinterns Interns n % n % 8 20.5 1 2.9 1 = Met Acceptable 11 28.2 24 70.6 0 = Not Met 20 51.3 9 26.5 2 = Met Target 2 5.1 4 11.8 1 = Met Acceptable 8 20.5 12 35.3 29 74.4 18 52.9 5 12.8 8 23.5 1 = Met Acceptable 15 38.5 10 29.4 0 =Not Met 19 48.7 16 47.1 2 = Met Target 0 = Not Met 2 = Met Target 4. Strategies - The teacher modifies the lesson by describing appropriate learning strategies to help students learn content concepts and develops questions to promote higher-order thinking skills. 2 = Met Target 7 17.9 7 20.6 1 = Met Acceptable 3 7.7 2 5.9 29 74.4 25 73.5 5. Interaction - The teacher modifies the lesson to include grouping configurations for student interaction and discuss. 2 = Met Target 4 10.3 2 5.9 1 = Met Acceptable 18 46.2 22 64.7 0 = Not Met 17 43.6 10 29.4 3 7.7 1 2.9 1 = Met Acceptable 21 53.8 23 67.6 0 = Not Met 15 38.5 10 29.4 2 5.1 1 2.9 1 = Met Acceptable 15 38.5 24 70.5 0 = Not Met 22 56.4 9 26.5 8 20.5 15 44.1 1 = Met Acceptable 15 38.5 8 23.5 0 = Not Met 16 41.0 11 32.4 6. Practice and Application - The teacher modifies the lesson by provide hands-on materials and/or manipulatives, and activities for students to practice new content knowledge, apply content and language knowledge, and integrate all language skills (reading, writing, listening, and speaking) in the classroom. 7. Review - The teacher modifies the lesson by describing a comprehensive review of key vocabulary, key concepts, and assessment of lesson objectives attainment. 8. Assessment - Explanation of rules is provided in L1, simplified sentence, or visual aid. 1 0 = Not Met 2 = Met Target 2 = Met Target 2 = Met Target ® From Echevarria, Jana, et al. Making Content Comprehensible For English Learners: The SIOP Model, 3/e Published by Allyn and Bacon, Boston, MA. Copyright © 2008 by Pearson Education. Adapted by permission of the publisher. The full SIOP® protocol needed to be modified in this particular study because the researchers were examining written teacher lesson plans, not observing an enacted lesson, as the protocol was designed for. While we recommend that the SIOP® Model in its entirety be used in professional development for teacher lesson delivery, this review was limited to written materials. The mean total score for the preinterns on the simplified SIOP® rubric was M = 4.7 (SD = 4.6) and the mean total score for the student-teaching interns was M = 6.0 (SD = 4.2) out of the possible total of 16 points. Clearly, neither group of interns performed at a high level overall on this assessment. An independent t-test indicated that the difference between the means of the teacher candidates at the two internship levels was not statistically significant, t(71) = 1.21, p = .23, d = .28. Unfortunately, and in contrast to the TWS scores, this suggests little value was added by the course EDUC 402 Adaptations for Diversity (3 credits) taken concurrently by the student-teaching interns when their TWS performances were looked at from the perspective of adaptations and modifications that were made explicitly for their students who were ELLs. To further determine whether taking the course EDUC 402 Adaptations for Diversity, while completing the senior-level, student-teaching internship had any effect on the frequency of adaptations or modifications made for students who were ELLs, each of the simplified SIOP® protocol indicators was evaluated separately using a 3 (Indicator Not Met, Indicator Met Acceptable, and Indicator Met at Target) by 2 (Preinterns versus Student-Teaching Interns) chi-square analysis. Lesson Preparation For indicator 1, which addressed preparation, the chi-square analysis revealed a statistically significant difference between the preinterns and the studentteaching interns, χ2 (2, N = 73) = 14.2, p = .001, V = .44. The results showed that the preinterns were less likely to meet this indicator than the student teaching interns (51.3% not met versus 26.5% not met, respectively). However, when the preinterns did meet the indicator, they were more likely to meet it at the target level (20.5% versus 2.9%, respectively). This finding is mixed and somewhat perplexing regarding whether the more experienced interns benefited from their concurrently enrollment in EDUC 402. Building Background For Indicator 2, which addressed building background knowledge, the preinterns and the studentteaching interns were not statistically different, χ2 (2, N = 73) = 3.7, p = .16, V = .23. In general, less that half of the teacher candidates met this indicator regardless of internship level. Comprehensible Input On Indicator 3, which looked at adaptations and modifications to ensure comprehensible input, the chisquare analysis did not reveal a statistically significant difference between the student-teaching interns and the preinterns, χ2 (2, N = 73) = 1.6, p = .45, V = .15. In this case, 48.7% of the preinterns did not meet the indicator, while 47.1% of the student-teaching interns did not meet the indicator. Strategies Indicator 4 rated the adaptations or modifications our teacher candidates made to use a variety of question types to promote higher-order thinking skills and to provide opportunities for their students who were ELLs to use strategies. The chi-square analysis revealed no statistically significant difference between the preinterns and the student-teaching interns on this indicator, χ2 (2, N = 73) = .15, p = .93, V = .05. This means that the student-teaching interns, despite concurrently taking a course that addressed adaptations for diversity, were no better than the preinterns at providing opportunities to use learning strategies or at using a variety of question types to promote higher-order thinking skills for their students who were ELLs. Moreover, most of the interns at both levels did not meet this indicator, 74.4% not met for the preinterns, and 73.5% not met for the student-teaching interns. Interaction Indicator 5 evaluated adaptations or modif-ications to the types of grouping configurations that were made to support interaction and discussion for their students who were ELLs. The chi-square comparing the preinterns with the student-teaching interns was not statistically significant, χ2 (2, N = 73) = 2.55, p = .28, V = .19. In this case, more of the student-teaching interns met this indicator (70.6%) than the preinterns (56.4%), but the difference was not statistically significant. Practice/Application Indicator 6 assessed whether the TWSs of our teacher candidates show evidence of the use of handson materials and/or manipulatives, and additional activities for their students who were ELLs, so they could practice new content knowledge, apply content and language knowledge, and integrate language skills in the classroom. Once again, the chi-square analysis comparing the preinterns with the student-teaching interns was not statistically significant, χ2 (2, N = 73) = 1.76, p = .42, V = .16. On the positive side, 61.5% of the preinterns, and 70.6% of the student-teaching interns, met this indicator at the level of acceptable or higher, although few of our teacher candidates were at the target level on this indicator (7.7% for the preinterns and 2.9% for the student-teaching interns). Review Indicator 7 looked at adaptations or modifications of the lesson designed for students who were ELLs that provided them with a comprehensive review of key vocabulary, key concepts, and lesson objectives. This time, the chi-square analysis comparing the preinterns with the student-teaching interns was statistically significant, χ2 (2, N = 73) = 7.6, p = .02, V = .32. The student-teaching interns were most likely to meet this indicator at the acceptable level (70.6%), but the preinterns were most likely to not meet this indicator (56.4%). Unfortunately, only 2.9% of the student teaching interns met this indicator at the targeted level. Assessment The final indicator (Indicator 8), looked at whether or not the teacher candidates made any adaptations or modifications to their assessments for their students who were ELLs, by providing explanation of the rules in the students’ first language, or by providing simplified sentences or visual aids. The chi-square test to compare the preinterns with the student-teaching interns on this indicator was not statistically significant, χ2 (2, N = 73) = 4.9, p = .09, V = .26, even though 44.1% of the student-teaching interns met this indicator at the target level when compared to only 20.5% of the preinterns. It was encouraging to note the fact that the majority of the teacher candidates met this indicator at the acceptable level or higher (59% for the preinterns and 67.6% for the student teaching interns). Effect of EDUC 402 Instructor After examination of the TWSs produced by the student teaching interns, it appeared to the investigators that the TWSs of teacher candidates in some sections of EDUC 402 contained more adaptations or modifications for their ELL students than the TWSs of teacher candidates in other sections. When this was investigated further, it was discovered that the same professor had taught those sections of EDUC 402. Al- though we had not planned to look at the effect of the EDUC 402 instructor on the TWS performances of our teacher candidates, we realized that there might be a reason to do so, because it was also known to us that this instructor was the only EDUC 402 instructor who had completed SIOP® I training. As a result, we decided to compare the total scores received by this instructor’s teacher candidates (M = 7.50; SD = 3.96) with those of the other instructors combined (M = 4.25; SD = 3.77). The independent t-test revealed that the modified SIOP® protocol total scores were statistically significantly higher (M = 7.5 versus M = 4.25) for the teacher candidates who took EDUC 402 with the instructor who had completed SIOP® I training, t(32) = 2.44, p = .02, d = .84. Given Cohen’s (1988) benchmark criteria for interpretation, this was a large effect. This result suggests SIOP® training is beneficial for education faculty members—so they can prepare teacher candidates to address the needs of diverse learners and to support the learning of all students. Discussion This study demonstrated Teacher Work Sample (TWS) performances are capable of being examined in multiple ways by taking a deeper look at teacher candidates’ abilities to make adaptations and modifications for individual student needs as evidenced in their TWSs. Although teacher candidates performed well overall on their TWSs when judged from the perspective of the regular TWS scoring rubric, when the same TWS performances were judged from the perspective of modified SIOP® protocol (Echevarria, Vogt, & Short, 2004; 2008) indicators of adaptations and modifications for English language learners, the teacher candidates did not perform well. The contrast of the performance ratings of the teacher candidates’ TWS when viewed from the perspective of a different scoring rubric raised concerns about the extent to which the standards-based judgments from the regular TWS scoring rubric were valid indicators of teacher candidates’ abilities to make appropriate adaptations or modifications to support the learning of all students. In concert with the conclusions of Lucas and Grinberg (2008) and the recommendations of Lucas, Villegas, and Freedson-Gonzales (2008), the results of this study indicate that changes are needed to strengthen teacher education programs and greater attention should be focused on preparing all teachers to teach ELLs. To be fair, it should be pointed out that TWS assessments were developed to measure a set of generic teaching processes and a limited set of targeted teaching standards (Denner, Norman, Salzman, Pankratz, & Evans, 2004; Denner, Salzman, & Bangert, 2001; Schalock, Schalock, & Girod, 1997). A TWS was never intended to be a measure of all of the important aspects of teaching (Denner, Salzman, & Bangert, 2001). Clearly, inferences about teaching abilities based on the scores from a regular TWS scoring rubric must be limited to the targeted standards, since not every national, state, or institutional standard is evaluated or was meant to be evaluated by a TWS. As shown in this study, inferences about the abilities of teacher candidates to meet specialty-area teaching standards were shown not to be warranted–in this case standards related to supporting the achievement of English language learners. Indeed, similar to the present study, Pratt (2002), after analyzing fifty of the Western Oregon University TWSs against some of the National Council of Teachers of Mathematics (NCTM) standards, concluded there was only weak alignment or no alignment with those specialty-area standards. Hence, valid interpretation of TWS scores should be limited to the purposes for which they were intended and generalized only if additional supporting evidence justifies other uses. Nevertheless, this study also demonstrated that TWS performances are a rich source of data that can be analyzed from alternative perspectives to provide a basis for program improvement. When we looked at the TWS performances of our teacher candidates using a rubric that focused on indicators of the extent to which they created instructional opportunities to meet the needs of their students who were ELLs, we found that every one of the indicators were met by some of the teacher candidates. Thus, TWS performances contain a great deal of information about candidates’ teaching capabilities, if teacher educators are willing to spend the time to look at them and to apply appropriate assessment tools. Unfortunately, in the present study, the data also revealed a high percentage of studentteaching interns at program completion did not meet the modified SIOP® protocol indicators for their students who were ELLs, and a very low percentage of them met the indicators at the target level (see Table 2 for the percentage meeting and not meeting each indicator in this study). Plainly, teacher preparation programs can do a better job of preparing teachers to meet the needs of their students who are ELLs. The data from this study also confirmed the concern of Lucas, et al. (2008, p. 10), that “issues of language are likely to get lost within diversity courses.” The findings indicate the training provided by EDUC 402 Adaptations for Diversity as a senior-level course that accompanied the student-teaching internship was too late in the program, and it was not sufficiently focused on the needs of ELLs. Although modest benefit on the TWS was found for student-teaching interns who had taken this course compared to the preinterns who had not yet taken it, there was no overall difference between the student-teaching interns and the preinterns on their modified SIOP® scores. As a result, a new junior-level course has been developed and approved to replace EDUC 402. The new course devotes one credit of the course to the SIOP® Model (Echevarria, Vogt, & Short, 2004; 2008) for making adaptations and modifications for English language learners. The SIOP® Model portion of the course will be taught exclusively by college faculty members who have knowledge and skills to teach ELLs or who have received such professional training workshops. In concert with the recommendation of Lucas and Grinberg (2008, p. 628) “to conduct research to get a better sense of where we are starting from” regarding the preparation of teachers to teach ELLs, this study has provided us with baseline data that can be used to determine whether the program changes have the intended effect. Systematic studies of the effects of program changes to assure that the intended program strengthening occurs and happens without adverse effects is also in accordance with the expectations of the National Council for Accreditation of Teacher Education (2008) unit accreditation requirements. This study should encourage other teacher preparation programs to inquiry into their teacher candidates’ levels of preparation for teaching students who are ELLs. Finally, our finding that the Adaptations for Diversity (EDUC 402) course instructor who had completed some professional training related to teaching ELLs had a greater effect on the overall preparation of the enrolled teacher candidates to make adaptations and modifications for their students who were ELLs when compared to the teacher candidates of the other course instructors supports the value of this type of professional development for other education faculty members. The finding also supports the concern expressed by Lucas, et al. (2008) regarding the need for professional development for teacher educators. They found that teacher educators generally do not have the knowledge and skills to teach ELLs (Lucas & Grin- berg, 2008). Lucas and Grinberg (2008) contend that topic and course changes in teacher education will have no impact if educational faculty members teaching the courses do not have the knowledge and skills to prepare teachers to teach ELLs. They strongly recommend professional development be “an integral part of any effort to modify teacher education to prepare classroom teachers to teach ELLs” (p. 625). In confirmation of their contention, this study showed that our course that was partly designed to address the preparation of teachers to teach students who are ELLs was more effective when taught by the sole faculty member who had completed some relevant training. As a result of this study, a professional workshop on teaching ELLs was held for our education faculty. We have also sent more of our education faculty members to receive such professional development training. In addition, more attention has been drawn to ELLs in the courses across the curriculum in our teacher education programs. Upon examination of the preparation levels of their teacher candidates, other teacher preparation programs will likely want to consider these and other types of professional development as recommended by Lucas, et al. (2008) for their education faculty members preparing classroom teachers to be linguistically responsive teachers. References Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum. Denner, P., Newsome, J., & Newsome, J. D. (2005, February). Generalizability of teacher work sample performance assessments across occasions of development. A research report presented at the Annual Meeting of the Association of Teacher Educators, Chicago, IL. Denner, P. R., Norman, A. D., Salzman, S. A., Pankratz, R. S., & Evans, C. S. (2004). The Renaissance Partnership teacher work sample: Evidence supporting score generalizability, validity, and quality of student learning assessment. In E. M. Guyton & J. R. Dangel (Eds.), Teacher Education Yearbook XII: Research Linking Teacher Preparation and Student Performance. Dubuque, IA: Kendall/Hunt. Denner, P. R., Salzman, S. A., & Bangert, A. W. (2001). Linking teacher assessment to student performance: A benchmarking, generalizability, and validity study of the use of teacher work samples. Journal of Personnel Evaluation in Education, 15(4), 287-307. Denner, P. R., Salzman, S. A., Newsome, J. D., & Birdsong, J. R. (2003). Teacher work sample assessment: Validity and generalizability of performances across occasions of development. Journal for Effective Schools, 2(1), 29-48. Echevarria, J., Short, D. J., & Vogt, M. (2008). Implementing the SIOP® through effective professional development and coaching. Boston, MA: Pearson. Echevarria, J., Vogt, M., & Short, D. J. (2004). Making content comprehensible for English learners: The SIOP® MODEL (2nd ed.). Boston, MA: Pearson. Echevarria, J., Vogt, M., & Short, D. J. (2008). Making content comprehensible for English learners: The SIOP® MODEL (3rd ed.). Boston, MA: Pearson. Idaho State University. (2007). Idaho State University undergraduate catalog 2007-2008. Pocatello, ID: Author. Lucas, T., & Grinber, J. (2008). Responding to the linguistic reality of mainstream classrooms: Preparing all teachers to teach English language learners. In M. Cochran-Smith, S. Feiman-Nemser, & J. McIntyre (Eds.), Handbook of research on teacher education: Enduring issues in changing contexts (3rd ed., pp. 606-636). Mahwah, NJ: Erlbaum. Lucas, T., Villegas, A. M., & Freedson-Gonzales, M. (2008). Linguistically responsive teacher education: Preparing classroom teachers to teach English language learners. Journal of Teacher Education. 59(4), 361-373. National Center for Educational Statistics. (2008). The condition of education, Section 1. Participation in education: Elementary/secondary education, Indicator 7: Language minority school-age children. Washington, DC: U. S. Department of Education. Retrieved August 1, 2008 from http://nces.ed.gov/programs/coe/2008/section1/in dicator07.asp National Council for Accreditation of Teacher Education. (2008). Professional Standards for the Accreditation of Teacher Preparation Institutions. Washington, DC: NCATE. Pratt, E. O. (2002). Aligning mathematics teacher work sample content with selected NCTM standards: Implications for preservice teacher education. Journal of Personnel Evaluation in Education, 16, 175-190. Schalock, H. D., Schalock, M., & Girod, G. (1997). Teacher work sample methodology as used at Western Oregon State College. In J. Millman (Ed.). Grading teachers, grading schools: Is student achievement a valid evaluation measure? (pp. 15 - 45). Thousand Oaks, CA: Corwin Press. Authors Shu-Yuan Lin is an associate lecturer in the Department of Educational Foundations in the College of Education at Idaho State University. Her current research interests and special projects are focused on computer-based prewriting strategies, English as a second/new/foreign language instruction, standards-based teacher assessments, tech- nology integration in K-16 instruction, and cultural and linguistic diversity in higher education. Peter Denner is the assistant dean for assessment and a professor in the Department of Educational Foundations in the College of Education at Idaho State University. His current research interests are focused on standards-based performance assessments of teacher quality and the linking of teacher performance assessments, particularly Teacher Work Samples, to the learning of P-12 students. Angela Luckey is an emeritus associate professor in the Department of Educational Foundations in the College of Education at Idaho State University. Her research interests focused on bilingual education and English as a second language instruction.