CIMA Research Brief March 2010 CLASSIC© Programming and Teacher Pedagogy: Achieving CREDE Standards for Effective Pedagogy Research Brief Center for Intercultural and Multilingual Advocacy (CIMA) March 2010 Research to date on the effectiveness of the CLASSIC© program has primarily focused on changes in the attitudes and self-efficacy of the participants. While results from survey and interview based work have been encouraging, the methods have most often been pre-experimental. This research has been very informative in shaping the program over the years, however it does not provide rigorous empirical evidence of the effects on teachers’ actual practice. Current CIMA objectives require stronger evidence that the program is altering participants’ teaching practices in important ways, not only to affirm existing program features, and identify possible shortfalls, but also to respond to the need for greater accountability. If done correctly, systematic classroom observation can provide more reliable and objective measures than participant self-report and anecdotal accounts. Observational methods focus on behaviors that can be witnessed, rather than subjective accounts, or what participants believe to be the case. The findings reported in this research brief are based on an observational measure of effective pedagogy for diverse learners. This research explores the impact of the CLASSIC© program on the teaching practices of the participants in their actual classroom settings. CREDE’s Standards for Effective Pedagogy & the Standards Performance Continuum The CREDE Standards for Effective Pedagogy identify five descriptors of the potent features for educational success of diverse at-risk populations (Tharp, Dalton, Estrada, 2000). The CREDE Standards for Effective Pedagogy are: I. Standard I: Teachers and Students Producing Together. (Joint Productive Activity, JPA). II. Standard II: Developing Language and Literacy Across the Curriculum (LL). III. Standard III: Making Meaning—Connecting School to Students’ Lives (Contextualization, CTX). IV. Standard IV: Teaching Complex Thinking (CC) V. Standard V: Teaching Through Instructional Conversation. (IC) CREDE designed and developed an observational measure of their five standards, the Standards Performance Continuum (SPC) to serve several purposes: a) as a measure of the effectiveness of their professional development programs; b) to provide developmental guidelines and constructive feedback for teachers; c) as a catalyst for school reform. The NEA has adopted the SPC for the purposes of measuring the effectiveness of professional development. Research by Thrap and Dalton (2007), proposed that the Five Standards for Effective Pedagogy provided a lens for disaggregating pedagogy from teaching, thereby clarifying the pedagogy’s functional value for strengthening teaching. In order to demonstrate the functional value of the pedagogy educators learn in the CLASSIC© program on strengthening educators teaching practice, the Standards Performance Continuum was identified as a key tool for data collection. Additional elements of the SPC that aligned with the CLASSIC© program included: (a) emphasis on academic language development, (b) emphasis on contextualizing academic concepts within the experience and knowledge that students bring from home, community, and school, (c) emphasis on student engagement, and (d) pre-assessment of students’ background knowledge (Echevarria, Vogt, & Short, 2000; Herrera, Murry, & Morales Cabral, 2007). The SPC defines five levels of enactment for each standard: 0) Not Observed - the standard is not present; 1) Emerging - elements of the standard are implemented at a minimal level; 2) Developing - the standard is partially implemented; 3) Enacting - the standard is fully implemented; 4) Integrating - at least three standards are implemented simultaneously. Additionally, these levels are operationally defined in the context of each standard individually so that ratings can be arrived at objectively based on well-defined criteria. CIMA Research Brief March 2010 CLASSIC© Continuum of Best Practice The educational philosophy described in CREDE’s explanation of standards and indicators is not fully reflected in the existing Standards Performance Continuum (SPC). For example, the SPC does not reflect critical concepts related to second language acquisition research/theory. Therefore, adaptations to the SPC were made to reflect CLASSIC© program fundamentals of effective practice. These program fundamentals of effective practice, which align with and enhance the five CREDE standards, include: (a) low-risk learning environment (Krashen, 1981, 1982), (b) incorporation of content and language objectives (Echevarria, Vogt, & Short, 2000; TESOL, 2003), (c) grouping configurations that take all four dimensions of biography into account (Thomas & Collier, 1997; Herrera & Murry, 2005), and (d) use of native language in academic and linguistic development (Cummins, 1981; Escamilla, 2006). To maintain the efficacy of the SPC developed by CREDE, the CLASSIC© program fundamentals were aligned to the five standards of the SPC. Once aligned, they were articulated as individual behaviors that could be measured according to the same five levels of enactment identified on the SPC. An example of this alignment under Standard II - Language & Literacy Development has been included to show how we added an indicator for the use of the native language in academic and linguistic development below: II. Language & Literacy Development: Native Language Not Observed Emerging Developing 0 1 2 No evidence of Minimal evidence Occasional use native language in of native language of the native environment or in environment language during instruction. and/or instruction. the lesson. Enacting 3 Explicit support of students’ use of the native language during the lesson. Integrating 4 Consistent, structured opportunities for students to use their native language as a resource during the lesson. Changing Pedagogy Two studies are presented here that describe the application of the CCBP to provide empirical evidence of the effects of the CLASSIC© professional development program on the classroom practices of inservice teachers. Study 1 compares teachers at the end of the program to teachers at the beginning of the program. Study 2 describes the effects of teaching using instructional strategies specifically designed to increase engagement and promote linguistic and academic development for culturally and linguistically diverse (CLD) students. STUDY 1 The CCBP was used to explore differences between two groups of program participants: a group of teachers in their final course of the program (cohort 1), and a group at the start of their first course (cohort 2). It was hypothesized that cohort 1 would demonstrate a higher level of standards than cohort 2 since these teachers were nearing the completion of the program and therefore had more experience and practice on all aspects of the program. Method Inservice teachers across 4 school districts in Northeast Kansas who were enrolled in the CLASSIC© program participated in the observations conducted in Study 1. Seventy cohort 1 teachers and 72 cohort 2 teachers were observed in the fall semester of 2009. There were no significant differences in the grade levels and content areas of the teachers across both cohorts. The grade level of the sample comprised 38.5% grades K-3, 36.9% grades 4-8, and 24.6% grades 9-12. The number of years of teaching experience reported by the teachers in the sample was also equivalent across cohorts, and overall, 15.9% of participants reported having 1-3 years experience, 45.7% 4-10 years, 18.1% 11-15 years, and 20.3% reporting 16 or more years experience. CIMA Research Brief March 2010 Observers were trained on the CCBP rubric before going out into the field. Training consisted of describing the operational definitions for each of the items at each level in the scale. To arrive at agreement among raters, videos of classroom instruction were viewed and discussed in terms of how they might be rated. Each trainee rated an additional three videos individually, and six live field observations were jointly conducted. Data from each trainee for all nine observations were tested for inter-rater reliability. Alpha above .85 was achieved on all but 5 of 22 items. Ratings for the five standards subscales achieved alphas between .89 and .98, and an overall alpha of .98 was achieved on the composite score. Participants were informed and provided consent to having observers visit their classrooms during the course of the semester. Observations were scheduled with as little advance notice as possible in order to limit the amount of preparation in an effort to capture the teachers’ typical instructional conditions in the classroom. Observations were conducted in the teachers’ normal grade level and content area classroom settings. In these classrooms, 96.3% contained at least one CLD student; the most commonly observed number of CLD students was 4, and the maximum number observed was 18. Similarly, 71.9% of the classrooms contained at least one ELL; 1-4 students were most common, and 17 ELLs was the maximum. An effort was made by the observers to capture one complete lesson, and as such, the duration of the observations ranged from 30 to 90 minutes, with the majority of observations lasting more than 45 minutes. Results and Discussion Table 1.1 shows descriptive statistics by cohort on each of the indicators in the rubric. This provides information about the average (mean) score on each of the measures and can be useful in describing the level of standards the cohort 1 participants achieved as a whole on each of the CCBP indicators. For example, the average cohort 1 teacher at the end of the program attained between the Developing and Enacting level (mean of 2.41) on the Activity Connections indicator, but remained at the Emerging stage (mean of 0.86) on Native Language support. This information can be useful for reflecting on the success of the program to deliver on specific goals. Summing the indicators under each standard and dividing by the number of indicators in each provides an average score for each of the five standards (Table 1.1). These scores are conceptually and empirically comparable to the original SPC rubric where scores can range from 0 to 4 for each standard. For example, the mean cohort 2 score on the Contextualization standard was 1.2, which can be interpreted as the average cohort 2 teacher was observed to be at the Emerging stage. Working with average scores for each standard can also be useful in reducing the number of factors that are tested statistically, thus limiting the familywise type I error rate. Statistical tests of the differences between the two cohorts’ mean scores were conducted using the average scores for each standard (Table 1.2). Additionally, an overall average across all indicators was tested. Summing all 22 indicators and dividing that score by 22 arrived at the composite average. This score can be interpreted as an overall level of attainment in the measure of effective pedagogy and has the same levels as the individual indicators (0 to 4/Not Observed to Integrating). On average, cohort 1 teachers were observed to be at the Developing stage with a mean score of 1.94, while cohort 2 teachers were, on average, between the Emerging and Developing stage with a mean score of 1.59. The mean difference between cohorts was 0.35, which resulted in the finding of an effect size of d=0.79, which can be interpreted as an empirically large difference between groups. All tests attained statistical significance when corrected for the familywise type I error rate for running 6 tests at an overall alpha = .05. While some effects were larger than others (Joint Productive Activity effect of d=0.71, as compared to the Contextualization effect of d=0.52) all effect sizes were reasonably large, providing evidence of a higher level of enacting on all five standards for participants who were near the completion of the program (cohort 1). Aside from the amount of experience in the program, the two cohorts were not significantly different on other, arguably important, variables (years experience, grade levels, content areas, etc.), and although these groups could differ on some unaccounted for characteristics, evidence of a significantly higher level of demonstrated standards in cohort 1 is nonetheless very encouraging. STUDY 2 CIMA Research Brief March 2010 In Study 1, neither group was instructed to make any special accommodations to their lesson plan for the purposes of being observed. Study 2 looks at the levels of standards demonstrated by teachers experienced in the program who were using an instructional strategy that they learned in the CLASSIC© program. We predicted that these teachers would score even higher on our measure of pedagogical standards when observed while using strategies specifically designed to better accommodate the CLD student. Method All participants in Study 2 were cohort 1 teachers from Study 1 described above. The design for Study 2 consisted of a within-subjects comparison of instructional conditions: “business as usual” vs. strategy. Observations from Study 1 served as the “business as usual” condition. Participants were then observed a second time with explicit instructions to select and utilize a strategy they had learned in the program. This second observation served as the strategy condition. Observation durations ranged from 30 to 60 minutes, with the majority exceeding 45 minutes. Observers from Study 1 conducted all Study 2 observations. Results and Discussion Table 2.1 presents the means and standard deviations for the scores on each of the individual indicators in the CCBP. Again, these indicators were averaged for each of the five standards, and for the composite scores as described in Study 1. Results of the dependent samples t-tests on the mean differences in scores for each of these are provided in Table 2.2. Very large effect sizes (d > 1) were found, with the strategy condition producing higher scores than “business as usual” instruction for each of the five standards, and on the composite score. The effect of teachers’ use of an instructional strategy can also be conceptualized in terms the levels attained on our measure of standards for effective pedagogy in diverse classrooms. When teachers utilized a strategy, we witnessed an average increase of anywhere from three quarters to over one whole point on the scale in each of the five standards. Indeed, the mean composite score placed the average teacher at the Developing level (1.99) under “business as usual” conditions, and at the Enacting level (2.82) when using a strategy. GENERAL DISCUSSION Our two observational studies provide evidence of significant increases in the level of pedagogical standards demonstrated as a function of: a) teachers’ participation in the CLASSIC© program; and b) effecting instructional strategies designed to increase the quality of instruction for CLD students, and that the effects of a) can be further increased through b). Remember that teachers in Study 2 were the cohort 1 teachers who were nearing completion of the program, and while their typical instructional practices appeared to have improved as a result of their participation in the program, use of the strategies was effective in producing an even more salient difference. Future research should focus on replicating similar results with teachers participating in the CLASSIC© program at other sites across the country. Currently the program is being offered in six states (Kansas, Arkansas, Iowa, North Carolina, New Mexico, and Pennsylvania) presenting an opportunity to replicate these findings in a wider variety of demographic settings. More interestingly, our research hopes to extend these findings by exploring the relationship between pedagogical standards and student outcomes, both in terms of behaviors in the classroom, and in relation to academic achievement. If these characteristics that we have deemed important to the effective instruction of diverse learners are impactful, then we should expect there to be a significant, positive relationship between our measure of best practice and behaviors that are conducive to learning in the student (e.g. participation, attention, academic talk, etc.). These effects in turn, should be positively related to improved academic outcomes. CIMA Research Brief March 2010 Table 1.1 Study 1: Descriptive Statistics Standard Indicator I. Joint Productive Learning Environment Activity Teacher Collaboration Total Group, Partner, Small Group, Individual TPSI Partner/Grouping Determination Activity Connections I. Average Score II. Language & Literacy Development Listening, Speaking, Reading, Writing Questioning, Rephrasing, Modeling Native Language Language/Literacy Background Knowledge II. Average Score III. Contextualization Funds of Knowledge, Prior Knowledge, Academic Knowledge Assets/Community of Learners CLD Biography Connections III. Average Score IV. Challenging Activities Accommodations Content Objectives & Language Objectives Standards/Expectations Affective Filter Feedback (formative assessment) IV. Average Score V. Instructional Conversation Eliciting Student Talk Known to Unknown BICS/CALP Revoicing Student Articulate Views V. Average Score N=142 (cohort 1 n=70; cohort 2 n=72) Cohort 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Mean 2.39 2.15 2.19 1.99 2.07 1.40 1.36 1.01 2.41 1.86 2.08 1.68 2.36 2.08 2.23 2.00 0.86 0.31 1.53 1.26 1.74 1.41 1.60 1.29 1.41 0.88 1.54 1.44 1.52 1.20 2.19 1.60 0.90 0.61 2.33 1.94 2.37 2.31 2.67 2.47 2.09 1.79 2.24 1.83 1.97 1.47 2.29 1.79 1.99 1.81 1.83 1.54 2.06 1.69 S.D. 0.73 0.62 0.84 0.96 0.95 1.11 0.92 0.80 0.81 0.64 0.61 0.51 0.74 0.69 0.82 0.84 0.89 0.76 0.86 0.61 0.52 0.46 0.73 0.76 1.10 0.89 0.72 0.55 0.66 0.56 0.89 0.90 0.68 0.72 0.65 0.80 0.85 0.80 0.65 0.73 0.51 0.52 0.69 0.86 0.88 0.95 0.75 0.77 0.81 0.74 0.92 0.73 0.56 0.58 CIMA Research Brief March 2010 Table 1.2 Study 1: Tests of the Difference Between Group Means Standard Mean S.D. Mean Diff. t p I. Joint Productive Activity 2.081 0.611 0.401-2 4.24 < .001* 1.682 0.512 II. Language & Literacy 1.741 0.521 0.331-2 4.00 < .001* Development 1.412 0.462 III. Contextualization 1.521 0.661 0.321-2 3.01 .002* 1.202 0.562 IV. Challenging Activities 2.091 0.511 0.311-2 3.51 .001* 1.792 0.522 V. Instructional 2.061 0.561 0.371-2 3.80 < .001* Conversation 1.692 0.582 Composite Average 1.941 0.471 0.351-2 4.74 < .001* 1.592 0.412 df=140 * significant at p < .008 (Bonferroni correction for familywise error rate with alpha = .05/6 tests) 1 cohort 1 2 cohort 2 d 0.71 0.67 0.52 0.58 0.63 0.79 CIMA Research Brief March 2010 Table 2.1 Study 2: Descriptive Statistics Standard Indicator I. Joint Productive Learning Environment Activity Teacher Collaboration Total Group, Partner, Small Group, Individual TPSI Partner/Grouping Determination Activity Connections I. Average Score II. Language & Literacy Development Listening, Speaking, Reading, Writing Questioning, Rephrasing, Modeling Native Language Language/Literacy Background Knowledge II. Average Score III. Contextualization Funds of Knowledge, Prior Knowledge, Academic Knowledge Assets/Community of Learners CLD Biography Connections III. Average Score IV. Challenging Activities Accommodations Content Objectives & Language Objectives Standards/Expectations Affective Filter Feedback (formative assessment) IV. Average Score V. Instructional Conversation Eliciting Student Talk Known to Unknown BICS/CALP Revoicing Student Articulate Views V. Average Score N=58 Condition: 1=business as usual; 2=CLASSIC© strategy Condition 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Mean 2.43 3.26 2.31 2.90 2.17 3.00 1.40 1.91 2.45 3.41 2.15 2.90 2.41 3.19 2.33 3.03 0.78 1.22 1.62 2.69 1.78 2.53 1.64 2.79 1.52 2.86 1.57 2.71 1.57 2.79 2.19 3.00 0.91 1.14 2.41 3.45 2.33 3.52 2.66 3.05 2.10 2.83 2.33 3.14 2.02 2.88 2.33 3.21 2.10 2.66 1.93 2.93 2.14 2.96 S.D. 0.78 0.76 0.82 0.77 0.90 1.08 0.92 1.00 0.86 0.73 0.62 0.61 0.70 0.69 0.78 0.73 0.84 1.00 0.90 0.78 0.54 0.56 0.72 0.81 1.10 0.78 0.52 0.80 0.68 0.56 0.91 0.70 0.71 0.51 0.65 0.60 0.87 0.76 0.69 0.63 0.53 0.42 0.66 0.66 0.89 0.73 0.74 0.81 0.79 0.74 0.95 0.86 0.59 0.55 CIMA Research Brief March 2010 Table 2.2 Study 2: Within-Subjects Tests of the Mean Difference Between Conditions Standard Mean S.D. Mean Diff. t p I. Joint Productive Activity 2.151 0.621 0.742-1 9.05 < .001* 2.902 0.612 II. Language & Literacy 1.781 0.541 0.752-1 9.07 < .001* Development 2.532 0.562 III. Contextualization 1.571 0.681 1.212-1 10.85 < .001* 2.792 0.562 IV. Challenging Activities 2.101 0.521 0.732-1 9.46 < .001* 2.832 .0422 V. Instructional 2.141 0.591 0.822-1 9.08 < .001* Conversation 2.962 0.552 Composite Average 1.991 0.491 0.822-1 12.51 < .001* 2.822 0.422 df=57 * significant at p < .008 (Bonferroni correction for familywise error rate with alpha = .05/6 tests) 1 business as usual condition © 2 CLASSIC strategy condition d 1.17 1.19 1.42 1.24 1.19 1.64