HOW TO DO A STATE LONGITUDINAL EVALUATION MATH AND SCIENCE PARTNERSHIPS PROGRAM FEBRUARY 2011 WHAT IS A LONGITUDINAL EVALUATION? • Definition: A longitudinal evaluation collects data on the same set of participants on a set of common measures over time to assess the extent to which these measures change. • Example: Student achievement data on math is collected from third graders in Baltimore public schools in the spring of 2011. Achievement data from this cohort of third graders would be collected in the spring of 2012 and 2013. Measure of math achievement would be the Stanford Achievement Tests. • Example: Teacher practices of middle school science teachers in Baltimore public schools in the spring of 2011, 2012, and 2013. Teacher practices are measured by a classroom observation protocol such as the Reformed Teaching Observation Protocol. KEY ELEMENTS OF A LONGITUDINAL STUDY • Participants: Follow the same students or teachers over time. Collecting data each year on a similar, but not the same, set of students would not be considered longitudinal data. For example, collecting data on student achievement from 3rd graders in one half of the elementary schools in the first year and then collecting data from 3rd graders from the other half of schools would not be longitudinal data. • Measures: Use the same measures at each wave of the data collection. Example: Teacher practices – Use the same instrument to measure teacher practices such as the Reformed Teaching Observation Protocol. • Data collection methods: Timing. Even when you collect data on the same participants using the same measure, when you collect the data has to be the same. Most obvious example is collecting fall achievement data in the first year and then spring achievement data in the second year. Approaches to a MSP Longitudinal Evaluation • Two different levels to consider • Within Each MSP Grant: Follow the same students/teachers over time within each grant. Examine changes over time for that particular PD intervention. Consistency of population, measures, and methods only pertains to that particular grant. • Across MSP Grantees: Follow the same set of grantees over time using a common set of measures. This makes most sense to me for different subgroups of the MSP grantees such as the different models of PD identified in the recent MSP Annual Performance Report: Summer Institutes with followup, Summer Institute only, and School Year PD. Using the same measures, for example teacher practices, you would be able to examine how teacher practices change over time for particular subgroups of MSP grantees. • At the State level, I would think that collecting longitudinal data across grantees would be of greater interest. Why Should You Consider a Longitudinal Evaluation? • Lots of work - Resources: Longitudinal evaluations requires individuals with expertise in conducting such studies. Funds are needed to collect and analyze the data. - Time: Planning the study, monitoring the data collection, and conducting the analyses is time-consuming particularly for a longitudinal study - Long-term commitment: By its very nature, a longitudinal evaluation requires a multi-year commitment • Two main purposes of a longitudinal evaluation: Program process monitoring and program outcome monitoring (Ross, Lipsey, Freeman – Evaluation). Program Process Monitoring and Program Outcome Monitoring • Program Process Monitoring: “Systematic and continual documentation of key aspects of program performance that assess whether the program is operating as intended.” Sample Process Monitoring Questions: Is the duration and intensity of the professional development consistent over time? Is there a fall-off in the number of hours of PD received by teachers? By PD model? Is the content of the professional development the same over time? Has the program changed its emphasis on the skill areas and teaching strategies? Are the PD instructors and coaches the same over time? Is there a lot of turnover? Program Process Monitoring and Program Outcome Monitoring • Program Outcome Monitoring: “The continual measurement of intended outcomes of the program.” • Sample Outcome Monitoring Questions - What are the long-term achievement trends of students taught by MSP teachers? Do students who are positively impacted by MSP teachers continue to sustain those gains in successive years? - Do MSP teachers use the practices taught by their MSP professional development experiences? Do they continue to use these practices? Audiences: Who Should Care? • MSP Program directors: Program fidelity Duration and intensity of PD Short- and long-term trends in teacher practices Short- and long-term trends in student achievement • Policymakers: Local, state (MSP state coordinators), and federal levels Want to know what’s working, what approaches should be replicated and expanded What approaches should not continue to receive support What are the relative pay-offs balancing costs and impacts for different approaches to math and science PD • Larger community of practitioners Expand the evidence base on math and science professional development How should their PD programs be modified to reflect best practices Setting up a longitudinal data system • What’s are the foundational requirements? - Resources (Takes time and money) - Expertise (Requires individuals trained in evaluation) - Intent of the program (Common set of goals and outcomes) • Resources and Expertise: Every MSP grantee is required to conduct an independent evaluation of their program with a particular focus on outcomes. In addition, or as part of the evaluation, MSP grantees provide program implementation data as part of their annual performance reports. MSP grant funds are reserved to conduct the evaluation. • MSP grantees share a common goal: Raise student achievement though high-quality PD which increases teacher knowledge, promotes best practices, and develops new and more effective approaches to math and science education. Steps in Setting Up the System • Next MSP State grant competition: Provides an excellent opportunity to set up this data system. - Very difficult to build a longitudinal data system retrospectively with former MSP grantees. Too much variation in what data was collected, who it was collected on, and how it was collected. - Difficult to build such a system in mid-stream with the current round of MSP grantees. In addition to the variation noted above, the number of years to examine data trends will be limited. - Best approach is to build the system prospectively: Define requirements in terms of definitions, measures, and reporting for the evaluation before the program begins. What might go in the next MSP Application Notice? • Develop a section in the notice that requires (encourages) the grantees to collect data on a common set of data elements, using the same measures and methods over time. • Possible longitudinal data elements: Length and intensity of the PD Content of the PD Teacher knowledge Teacher practice Student achievement Costs • Caution: Pick only a few data elements that will be measured and collected. There are resource constraints and feasibility issues in collecting the same data across the grantees over time. At the grantee level, MSP grantees are fairly diverse in several ways: PD mode, subject area, and grade level. They may only share a few common data elements. What might go in the next MSP Application Notice? • In the section on the evaluation requirements: (Decisions on this have to be made by the state depending on the direction and focus areas of their MSP grant competitions) Identify a common set of data elements to be collected by all evaluators List a common set of measures that will be used to collect these data. Example: Teacher practice, Student Achievement Self-report vs. fact-based (Note: Validity of self-report vs. fact-based -Attitudes/Satisfaction vs. Behavior) Requiring vs. Encouraging: May not be possible or feasible to require all grantees to collect the same data over time. However, it might be possible to consider using some incentives to encourage grantees to participate. (Some federal approaches) Additional points for agreeing to do this as part of their evaluation – competitive preference Larger awards for grantees agreeing to participate Sheltered competition: Portion of the funds are set aside in which innovative and promising PD models are being tested. Analyzing and Reporting the Data • Analysis and Reporting of the Data: While it’s the last part of any study, it’s probably the first thing you want to think about. The design of the data system: data elements, target population, data collection, and time period are all driven by the research questions you’re seeking answers to. Examples of research questions: - Do teacher practices degrade over time depending on the type of PD received? - What’s the difference in the change in teacher practices between those coming out of a summer institute versus those who have follow-up activities in addition to the summer institute? - To answer these questions, you would select an instrument to measure teacher practices that all MSP grantees would administer in the spring of the school year. In successive years, the same teachers’ practices would again be measured in the spring. Change scores could be calculated that measured the extent to which teacher practices changed over time. Subgroup analyses by type of PD model could be conducted to compare the difference in the change over time among the models. Analyzing and Reporting the Data • MSP grantees are diverse in many ways: PD model, subject areas, and grade. The choice of the research questions and the design of the data system to answer these questions will depend on the particular emphases that you chose to pursue in your grant competitions. • Trend data across cohorts of MSP grantees: Although not longitudinal data as discussed here, the availability of data on a common set of measures across cohorts of MSP grantees would be very useful. This provides a long-term view of how PD practices and outcomes change over time. It would be particularly useful when you’re able to examine them by particular subgroups. Some reality checks • Longitudinal evaluations are costly and possibly not feasible within resource constraints or the time period of the grant. • Getting evaluators to agree to use a common set of measures will be difficult, but not impossible. First, you can make this a requirement of the grant. However buy-in is much preferred, Abt’s experience in providing technical assistance to evaluators (Striving Readers and i3) indicates a willingness to learn and modify their designs for the greater good. • Evaluation expertise at the state level is needed to complete some of the steps in developing the grant solicitation as well as conduct monitoring/technical assistance, and possibly reporting activities. Possible resources: university-based researchers (and their graduate students), regional labs, and the technical assistance available through the federal MSP office. Final Thoughts • Potential payoff from this data is great. • Too often, we don’t know how well programs work over time. We don’t have good evidence about the sustainability of the impacts we might see in the first year. There is some evidence from the literature that teacher practices acquired through PD are not sustained over time. • Often, we keep going over the same ground, reinventing the same approaches, not knowing if we’re on the right track. • Building a better base of evidence is what the entire field of education seems to be moving towards. Certainly at the federal level, but also at the state and local levels. • I think it’s worth the investment of time and resources.