Written for CADRE by Abt Associates (Daphne Minner, et. al.), 2012 In 2011, the National Science and Technology Council reviewed how 13 federal agencies spent $3.4 billion in fiscal year 2010 to support STEM education. NSF was found to have made the largest investment in STEM education, and its DRK-12 program had the largest budget of its 6 educational research and development programs. The compendium reviewed here (Part 1 of 2) focuses on 5 cohorts of DRK-12 projects (2008-2012) that utilized instruments designed to assess teacher practices, pedagogical content knowledge, and content knowledge. The purpose of this compendium is to provide an overview on the current status of STEM instrumentation commonly being used in the U.S and to provide resources useful to research and evaluation professionals. Research Question: What are the instruments, constructs, and methods being used to study teacher outcomes within the DR-K12 portfolio? Only extant, named instruments (as opposed to instruments being developed as part of a current proposal) were included. Two Phases: ◦ Phase 1: A review of all proposals funded by DRK-12 2008-2012 revealed 295 eligible projects. ◦ Phase 2: Data collection was conducted for instrumentspecific information about reliability and validity evidence, development and piloting, accessibility of the instrument, administration, and constructs measured. Since CADRE is funded as a cooperative agreement rather than a contract, they were unable to access Fastlane files and relied on materials provided by PIs. For 36 projects, materials were unavailable. For 8 of the 57 PCK instruments, the actual instruments were unavailable. 6 instruments required purchasing. 75 projects proposed to measure teacher practices, PCK, or Content: 71% measured only 1outcome, 24% measured 2 outcomes, and 5% measured all 3. Instruments Identified ◦ Practices: 42 ◦ PCK: 24 ◦ Content Knowledge: 27 5 Categories of Instruments 1. Instructional Practices 2. Instructional Practices plus Additional Constructs (Appendix B) 3. Instructional Beliefs (Appendix A) (Appendix C) Multidimensional 4. System-wide Reform Focused 5. Discourse Focused (Appendix E) (Appendix D) Need to be more cognizant about providing relevant psychometric information on the tools used and developed in order for others to reliably implement the tools in their own projects. Instruments developed must go through rigorous reliability and validity Initial step towards the systematic assessment and improvement of STEM research tools. Eleven instruments that primarily assessed classroom instructional practices: Seven observation protocols, Three rubrics, One survey Predominantly designed for pre-k through middle school teachers (6, 55%) More focused on science (5, 45%) than mathematics (3, 27%) or technology (2, 18%). The three science observation protocols capture variables ranging from the lesson’s temporal flow and percentage of time students spend in different types of groupings, to the extent of opportunity for students to engage in the various phases of the investigation cycle. The two science scoring rubrics are intended to be applied to lesson artifacts and instructional materials that the teacher provides students. They contain codes for student grouping, structure of lessons, use of scientific resources, hands-on opportunities through investigation, cognitive depth of the materials, encouragement of the scientific discourse community, and opportunity for explanation/justification, and connections/applications to novel situations. Across these eleven instruments, one had low reliability evidence, and four (36%) had acceptable or good evidence. For only two instruments was the team able to find validity evidence. Instructional Strategies Classroom Observation Protocol ◦ Identifying sense of purpose; asking account of student ideas; engaging students with relevant phenomena; developing and using scientific ideas; promoting student thinking about phenomena, experiences, and knowledge Scoop Notebook – Artifact rubric ◦ Portfolio assessment that captures: grouping, structure of lessons, use of scientific resources, hands-on, inquiry, cognitive depth, scientific discourse community, explanation/justification, assessment, connections/applications 11 instruments that measure instructional practices in addition to one or two other constructs, meaning: ◦ physical context ◦ demographics ◦ teacher content knowledge ◦ an aspect of classroom management This more comprehensive nature is also reflected in the subject domains being assessed— ◦ 2 each, mathematics and science ◦ 5 both mathematics and science ◦ 1 technology ◦ 1 general teaching skills Exist for many subjects. Middle School version tests all sciences very generally, whereas high school breaks it apart by specific domains Sit-down test, 4 hours, 50 multiple-choice questions and 2 constructed-response questions The test is designed to provide evidence that an examinee has a basic working knowledge of teaching foundations Ratings are made after at least 3 hours of observation Ratings for each item are made on a 7-point scale. Behavioral descriptors are present at the 1, 3, 5, and 7 levels. Assesses the materials and instructional supports for math and science learning present Name Mathematics Teaching Efficacy Belief Instrument (MTEBI, 2000) Modified from STEBI Riggs (California State U) & Enochs (Kansas State U),1990 What’s measured Personal math teaching efficacy (13 items) & math teaching outcome expectancy (8 items) Extent to which teachers believe they have the capability to positively affect student achievement Type Validity Evidence GRADE LEVEL Survey (Likert 5-point scale) Construct MTEBI:21 items, PRESERVICE (STEBI:25 items, ELEMENTARY) Name Principles of Scientific Inquiry-Teacher (PSI-T) Campbell, Chapman (Utah State U) & AbdHamid (U of Iowa), 2010 What’s measured Teacher (& student) perceptions of frequency of occurrence when students are responsible for each of 5 principles of scientific inquiry (NRC) Type Validity Evidence GRADE LEVEL Survey (5-point, 20 items) (teacher & student version) Content Extent to which students Construct are experiencing inquiry in science classrooms HIGH PS: Why is VNOS –C included? Name What’s measured Type Validity Evidence GRADE LEVEL Local Systemic Change Classroom Observation Protocol (LSC) Horizon Research Inc., 2000 Inside the Classroom: Teacher Interview Protocol Overall quality of observed math/science lesson: lesson design; implementation; math/science content; classroom culture; likely impact on students' understanding Observation (5-point scale) Teachers' perceptions of factors that influenced selection of lesson content and pedagogy Interview Content (Good inter-rater % reliability) K-12 Thirteen instruments looking at instructional practices and social aspects of classroom community (including class management). Observation protocols Six are non-domain specific Three are math-specific Three are science-specific One measures both Seven demonstrated more than one type of validity (more than other categories) Three scales ◦ Lesson design implementation ◦ Content ->PCK->Propositional and Procedural ◦ Classroom culture (e.g., egalitarian s-t relationship) High interrater % agreement High Validity ◦ Construct ◦ Content ◦ Predictive Three domains ◦ Emotional ◦ Classroom organization ◦ Instructional support High internal consistency High interrater % agreement Content validity 25 Content Tests, 12 General tests, 8 Science, 3 Math, 1 Science and Math, 1 Technology. ◦ General Tests: American College testing, GRE, ITBS-Iowa Test of Basic Skills, NAEP, PISA, PRAXIS, WEST-E. ◦ Science: MOSART, FACETS, IL Certification Testing System Study Guide-Science, FCI Force Concept Inventory Assessment, DTAMS-science: Diagnostic Science Assessment for Middle School Teachers, Classroom Test of Scientific Reasoning (Lawson). ◦ Math: MKT, M-SCAN, DTAMS-math: Diagnostic Math Assessment for Middle School Teachers. ◦ Science and Math: TIMSS ◦ Technology: TAGLIT: Taking a Good Look At Instructional Technology 12 of them Student Test, 9 Teacher Test, 2 Survey, 1 observation Tool, 1 Student test and Teacher Tools. 7 K-12 level, 4 elementary and middle, 4 Postsecondary, 3 high school, 2 middle, 1 grades4-9, 1 elementary, 1 middle and high, 1 high and postsecondary, 1 no level indicated. Each assessment is composed of 25 items—20 multiple-choice and 5 open-response. Paper-and-pencil format Pre- and post-tests before and after workshops To determine growth in teachers' content knowledge To be completed by test-takers within an hour. Each assessment has 3-4 science sub-domains. Available for use free of charge. • Scored for a fee of $10 per teacher per assessment -includes scores on individual items, on each science sub-domain in the content area, and on four different knowledge types (memorized, conceptual understanding, higher-order thinking, pedagogical content knowledge) http://louisville.edu/education/centers/crmstd/diag-sci-assess-middle http://louisville.edu/education/centers/crmstd/diag-sci-assess-middle Free and can be accessed after completion of four online tutorials that explain test design, use, scoring, and interpretation of results. A set of multiple-choice items include K–12 physical science and earth science content, and K–8 life science content in the NRC NSES as well as to the research literature about misconceptions concerning science concepts. ASW – Analysis of Student Work: A rubric is used to score teachers’ evaluations of a standardized set of video cases of student problem solving. LoU – Levels of Use Interviews: An interview determines how a change is being implemented in the classroom. SEPUP - Group Interaction and Communication of Scientific Information Rubrics: Rubrics are used to grade student work on a variety of measures including how they design and conduct an investigation, analyze data, understand concepts, evaluate evidence and identify tradeoffs, communicate scientific information, and work cooperatively in a group. Detailed access information can be found for each instrument in Appendices H & I of the compendium. Part 2 of the compendium (not covered here) details measurement of students’ content knowledge, reasoning skills, and psychological attributes.