Developing Instructionally Embedded Formative Assessments William D. Schafer University of Maryland • Conference is about formative and interim assessments • Formative is generally thought of as in opposition to summative • Summative assessments are important but will only be mentioned as a way to differentiate formative assessments • Interim assessments will also be mentioned in my remarks, but not much Goals of the presentation: – discuss how I feel it is best to think of formative assessments – describe characteristics of formative assessments I feel are most useful to teachers and students – describe a possible way to generate formative assessments – suggest what a state might do to bring the possibility to a reality • Formative and summative are terms originating in the program evaluation literature (Dunn & Mulvenon, 2009) • Formative evaluation means making decisions about how well implementation is proceeding in order to make decisions about whether to continue as-is or to revise the program (Scriven, 1967) • Summative evaluation means making decisions about whether the program should be certified, and perhaps disseminated • These terms were first applied to assessment by Bloom (1969) • Dunn and Mulvenon (2009) and Good (2011) have traced how different authors have defined formative assessments • Good (2011) emphasized that the use of the assessment information as opposed to characteristics of the assessment, itself is what should define the term “formative” • But “formative assessment” is still a handy term, as long as we understand that it means something like any tool providing information to aid in making judgments about the success of instruction or learning • We generally think of summative assessments as those used to make evaluative judgments following a period of instruction, such as a unit or a semester or a year or even an entire schooling • Nevertheless, I have been trying to think of a real-world example of an educational assessment that can’t be used formatively by someone to make decisions about future instruction and/or learning • I can’t! (Can you?) • Interim is a newer term and means essentially tests like the statewide summative assessments applied during the school year • They are used to see whether students are developing toward eventual proficiency • Frankly, what educators are supposed to do with this information is not very clear to me • Why test students before they have had opportunity to learn the full academic year’s curriculum? • And what is an interim use of assessment information that is neither formative nor summative? • Do we really need this new term?? • Here, I want to ignore both interim and summative assessments and focus on formative assessment uses • I’ll use the term “formative assessment,” to mean assessments used for formative (instructional or learning) purposes • An assessment provides information and the use of the information is what makes the assessment formative • I’ll first discuss usefulness of formative assessments and then turn to a policy-level proposal for creating them • If use defines a formative assessment what should formative assessments look like to be most useful? • The fundamental assumption of my presentation is that to be most useful, formative assessments should be embedded in instructional units • This applies to both content and timing. A formative assessment should provide needed information to decision makers at optimal times • You can in theory pick one up from one unit and drop it into another but you are almost surely better off creating one for the unit you are working with • This assumption implies that formative assessments should be instructionally embedded • They may come in all reasonable assessment formats, such as – homework assignments – class work – brief quizzes – brief or extensive writing assignments (hopefully coupled with rubrics) – oral interactions, individually or in groups • Good instructional and/or learning decision making requires good information • Good information requires good assessments • Good assessments are not trivial to develop and neither are they trivial to evaluate • Bad assessments can be dangerous, since they give bad information that affects future actions • So we need a way to generate, evaluate, and disseminate instructionally embedded formative assessments • Formative assessments are needed in much greater volume than summative assessments • They are used more often, even though any one of them is used with fewer students • Fortunately, their linkage with instruction may provide a way to develop them efficiently and effectively • I will turn to a mechanism that could generate, evaluate, and disseminate formative assessments embedded in instructional units. • I will then suggest steps a state could take to refine and establish the mechanism • Because of the close relationship with instruction, I feel teachers should be creating formative assessments as they develop instructional units • In my version of utopia, I envision a computerized database of instructional units that have been approved by the state and that can be searched by teachers to review available options for their upcoming instructional activities (Schafer & Moody, 2004) • While the focus here is on instruction, the database would need to incorporate appropriate formative assessments • I would like to see curriculum specialists, instructional specialists, and assessment specialists making recommendations about how teachers can enhance their personal instructional units from all three perspectives • This will require assessment expertise – In-service workshops could be created to help teachers develop and use formative assessments as part of their instructional units – Rick Stiggins and his colleagues at the Assessment Training Institute have given us models of effective inservice experiences for teachers; these could be used more broadly – Sue Brookhart (2011) has recently discussed the needed content • A format is needed to express the unit plans, including (a) individual lesson plans to support a unit plan with goals tied explicitly to state curricula (b) formative assessments designed to help the teacher make those instructional decisions that help them initially motivate their students, help them decide whether or not to go on, help them determine if their students can generalize what they are learning to novel applications, etc. (c) formative assessments for students designed to help them understand what their learning goals should be, how well they are grasping the material, where they can get help if they need it, etc. (d) summative assessments that can be used at the end of the instructional unit to certify achievement (e.g., to be used in assigning a grade for the unit) • The units could be tried out and the data analyzed in order to document effectiveness • There could be a peer review process like we use in our own research journals. Peer reviews generate revision and resubmission requests, which is reasonable here and gives direction to the authors to improve • Each unit, including its formative assessments and a summative assessment at the end, can be certified by the enabling authority (e.g., a state education agency or a consortium) and made available electronically throughout a broad community of educators • The units could be selected by teachers who feel they will be useful, perhaps with modifications, and reviewed as teacher-users feel they have something helpful to say • Meaningful rewards for authoring teacher groups when their units are selected into the database could include – recognition (e.g., a plaque for the teachers and the school, and an article in the local newspaper) – money (e.g., a one-year increase in their salary steps) • As with university faculty, recognition in terms of prestige, job performance reviews, and bonuses or raises can be significant motivators. • Recognition vehicles for evaluating peers who reviewed and hopefully tried out the units also need to be worked out • This process would involve some extra effort in the beginning, • In the end it would result in less work for teachers who capitalize on the unit plans in the data base instead of creating their own unit and lesson plans from scratch • I see the process being implemented by a sevenperson Unit Review Team consisting of – a curriculum specialist in each of four contents (English, math, science, social studies) – an instruction specialist at each of two levels: elementary, high school) – an assessment specialist • Finally, let’s turn to how can a state can get started • Some thoughts are described here, but a state will need to develop its own process with modifications as its stakeholders review and hopefully sign-on • First, an overview of the process will be described • It could be implemented by a seven-person committee like that mentioned earlier • Following the overview, seven concrete (but tentative) steps are suggested to facilitate implementation • Teacher groups should evaluate and tweak this proposal. • They should be asked to list the elements that should appear in a unit plan based on what they would find useful and what they feel teacher groups could create • A format should be a tangible result, which should be evaluated by curriculum, instruction, and assessment professionals • A feasible and effective reward structure also needs to be proposed • At the same time principals could nominate experienced teacher groups (say, a dozen groups, in different contents at different grades) who are willing to follow the format to prepare a submission – The submissions could be compared with the format to see where either they or the format might be revised – The units could be returned to the teacher groups for revision • Other principals could solicit reviewing teachers who actually try the units out and make recommendations back to the Unit Review Team and through them to the development group for revisions • This is much like a journal’s peer review process On the other end of the process, the structure of the database needs to be developed • Teacher focus groups could recommend the characteristics they would find most helpful in searching the data base • Accepted units can be added to the database and plans made to solicit more • So what can a state do to implement and facilitate this concept? • Seven steps are described, most of which are cost-effective • I’ll use “the state” to refer to any entity that might be appropriate, such as a consortium (hint … hint) • First, the state should define its curricular goals. • This can be done through a forthright explanation of each of its constructs (combinations of content and grade level) • I have elsewhere described the concept of assessment limits (Schafer & Moody, 2004; Schafer, in press) • These are very specific elaborations of the potential scope of a state’s assessments • They are at-most lists for test developers and at-least lists for teachers) • Maryland’s web sites have numerous examples. E.g., http://mdk12.org/share/frameworks/CCSC_Math_gr8. pdf where the assessment limits are called “Essential Skills and Knowledge.” • Second, the thinking processes that the state will assess also needs elaboration. • What does the state feel students need to do with each of the content elements? • Those that are important enough to test are important enough to circulate among practitioners. • We need to understand the intended content of the summative assessment before we can judge how well a unit and its associated assessments are aligned to appropriate content • There are many choices to do this (Schafer, in press) • It is a necessary element of an alignment study but should be developed as part of a curriculum rather than as part of assessment development • Third, the state should express its summative assessments in terms of blueprints that specify the content, process, and difficulty distributions of the items on them • Blueprints formalize the range of possible topics and activities that students might be asked to exhibit as well as the levels of achievement they will find represented • They are useful in evaluating alignment of any assessment to the content (Schafer, in press) • They can also be useful in evaluating the alignment of instruction with the content • They are necessary to evaluate a unit since the unit needs to have at a minimum learning goals consistent with the state, in scope, depth, and “rigor.” • Fourth, a complete development of the state’s assessment scales would also be helpful • This includes interpreting the achievement levels (degrees of success that students and teachers can strive for). • This is often done using achievement level descriptions • Perhaps more effective would be to include examples of performance at each level • A reasonable way to do that would be to provide scale locations for all released items • That would describe actual operational definitions of the achievement levels to go along with the usual verbal characterizations, which are often ambiguous • Fifth, teachers need to have an understanding of how to build and use assessments formatively. • This includes assessments for both teacher and student insights (decision making). • Development of assessment competencies (See Brookhart, 2011) through in-service workshops like those Rick Stiggins delivers through the Assessment Training Institute is a possible approach for a state • But in the end, educators in the state might do best to take control of their own assessment learning, perhaps with the help of outside experts facilitated by the state • Sixth, the state might convene teacher groups to study, what nature of the unit plan data base would be most helpful, both to search and then to use. – How specific should it be? – What variables should be used to search it? – What descriptions would be most useful in helping a teacher quickly decide whether it would be appropriate for his or her situation? – How can the plans be best expressed? – How can quality unit plans be encouraged but not be onerous to produce? – What incentives would work, both for teachers, principals, and the state? • These can be explored in order to shape the database into a resource that can make meaningful change in what goes on in classrooms, and to capitalize on what segments of the teacher workforce can do well already. • Seventh, the state needs to institute a review, revision, re-review process. This can be modeled after the peer review process and administered by the seven-person committee of experts in the contents and in assessment development and use. • This sort of effort would have at least three desirable outcomes. It would – increase teacher professionalism, – avoid the loss of good developmental work done by teachers who retire or change grade levels or even careers, – and certainly affect what goes on in schools and classrooms on a daily basis References • Brookhart, S. M. (2011). Educational assessment knowledge and skills for teachers. Educational Measurement: Issues and Practice, 30(1), 3-12. • Dunn, K. E. & Mulvenon, S. W. (2009). A critical review of research on formative assessment: The limited scientific evidence of the impact of formative assessment in education. Practical Assessment, Research & Evaluation, 14(7). Available online: http://pareonline.net/getvn.asp?v=14&n=37. • Good, R. Formative use of assessment information: It’s a process, so let’s say what we mean. Practical Assessment, Research & Evaluation, 16(3). Available online: http://pareonline.net/getvn.asp?v=16&n=3. • Schafer, W. D. (in press). A process for systematic alignment of assessments to educational domains. In Schraw, G., & Robinson, D. R. (Eds.). Assessment of higher order thinking skills. New York, NY: Information Age Publishers. • Schafer, W. D. & Moody, M. (2004). Designing accountability assessments for teaching. Practical Assessment, Research & Evaluation, 9(14). Available online: http://pareonline.net/getvn.asp?v=9&n=14.