Developing Language Assessments and Justifying their Use Lyle F. Bachman Department of Applied Linguistics University of California, Los Angeles lfb@humnet.ucla.edu Presentation based on Bachman, L. F. & Palmer, A. S. (2010). Language assessment in practice: Developing language assessments and justifying their use the real world. Oxford: Oxford University Press. When you need to assess your students, where do you begin? a. The type of assessment task I will use b. How I will maintain test security c. How I can help my students do well so that they will succeed after they finish my class d. How I can help my students do well so that they will make me look good e. All of the above. Teachers’ questions about assessment Teachers almost always ask: When should I assess? How often should I assess? How should I assess? Teachers seldom ask: What should I assess? Teachers almost never ask: Why should I assess? Topics in this presentation The purposes of teaching and assessment in the classroom Two modes of classroom assessment Deciding why we want to assess: identifying intended beneficial consequences Deciding why we want to assess: identifying decisions to be made Deciding what we want to assess: defining constructs Deciding when to assess Deciding how we want to assess: Designing assessment tasks: Relating classroom assessment to assessment purpose: 5 things to think about Why do we teach? The primary purposes of language teaching are to (e.g.): promote or facilitate learning; enhance learners’ linguistic, cognitive, emotional, and social development Why do we assess? The primary purpose of classroom assessment is: to gather information to help us make decisions that will lead to beneficial consequences for stake holders (learners, teachers). Beneficial Consequences Decision(s) Interpretation(s) about learner’s language ability Assessment Report Assessment Performance Assessment: Information Teaching And Learning: Consequences Evaluation: Decisions Teaching & learning tasks, assessment tasks Primary purpose Teaching & learning activities/tasks Promote or facilitate learning Enhance learners’ linguistic, cognitive, emotional, and social development Assessment activities/tasks • Gather information to inform decisions Why do we assess? In the classroom, we use language assessments to inform two kinds of decisions: formative and summative: Formative decisions relate to making changes in teaching and learning activities in support of, or to promote or enhance learning. Formative decisions are made during the processes of teaching and learning. Why do we assess? In the language classroom, we use assessments to inform two kinds of decisions: formative and summative: Summative decisions relate to passing or failing students on the basis of their progress or achievement, or certifying them based on their level of ability. Summative decisions are made after the processes of teaching and learning. Why do we assess? Formative decisions: Teachers make decisions about: changing their teaching (materials, activities). presenting, revising, contextualizing, and scaffolding new material; placing learners into appropriate groups or levels; guiding their students’ learning; challenging and motivating their students to learn. Why do we assess? Formative decisions: Learners make decisions about making changes: in their approaches to or strategies of learning; in the particular areas on which they may need or want to place greater emphasis. Why do we assess? Summative decisions: Teachers and administrators make decisions about: which students pass and fail a course. which students are certified at a particular level of ability Modes of classroom assessment Two modes of classroom assessment: Implicit Explicit Modes of classroom assessment Implicit mode: (“dynamic assessment”, “on-line assessment”, “continuous assessment”) Instantaneous and cyclical: • assessment – decision – instruction; • assessment – decision – instruction Learners are largely unaware that assessment is taking place. Used primarily for formative decisions. Modes of classroom assessment Explicit mode: Assessment as “assessment” Separate activity from teaching Both teacher and learners know this activity is an assessment. Used for both formative and summative decisions. Mode Implicit Characteristics Continuous Instantaneous Cyclical Both teacher and students may be unaware that assessment is taking place Explicit Clearly distinct from teaching Both teacher and learners aware that assessment is taking place Purpose Formative decisions, e.g.: Correct or not correct student’s response Change form of questioning Call on another student Produce a model utterance Request a group response Summative decisions, e.g.: • Pass/fail decision based partly on “classroom participation or performance Summative decisions, e.g.: Decide who passes the course Certify level of ability Formative decisions, e.g.: Teacher: Move on to next lesson or review current lesson Teacher: focus more on a specific area of content Student: spend more time on particular area of language ability Student: use a different learning strategy Explicit mode of classroom assessment When do we assess? What are our intended consequences? What decisions do we need to make? What information about learners’ language ability do we need to collect? How will we collect this information? When do we assess? Whenever we need to make an instructional decision, or a decision about learners, we need to assess. When do we assess? Occasions for classroom assessment Warm-up, revision (self-assessment, implicit assessment) Presentation (implicit assessment) Guided practice (implicit assessment) Independent practice (self-assessment) “Assessment” (explicit assessment) Types of decisions for which language assessments are used Guiding teaching and learning Entrance, readiness Placement Achievement/progress Certification Selection (e.g., employment, immigration) Uses of language assessments Many of these decisions are “high stakes”. Need to ask: 1. What beneficial consequences do we want to bring about? 2. What decisions do we need to make to help promote the intended consequences? 3. What information about learners do we need to make the most appropriate decision? 4. How can we gather this information? Teachers’ judgments? Classroom assessments? Self assessments? Formal tests? ASSESSMENT DEVELOPMENT Intended Consequences Decisions to be made Interpretation(s) about learners’ language ability Assessment Record Assessment Performance ASSESSMENT DEVELOPMENT Decisions to be made Interpretation(s) about learners’ language ability Assessment Record Assessment Performance ASSESSMENT INTERPRETATION AND USE Intended Consequences Uses of language assessments Your Turn: For your assessment project, answer these questions: 1. What beneficial consequences do I/we want to bring about? 2. What decisions do I/we need to make to help promote the intended consequences? 3. What information about learners do I/we need to make the most appropriate decision? 4. How can I/we gather this information? Accountability We must be able to justify the use we make of a language test. That is, we need to be ready if we are held accountable for the use we make of a language test. In other words, we need to be prepared to convince stakeholders that the intended uses of our test are justified. Whom do we need to convince? All Stake holders: Ourselves Our fellow teachers Test takers (our students) Program/department/university administrators Parents, guardians Other stake-holders (e.g., potential employers, funding agencies) Uses of language assessments Your Turn: For your assessment project, describe the stake holders. How do we do this? We need a conceptual framework that will enable us to justify the intended uses of our assessments. An “Assessment Use Argument” (AUA) provides such a framework. How do we do this? Two activities in justifying the uses of our assessments: Develop an Assessment Use Argument (AUA) that the intended uses of our assessment are justified, and Collect backing (evidence), or be prepared to collect backing in support of the AUA. Assessment Use Argument Provides: the rationale and justification for the decisions we make in designing and developing the test, and the logical framework for linking our intended consequences and decisions to the test taker’s performance. Parts of an Assessment Use Argument Claims: statements about our intended interpretations and uses of test performance; claims have two parts: • An outcome • One or more qualities claimed for the outcome Data: information on which the claim is based. Parts of an assessment Use Argument Warrants: statements justifying the claims Rebuttals: statements about possible alternatives to the outcomes or to the qualities that are stated in the claims. Backing: the evidence that we need to collect to support the claims and warrants in the AUA. Consequences Beneficial Decisions Equitable Valuessensitive Interpretations about test taker’s language ability Assessment Reports/Scores Assessment Performance Meaningful Impartial Generalizable Relevant Sufficient Consistent Articulating Claims for Intended Uses (Table 1 in the Handout) Qualities of Claims in an AUA Claim 1 Outcome: Consequences Quality: Beneficence Articulate Claim 1: list and describe: The intended consequences The stakeholders Qualities of Claims in an AUA Generic version of Claim 1: The consequences of using an assessment and of the decisions that are made are beneficial to stakeholders. {EXAMPLES OF CLAIM 1, pp. 2, 24} Qualities of Claims in an AUA Your turn: Adapt Claim 1 to your project Qualities of Claims in an AUA Claim 2 Outcome: Decisions Qualities: • Values-sensitivity • Equitability Qualities of Claims in an AUA Generic version of Claim 2: The decisions that are made on the basis of the interpretation take into consideration existing educational and societal values and relevant legal requirements and are equitable for those stakeholders who are affected by the decisions. {EXAMPLES OF CLAIM 2, pp. 3, 27} Qualities of Claims in an AUA Your turn: Adapt Claim 2 to your project TAKE A BREAK! What do we assess? Learning objectives “Content” of the syllabus or curriculum “Content” of lesson plans “Content” of teaching and learning materials “Content” of teaching & learning activities Language ability, proficiency Qualities of Claims in an AUA Claim 3 Outcome: Interpretation Qualities: • Meaningfulness • Impartiality • Generalizability • Relevance • Sufficiency Qualities of Claims in an AUA Generic version of Claim 3: The interpretations about the ability to be assessed are: meaningful with respect to a particular learning syllabus, a needs analysis of the abilities needed to perform tasks in the TLU domain, or a general theory of language ability or any combination of these. impartial to all groups of test takers, generalizable to the TLU domain, relevant to the decision to be made, and sufficient for the decision to be made. {EXAMPLES OF CLAIM 3, pp. 5, 28} Qualities of Claims in an AUA Meaningfulness warrants define the ability we want to assess, with respect to one of more frames of reference, and specify the conditions under which test takers’ performance will be elicited. Meaningfulness Warrant 1 provides a descriptive label and definition of the ability to be assessed.. Generic version of meaningfulness Warrant 1: The definition of the construct is based on a frame of reference such as teaching syllabus, a needs analysis, or current research and/or theory of language use, and clearly distinguishes the construct from other, related constructs {EXAMPLES OF MEANINGFULNESS WARRANT 1, pp. 5, 28} Qualities of Claims in an AUA Meaningfulness Warrant 2 provides the conditions under which we will observe or elicit test takers’ performance. Generic version of meaningfulness Warrant 2: The assessment task specifications clearly specify the conditions under which we will observe or elicit performance from which we can make inferences about the construct we intend to assess. {EXAMPLES OF MEANINGFULNESS WARRANT 2, pp. 5, 28} Qualities of Claims in an AUA Your turn: 1. Adapt Claim 3 to your project. 2. Adapt meaningfulness Warrant 1 to your project. 3. Adapt meaningfulness Warrant 2 to your project. 4. Create an example assessment task for your project. How do we assess? Think about the following: Why we want to assess (decisions and consequences) What we want to assess (interpretations about learners’ language ability) The “target language use domains” to which we want the interpretations to generalize • Language classroom • School—other classes How do we assess? Use or design assessment tasks that correspond to tasks in the target language use domain. Teaching and learning tasks in the language classroom Language use tasks that learners need to perform in other classes in school Language use tasks that learners need to perform in the community or work place Assessment : Gather Information Teaching & Learning: Consequences What? How? Evaluation: Decisions Why? When? Qualities of Claims in an AUA Generalizability warrants describe: the TLU domain the tasks in the TLU domain, and the correspondence between characteristics of TLU task and assessment task {EXAMPLES OF GENERALIZABILITY WARRANTS, pp. 12, 15 } Qualities of Claims in an AUA Your turn: 1. 2. 3. 4. 5. Adapt the generalizability warrant for your project. Specify the TLU domain for your project. Describe the characteristics of a TLU task using the task characteristics template. Create an assessment task and describe its characteristics using the task characteristics template. Compare the characteristics of the TLU and assessment tasks. Summary Five things to think about before using an assessment 1. Begin with consequences. 1. What beneficial consequences do I want to bring about? How will using an assessment help my students improve their learning? How will using an assessment help me improve my teaching? How might using an assessment be detrimental to my students? 2. Consider decisions. 2. What decisions do I need to make? What decisions do I need to make to help my students improve their learning? What decisions do I need to make to improve my teaching? How can I make sure that my decisions are equitable and values sensitive? 3. Identify the information you need. 3. What information about test takers do I need in order to make these decisions? Do I need to know if students have mastered the learning objectives of the lesson or the course? Do I need to know if students are ready for the next grade or level in the program? Do I need to know if students will be able to perform language use tasks in university, or in a job? 4. Consider the quality of the information you need. 4. How will I make sure that the information I collect about my students is: Meaningful (e. g, reflects the content of the lesson or course) Impartial (i.e., not biased for or against any particular student or group of students) Generalizable, relevant (i.e., tells me something about my students’ ability to use language in settings outside the test itself?) Sufficient (i.e., provides enough information for me to make a decision) 5. Consider how you will get the information you need. 5. How can I get the information I need? Can I obtain this from observing students in my class? Do I need to make a conscious effort to informally assess my students more regularly and consistently? Do I need to give my students a formal assessment or test? How will I report the results of my observations or assessment? (e. g., scores, profile of strengths and areas for improvement, verbal descriptions as feedback on their work) How will I make sure that my reports are consistent? Thank you! Selected References Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press. Bachman, L. F. (2000). Modern language testing at the turn of the century: assuring that what we count counts. Language Testing, 17(1), 1 - 42. Bachman, L. F. (2004). Statistical analyses for language assessment. Cambridge: Cambridge University Press. Bachman, L. F. (2005). Building and supporting a case for test use. Language Assessment Quarterly, 2(1), 1-34. Bachman, L. F., & Palmer, A. S. (2010). Language assessment in the real world: developing language tests and justifying their use. Oxford: Oxford University Press.