Designing and Developing Useful Language Tests

advertisement
Developing Language Assessments
and Justifying their Use
Lyle F. Bachman
Department of Applied Linguistics
University of California, Los Angeles
lfb@humnet.ucla.edu
Presentation based on Bachman, L. F. & Palmer, A. S. (2010).
Language assessment in practice: Developing language
assessments and justifying their use the real world. Oxford:
Oxford University Press.
When you need to assess your
students, where do you begin?
a. The type of assessment task I will use
b. How I will maintain test security
c. How I can help my students do well so
that they will succeed after they finish
my class
d. How I can help my students do well so
that they will make me look good
e. All of the above.
Teachers’ questions about
assessment
Teachers almost always ask:
 When should I assess?
 How often should I assess?
 How should I assess?
Teachers seldom ask:
 What should I assess?
Teachers almost never ask:
 Why should I assess?
Topics in this presentation
 The purposes of teaching and assessment in the classroom
 Two modes of classroom assessment
 Deciding why we want to assess: identifying intended
beneficial consequences
 Deciding why we want to assess: identifying decisions to
be made
 Deciding what we want to assess: defining constructs
 Deciding when to assess
 Deciding how we want to assess: Designing assessment
tasks:
 Relating classroom assessment to assessment purpose: 5
things to think about
Why do we teach?
The primary purposes of language
teaching are to (e.g.):
promote or facilitate learning;
enhance learners’ linguistic, cognitive,
emotional, and social development
Why do we assess?
The primary purpose of classroom
assessment is:
to gather information
to help us make decisions
that will lead to beneficial consequences
for stake holders (learners, teachers).
Beneficial
Consequences
Decision(s)
Interpretation(s) about
learner’s language
ability
Assessment
Report
Assessment
Performance
Assessment:
Information
Teaching
And
Learning:
Consequences
Evaluation:
Decisions
Teaching & learning tasks,
assessment tasks
Primary
purpose


Teaching & learning
activities/tasks
Promote or facilitate
learning
Enhance learners’
linguistic, cognitive,
emotional, and social
development
Assessment
activities/tasks
• Gather
information to
inform
decisions
Why do we assess?
In the classroom, we use language
assessments to inform two kinds of
decisions: formative and summative:
Formative decisions relate to making
changes in teaching and learning
activities in support of, or to promote or
enhance learning.
Formative decisions are made during the
processes of teaching and learning.
Why do we assess?
In the language classroom, we use
assessments to inform two kinds of
decisions: formative and summative:
Summative decisions relate to passing or
failing students on the basis of their
progress or achievement, or certifying
them based on their level of ability.
Summative decisions are made after the
processes of teaching and learning.
Why do we assess?
Formative decisions:
 Teachers make decisions about:
 changing their teaching (materials, activities).
 presenting, revising, contextualizing, and
scaffolding new material;
 placing learners into appropriate groups or
levels;
 guiding their students’ learning;
 challenging and motivating their students to
learn.
Why do we assess?
Formative decisions:
 Learners make decisions about making
changes:
 in their approaches to or strategies of learning;
 in the particular areas on which they may need
or want to place greater emphasis.
Why do we assess?
Summative decisions:
 Teachers and administrators make
decisions about:
 which students pass and fail a course.
 which students are certified at a particular
level of ability
Modes of classroom assessment
Two modes of classroom assessment:
 Implicit
 Explicit
Modes of classroom assessment
Implicit mode: (“dynamic
assessment”, “on-line assessment”,
“continuous assessment”)
 Instantaneous and cyclical:
• assessment – decision – instruction;
• assessment – decision – instruction
 Learners are largely unaware that
assessment is taking place.
 Used primarily for formative decisions.
Modes of classroom assessment
Explicit mode: Assessment as
“assessment”
Separate activity from teaching
Both teacher and learners know this
activity is an assessment.
Used for both formative and
summative decisions.
Mode
Implicit
Characteristics




Continuous
Instantaneous
Cyclical
Both teacher
and students
may be
unaware that
assessment is
taking place
Explicit  Clearly distinct
from teaching
 Both teacher
and learners
aware that
assessment is
taking place
Purpose
Formative decisions, e.g.:
 Correct or not correct student’s response
 Change form of questioning
 Call on another student
 Produce a model utterance
 Request a group response
Summative decisions, e.g.:
• Pass/fail decision based partly on “classroom
participation or performance
Summative decisions, e.g.:
 Decide who passes the course
 Certify level of ability
Formative decisions, e.g.:
 Teacher: Move on to next lesson or review
current lesson
 Teacher: focus more on a specific area of
content
 Student: spend more time on particular area
of language ability
 Student: use a different learning strategy
Explicit mode of classroom
assessment
 When do we assess?
 What are our intended consequences?
 What decisions do we need to make?
 What information about learners’
language ability do we need to collect?
 How will we collect this information?
When do we assess?
Whenever we need to make an
instructional decision, or a decision
about learners, we need to assess.
When do we assess?
Occasions for classroom assessment
Warm-up, revision (self-assessment, implicit
assessment)
Presentation (implicit assessment)
Guided practice (implicit assessment)
Independent practice (self-assessment)
“Assessment” (explicit assessment)
Types of decisions for which
language assessments are used






Guiding teaching and learning
Entrance, readiness
Placement
Achievement/progress
Certification
Selection (e.g., employment,
immigration)
Uses of language assessments


Many of these decisions are “high stakes”.
Need to ask:
1.
What beneficial consequences do we want to bring
about?
2. What decisions do we need to make to help promote the
intended consequences?
3. What information about learners do we need to make the
most appropriate decision?
4. How can we gather this information?
 Teachers’ judgments?
 Classroom assessments?
 Self assessments?
 Formal tests?
ASSESSMENT DEVELOPMENT
Intended
Consequences
Decisions to be made
Interpretation(s) about
learners’ language ability
Assessment
Record
Assessment
Performance
ASSESSMENT DEVELOPMENT
Decisions to be made
Interpretation(s) about
learners’ language ability
Assessment
Record
Assessment
Performance
ASSESSMENT INTERPRETATION AND
USE
Intended
Consequences
Uses of language assessments
 Your Turn:
 For your assessment project, answer these
questions:
1. What beneficial consequences do I/we want to
bring about?
2. What decisions do I/we need to make to help
promote the intended consequences?
3. What information about learners do I/we need to
make the most appropriate decision?
4. How can I/we gather this information?
Accountability
 We must be able to justify the use we
make of a language test.
 That is, we need to be ready if we are
held accountable for the use we make of
a language test.
 In other words, we need to be prepared
to convince stakeholders that the
intended uses of our test are justified.
Whom do we need to
convince?
All Stake holders:
 Ourselves
 Our fellow teachers
 Test takers (our students)
 Program/department/university
administrators
 Parents, guardians
 Other stake-holders (e.g., potential
employers, funding agencies)
Uses of language assessments
 Your Turn:
 For your assessment project, describe
the stake holders.
How do we do this?
We need a conceptual framework
that will enable us to justify the
intended uses of our assessments.
An “Assessment Use Argument”
(AUA) provides such a framework.
How do we do this?
Two activities in justifying the uses of our
assessments:
 Develop an Assessment Use Argument
(AUA) that the intended uses of our
assessment are justified, and
 Collect backing (evidence), or be
prepared to collect backing in support of
the AUA.
Assessment Use Argument
Provides:
the rationale and justification for the
decisions we make in designing and
developing the test, and
the logical framework for linking our
intended consequences and decisions
to the test taker’s performance.
Parts of an Assessment Use
Argument
Claims: statements about our intended
interpretations and uses of test
performance; claims have two parts:
• An outcome
• One or more qualities claimed for the
outcome
Data: information on which the claim is
based.
Parts of an assessment Use
Argument
Warrants: statements justifying the claims
Rebuttals: statements about possible
alternatives to the outcomes or to the
qualities that are stated in the claims.
Backing: the evidence that we need to
collect to support the claims and warrants
in the AUA.
Consequences
Beneficial
Decisions
Equitable
Valuessensitive
Interpretations about
test taker’s language
ability
Assessment
Reports/Scores
Assessment
Performance
Meaningful
Impartial
Generalizable
Relevant
Sufficient
Consistent
Articulating Claims for Intended
Uses
(Table 1 in the Handout)
Qualities of Claims in an
AUA
Claim 1
 Outcome: Consequences
 Quality: Beneficence
Articulate Claim 1: list and
describe:
 The intended consequences
 The stakeholders
Qualities of Claims in an
AUA
Generic version of Claim 1:
The consequences of using an
assessment and of the decisions that
are made are beneficial to
stakeholders.
{EXAMPLES OF CLAIM 1, pp. 2, 24}
Qualities of Claims in an
AUA
Your turn:
Adapt Claim 1 to your project
Qualities of Claims in an
AUA
Claim 2
 Outcome: Decisions
 Qualities:
• Values-sensitivity
• Equitability
Qualities of Claims in an
AUA
Generic version of Claim 2:
The decisions that are made on the
basis of the interpretation take into
consideration existing educational
and societal values and relevant legal
requirements and are equitable for
those stakeholders who are affected
by the decisions.
{EXAMPLES OF CLAIM 2, pp. 3, 27}
Qualities of Claims in an
AUA
Your turn:
Adapt Claim 2 to your project
TAKE A BREAK!
What do we assess?
 Learning objectives
 “Content” of the syllabus or curriculum
 “Content” of lesson plans
 “Content” of teaching and learning
materials
 “Content” of teaching & learning activities
 Language ability, proficiency
Qualities of Claims in an
AUA
Claim 3
 Outcome: Interpretation
 Qualities:
• Meaningfulness
• Impartiality
• Generalizability
• Relevance
• Sufficiency
Qualities of Claims in an
AUA
Generic version of Claim 3:
The interpretations about the ability to be
assessed are:
 meaningful with respect to a particular learning
syllabus, a needs analysis of the abilities
needed to perform tasks in the TLU domain, or
a general theory of language ability or any
combination of these.
 impartial to all groups of test takers,
 generalizable to the TLU domain,
 relevant to the decision to be made, and
 sufficient for the decision to be made.
{EXAMPLES OF CLAIM 3, pp. 5, 28}
Qualities of Claims in an
AUA
Meaningfulness warrants define the ability we
want to assess, with respect to one of more
frames of reference, and specify the conditions
under which test takers’ performance will be
elicited.
Meaningfulness Warrant 1 provides a descriptive
label and definition of the ability to be assessed..
Generic version of meaningfulness Warrant 1:
The definition of the construct is based on a
frame of reference such as teaching syllabus, a
needs analysis, or current research and/or
theory of language use, and clearly
distinguishes the construct from other, related
constructs
{EXAMPLES OF MEANINGFULNESS WARRANT 1,
pp. 5, 28}
Qualities of Claims in an
AUA
 Meaningfulness Warrant 2 provides the
conditions under which we will observe
or elicit test takers’ performance.
 Generic version of meaningfulness
Warrant 2: The assessment task
specifications clearly specify the
conditions under which we will observe
or elicit performance from which we can
make inferences about the construct we
intend to assess.
{EXAMPLES OF MEANINGFULNESS
WARRANT 2, pp. 5, 28}
Qualities of Claims in an
AUA
Your turn:
1. Adapt Claim 3 to your project.
2. Adapt meaningfulness Warrant 1
to your project.
3. Adapt meaningfulness Warrant 2
to your project.
4. Create an example assessment
task for your project.
How do we assess?
Think about the following:
 Why we want to assess (decisions and
consequences)
 What we want to assess (interpretations
about learners’ language ability)
 The “target language use domains” to
which we want the interpretations to
generalize
• Language classroom
• School—other classes
How do we assess?
Use or design assessment tasks that
correspond to tasks in the target language
use domain.
Teaching and learning tasks in the
language classroom
Language use tasks that learners need
to perform in other classes in school
Language use tasks that learners need
to perform in the community or work
place
Assessment :
Gather
Information
Teaching &
Learning:
Consequences
What? How?
Evaluation:
Decisions
Why? When?
Qualities of Claims in an
AUA
Generalizability warrants describe:
 the TLU domain
 the tasks in the TLU domain, and
 the correspondence between
characteristics of TLU task and
assessment task
{EXAMPLES OF GENERALIZABILITY
WARRANTS, pp. 12, 15 }
Qualities of Claims in an
AUA
Your turn:
1.
2.
3.
4.
5.
Adapt the generalizability warrant for your
project.
Specify the TLU domain for your project.
Describe the characteristics of a TLU task
using the task characteristics template.
Create an assessment task and describe its
characteristics using the task characteristics
template.
Compare the characteristics of the TLU and
assessment tasks.
Summary
Five things to think about before
using an assessment
1. Begin with consequences.
1. What beneficial consequences do I
want to bring about?
 How will using an assessment help my
students improve their learning?
 How will using an assessment help me
improve my teaching?
 How might using an assessment be
detrimental to my students?
2. Consider decisions.
2. What decisions do I need to make?
 What decisions do I need to make to help
my students improve their learning?
 What decisions do I need to make to
improve my teaching?
 How can I make sure that my decisions
are equitable and values sensitive?
3. Identify the information you
need.
3. What information about test takers do I
need in order to make these decisions?
 Do I need to know if students have mastered the
learning objectives of the lesson or the course?
 Do I need to know if students are ready for the
next grade or level in the program?
 Do I need to know if students will be able to
perform language use tasks in university, or in a
job?
4. Consider the quality of the
information you need.
4. How will I make sure that the information
I collect about my students is:
 Meaningful (e. g, reflects the content of the
lesson or course)
 Impartial (i.e., not biased for or against any
particular student or group of students)
 Generalizable, relevant (i.e., tells me something
about my students’ ability to use language in
settings outside the test itself?)
 Sufficient (i.e., provides enough information for
me to make a decision)
5. Consider how you will get
the information you need.
5. How can I get the information I need?
 Can
I obtain this from observing students in my
class?
 Do I need to make a conscious effort to informally
assess my students more regularly and
consistently?
 Do I need to give my students a formal
assessment or test?
 How will I report the results of my observations or
assessment? (e. g., scores, profile of strengths
and areas for improvement, verbal descriptions as
feedback on their work)
 How will I make sure that my reports are
consistent?
Thank you!
Selected References
Bachman, L. F. (1990). Fundamental considerations in
language testing. Oxford: Oxford University Press.
Bachman, L. F. (2000). Modern language testing at the
turn of the century: assuring that what we count
counts. Language Testing, 17(1), 1 - 42.
Bachman, L. F. (2004). Statistical analyses for language
assessment. Cambridge: Cambridge University Press.
Bachman, L. F. (2005). Building and supporting a case for
test use. Language Assessment Quarterly, 2(1), 1-34.
Bachman, L. F., & Palmer, A. S. (2010). Language
assessment in the real world: developing language tests
and justifying their use. Oxford: Oxford University
Press.
Download