Dylan_WIliam-SSAT-Annual-Conference

advertisement
1
Embedded formative assessment:
still more rhetoric than reality
National Conference of The Schools Network 2011
Dylan Wiliam
www.dylanwiliam.net
Origins and antecedents
2

Feedback (Wiener, 1948)
Developing range-finders for anti-aircraft guns
 Effective action requires a closed system within which

Actions taken within the system are evaluated
 Evaluation of the actions leads to modification of future actions


Two kinds of loops
Positive (bad: leads to collapse or explosive growth)
 Negative (good: leads to stability)



“Feedback is information about the gap between the
actual level and the reference level of a system parameter
which is used to alter the gap in some way” (Ramaprasad,
1983 p. 4)
Feedback and instructional correctives (Bloom)
What’s wrong with the feedback metaphor?
3
In education



Feedback is any information
given to the student about
their current performance
… or at best, information that
compares current
performance with desired
performance.
Much rarer is information
that can be used by learners
to improve
In engineering

That’s just data

That’s just a thermostat

That’s a feedback system
Feedback has complex effects
4



264 low and high ability grade 6 students in 12 classes in 4 schools;
analysis of 132 students at top and bottom of each class
Same teaching, same aims, same teachers, same classwork
Three kinds of feedback: scores, comments, scores+comments
Scores
Comments
Achievement
Attitude
no gain
High scorers : positive
Low scorers: negative
30% gain
High scorers : positive
Low scorers : positive
Butler(1988) Br. J. Educ. Psychol., 58 1-14
Responses
5
Scores
Comments
Achievement
Attitude
no gain
High scorers : positive
Low scorers: negative
30% gain
High scorers : positive
Low scorers : positive
What do you think happened for the students given both
scores and comments?
A.
B.
C.
D.
E.
Gain: 30%; Attitude: all positive
Gain: 30%; Attitude: high scorers positive, low scorers negative
Gain: 0%; Attitude: all positive
Gain: 0%; Attitude: high scorers positive, low scorers negative
Something else
Students and grades
6
Feedback is not always effective
7



200 grade 5 and 6 Israeli students
Divergent thinking tasks
4 matched groups
experimental group 1 (EG1); comments
 experimental group 2 (EG2); grades
 experimental group 3 (EG3); praise
 control group (CG); no feedback


Achievement


EG1>(EG2≈EG3≈CG)
Ego-involvement

(EG2≈EG3)>(EG1≈CG)
Butler (1987) J. Educ. Psychol. 79 474-482
Feedback should feed forward
8

80 Grade 8 Canadian students learning to write
major scales in Music
 Experimental
group 1 (EG1) given
 written
praise
 list of weaknesses
 workplan
 Experimental
group 2 (EG2) given
 oral
feedback
 nature of errors
 chance to correct errors
 Control
 no

group (CG1) given
feedback
Achievement: EG2>(EG1≈CG)
Boulet et al. (1990) J. Educational Research 84 119-125
…and should leave learning with the learner
9

‘Peekability’ (Simmonds & Cope, 1993)
Pairs of students, aged 9-11
 Angle and rotation problems

class 1 worked on paper
 class 2 worked on a computer, using Logo



Class 1 outperformed class 2
‘Scaffolding’ (Day & Cordón, 1993)

2 grade 3 classes
class 1 given ‘scaffolded’ response
 class 2 given solution when stuck


Class 1 outperformed class 2
Effects of feedback
10



Kluger & DeNisi (1996)
Review of 3000 research reports
Excluding those:







without adequate controls
with poor design
with fewer than 10 participants
where performance was not measured
without details of effect sizes
left 131 reports, 607 effect sizes, involving 12652 individuals
On average feedback does improve performance, but


Effect sizes very different in different studies
40% of effect sizes were negative
Getting feedback right is hard
11
Feedback indicates performance…
exceeds goal
falls short of goal
Change behavior
Exert less effort
Increase effort
Change goal
Increase aspiration
Reduce aspiration
Abandon goal
Decide goal is too easy
Decide goal is too hard
Reject feedback
Feedback is ignored
Feedback is ignored
Response type
Feedback practice audit
12
How often do students
receive ‘feedback’ in the form
of scores, levels, sub-levels,
or grades?
A.Key stages 1 to 3
B.Key stage 4
C.Key stage 5
1.
2.
3.
4.
5.
Every week
Every two or three weeks
Every month or half-term
Termly/twice a year
Annually
Kinds of feedback (Nyquist, 2003)
13

Weaker feedback only


Feedback only


KCR+ explanation (KCR+e)
Moderate formative assessment


KoR + clear goals or knowledge of correct results (KCR)
Weak formative assessment


Knowledge or results (KoR)
(KCR+e) + specific actions for gap reduction
Strong formative assessment

(KCR+e) + activity
Effects of formative assessment (HE)
14
Kind of feedback
Count
Effect/sd
Weaker feedback only
31
0.14
Feedback only
48
0.36
Weaker formative assessment
49
0.26
Moderate formative assessment
41
0.39
Strong formative assessment
16
0.56
Feedback practice audit 2
15
In your school, what
proportion of feedback
events involve students in
responding to the feedback
provided immediately, and in
class?
1.
2.
3.
4.
5.
Less than 10%
10% to 30%
30% to 70%
70% to 90%
More than 90%
Unfortunately, humans are not machines…
16

Attribution (Dweck, 2000)
 Personalization
(internal v external)
 Permanence (stable v unstable)
 Essential that students attribute both failures and
success to internal, unstable causes
(it’s down to you, and you can do something about it)
Personalization
Success
Failure
internal: “I got a good grade
because I did a good piece of
work”
internal: “I got a low grade because
it wasn’t a very good piece of work”
external: “I got a good grade
because the teacher likes me”
external: “I got a low grade because
the teacher doesn’t like me
stable: “I got a good grade
stable: “I got a bad grade because
because I’m good at that subject” I’m no good at that subject”
Stability
unstable: “I got a good grade
because I was lucky in the
questions that came up”
Specificity
unstable: “I got a bad grade
because I hadn’t reviewed the
material before the test”
specific: “I’m good at that but
specific: “I’m no good at that but
that’s the only thing I’m good at” I’m good at everything else”
global: “I’m good at that means
I’ll be good at everything”
global: “I’m useless at everything”
Mindset
18

Views of ‘ability’
 fixed
(IQ)
 incremental (untapped potential)
 Essential that teachers inculcate in their students a
view that ‘ability’ is incremental rather than fixed
(by working, you’re getting smarter)
Force-field analysis (Lewin, 1954)
19

What are the forces that
will support or drive the
adoption of formative
assessment practices in
your school/authority?
+

What are the forces that
will constrain or prevent
the adoption of formative
assessment practices in
your school/authority?
—
“Flow”
20




A dancer describes how it fees when a performance is going well: “Your
concentration is very complete. Your mind isn’t wandering, you are not
thinking of something else; you are totally involved in what you are doing.
… Your energy is flowing very smoothly. You feel relaxed, comfortable and
energetic.”
A rock climber describes how it feels when he is scaling a mountain: “You
are so involved in what you are doing [that] you aren’t thinking of yourself
as separate from the immediate activity. … You don’t see yourself as
separate from what you are doing.”
A mother who enjoys the time spent with her small daughter: “Her
reading is the one thing she’s really into, and we read together. She reads
to me and I read to her, and that’s a time when I sort of lose touch with
the rest of the world, I’m totally absorbed in what I’m doing.”
A chess player tells of playing in a tournament: “… the concentration is
like breathing—you never think of it. The roof could fall in and, if it missed
you, you would be unaware of it.” (Csikszentmihalyi, 1990, pp. 53–54)
Motivation: cause or effect?
21
high
arousal
Flow
anxiety
challenge
control
worry
relaxation
apathy
boredom
low
low
Csikszentmihalyi (1990)
competence
high
Providing feedback that moves learning on
22

Key idea: feedback should:
Cause thinking
 Provide guidance on how to improve





Comment-only marking
Focused marking
Explicit reference to mark schemes/scoring guide
Suggestions on how to improve:


Not giving complete solutions
Re-timing assessment:

e.g., three-quarters-of-the-way-through-a-unit test
A blossoming of research reviews…
23








Fuchs & Fuchs (1986)
Natriello (1987)
Crooks (1988)
Bangert-Drowns, et al. (1991)
Dempster (1991, 1992)
Elshout-Mohr (1994)
Kluger & DeNisi (1996)
Black & Wiliam (1998)








Nyquist (2003)
Brookhart (2004)
Allal & Lopez (2005)
Köller (2005)
Brookhart (2007)
Wiliam (2007)
Hattie & Timperley (2007)
Shute (2008)
Effects of formative assessment
24
Standardized effect size: differences in means, measured
in population standard deviations
Source
Kluger & DeNisi (1996)
Black &Wiliam (1998)
Wiliam et al., (2004)
Effect size
0.41
0.4 to 0.7
0.32
Hattie & Timperley (2007)
Shute (2008)
0.96
0.4 to 0.8
Problems with effect sizes
25



Restriction of range
Sensitivity to instruction
Ambiguous comparisons
Definitions of formative assessment
26
We use the general term assessment to refer to all those activities
undertaken by teachers—and by their students in assessing themselves—
that provide information to be used as feedback to modify teaching and
learning activities. Such assessment becomes formative assessment when
the evidence is actually used to adapt the teaching to meet student
needs” (Black & Wiliam, 1998 p. 140)
“the process used by teachers and students to recognise and respond to
student learning in order to enhance that learning, during the learning”
(Cowie & Bell, 1999 p. 32)
“assessment carried out during the instructional process for the purpose
of improving teaching or learning” (Shepard et al., 2005 p. 275)
27
“Formative assessment refers to frequent, interactive
assessments of students’ progress and understanding to identify
learning needs and adjust teaching appropriately” (Looney,
2005, p. 21)
“A formative assessment is a tool that teachers use to measure
student grasp of specific topics and skills they are teaching. It’s a
‘midstream’ tool to identify specific student misconceptions and
mistakes while the material is being taught” (Kahl, 2005 p. 11)
28
“Assessment for Learning is the process of seeking and interpreting evidence
for use by learners and their teachers to decide where the learners are in
their learning, where they need to go and how best to get there” (Broadfoot
et al., 2002 pp. 2-3)
Assessment for learning is any assessment for which the first priority in its
design and practice is to serve the purpose of promoting students’ learning.
It thus differs from assessment designed primarily to serve the purposes of
accountability, or of ranking, or of certifying competence. An assessment
activity can help learning if it provides information that teachers and their
students can use as feedback in assessing themselves and one another and in
modifying the teaching and learning activities in which they are engaged.
Such assessment becomes “formative assessment” when the evidence is
actually used to adapt the teaching work to meet learning needs. (Black et
al., 2004 p. 10)
Which of these is formative?
29
A. A science adviser uses test results to plan professional
development workshops for teachers
B. Teachers doing item-by-item analysis of KS2 math tests to
review their curriculum
C. A school tests students every 10 weeks to predict which
students are “on course” to pass a big test
D. “Three fourths” of the way through a unit test
E. Exit pass question: “What is the difference between mass
and weight?”
F. “Sketch the graph of y equals one over one plus x squared
on your mini-dry-erase boards.”
What does formative assessment form?
30
Cycle length
Long
Medium
Short


Student involved assessment


Student engagement


Teacher cognition about learning


Curriculum alignment

Monitoring progress

Responsive classroom practice

Formative assessment: a new definition
31

“An assessment functions formatively to the extent
that evidence about student achievement elicited by
the assessment is interpreted and used to make
decisions about the next steps in instruction that are
likely to be better, or better founded, than the
decisions that would have been taken in the absence
of that evidence.” (Wiliam, 2009)

Formative assessment involves the creation of, and
capitalization upon, moments of contingency in the
regulation of learning processes.
Unpacking formative assessment
32

Key processes
 Establishing
where the learners are in their learning
 Establishing where they are going
 Working out how to get there

Participants
 Teachers
 Peers
 Learners
Unpacking formative assessment
33
Where the
learner is going
Teacher
Peer
Learner
Clarifying,
sharing and
understanding
learning
intentions
Where the learner is How to get there
Providing
Engineering effective
discussions, tasks, and feedback that
moves learners
activities that elicit
forward
evidence of learning
Activating students as learning
resources for one another
Activating students as owners
of their own learning
Five “key strategies”…
34

Clarifying, sharing, and understanding learning intentions


Engineering effective classroom discussions, tasks and
activities that elicit evidence of learning


feedback
Activating students as learning resources for one another


classroom discourse, interactive whole-class teaching
Providing feedback that moves learners forward


curriculum philosophy
collaborative learning, reciprocal teaching, peer-assessment
Activating students as owners of their own learning

metacognition, motivation, interest, attribution, self-assessment
Wiliam & Thompson (2007)
Unpacking formative assessment
35
Where the
learner is going
Teacher
Peer
Learner
Where the learner is How to get there
Using
evidence
of
Clarifying,
achievement
to adapt what
sharing
and
understanding
happens
in
classrooms
to
learning
intentionsmeet learner needs
Clarifying, sharing, and
understanding learning intentions
Sharing learning intentions
37





3 teachers each teaching 4 Year 8 science classes in
two US schools
14 week experiment
7 two-week projects, each scored 2-10
All teaching the same, except:
For a part of each week
 Two
of each teacher’s classes discusses their likes and
dislikes about the teaching (control)
 The other two classes discusses how their work will be
assessed
White & Frederiksen, Cognition & Instruction, 16(1), 1998
Sharing learning intentions
38
Comprehensive Test of Basic Skills
Group
Likes and dislikes
Reflective assessment
Low
Middle
High
Outcomes
39

Who will benefit most from the reflective
assessment?
1.
2.
3.
4.
Higher achievers
Average achievers
Lower achievers
All students will benefit equally
Sharing learning intentions
40
Comprehensive Test of Basic Skills
Group
Low
Middle
High
Likes and dislikes
4.6
5.9
6.6
Reflective assessment
Sharing learning intentions
41
Comprehensive Test of Basic Skills
Group
Low
Middle
High
Likes and dislikes
4.6
5.9
6.6
Reflective assessment
6.7
7.2
7.4
Sharing learning intentions
42

Explain learning intentions at start of lesson/unit:




Consider providing learning intentions and success criteria in
students’ language.
Use posters of key words to talk about learning:




Learning intentions
Success criteria
e.g., describe, explain, evaluate
Use planning and writing frames judiciously
Use annotated examples of different standards to “flesh out”
assessment rubrics (e.g., lab reports)
Provide opportunities for students to design their own tests
Engineering effective discussions,
activities, and classroom tasks that elicit
evidence of learning
Eliciting evidence
44


Key idea: questioning should
 cause thinking
 provide data that informs teaching
Improving teacher questioning
 generating questions with colleagues
 closed v open
 low-order v high-order
 appropriate wait-time
Medicine Hat Tigers
45


A major junior (ice) hockey team playing in the
Central Division of the Eastern Conference of the
Western Hockey League in Canada
Players are aged from 15 to 20
 15
year olds are only allowed to play five games until
their own season has ended
 Each team is allowed only three 20 year olds
 Total roster 25 players
Medicine Hat Tigers
46
0
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
1
2
3
4
5
6
7
8
Eliciting evidence
47


Getting away from I-R-E
 basketball rather than serial table-tennis
 ‘No hands up’ (except to ask a question)
 ‘Hot Seat’ questioning
All-student response systems
 ABCD cards, Mini white-boards, Exit passes
Nothing new under the sun…
48
Eliciting evidence practice audit
49
In what proportion of lessons
in your school would a
teacher use an ‘all student
response’ system at least
every 30 minutes?
1.
2.
3.
4.
5.
Less than 10%
10% to 30%
30% to 70%
70% to 90%
More than 90%
Hinge questions
50

A hinge question is based on the important concept in a
lesson that is critical for students to understand before you
move on in the lesson.

The question should fall about midway during the lesson.

Every student must respond to the question within two
minutes.

You must be able to collect and interpret the responses
from all students in 30 seconds
Questioning in maths: Diagnosis
51
In which of these right-angled triangles is a2 + b2 = c2 ?
A
b
a
B
a
c
C
b
a
b
D
c
c
b
c
E
c
a
a
b
F
b
c
a
Questioning in science: Diagnosis
52
The ball sitting on the table is not moving. It is not moving because:
A.
B.
C.
D.
E.
no forces are pushing or pulling on the ball.
gravity is pulling down, but the table is in the way.
the table pushes up with the same force that gravity pulls down
gravity is holding it onto the table.
there is a force inside the ball keeping it from rolling off the table
Wilson & Draney, 2004
Questioning in English: Diagnosis (2)
53
Which of these is correct?
A. Its on its way.
B. It’s on its way.
C. Its on it’s way.
D. It’s on it’s way.
Questioning in English: Diagnosis (3)
54
Identify the adverbs in these sentences:
1. The boy ran across the street quickly.
(A) (B) (C)
(D)
(E)
2. Jayne usually crossed the street in a leisurely fashion.
(A)
(B)
(C)
(D)
(E)
3. Fred ran the race well but unsuccessfully.
(A)
(B) (C) (D)
(E)
Questioning in English: Diagnosis (4)
55
Which of these is the best thesis statement?
A. The typical TV show has 9 violent incidents
B. The essay I am going to write is about violence on TV
C. There is a lot of violence on TV
D. The amount of violence on TV should be reduced
E. Some programs are more violent than others
F. Violence is included in programs to boost ratings
G. Violence on TV is interesting
H. I don’t like the violence on TV
Questioning in history: Diagnosis
56
Why are historians concerned with bias when
analyzing sources?
A. People can never be trusted to tell the truth
B. People deliberately leave out important details
C. People are only able to provide meaningful information
if they experienced an event firsthand
D. People interpret the same event in different ways,
according to their experience
E. People are unaware of the motivations for their actions
F. People get confused about sequences of events
Questioning in MFL: Diagnosis
57
Which of the following is the correct translation
for ”I give the book to him”?
A.
B.
C.
D.
E.
F.
Yo lo doy el libro.
Yo doy le el libro.
Yo le doy el libro.
Yo doy lo el libro.
Yo doy el libro le.
Yo doy el libro lo.
Key requirement: discriminate between
incorrect and correct cognitive rules
58
Version 1
Version 2
There are two flights per day
from Newtown to Oldtown.
The first flight leaves
Newtown each day at 9:20
and arrives in Oldtown at
10:55. The second flight
from Newtown leaves at
2:15. At what time does the
second flight arrive in
Oldtown? Show your work.
There are two flights per day
from Newtown to Oldtown.
The first flight leaves
Newtown each day at 9:05
and arrives in Oldtown at
10:55. The second flight
from Newtown leaves at
2:15. At what time does the
second flight arrive in
Oldtown? Show your work.
Activating students as owners of their
own learning
Self-assessment: Portugal
60
45 teachers studying for a Masters degree in Education,
matched in age, qualifications and experience using the same
curriculum scheme for the same amount of time
Control group (N=20) follow
Experimental group (N=25)
regular MA program
develop self-assessment with
their students
117 students aged 8 years
119 students aged 9 years
77 students aged 10 - 14 years
125 students aged 8 years
121 students aged 9 years
108 students aged 10 - 14 years
Fontana & Fernandes, Br. J. Educ. Psychol. 64: 407-417
Details of the intervention
61
Weeks
Intervention
1 to 2
Individual choice from a range of work provided by the
teacher. Student self-assessment using materials provided
3 to 6
Children construct own problems like those in weeks 1 and 2
and select structured math apparatus to aid solutions
7 to 10
Children presented with a new learning objectives, and make
up their own problems, without exemplars by the teacher
11 to 14
Children set their own learning objectives, construct
appropriate problems, and use appropriate self-assessment
15 to 20
As weeks 1 to 14, but with less monitoring from the teacher
and increased freedom of choice and personal responsibility
Impact on student achievement
62
Pre-test
Post-test
Gain
Effect size
Control
65.1
72.9
7.8
0.34
Experimental
58.7
73.7
15.0
0.66
Students owning their own learning
63

Students assessing their own work:
 With
mark schemes or scoring guides
 With exemplars

Self-assessment of understanding:
 Traffic
lights
 Red/green discs
 Coloured cups
Activating students as learning
resources for one another
Benefits of structured interaction
65



15-yr-olds studying World History were tested on their
understanding of material delivered in lessons
At the end of the lessons, students were given time to review
their understanding of the material before they were tested
Half the students had been trained to pose questions as they
listened to the lectures
Individual
Group
Unstructured Independent review
Group discussion
Structured
Structured peerquestioning
Structured selfquestioning
King, A. (1991). Applied Cognitive Psychology, 5(4), 331-346
Impact on achievement
66
100
Structured peer
questioning
90
Score
80
Structured selfquestioning
70
Group discussion
60
Independent review
50
40
Pre
Post
10-day
King, A. (1991). Applied Cognitive Psychology, 5(4), 331-346
Students as learning resources
67

Students assessing their peers’ work:
 “Pre-flight
checklist”
 “Two stars and a wish”


Training students to pose questions/identifying
group weaknesses
End-of-lesson students’ review
Pulling it all together
Dual-pathway model (Boekaerts, 1993)
69
“It is assumed that students who are invited to participate in
a learning activity use three sources of information to form a
mental representation of the task-in-context and to appraise
it:
1.current perceptions of the task and the physical, social, and
instructional context within which it is embedded;
2.activated domain-specific knowledge and (meta)cognitive
strategies related to the task; and
3.motivational beliefs, including domain-specific capacity,
interest and effort beliefs.” (Boekaerts, 2006, p. 349)
Growth and well-being
70






Share learning goals with students so that they are able to monitor
their own progress toward them.
Promote the belief that ability is incremental rather than fixed;
when students think they can’t get smarter, they are likely to devote
their energy to avoiding failure.
Make it more difficult for students to compare themselves with
others in terms of achievement.
Provide feedback that contains a recipe for future action rather than
a review of past failures.
Use every opportunity to transfer executive control of the learning
from the teacher to the students to support their development as
autonomous learners, and as learning resources for one another
Use random questioning and all-student response systems to
provide high-quality evidence to the teacher about the
progress of learning
Force-field analysis (Lewin, 1954)
71

What are the forces that
will support or drive the
adoption of formative
assessment practices in
your school/authority?
+

What are the forces that
will constrain or prevent
the adoption of formative
assessment practices in
your school/authority?
—
Download