Testing and Spacing: Keys to Enhancing Learning and Retention

advertisement
Testing and Spacing:
Keys to Enhancing
Learning and Retention
Sean Kang
Department of Psychology, UCSD
TDLC Bootcamp
Aug 10, 2009
Purpose of Tests / Quizzes
• Traditionally, an assessment tool
• But testing does not merely measure the
contents of memory
• Taking a test can serve as a learning
opportunity, enhancing memory retention to a
greater extent than additional studying…
the testing effect
(also referred to as retrieval practice)
Spitzer (1939)
• 3,605 sixth-graders in Iowa
• Students read ~600-word article on the bamboo
plant
• 25-item multiple-choice test (no feedback)
• Varied the retention interval and frequency of
testing
Spitzer (1939)
Time After Studying (Days)
Group
0
1
1
T1
T2
2
T1
3
4
5
6
7
8
7
14
21
28
63
T3
T2
T1
T3
T2
T1
T2
T1
T2
T1
T2
T1
T1
Spitzer (1939)
Time After Studying (Days)
Group
0
1
13.2
2
13.2
3
4
5
6
7
8
1
7
14
21
28
63
9.6
7.9
7.0
6.5
6.8
6.4
Spitzer (1939)
Time After Studying (Days)
Group
0
1
1
13.2
13.1
2
13.2
3
4
5
6
7
8
7
14
21
28
63
12.2
11.8
9.6
10.7
8.9
7.9
8.2
7.0
7.1
6.5
7.1
6.8
6.4
The Testing Effect
• Journal of Educational Psychology, 1989:
• Dempster, F. N. (1992). Using tests to promote learning:
A neglected classroom resource. Journal of Research
& Development in Education, 25, 213–217.
• Resurgence of interest in the testing effect in recent
years
Roediger & Karpicke (2006)
• Stimuli: 2 prose passages from TOEFL prep
book (~260 words each)
• Learning condition (within-subjects):
– Restudy (two 7-min periods of study) vs. Test (7-min
period of study, followed by 7-min period of test)
• Retention interval (between-subjects):
– 5 min, 2 days, or 1 week
Does testing benefit memory for
non-verbal materials?
• Past research has focused exclusively on verbal
materials (or at least required verbal responses
at test)
Carpenter & Pashler (2007)
Roediger & Karpicke (2008)
• Stimuli: 40 Swahili-English word pairs
• Subjects studied and were tested on the Swahili
words in alternating blocks
d = 4.03
Testing effect: How does it work?
1. Additional (focused) presentation of material
2. Operations/processes engaged by an initial test
are also engaged during the final test, resulting
in positive transfer to same type of tests (i.e.,
practice effect)
3. Retrieval itself is a potent memory modifier, with
increasing retrieval demand/effort enhancing
later retention
Does test format matter?
Initial test type - Short Answer (SA), Multiple Choice (MC), Read Fact
Final, criterial test (SA, MC)
Corrective feedback given after each initial test question.
COMPETING PREDICTIONS:
1) Repeated exposure
2) Transfer appropriate processing
3) Retrieval “effort”
Initial MC
Initial SA
MC
SA
Final test
Initial MC
Initial SA
MC
SA
Final test
Procedure
ENCODING
Read 4
Current
Directions
articles
~15 min
each
INTERVENING
EXPERIENCE
FINAL TEST
Multiple choice
Short answer
Read answer
Control/filler
Within-Subjects,
after each article
8 items/condition
3 days
Mult. choice (16): 4 from each of the 4 prior
conditions
Short answer (16): 4 from each of the 4
prior conditions
Feedback provided after
each test question
N=48
(Kang, McDermott, & Roediger, 2007)
Sample Test Question
(E.g., after reading article on literacy acquisition by Rebecca Treiman)
Read Fact:
Young Joe is more likely to know the name of the letter ‘j’ than Alice or
Tom.
Short Answer:
Young Joe is more likely to know the _______ of the letter ‘j’ than Alice or
Tom.
Multiple-choice:
a. place of articulation
b. phoneme
c. name
d. sound
Testing enhanced later memory, and the enhancement
was greater when the initial test format was short answer
None
1
Read statements
0.9
MC
0.8
SA
INITIAL
TEST
Proportion Correct
0.7
0.6
0.5
0.4
0.3
.69 .83 .87 .94
0.2
.27 .46 .53 .57
0.1
0
FINAL MC
FINAL SA
COMPETING PREDICTIONS:
Transfer appropriate processing
Retrieval “effort”
Initial MC
Initial SA
MC
SA
Final Test
Initial MC
Initial SA
MC
SA
Final Test
Does feedback matter?
ENCODING
Read 4
Current
Directions
articles
~15 min
each
INTERVENING
EXPERIENCE
FINAL TEST
Multiple choice
Short answer
Read answer
Control/filler
Within-Subjects,
after each article
8 items/condition
3 days
Mult. choice (16): 4 from each of the 4 prior
conditions
Short answer (16): 4 from each of the 4
prior conditions
Feedback provided after
each test question
N=48
(Kang, McDermott, & Roediger, 2007)
Corrective feedback important, especially when initial
Does
feedback
matter?
test performance
is not
high
None
1
Read statements
0.9
MC
0.8
SA
INITIAL
TEST
Proportion Correct
0.7
0.6
0.5
0.4
.74 .88 .87 .80
0.3
0.2
.33 .51 .62 .48
0.1
0
FINAL MC
FINAL SA
The Testing Effect
• Taking a test can be a potent learning event, often
yielding better long-term retention than additional
studying.
• Testing benefits learning of a diverse range of
materials, both verbal and nonverbal.
• Repeated retrieval practice augments the benefit.
• The size of the testing effect is modulated by test
format & feedback
– Tests requiring effortful retrieval are more effective at
enhancing retention, implicating retrieval as a causal
mechanism
– To maximize the benefit of testing, feedback should be
provided when initial test performance is low
The Spacing Effect
• Reviews are more effective when distributed or spaced
out, rather than massed (with total time equated)
• One of the most robust phenomenon; observed with
diverse range of materials / types of learning
• Ebbinghaus (1885):
– When learning to recite a list of 12 nonsense syllables, if 68
repetitions in one day, 7 repetitions required the next day to relearn.
If 38 repetitions spread across 3 days, however, 6 repetitions
required the following day to relearn.
“…with any considerable number of repetitions a suitable distribution of
them over a space of time is decidedly more advantageous than the
massing of them at a single time.”
The Spacing Effect
Inter-Study Interval
(ISI)
Or practise
retrieving
Spacing effect: Spaced > Massed
Lag effect: Comparison of different levels of spacing
Theoretical accounts
• Deficient processing theory
– At short ISI, processing of 2nd presentation is deficient; less
attention paid to an item that is relatively more familiar
• Encoding variability theory
– Item and its context stored at encoding;
– Context is assumed to undergo random drift;
– Average distance between any prior context and the current
context will increase with passing of time;
– Likelihood of successful retrieval depends on the distance
between context at test and context at encoding;
– As ISI increases, increased probability that test context will be
similar to at least one of the study/encoding contexts
The Spacing Effect
• Is there an optimal ISI / gap?
• Does the answer depend on the RI?
<
Cepeda et al. (2006)
The Spacing Effect
• For RI >= 1 day, is a 1-day ISI/gap sufficient to produce
most/all of the benefit of spacing?
• Only a handful of studies provide multi-gap
comparisons, with RI >= 1 day.
Cepeda et al. (2009), Experiment 1
• N = 182
• Stimuli: 40 Swahili-English word pairs
• ISI / Gap (between-subjects):
– 0, 1, 2, 4, 7, and 14 days
• RI: 10 days
• Procedure
– Session 1: All items presented for study once, followed by
testing with feedback until all items successfully recalled 2x.
– Session 2: After appropriate gap, all items tested 2x with
feedback.
– Session 3: After 10-day RI, final test.
Cepeda et al. (2009), Experiment 2
• N = 161
• Stimuli: 2 sets
– Obscure facts (e.g., Who invented snow golf? Rudyard Kipling)
– Photographs of not-well-known objects paired with facts
E.g.,
Name this model, in which Amelia Earhart made her ill fated
last flight. Lockheed Electra.
• ISI / Gap (between-subjects):
– 0, 1, 7, 28, 84, 168
• RI: 168 days
Cepeda et al. (2009), Conclusions
• Spacing benefits observed with RIs > 1 week
• Gap/ISI had non-monotonic effects on final test
performance; accuracy increased then decreased as
gap increased.
• For sufficiently long RIs, optimal gap/ISI > 1 day.
Cepeda et al. (2008)
• Experiment conducted on the internet
• N = 1,354
• 26 different combinations of gaps and RIs
• Stimuli: 32 obscure facts
• Procedure
– Session 1: Learn 32 facts to criterion of one correct recall of
each fact.
– Session 2: After appropriate gap, subjects tested 2x with
feedback.
– Session 3: After appropriate RI, final test.
Cepeda et al. (2008), Conclusions
• For each RI, final performance initially increased with
increasing gap, then fell as gap increased further.
• The effect of gap was very large: the optimal gap
provided a 64% increase (averaged across RIs) in final
recall, relative to the 0-day gap condition.
• As RI increases, the optimal gap also increases, but the
ratio of optimal gap to RI should decline.
• Smaller costs associated with using gap that is longer
than the optimal value than using gap that is shorter.
Expanding vs. Equal Interval
Spaced Retrieval
Expanding vs. Equal Interval
Spaced Retrieval
• Landauer & Bjork (1978) demonstrated the advantage of
expanding over equal interval retrieval practice.
• But findings since then have been rather inconsistent,
with several instances of failures to replicate. E.g.,
Karpicke & Roediger (2007)
Applications of Testing & Spacing
• Supermemo
www.supermemo.com
• Spaced Ed
www.spaceded.com
Download