Uploaded by Alexandra Juhász

Cognitive Psychology A Student's Handbook

advertisement
“This edition of Eysenck and Keane has further enhanced the status of
Cognitive Psychology: A Student’s Handbook, as a high benchmark that
other textbooks on this topic fail to achieve. It is informative and innovative, without losing any of its hallmark coverage and readability.”
Professor Robert Logie, School of Philosophy, Psychology and Language
Sciences, University of Edinburgh, United Kingdom
“The best student’s handbook on cognitive psychology – an indispensable
volume brought up-to-date in this latest edition. It explains everything from
low-level vision to high-level consciousness, and it can serve as an introductory text.”
Professor Philip Johnson-Laird, Stuart Professor of Psychology, Emeritus,
Princeton University, United States
“I first read Eysenck and Keane’s Cognitive Psychology: A Student’s
Handbook in its third edition, during my own undergraduate studies. Over
the course of its successive editions since then, the content – like the field of
cognition itself – has evolved and grown to encompass current trends, novel
approaches and supporting learning resources. It remains, in my opinion,
the gold standard for cognitive psychology textbooks.”
Dr Richard Roche, Senior Lecturer, Department of Psychology, Maynooth
University, Ireland
“Eysenck and Keane have once again done an excellent job, not only in
terms of keeping the textbook up-to-date with the latest studies, issues and
debates; but also by making the content even more accessible and clear
without compromising accuracy or underestimating the reader’s intelligence.
After all these years, this book remains an essential tool for students of cognitive psychology, covering the topic in the appropriate breadth and depth.”
Dr Gerasimos Markopoulos, Senior Lecturer, School of Science, Bath Spa
University, United Kingdom
“Eysenck and Keane’s popular textbook offers comprehensive coverage of
what psychology students need to know about human cognition. The textbook introduces the core topics of cognitive psychology that serve as the
fundamental building blocks to our understanding of human behaviour.
The authors integrate contemporary developments in the field and provide
an accessible entry to neighboring disciplines such as cognitive neuroscience
and neuropsychology.”
Dr Motonori Yamaguchi, Senior Lecturer, Department of Psychology,
University of Essex, United Kingdom
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 1
28/02/20 2:15 PM
“The eighth edition of Cognitive Psychology by Eysenck and Keane provides possibly the most comprehensive coverage of cognition currently
available. The text is clear and easy to read with clear links to theory across
the chapters. A real highlight is the creative use of up-to-date real-world
examples throughout the book.”
Associate Professor Rhonda Shaw, Head of the School of Psychology,
Charles Sturt University, Australia
“Unmatched in breadth and scope, it is the authoritative textbook on cognitive psychology. It outlines the history and major developments within
the field, while discussing state-of-the-art experimental research in depth.
The integration of online resources keeps the material fresh and engaging.”
Associate Professor Søren Risløv Staugaard, Department of Psychology and
Behavioural Sciences, Aarhus University, Denmark
“Eysenck and Keane’s Cognitive Psychology provides comprehensive topic
coverage and up-to-date research. The writing style is concise and easy to
follow, which makes the book suitable for both undergraduate and graduate students. The authors use real-life examples that are easily relatable to
students, making the book very enjoyable to read.”
Associate Professor Lin Agler, School of Psychology, University of Southern
Mississippi Gulf Coast, United States
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 2
28/02/20 2:15 PM
Cognitive Psychology
The fully updated eighth edition of Cognitive Psychology: A Student’s
Handbook provides comprehensive yet accessible coverage of all the key
areas in the field ranging from visual perception and attention through to
memory and language. Each chapter is complete with key definitions, practical real-life applications, chapter summaries and suggested further reading
to help students develop an understanding of this fascinating but complex
field.
The new edition includes:
●
●
●
an increased emphasis on neuroscience
updated references to reflect the latest research
applied ‘in the real world’ case studies and examples.
Widely regarded as the leading undergraduate textbook in the field of
cognitive psychology, this new edition comes complete with an enhanced
accompanying companion website. The website includes a suite of learning
resources including simulation experiments, multiple-choice questions, and
access to Primal Pictures’ interactive 3D atlas of the brain. The companion
website can be accessed at: www.routledge.com/cw/eysenck.
Michael W. Eysenck is Professor Emeritus in Psychology at Royal
Holloway, University of London, United Kingdom. He is also Professorial
Fellow at Roehampton University, London. He is the best-selling author
of several textbooks including Fundamentals of Cognition (2018), Memory
(with Alan Baddeley and Michael Anderson, 2020) and Fundamentals of
Psychology (2009).
Mark T. Keane is Chair of Computer Science at University College Dublin,
Ireland.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 3
28/02/20 2:15 PM
Visit the Companion Website
to access a range of interactive
teaching and learning resources
Includes access to Primal Pictures’
interactive 3D brain
www.routledge.com/cw/eysenck
PRIMAL PICTURES
Revolutionizing medical education with anatomical solutions to fit every need
For over 27 years, Primal Pictures has led the way in offering premier 3D digital human anatomy solutions,
transforming how educators teach and students learn the complexities of human anatomy and medicine.
Our pioneering scientific approach puts quality, accuracy and detail at the heart of everything we do.
Primal’s experts have created the world’s most medically accurate and detailed 3D reconstruction of human
anatomy using real scan data from the NLM Visible Human Project®, as well as CT images and MRIs. With
advanced academic research and thousands of development hours underpinning its creation, our model
surpasses all other anatomical resources available.
To learn more about Primal’s cutting-edge solution for better learning outcomes and increased student
engagement visit www.primalpictures.com/students
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 4
28/02/20 2:15 PM
COGNITIVE PSYCHOLOGY
A Student’s Handbook
Eighth Edition
MICHAEL W. EYSENCK
AND MARK T. KEANE
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 5
28/02/20 2:15 PM
Eighth edition published 2020
by Routledge
2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN
and by Routledge
52 Vanderbilt Avenue, New York, NY 10017
Routledge is an imprint of the Taylor & Francis Group, an informa business
© 2020 Michael W. Eysenck and Mark T. Keane
The right of Michael W. Eysenck and Mark T. Keane to be
identified as authors of this work has been asserted by them
in accordance with sections 77 and 78 of the Copyright,
Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted
or reproduced or utilised in any form or by any electronic,
mechanical, or other means, now known or hereafter
invented, including photocopying and recording, or in any
information storage or retrieval system, without permission
in writing from the publishers.
Trademark notice: Product or corporate names may be
trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
First edition published by Lawrence Erlbaum Associates 1984
Seventh edition Published by Routledge 2015
Every effort has been made to contact copyright-holders.
Please advise the publisher of any errors or omissions, and
these will be corrected in subsequent editions.
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-in-Publication Data
A catalog record has been requested for this book
ISBN: 978-1-13848-221-0 (hbk)
ISBN: 978-1-13848-223-4 (pbk)
ISBN: 978-1-35105-851-3 (ebk)
Typeset in Times New Roman by
Servis Filmsetting Ltd, Stockport, Cheshire
Visit the companion website: www.routledge.com/cw/eysenck.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 6
28/02/20 2:15 PM
To Christine with love
(M.W.E.)
What moves science forward is argument, debate,
and the testing of alternative theories . . . A science without
controversy is a science without progress.
(Jerry Coyne)
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 7
28/02/20 2:15 PM
Taylor& Francis
Taylor & Francis Group
http://taylorandfrancis.com
Contents
List of illustrations
Preface
Visual tour (how to use this book)
xiv
xxix
xxxi
1 Approaches to human cognition
1
Introduction 1
Cognitive psychology 3
Cognitive neuropsychology 7
Cognitive neuroscience: the brain in action
Computational cognitive science 26
Comparisons of major approaches 33
Is there a replication crisis? 34
Outline of this book 36
Chapter summary 37
Further reading 39
12
PART I
Visual perception and attention
41
2 Basic processes in visual perception
43
Introduction 43
Vision and the brain 44
Two visual systems: perception-action model 55
Colour vision 64
Depth perception 71
Perception without awareness: subliminal perception
Chapter summary 90
Further reading 92
3 Object and face recognition
81
94
Introduction 94
Pattern recognition 95
Perceptual organisation 96
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 9
28/02/20 2:15 PM
x
Contents
Approaches to object recognition 103
Object recognition: top-down processes
Face recognition 116
Visual imagery 130
Chapter summary 137
Further reading 139
111
4 Motion perception and action
140
Introduction 140
Direct perception 141
Visually guided movement 145
Visually guided action: contemporary approaches
Perception of human motion 157
Change blindness 163
Chapter summary 175
Further reading 176
152
5 Attention and performance
Introduction 178
Focused auditory attention 179
Focused visual attention 183
Disorders of visual attention 196
Visual search 200
Cross-modal effects 208
Divided attention: dual-task performance
“Automatic” processing 226
Chapter summary 231
Further reading 233
178
212
PART II
Memory
237
6 Learning, memory and forgetting
239
Introduction 239
Short-term vs long-term memory 240
Working memory: Baddeley and Hitch 246
Working memory: individual differences and executive
functions 254
Levels of processing (and beyond) 262
Learning through retrieval 265
Implicit learning 269
Forgetting from long-term memory 278
Chapter summary 293
Further reading 295
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 10
28/02/20 2:15 PM
Contents
7 Long-term memory systems
xi
296
Introduction 296
Declarative memory 300
Episodic memory 305
Semantic memory 313
Non-declarative memory 325
Beyond memory systems and declarative vs non-declarative
memory 332
Chapter summary 340
Further reading 342
8 Everyday memory
344
Introduction 344
Autobiographical memory: introduction 346
Memories across the lifetime 351
Theoretical approaches to autobiographical memory 355
Eyewitness testimony 363
Enhancing eyewitness memory 372
Prospective memory 375
Theoretical perspectives on prospective memory 381
Chapter summary 389
Further reading 391
PART III
Language
393
9 Speech perception and reading
403
Introduction 403
Speech (and music) perception 404
Listening to speech 408
Context effects 412
Theories of speech perception 417
Cognitive neuropsychology 429
Reading: introduction 432
Word recognition 436
Reading aloud 442
Reading: eye-movement research 453
Chapter summary 457
Further reading 460
10 Language comprehension
Introduction 461
Parsing: overview 462
Theoretical approaches: parsing and prediction
Pragmatics 478
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 11
461
464
28/02/20 2:15 PM
xii
Contents
Individual differences: working memory capacity 487
Discourse processing: inferences 490
Discourse comprehension: theoretical approaches 498
Chapter summary 510
Further reading 512
11 Language production
Introduction 514
Basic aspects of speech production 516
Speech planning 519
Speech errors 521
Theories of speech production 525
Cognitive neuropsychology: speech production
Speech as communication 543
Writing: the main processes 549
Spelling 558
Chapter summary 564
Further reading 566
514
536
PART IV
Thinking and reasoning
569
12 Problem solving and expertise
573
Introduction 573
Problem solving: introduction 574
Gestalt approach and beyond: insight and role of experience
Problem-solving strategies 588
Analogical problem solving and reasoning 593
Expertise 600
Chess-playing expertise 601
Medical expertise 604
Brain plasticity 609
Deliberate practice and beyond 612
Chapter summary 619
Further reading 621
13 Judgement and decision-making
Introduction 622
Judgement research 623
Theories of judgement 633
Decision-making under risk 640
Decision-making: emotional and social factors
Applied and complex decision-making 654
Chapter summary 663
Further reading 665
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 12
576
622
649
28/02/20 2:15 PM
Contents
14
666
Reasoning and hypothesis testing
Introduction 666
Hypothesis testing 667
Deductive reasoning 672
Theories of “deductive” reasoning
Brain systems in reasoning 690
Informal reasoning 694
Are humans rational? 701
Chapter summary 708
Further reading 710
xiii
680
PART V
Broadening horizons
713
15 Cognition and emotion
715
Introduction 715
Appraisal theories 719
Emotion regulation 723
Affect and cognition: attention and memory 730
Affect and cognition: judgement and decision-making 738
Judgement and decision-making: theoretical approaches 750
Anxiety, depression and cognitive biases 753
Cognitive bias modification and beyond 761
Chapter summary 764
Further reading 766
16 Consciousness
Introduction 767
Functions of consciousness 768
Assessing consciousness and conscious experience 775
Global workspace and global neuronal workspace theories
Is consciousness unitary? 792
Chapter summary 798
Further reading 799
Glossary
References
Author index
Subject index
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 13
767
783
801
824
915
931
28/02/20 2:15 PM
Illustrations
TABLES
1.1
1.2
1.3
11.1
15.1
Approaches to human cognition
Major techniques used to study the brain
Strengths and limitations of major approaches to human
cognition
Involvement of working memory components in various
writing processes
Effects of anxiety and depression on attentional bias
(engagement and disengagement)
3
16
35
556
757
PHOTOS
Chapter 1
•
Max Coltheart
•
The magnetic resonance imaging (MRI) scanner
•
Transcranial magnetic stimulation coil
•
The IBM Watson and two human contestants
(Ken Jennings and Brad Rutter)
8
18
21
27
Chapter 3
•
Irving Biederman
•
Heather Sellers
107
118
Chapter 6
•
Alan Baddeley and Graham Hitch
•
Endel Tulving
246
287
Chapter 7
•
Henry Molaison
297
Chapter 8
•
Jill Price
•
World Trade Center attacks on 9/11
•
Jennifer Thompson and Ronald Cotton
348
349
364
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 14
28/02/20 2:15 PM
Illustrations
Chapter 11
•
Iris Murdoch
550
Chapter 12
•
Monty Hall
•
Fernand Gobet
•
Magnus Carlsen
575
602
613
Chapter 13
•
Pat Croskerry
•
Nik Wallenda
625
647
xv
FIGURES
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
1.10
1.11
1.12
1.13
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
An early version of the information processing approach
Diagram to demonstrate top–down processing
Test yourself by naming the colours in each column
The four lobes, or divisions, of the cerebral cortex in the left
hemisphere
Brodmann brain areas on the lateral and medial surfaces
The brain network and cost efficiency
The organisation of the “rich club”
The spatial and temporal resolution of major techniques and
methods used to study brain functioning
Areas showing greater activation in a dead salmon when
presented with photographs of people than when at rest
The primitive mock neuroimaging device used by Ali et al.
(2014)
Architecture of a basic three-layer connectionist network
The main modules of the ACT-R cognitive architecture with
their locations within the brain
The basic structure of the standard model of the mind
involving five independent modules
Complex scene that requires prolonged perceptual processing
to understand fully
Route of visual signals
Simultaneous contrast involving lateral inhibition
Some distinctive features of the largest visual cortical areas
Connectivity within the ventral pathway on the lateral surface
of the macaque brain
(a) The single hierarchical model; (b) the parallel hierarchical
model; (c) the three parallel hierarchical feedforward systems
model
The percentage of cells in six different visual cortical areas
responding selectively to orientation, direction of motion,
disparity and colour
Visual motion inputs
Goodale and Milner’s (1992) perception-action model showing
the dorsal and ventral streams
Lesion overlap in patients with optic ataxia
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 15
4
4
5
13
13
14
15
17
25
26
28
30
31
43
45
46
47
48
49
52
53
56
57
28/02/20 2:15 PM
xvi
Illustrations
2.11 The Müller-Lyer illusion
58
2.12 The Ebbinghaus illusion
59
2.13 The hollow-face illusion. Left: normal and hollow faces with
small target magnets on the forehead and cheek of the normal
face; right: front view of the hollow mask that appears as an
illusory face projecting forwards
60
2.14 Disruption of size judgements when estimated perceptually
(estimation) or produced by grasping (grasping) in full or
restricted vision
61
2.15 Historical developments in theories linking perception and
action
63
2.16 Schematic diagram of the early stages of neural colour
processing
66
2.17 Photograph of a mug showing enormous variation in the
properties of the reflected light across the mug’s surface
67
2.18 “The Dress” made famous by its appearance on the internet
69
2.19 Observers’ perceptions of “The Dress”
69
2.20 An engraving by de Vries (1604/1970) in which linear
perspective creates an effective three-dimensional effect
when viewed from very close but not from further away
72
2.21 Examples of texture gradients that can be perceived as surfaces
receding into the distance
73
2.22 Kanizsa’s (1976) illusory square
73
2.23 Accuracy of size judgements as a function of object type
78
2.24 (a) A representation of the Ames room; (b) an actual Ames
room showing the effect achieved with two adults
79
2.25 Perceived distance. Top: stimuli presented to participants;
bottom: example of the stimulus display
81
2.26 The body size effect: what participants in the doll experiment
could see
81
2.27 Estimated contributions of conscious and subconscious
processing to GY’s performance in exclusion and inclusion
conditions in his normal and blind fields
84
2.28 The areas of most relevance to blindsight are the lateral
geniculate nucleus and middle temporal visual area
86
2.29 The relationship between response bias in reporting conscious
awareness and enhanced N200 on no-awareness correct trials
compared to no-awareness incorrect trials (UC)
89
3.1 The kind of stimulus used by Navon (1977) to demonstrate
the importance of global features in perception
95
3.2 The CAPTCHA used by Yahoo
97
3.3 The FBI’s mistaken identification of the Madrid bomber
98
3.4 Examples of the Gestalt laws of perceptual organisation:
(a) the law of proximity; (b) the law of similarity; (c) the law
of good continuation; and (d) the law of closure
99
3.5 An ambiguous drawing that can be seen as either two faces
or as a goblet
100
3.6 The tendency to perceive an array of empty circles as (A)
a rotated square or (B) a diamond
101
3.7 A task to decide which region in each stimulus is the figure
102
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 16
28/02/20 2:15 PM
Illustrations
3.8
3.9
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.17
3.18
3.19
3.20
3.21
3.22
3.23
3.24
3.25
4.1
4.2
4.3
4.4
4.5
4.6
High and low spatial frequency versions of a place
(a building)
Image of Mona Lisa revealing very low spatial frequencies
(left), low spatial frequencies (centre) and high spatial
frequencies (right)
An outline of Biederman’s recognition-by-components theory
Ambiguous figures
A brick wall that can be seen as something else
Object recognition involving two different routes: (1) a topdown route in which information proceeds rapidly to the
orbitofrontal cortex; (2) a bottom-up route using the slower
ventral visual stream
Interactive-iterative framework for object recognition
Recognising an elephant when a key feature (its trunk) is
partially hidden
Accuracy and speed of object recognition for birds, boats,
cars, chairs and faces by patient GG and healthy controls
Face-selective areas in the right hemisphere
An array of 40 faces to be matched for identity
The model of face recognition put forward by Bruce and
Young (1986)
Damage to regions of the inferior occipito-temporal cortex,
the anterior inferior temporal cortex and the anterior
temporal pole
The approximate locations of the visual buffer in BA17 and
BA18, of long-term memories of shapes in the inferior
temporal lobe, and of spatial representations in the posterior
parietal cortex
Dwell time for the four quadrants of a picture during
perception and imagery
Slezak’s (1991, 1995) investigations into the effects of rotation
on object recognition
The extent to which perceived or imagined objects could be
classified accurately on the basis of brain activity in the early
visual cortex and object-selective cortex
Connectivity during perception and imagery involving
(a) bottom-up processing; and (b) top-down processing
The optic-flow field as a pilot comes in to land, with the focus
of expansion in the middle
Graspable and non-graspable objects having similar
asymmetrical features
The visual features of a road viewed in perspective
The far road “triangle” in (A) a left turn and (B) a right
turn
Errors in time-to-contact judgements for the smaller and the
larger object as a function of whether they were presented in
their standard size, the reverse size (off-size) or lacking texture
(no-texture)
The dorso-dorsal and ventro-dorsal streams showing their
brain locations and forms of processing
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 17
xvii
104
105
107
112
114
115
115
116
120
121
124
126
127
132
133
134
135
135
142
143
147
148
150
156
28/02/20 2:15 PM
xviii
Illustrations
4.7
4.8
4.9
4.10
4.11
4.12
4.13
4.14
4.15
4.16
4.17
4.18
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
Point-light sequences (a) with the walker visible and (b) with
the walker not visible
157
Human detection and discrimination efficiency for human
walkers presented in contour, point lights, silhouette and
skeleton
158
Brain areas involved in biological motion processing
159
The main brain areas associated with the mirror neuron
system plus their interconnections
161
The unicycling clown who cycled close to students walking
across a large square
164
The sequence of events in the disappearing lighter trick
166
Participants’ fixation points at the time of dropping the
lighter
166
Change blindness: an example
168
(a) Percentage of correct change detection as a function of
form of change and time of fixation; also false alarm rate when
there was no change. (b) Mean percentage correct change
detection as a function of the number of fixations between
target fixation and change of target and form of change
169
(a) Change-detection accuracy as a function of task difficulty
and visual eccentricity. (b) The eccentricity at which changedetection accuracy was 85% correct as a function of task
difficulty
170
An example of inattentional blindness: a woman in a gorilla
suit in the middle of a game of passing the ball
172
An example of inattentional blindness: the sequence of events
on the initial baseline trials and the critical trial
174
A comparison of Broadbent’s theory, Treisman’s theory, and
Deutsch and Deutsch’s theory
181
Split attention. (a) Shaded areas indicate the cued locations;
the near and far locations are not cued. (b) Probability of
target detection at valid (left or right) and invalid (near or
far) locations
185
A comparison of object-based and space-based attention
187
Object-based and space-based attention. (a) Possible target
locations for a given cue. (b) Performance accuracy at the
various target locations
188
Sample displays for three low perceptual load conditions
in which the task required deciding whether a target X or N
was presented
190
The brain areas associated with the dorsal or goal-directed
attention network and the ventral or stimulus-driven network 193
A theoretical approach based on several functional networks
of relevance to attention: fronto-parietal; default mode;
cingulo-opercular; and ventral attention
195
An example of object-centred or allocentric neglect
197
Illegal and dangerous items captured by an airport security
screener
201
Frequency of selection and identification errors when targets
were present at trials
201
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 18
28/02/20 2:15 PM
Illustrations
xix
5.11 Performance speed on a detection task as a function of target
definition (conjunctive vs single feature) and display size
203
5.12 Eye fixations made by observers searching for pedestrians
204
5.13 A two-pathway model of visual search
205
5.14 An example of a visual search task when considering feature
integration theory
208
5.15 An example of temporal ventriloquism in which the apparent
time of onset of a flash is shifted towards that of a sound
presented at a slightly different timing from the flash
210
5.16 Wickens’s four-dimensional multiple-resource model
216
5.17 Threaded cognition theory
218
5.18 Patterns of brain activation: (a) underadditive activation;
(b) additive activation; (c) overadditive activation
220
5.19 Effects of an audio distraction task on brain activity associated
with a straight driving task
221
5.20 Dual-task (auditory and visual tasks) and single-task (auditory
or visual task) conditions: reaction times for correct responses
only over eight experimental sessions
224
5.21 Response times on a decision task as a function of
memory-set size, display-set size and consistent vs
varied mapping
227
5.22 Factors that are hypothesised to influence representational
quality within Moors’ (2016) theoretical approach
229
6.1 The multi-store model of memory as proposed by Atkinson
and Shiffrin (1968)
240
6.2 Short-term memory performance in conditions designed to
create interference (repeated condition) or minimise
interference (unique condition)
243
6.3 The working memory model showing the connections among
its four components and their relationship to long-term
memory
246
6.4 Phonological loop system as envisaged by Baddeley (1990)
248
6.5 Sites where direct electrical stimulation disrupted digit-span
performance
249
6.6 Amount of interference on a spatial task and a visual task
as a function of a secondary task (spatial: movement vs visual:
colour discrimination)
250
6.7 Screen displays for the digit 6
253
6.8 Mean reaction times quintile-by-quintile on the anti-saccade
task by groups high and low in working memory capacity
256
6.9 Schematic representation of the unity and diversity of three
executive functions
259
6.10 Activated brain regions across all executive functions in a
meta-analysis of 193 studies
260
6.11 Recognition memory performance as a function of processing
depth (shallow vs deep) for three types of stimuli: doors,
clocks, and menus
263
6.12 Distinctiveness. Percentage recall of the critical item
(e.g., kiwi) and of the preceding and following items in the
encoding, retrieval and control conditions
264
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 19
28/02/20 2:15 PM
xx
Illustrations
6.13 (a) Restudy causes strengthening of the memory trace formed
after initial study; (b) testing with feedback causes strengthening
of the memory trace; and (c) the formation of a second
memory trace
266
6.14 (a) Final recall for restudy-only and test-restudy group
participants; (b) recall performance in the CMR group as a
function of whether the mediators were or were not
retrieved
267
6.15 Mean recall percentage in Session 2 on Test 1 and Test 2 as
function of retrieval practice or restudy practice in Session 1
268
6.16 Schematic representation of a traditional keyboard
270
6.17 Mean number of completions in inclusion and exclusion
conditions as a function of number of trials
273
6.18 Response times for participants showing a sudden drop
in reaction times or not showing such a drop
273
6.19 The striatum is of central importance in implicit learning
274
6.20 A model of motor sequence learning
275
6.21 Sequential motor skill learning dependencies
276
6.22 Skilled typists’ performance when tested on a traditional
keyboard
277
6.23 Forgetting over time as indexed by reduced savings
279
6.24 Methods of testing for proactive and retroactive interference
281
6.25 Percentage of items recalled over time for the conditions:
no proactive interference, remember and forget
282
6.26 Percentage of words correctly recalled across 32 articles
in the respond, baseline and suppress conditions
286
6.27 Proportion of words recalled in high- and low-overload
conditions with intra-list cues, strong extra-list cues and weak
extra-list cues
289
7.1 Damage to brain areas within and close to the medial
temporal lobes producing amnesia
298
7.2 The standard account based on dividing long-term memory
into two broad classes: declarative and non-declarative
300
7.3 Interactions between episodic memories, semantic memories
and gist memories
305
7.4 (a) Locations of the hippocampus, the perirhinal cortex and
the parahippocampal cortex; (b) the binding-of-item-andcontext model
307
7.5 (A) Left lateral, (B), medial and (C) anterior views of
prefrontal areas having greater activation to familiarity-based
than recollection-based processes and areas showing the
opposite pattern
309
7.6 Sample pictures on the recognition-memory test
309
7.7 (A) Areas activated for both episodic simulation and episodic
memory; (B) areas more activated for episodic simulation than
episodic memory
312
7.8 Accuracy of (a) object categorisation and (b) speed of
categorisation at the superordinate, basic and subordinate
levels
315
7.9 The hub-and-spoke model
319
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 20
28/02/20 2:15 PM
Illustrations
xxi
7.10 Performance accuracy on tool function and tool manipulation
tasks with anodal transcranial direct current stimulation to the
anterior temporal lobe or to the inferior parietal lobule and in
a control condition
321
7.11 Categorisation performance for pictures and words by
healthy controls and patients with semantic dementia
324
7.12 Percentages of priming effect and recognition-memory
performance of healthy controls and patients
326
7.13 Brain regions showing repetition suppression or response
enhancement in a meta-analysis
328
7.14 Mean reaction times on the serial reaction time task by
Parkinson’s disease patients and healthy controls
330
7.15 A processing-based memory model
334
7.16 Recognition memory for faces presented and tested in a
fixed or variable viewpoint
335
7.17 Brain areas whose activity during episodic learning predicted
increased recognition-memory performance (task-positive) or
decreased performance (task-negative)
337
7.18 A three-dimensional model of memory: (1) conceptually
or perceptually driven; (2) relational or item stimulus
representation; (3) controlled or automatic/involuntary
intention
339
7.19 Process-specific alliances including the left angular gyrus are
involved in recollection of episodic memories and semantic
processing
339
8.1 Brain regions activated by autobiographical, episodic retrieval
and mentalising tasks including regions of overlap
347
8.2 Number of internal details specific to an autobiographical
event recalled at various time delays (by controls and
individuals with highly superior autobiographical memory)
348
8.3 Childhood amnesia based on data reported by Rubin and
Schulkind (1997)
352
8.4 Temporal distribution of autobiographical memories across
the lifespan
354
8.5 The knowledge structures within autobiographical memory,
as proposed by Conway (2005)
357
8.6 The mean number of events participants could remember from
the past 5 days and those they imagined were likely over the
next 5 days
358
8.7 A model of the bidirectional relationships between neural
networks involved in the construction and/or elaboration of
autobiographical memories
360
8.8 Life structure scores (proportion negative,
compartmentalisation, positive redundancy, negative
redundancy) for patients with major depressive disorder,
patients in remission from major depressive disorder
and healthy controls
361
8.9 Four cognitive biases related to autobiographical memory
recall that maintain depression and increase the risk of
recurrence following remission
362
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 21
28/02/20 2:15 PM
xxii
Illustrations
8.10 Examples of Egyptian and UK face-matching arrays
366
8.11 Size of the misinformation effect as a function of detail
memorability in the neutral condition
367
8.12 Extent of misinformation effects as a function of condition
for the original memory and endorsement of the
misinformation presented previously
371
8.13 Eyewitness identification: test of face-recognition
performance
371
8.14 A model of the component processes involved in prospective
memory
378
8.15 Mean failures to resume an interrupted task and mean
resumption times for the conditions: no-interruption,
blank-screen interruption and secondary air traffic control
task interruption
379
8.16 Self-reported memory vividness, memory details and
confidence in memory for individuals with good and poor
inhibitory control before and after repeated checking
381
8.17 The dual-pathways model of prospective memory (based
on the multi-process framework) for non-focal and focal tasks
separately
383
8.18 Example 1: top-down monitoring processes operating in
isolation. Example 2: bottom-up spontaneous retrieval processes
operating in isolation. Example 3: dual processes operating
dynamically
383
8.19 (a) Sustained and (b) transient activity in the (c) left anterior
prefrontal cortex for non-focal and focal prospective memory
tasks
385
8.20 Frequency of cue-driven monitoring following the presentation
of semantically related or unrelated cues
386
8.21 Different ways the instruction to press Q for fruit words was
encoded
388
9.1 (a) Areas activated during passive music listening and passive
speech listening; (b) areas activated more by listening to music
than speech or the opposite
406
9.2 The main processes involved in speech perception and
comprehension
407
9.3 A hierarchical approach to speech segmentation involving
three levels or tiers
410
9.4 A model of spoken-word comprehension
412
9.5 Gaze probability for critical objects over the first 1,000 ms
since target word onset for target neutral, competitor
neutral, competitor constraining and unrelated neutral
conditions
414
9.6 Mean target duration required for target recognition for words
and sounds presented in isolation or within a general sentence
context
420
9.7 The basic TRACE model, showing how activation between the
three levels (word, phoneme and feature) is influenced by
bottom-up and top-down processing.
421
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 22
28/02/20 2:15 PM
Illustrations
9.8
9.9
9.10
9.11
9.12
9.13
9.14
9.15
9.16
9.17
9.18
9.19
9.20
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
10.9
10.10
(a) Actual eye fixations on the object corresponding to a
spoken word or related to it; (b) predicted eye fixations from
the TRACE model
Mean reaction times for recognition of /t/ and /k/ phonemes
in words and non-words
Fixation proportions to high-frequency target words during
the first 1,000 ms after target onset
A sample display showing two nouns (“bench” and “rug”)
and two verbs (“pray” and “run”).
Processing and repetition of spoken words according to the
three-route framework
A general framework of the processes and structures involved
in reading comprehension
Estimated reading ability over a 30-month period with initial
testing at a mean age of 66 months for English, Spanish and
Czech children
McClelland and Rumelhart’s (1981) interactive activation
model of visual word recognition
The time course of inhibitory and facilitatory effects of
priming
Basic architecture of the dual-route cascaded model
The three components of the triangle model and their
associated neural regions: orthography, phonology
and semantics
Mean naming latencies for high-frequency and
low-frequency words that were irregular or regular
and inconsistent
Key assumptions of the E-Z Reader model
Total sentence processing time as a function of sentence
type
A model of language processing involving heuristic and
algorithmic routes
Sentence reading times as a function of the way in which
comprehension was assessed: detailed questions; superficial
questions on all trials; or occasional superficial questions
The N400 responses to a critical word in correct and
incorrect sentences
Response times for literally false, scrambled metaphor, and
metaphor sentences in (a) written and (b) spoken conditions)
Mean reaction times to verify metaphor-relevant and
metaphor-irrelevant properties
Mean proportion of statements rated comprehensible with a
response deadline of 500 or 1600 ms: literal, forward
metaphors, reversed metaphors and scrambled metaphors
Sample displays seen from the listener’s perspective
Proportion of fixation on four objects over time
A theoretical framework for reading comprehension
involving interacting passive and reader-initiated
processes
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 23
xxiii
422
423
428
428
430
433
434
437
440
443
448
451
455
471
473
474
476
480
482
483
485
486
492
28/02/20 2:15 PM
xxiv
Illustrations
10.11 Reaction times to name colours when the word presented in
colour was predictable from the preceding text compared to a
control condition
496
10.12 The construction–integration model
502
10.13 Forgetting functions for situation, proposition and surface
information over a 4-day period
503
10.14 The RI-Val model showing the effects on comprehension of
resonance, integration and validation over time
506
11.1 Brain areas activated during speech comprehension and
production
517
11.2 Correlations between aphasic patients’ speech-production
abilities and their ability to detect their own speech-production
errors
524
11.3 Speech-production processes for picture naming, with
median peak activation times
532
11.4 Speech-production processes: the timing of activation
associated with different cognitive functions
534
11.5 Language-related regions and their connections in the left
hemisphere
536
11.6 Semantic and syntactic errors made by: healthy controls and
patients with no damage to the dorsal or ventral pathway,
damage to the ventral pathway only, damage to the dorsal
pathway only and damage to both pathways
540
11.7 A sample array with six different garments coloured blue
or green
544
11.8 Architecture of the forward modelling approach to explaining
audience design effects
546
11.9 Hayes’ (2012) writing model: (1) control level; (2) writing
process level; and (3) resource level
552
11.10 The frequency of three major writing processes (planning,
translating and revising) across the three phases of writing
553
11.11 Kellogg’s three-stage theory of the development of
writing skill
554
11.12 Brain areas activated during handwriting tasks
559
11.13 The cognitive architectures for (a) reading and (b) spelling
560
11.14 Brain areas in the left hemisphere associated with reading,
letter perception and writing
563
12.1 Explanation of the solution to the Monty Hall problem
575
12.2 Brain areas involved in (a) mathematical problem solving; (b)
verbal problem solving; (c) visuo-spatial problem solving; and
(d) areas common to all three problem types (conjunction)
577
12.3 The mutilated draughtboard problem
577
12.4 Flow chart of insight problem solving
580
12.5 (a) The nine-dot problem and (b) its solution
580
12.6 Two of the matchstick problems used by Knoblich et al. (1999)
with cumulative solution rates
581
12.7 The multiplying billiard balls trick
582
12.8 The two-string problem
583
12.9 Some of the materials for participants instructed to mount a
candle on a vertical wall in Duncker’s (1945) study
585
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 24
28/02/20 2:15 PM
Illustrations
xxv
12.10 Mean percentages of correct solutions as a function of
problem type and working memory capacity
587
12.11 The initial state of the five-disc version of the Tower of Hanoi
problem
588
12.12 Tower of London task (two-move and five-move problems)
590
12.13 A problem resembling those used on the Raven’s Progressive
Matrices
594
12.14 Relational reasoning: the probabilities of successful encoding,
inferring, mapping and applying for lower and high
performers
597
12.15 Major processes involved in performance of numerous
cognitive tasks
598
12.16 Summary of key brain regions and their associated functions
in relational reasoning based on patient and neuroimaging
studies
599
12.17 Mean strength of the first-mentioned chess move and the
move chosen as a function of problem difficulty by experts
and by tournament players
603
12.18 A theoretical framework of the main cognitive processes and
potential errors in medical decision-making
605
12.19 Eye fixations of a pathologist given the same biopsy
whole-slide image (a) starting in year 1 and (d) ending
in year 4
606
12.20 Brain activation while diagnosing lesions in X-rays, naming
animals and naming letters
608
12.21 Brain image showing areas in the primary motor cortex with
differences in relative voxel size between trained children and
non-trained controls: (a) changes in relative voxel size over time;
(b) correlation between improvement in motor-test performance
and change in relative voxel size
611
12.22 Brain image showing areas in the primary auditory area with
differences in relative voxel size between trained children and
non-trained controls: (a) changes in relative voxel size over time;
(b) correlation between improvement in a melody-rhythm test
and change in relative voxel size
612
12.23 Mean chess ratings of candidates, non-candidate grandmasters
and all non-grandmasters as a function of number of games
played
616
12.24 The main factors (genetic and environmental) influencing the
development of expertise
617
13.1 Percentages of correct responses and various incorrect
responses with the false-positive and benign cyst scenarios
627
13.2 Percentage of correct predictions of the judged frequencies of
different causes of death based on the affect heuristic (overall
dread score), affect heuristic and availability
628
13.3 Percentage of correct inferences on four tasks
632
13.4 A hypothetical value function
642
13.5 Ratings of competence satisfaction for the sunk-cost option
and the alternative option for those selecting each option
644
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 25
28/02/20 2:15 PM
xxvi
Illustrations
13.6
13.7
13.8
13.9
13.10
13.11
14.1
14.2
14.3
14.4
14.5
14.6
14.7
14.8
14.9
14.10
14.11
14.12
15.1
15.2
15.3
15.4
Risk aversion for gains and risk seeking for losses on a
money-based task by financial professionals and students
Percentages of participants adhering to cumulative prospect
theory, the minimax rule, or unclassified with affect-poor and
affect-rich problems (a) with or (b) without numerical
information concerning willingness to pay for
medication
Proportion of politicians and population samples in Belgium,
Canada and Israel voting to extend a loan programme
A model of selective exposure: defence motivation and
accuracy motivation
The five phases of decision-making according to Galotti’s
theory
Klein’s recognition-primed decision model
Mean number of modus ponens inferences accepted as a
function of relative strength of the evidence and strategy
The Wason selection task
Percentage acceptance of conclusions as a function of
perceived base rate (low vs high), believability of conclusions
and validity of conclusions
Three models of the relationship between the intuitive and
deliberate systems: (a) serial model; (b) parallel model;
and (c) logical intuition model
Proportion correct on incongruent syllogisms as a function of
instructions and cognitive ability
The approximate time courses of reasoning and metareasoning processes during reasoning and problem solving
Brain regions most consistently activated across 28 studies of
deductive reasoning
Relationships between reasoning task performance (accuracy)
and inferior frontal cortex activity in the left hemisphere and
the right hemisphere in (a) the low-load condition and (b) the
high-load condition
Mean responses to the question, “How much risk do you
believe climate change poses to human health, safety or
prosperity?”
Effects of trustworthiness and others’ opinions on
convincingness ratings
Mean-rated argument strength as a function of the probability
of the outcome and how negative the outcome would be
Stanovich’s tripartite model of reasoning
The two-dimensional framework for emotion showing the two
dimensions of pleasure–misery and arousal–sleep and the two
dimensions of positive affect and negative affect
Brain areas activated by positive, negative and neutral stimuli
Brain areas showing greater activity for top-down than for
bottom-up processing and those showing greater activity for
bottom-up than for top-down processes
Multiple appraisal mechanisms used in emotion generation
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 26
645
650
654
659
660
661
676
676
679
685
687
689
690
692
696
700
701
706
716
717
718
720
28/02/20 2:15 PM
Illustrations
15.5
15.6
15.7
15.8
15.9
15.10
15.11
15.12
15.13
15.14
15.15
15.16
15.17
15.18
15.19
15.20
15.21
15.22
15.23
15.24
16.1
16.2
Changes in self-reported horror and distress and in galvanic
skin response between pre-training and post-training (for the
watch condition and the appraisal condition)
A process model of emotion regulation based on five major
types of strategy (situation selection, situation modification,
attention deployment, cognitive change and response
modulation)
Mean level of depression as a function of stress severity and
cognitive reappraisal ability
A three-stage neural network model of emotion regulation
The incompatibility flanker effect (incompatible trials –
compatible trials) on reaction times as a function of mood
(happy or sad) and whether a global, local or mixed focus
had been primed on a previous task
Two main brain mechanisms involved in the memoryenhancing effects of emotion: (1) the medial temporal
lobes; (2) the medial, dorsolateral and ventrolateral
prefrontal cortex
(a) Free and (b) cued recall as a function of mood state
(happy or sad) at learning and at recall
Two well-known moral dilemma problems: (a) the trolley
problem; and (b) the footbridge problem
The dorsolateral prefrontal cortex, located approximately in
Brodmann areas 9 and 46 and the ventromedial prefrontal
cortex located approximately in Brodmann areas 10 and 11
Sensitivity to consequences, sensitivity to moral norms and
preference for inaction vs action as a function of psychopathy
(low vs high)
Driverless cars: moral decisions
Effects of mood manipulation (anxiety, sadness or neutral)
on percentages of people choosing a high-risk job option
Mean buying price for a water bottle as a function of mood
(neutral vs sad) and self-focus (low vs high)
The positive emotion “family tree” with the trunk representing
the neural reward system and the branches representing nine
semi-distinct positive emotions
Probability of selecting a candy bar by participants in a happy
or sad mood as a function of implicit attitudes on the Implicit
Association Test
Effects of mood states on judgement and decision-making.
The emotion-imbued choice model
The dot-probe task
The emotional Stroop task
The impaired cognitive control account put forward by
Joormann et al. (2007)
Mean scores for error detection on a proofreading task
comparing unconscious goal vs no-goal control and low vs.
high goal importance
Awareness as a social perceptual model of attention
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 27
xxvii
721
725
727
728
733
735
737
738
739
741
742
745
746
748
750
750
752
756
756
761
770
771
28/02/20 2:15 PM
xxviii
Illustrations
16.3
16.4
16.5
16.6
16.7
16.8
16.9
16.10
16.11
16.12
16.13
16.14
(a) Region in left fronto-polar cortex for which decoding of
upcoming motor decisions was possible. (b) Decoding
accuracy of these decisions
774
Undistorted and distorted photographs of the Brunnen der
Lebensfreude in Rostock, Germany
777
Modulation of the appropriate frequency bands of the EEG
signal associated with motor imagery in one healthy control
and three patients
779
Activation patterns on a binocular-rivalry task when observers
(A) reported what they perceived or (B) passively experienced
rivalry
781
Three successive stages of visual processing following
stimulus presentation
782
Percentage of trials on which participants reported
awareness of the content of photographs under masked
and unmasked conditions for animal and non-animal
photographs
783
Five hypotheses about the relationship between
attention and conscious awareness identified by Webb
and Graziano
785
Event-related potential waveforms in the aware-correct,
unaware-correct and unaware-incorrect conditions
786
Synchronisation of neural activity across cortical areas for
consciously perceived words (visible condition) and nonperceived words (invisible condition) during different time
periods
787
Integrated brain activity: (a) overall information sharing or
integration across the brain for vegetative state, minimally
conscious and conscious brain-damaged patients and healthy
controls); (b) information sharing (integration) across short,
medium and long distances within the brain for the four
groups
788
Event-related potentials in the left and right hemispheres
to the first of two stimuli by AC (a patient with
severe corpus callosum damage)
796
Detection and localisation of circles presented to the left
or right visual fields by two patients responding verbally,
with the left or right hand
797
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 28
28/02/20 2:15 PM
Preface
Producing regular editions of this textbook gives us a front-row seat from
which to observe all the exciting developments in our understanding of
human cognition. What are the main reasons for the rapid rate of progress
within cognitive psychology since the seventh edition of this textbook?
Below we identify two factors that have been especially important.
First, the overarching assumption that the optimal way to enhance our
understanding of cognition is by combining data and insights from several
different approaches remains exceptionally fruitful. These approaches
include traditional cognitive psychology; cognitive neuropsychology (study
of brain-damaged patients); computational cognitive science (development
of computational models of human cognition); and cognitive neuroscience
(combining information from behaviour and from brain activity). Note
that we use the term “cognitive psychology” in a broad or general sense to
cover all these approaches.
The above approaches all continue to make extremely valuable contributions. However, cognitive neuroscience deserves to be singled out –
it has increasingly been used with great success to resolve theoretical
controversies and to provide novel empirical data that foster theoretical
developments.
Second, there has been a steady increase in cognitive research of
direct relevance to real life. This is reflected in a substantial increase in
the number of boxes labelled “in the real world” in this edition compared to the previous one. Examples include eyewitness confidence, mishearing of song lyrics, multi-tasking, airport security checks and causes of
plane crashes. What is noteworthy is the increased quality of real-world
research (e.g., more sophisticated experimental designs; enhanced theoretical relevance).
With every successive edition of this textbook, the authors have had
to work harder and harder to keep with huge increase in the number of
research publications in cognitive psychology. For example, the first author
wrote parts of the book in far-flung places including Botswana, New
Zealand, Malaysia and Cambodia. His only regret is that book writing has
sometimes had to take precedence over sightseeing!
We would both like to thank the very friendly and efficient staff at
Psychology Press including Sadé Lee and Ceri McLardy.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 29
28/02/20 2:15 PM
xxx
Preface
We would also like to thank the anonymous reviewers, that commented
on various chapters. Their comments were very useful when we embarked
on the task of revising the first draft of the manuscript. Of course, we are
responsible for any errors and/or misunderstandings that remain.
Michael Eysenck and Mark Keane
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 30
28/02/20 2:15 PM
Visual tour
(how to use this book)
TEXTBOOK FEATURES
Listed below are the various pedagogical features that can be found both in
the margins and within the main text, with visual examples of the boxes to
look out for, and descriptions of what you can expect them to contain.
Key terms
Throughout the book, key terms are highlighted in the text and defined in
boxes in the margins, helping you to get to grips with the vocabulary fundamental to the subject being covered.
In the real world
Each chapter contains boxes within the main text that explore “real world”
examples, providing context and demonstrating how some of the theories
and concepts covered in the chapter work in practice.
Chapter summary
Each chapter concludes with a brief summary of each section of the chapter,
helping you to consolidate your learning by making sure you have taken in
all of the concepts covered.
Further reading
Also at the end of each chapter is an annotated list of key scholarly books,
book chapters, and journal articles that it is recommended you explore
through independent study to expand upon the knowledge you have gained
from the chapter and plan for your assignments.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 31
28/02/20 2:15 PM
xxxii
Visual tour (how to use this book)
Links to companion website features
Whenever you see this symbol, look out for related supplementary material
amongst the resources for that chapter on the companion website at www.
routledge.com/cw/eysenck.
Glossary
An extensive glossary appears at the end of the book, offering a comprehensive list that includes all the key terms boxes in the main text.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 32
28/02/20 2:15 PM
Chapter
Approaches to
human cognition
1
INTRODUCTION
We are now well into the third millennium and there is ever-increasing
interest in unravelling the mysteries of the human brain and mind. This
interest is reflected in the substantial upsurge of scientific research within
cognitive psychology and cognitive neuroscience. In addition, the cognitive
approach has become increasingly influential within clinical psychology.
In that area, it is recognised that cognitive processes (especially cognitive
biases) play a major role in the development (and successful treatment) of
mental disorders (see Chapter 15).
In similar fashion, social psychologists increasingly focus on social
cognition. This focuses on the role of cognitive processes in influencing
individuals’ behaviour in social situations. For example, suppose other
people respond with laughter when you tell them a joke. This laughter
is often ambiguous – they may be laughing with you or at you (Walsh
et al., 2015). Your subsequent behaviour is likely to be influenced by your
cognitive interpretation of their laughter.
What is cognitive psychology? It is concerned with the internal processes involved in making sense of the environment and deciding on
appropriate action. These processes include attention, perception, learning, memory, language, problem solving, reasoning and thinking. We can
define cognitive psychology as aiming to understand human cognition
by observing the behaviour of people performing various cognitive tasks.
However, the term “cognitive psychology” can also be used more broadly
to include brain activity and structure as relevant information for understanding human cognition. It is in this broader sense that it is used in the
title of this book.
Here is a simple example of cognitive psychology in action. Frederick
(2005) developed a test (the Cognitive Reflection Test) that included the
following item:
A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the
ball. How much does the ball cost? ___ cents
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 1
KEY TERMS
Social cognition
An approach within social
psychology in which
the emphasis is on the
cognitive processing
of information about
other people and social
situations.
Cognitive psychology
An approach that aims
to understand human
cognition by the study
of behaviour; a broader
definition also includes
the study of brain activity
and structure.
28/02/20 2:15 PM
2
Approaches to human cognition
KEY TERM
What do you think is the correct answer? Braňas-Garza et al. (2015)
found in a review of findings from 41,004 individuals that 68% produced
the wrong answer (typically 10 cents) and only 32% gave the right answer
(5 cents). Even providing financial incentives to produce the correct answer
failed to improve performance.
The above findings suggest most people will rapidly produce an incorrect answer (i.e., 10 cents) that is easily accessible and are unwilling to
devote extra time to checking that they have the right answer. However,
Gangemi et al. (2015) found many individuals producing the wrong
answer had a feeling of error suggesting they experienced cognitive uneasiness about their answer. In sum, the intriguing findings on the Cognitive
Reflection Test indicate that we can fail to think effectively even on relatively simple problems. Subsequent research has clarified the reasons for
these deficiencies in our thinking (see Chapter 12).
The aims of cognitive neuroscientists overlap with those of cognitive
psychologists. However, there is one major difference between cognitive
neuroscience and cognitive psychology in the narrow sense. Cognitive neuroscientists argue convincingly we need to study the brain as well as behaviour while people engage in cognitive tasks. After all, the internal processes
involved in human cognition occur in the brain. Cognitive neuroscience
uses information about behaviour and the brain to understand human cognition. Thus, the distinction between cognitive neuroscience and cognitive
psychology in the broader sense is blurred.
Cognitive neuroscientists explore human cognition in several ways.
First, there are brain-imaging techniques of which functional magnetic
resonance imaging (fMRI) is probably the best-known. Second, there are
electrophysiological techniques involving the recording of electrical signals
generated by the brain. Third, many cognitive neuroscientists study the
effects of brain damage on cognition. It is assumed the patterns of cognitive impairment shown by brain-damaged patients can inform us about
normal cognitive functioning and the brain areas responsible for various
cognitive processes.
The huge increase in scientific interest in the workings of the brain
is mirrored in the popular media – numerous books, films and television
programmes communicate the more accessible and dramatic aspects of
cognitive neuroscience. Increasingly, media coverage includes coloured pictures of the brain indicating the areas most activated when people perform
various tasks.
Cognitive neuroscience
An approach that aims
to understand human
cognition by combining
information from
behaviour and the brain.
Four main approaches
We can identify four main approaches to human cognition (see Table 1.1).
Note, however, there has been a substantial increase in research combining two (or even more) of these approaches. We will shortly discuss each
approach in turn and you will probably find it useful to refer back to
this chapter when reading the rest of the book. Hopefully, you will find
Table 1.3 (towards the end of this chapter) especially useful because it summarises the strengths and limitation of all four approaches.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 2
28/02/20 2:15 PM
3
Approaches to human cognition
TABLE 1.1 APPROACHES TO HUMAN COGNITION
1.
Cognitive psychology: this approach involves using behavioural evidence to
enhance our understanding of human cognition. Since behavioural data are also
of great importance within cognitive neuroscience and cognitive neuropsychology,
cognitive psychology’s influence is enormous.
2.
Cognitive neuropsychology: this approach involves studying brain-damaged
patients to understand normal human cognition. It was originally closely linked
to cognitive psychology but has recently also become linked to cognitive
neuroscience.
3.
Cognitive neuroscience: this approach involves using evidence from behaviour and
the brain to understand human cognition.
4.
Computational cognitive science: this approach involves developing computational
models to further our understanding of human cognition; such models increasingly
incorporate knowledge of behaviour and the brain. A computational model takes
the form of an algorithm, which consists of a precise and detailed specification of
the steps involved in performing a task. Computational models are designed to
simulate or imitate human processing on a given task.
KEY TERM
Algorithm
A computational
procedure providing a
specified set of steps to
problem solution; see
heuristic.
COGNITIVE PSYCHOLOGY
We can obtain some perspective on the contribution of cognitive psychology
by considering what preceded it. Behaviourism was the dominant approach to
psychology throughout the first half of the twentieth century. The American
psychologist John Watson (1878–1958) is often regarded as the founder of
behaviourism. He argued that psychologists should focus on stimuli (aspects
of the immediate situation) and responses (behaviour produced by the participants in an experiment). This approach appears “scientific” because it
focuses on stimuli and responses, both of which are observable.
Behaviourists argued that internal mental processes (e.g., attention)
cannot be verified by reference to observable behaviour and so should be
ignored. According to Watson (1913, p. 165), behaviourism should “never
use the terms consciousness, mental states, mind, content, introspectively
verifiable and the like”. In stark contrast, as we have already seen, cognitive psychologists argue it is of crucial importance to study such internal
mental processes. Hopefully, you will be convinced that cognitive psychologists are correct when you read how the concepts of attention (Chapter 5)
and consciousness (Chapter 16) have been used fruitfully to enhance our
understanding of human cognition.
It is often claimed that behaviourism was overthrown by the “cognitive revolution”. However, the reality was less dramatic (Hobbs &
Burman, 2009). For example, Tolman (1948) was a behaviourist but he did
not believe internal processes should be ignored. He carried out studies in
which rats learned to run through a maze to a goal box containing food.
When Tolman blocked off the path the rats had learned to use, they rapidly
learned to follow other paths leading in the right general direction. Tolman
concluded the rats had acquired an internal cognitive map indicating the
maze’s approximate layout.
It is almost as pointless to ask “When did cognitive psychology start?”,
as to enquire “How long is a piece of string?”. However, 1956 was crucially
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 3
28/02/20 2:15 PM
4
Approaches to human cognition
important. At a meeting at the Massachusetts Institute of Technology,
Noam Chomsky presented his theory of language, George Miller discussed
Bottom-up processing
the magic number seven in short-term memory (Miller, 1956) and Alan
Processing directly
Newell and Herbert Simon discussed the General Problem Solver (see
influenced by
environmental stimuli; see
Gobet and Lane, 2015). In addition, there was the first systematic attempt
top-down processing.
to study concept formation from the cognitive perspective (Bruner et al.,
1956). The history of cognitive psychology from the perspective of its
Serial processing
Processing in which one
classic studies is discussed in Eysenck and Groome (2015a).
process is completed
Several decades ago, most cognitive psychologists subscribed to the
before the next one starts;
information-processing approach based loosely on an analogy between
see parallel processing.
the mind and the computer (see Figure 1.1). A stimulus (e.g., a problem
Top-down processing
or task) is presented, which causes various internal processes to occur,
Stimulus processing that
leading eventually to the desired response or answer. Processing directly
is influenced by factors
affected by the stimulus input is often described as bottom-up processing.
such as the individual’s
It was typically assumed only one process occurs at a time: this is serial
past experience and
expectations.
processing, meaning the current process is completed before the onset of
the next one.
The above approach is drastically oversimplified. Task processing typically also
involves top-down processing, which is processing influenced by the individual’s expectations and knowledge rather than simply by
the stimulus itself. Read what it says in the
triangle (Figure 1.2). Unless you know the
trick, you probably read it as “Paris in the
spring”. If so, look again: the word “the”
is repeated. Your expectation it was a wellknown phrase (i.e., top-down processing)
dominated the information available from the
stimulus (i.e., bottom-up processing).
The traditional approach was also oversimplified in assuming processing is typically
serial. In fact, more than one process typically occurs at the same time – this is parallel
processing. We are much more likely to use
parallel processing when performing a highly
Figure 1.1
practised task than a new one (see Chapter
An early version of the information processing approach.
5). For example, someone taking their first
driving lesson finds it very hard to control the
car’s speed, steer accurately and pay attention
to other road users at the same time. In contrast, an experienced driver finds it easy.
There is also cascade processing: a
form of parallel processing involving an
overlap of different processing stages when
someone performs a task. More specifically,
later stages of processing are initiated before
one or more earlier stages have finished. For
example, suppose you are trying to work out
Figure 1.2
Diagram to demonstrate top–down processing.
the meaning of a visually presented word.
KEY TERMS
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 4
28/02/20 2:15 PM
5
Approaches to human cognition
The most thorough approach would involve identifying all the letters in the
word followed by matching the resultant letter string against words you
have stored in long-term memory. In fact, people often engage in cascade
processing – they form hypotheses as to the word that has been presented
before identifying all the letters (McClelland, 1979).
An important issue for cognitive psychologists is the task-impurity
problem – most cognitive tasks require several processes thus making it
hard to interpret the findings. One approach to this problem is to consider various tasks all requiring the same process. For example, Miyake
et al. (2000) used three tasks requiring deliberate inhibition of a dominant
response:
(1)
(2)
(3)
KEY TERMS
Parallel processing
Processing in which two or
more cognitive processes
occur at the same time.
Cascade processing
Later processing stages
start before earlier
processing stages have
been completed when
performing a task.
The Stroop task: name the colour in which colour words are presented (e.g., RED printed in green) and avoid saying the colour word
(which has to be inhibited). You can see for yourself how hard this
task is by naming the colours of the words shown in Figure 1.3.
The anti-cascade task: inhibit the natural tendency to look at a visual
cue and instead look in the opposite direction. People typically take
longer to perform this task than the control task of simply looking at
the visual cue.
The stop-signal task: respond rapidly to indicate whether each of
a series of words is an animal or non-animal; on key trials, there
was a computer-emitted tone indicating that the response should be
inhibited.
Miyake et al. (2000) found all three tasks involved similar processes. They
used complex statistical techniques (latent variable analysis) to extract what
Figure 1.3
Test yourself by naming the colours in each column. You should name the colours rapidly
in the first three columns because there is no colour-word conflict. In contrast, colour
naming should be slower (and more prone to error) when naming colours in the fourth
and fifth columns.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 5
28/02/20 2:15 PM
6
Approaches to human cognition
KEY TERMS
was common across the three tasks. This was assumed to represent a relatively pure measure of the inhibitory process. Throughout this book, we
will discuss many ingenious strategies used by cognitive psychologists to
identify the processes used in numerous tasks.
Ecological validity
The applicability (or
otherwise) of the findings
of laboratory studies to
everyday settings.
Implacable experimenter
The situation in
experimental research in
which the experimenter’s
behaviour is uninfluenced
by the participant’s
behaviour.
Strengths
Cognitive psychology was for many years the engine room of progress in
understanding human cognition and the other three approaches listed in
Table 1.1 have benefitted from it. For example, cognitive neuropsychology
became important 25 years after cognitive psychology. It was only when
cognitive psychologists had developed reasonable accounts of healthy
human cognition that the performance of brain-damaged patients could
be understood fully. Before that, it was hard to decide which patterns of
cognitive impairment were theoretically important.
In a similar fashion, the computational modelling activities of computational cognitive scientists are typically heavily influenced by precomputational psychological theories. Finally, the great majority of
theories driving research in cognitive neuroscience originated within
cognitive psychology.
Cognitive psychology has not only had a massive influence on theorising across all four major approaches to human cognition. It has also had a
predominant influence on the development of cognitive tasks and on task
analysis (how a task is accomplished).
Limitations
In spite of cognitive psychology’s enormous contributions, it has several
limitations. First, our behaviour in the laboratory may differ from our
behaviour in everyday life. Thus, laboratory research sometimes lacks
ecological validity – the extent to which laboratory findings are applicable
to everyday life. For example, our everyday behaviour is often designed
to change a situation or to influence others’ behaviour. In contrast, the
sequence of events in most laboratory research is based on the experimenter’s predetermined plan and is uninfluenced by participants’ behaviour.
Wachtel (1973) used the term implacable experimenter to describe this
state of affairs.
We must not exaggerate problems associated with lack of ecological
validity. As we will see in this book, there has been a dramatic increase
in applied cognitive psychology in which the emphasis is on investigating topics of general importance. Such research often has good ecological
validity. Note that it is far better to carry out well-controlled experiments
under laboratory conditions than poorly controlled experiments under
naturalistic conditions. It is precisely because it is considerably easier for
researchers to exercise experimental control in the laboratory that so much
research is laboratory-based.
Second, theories in cognitive psychology are often expressed only in
verbal terms (although this is becoming less common). Such theories are
vague, making it hard to know precisely what predictions follow from
them and thus to falsify them. These limitations can largely be overcome by
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 6
28/02/20 2:15 PM
7
Approaches to human cognition
computational cognitive scientists developing cognitive models specifying
precisely any given theory’s assumptions.
Third, difficulties in falsifying theories have led to a proliferation of
different theories on any given topic. For example, there are at least 12
different theories of working memory (see Chapter 6). Another reason for
the proliferation of rather similar theories is the “toothbrush problem”
(Mischel, 2008): no self-respecting cognitive psychologist wants to use
anyone else’s theory.
Fourth, the findings obtained using any given task or paradigm are
sometimes specific to that paradigm and do not generalise to other (apparently similar) tasks. This is paradigm specificity. It means some findings
are narrow in scope and applicability (Meiser, 2011). This problem can be
minimised by developing theories accounting for performance across several
tasks or paradigms. For example, Anderson et al. (2004; discussed later in
this chapter) developed a comprehensive theoretical architecture or framework known as the Adaptive Control of Thought-Rational (ACT-R) model.
Fifth, cognitive psychologists typically obtain measures of performance
speed and accuracy. These measures are very useful but provide only indirect evidence about internal cognitive processes. Most tasks are “impure” in
that they involve several processes, and it is hard to identify the number and
nature of processes involved on the basis of speed and accuracy measures.
KEY TERMS
Paradigm specificity
The findings with a
given experimental task
or paradigm are not
replicated even when
apparently very similar
tasks or paradigms are
used.
lesion
Damage within the brain
resulting from injury or
disease; it typically affects
a restricted area.
COGNITIVE NEUROPSYCHOLOGY
Cognitive neuropsychology focuses on the patterns of cognitive performance (intact and impaired) of brain-damaged patients having a lesion
(structural damage to the brain caused by injury or disease). According to
cognitive neuropsychologists, studying brain-damaged patients can tell us
much about cognition in healthy individuals.
The above idea does not sound very promising, does it? In fact,
however, cognitive neuropsychology has contributed substantially to our
understanding of healthy human cognition. For example, in the 1960s,
most memory researchers thought the storage of information in longterm memory depended on previous processing in short-term memory (see
Chapter 6). However, Shallice and Warrington (1970) reported the case of
a brain-damaged man, KF. His short-term memory was severely impaired
but his long-term memory was intact. These findings played an important
role in changing theories of healthy human memory.
Since cognitive neuropsychologists study brain-damaged patients,
we might imagine they would be interested in the workings of the brain.
In fact, many cognitive neuropsychologists pay little attention to the
brain itself. According to Coltheart (2015, p. 198), for example, “Even
though cognitive neuropsychologists typically study people with brain
damage, . . . cognitive neuropsychology is not about the brain: it is about
information-processing models of cognition.”
An increasing number of cognitive neuropsychologists disagree with
Coltheart. They believe we should consider the brain, using techniques
such as magnetic resonance imaging to identify the brain areas damaged in
any given patient. They are also increasingly willing to study the impact of
brain damage on brain processes using various neuroimaging techniques.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 7
28/02/20 2:15 PM
8
Approaches to human cognition
Theoretical assumptions
Max Coltheart.
Courtesy of Max Coltheart.
KEY TERM
Modularity
The assumption that
the cognitive system
consists of many fairly
independent or separate
modules or processors,
each specialised for a
given type of processing.
Coltheart (2001) provided a very clear
account of the major assumptions of cognitive neuropsychology. Here we will discuss
these assumptions and briefly consider relevant evidence.
One key assumption is modularity,
meaning the cognitive system consists of
numerous modules or processors operating
fairly independently or separately of each
other. It is assumed these modules exhibit
domain specificity (they respond to only
one given class of stimuli). For example,
there may be a face-recognition module that
responds only when a face is presented.
Modular systems typically involve
serial processing with processing within one
module being completed before processing
starts in the next module. As a result, there
is very limited interaction among modules.
There is some support for modularity from
the evolutionary approach. Species with
larger brains generally have more specialised brain regions that could be involved in
modular processing. However, the notion
that human cognition is heavily modular
is hard to reconcile with neuroimaging evidence. The human brain possesses a moderately high level of connectivity (Bullmore &
Sporns, 2012; see p. 14), suggesting there is more parallel processing than
assumed by most cognitive neuropsychologists.
The second major assumption is that of anatomical modularity.
According to this assumption, each module is located in a specific brain
area. Why is this assumption important? Cognitive neuropsychologists
are most likely to make progress when studying brain patients with brain
damage limited to a single module. Such patients may not exist if there is
no anatomical modularity. Suppose all modules were distributed across
large brain areas. If so, the great majority of brain-damaged patients
would suffer damage to most modules, making it impossible to work out
the number and nature of their modules.
There is evidence of anatomical modularity in the visual processing
system (see Chapter 2). However, there is less support for anatomical
modularity with most complex tasks. For example, consider the findings of
Yarkoni et al. (2011). Across over 3,000 neuroimaging studies, some brain
areas (e.g., dorsolateral prefrontal cortex; anterior cingulate cortex) were
activated in 20% of them despite the great diversity of tasks involved.
The third major assumption (the “universality assumption”) is that
“Individuals . . . share a similar or an equivalent organisation of their cognitive functions, and presumably have the same underlying brain anatomy”
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 8
28/02/20 2:15 PM
9
Approaches to human cognition
(de Schotten and Shallice, 2017, p. 172). If this assumption (also common
within cognitive neuroscience) is false, we could not readily use the findings from individual patients to draw conclusions about the organisation
of other people’s cognitive systems or functional architecture.
There is accumulating evidence against the universality assumption.
Tzourio-Mazoyer et al. (2004) discovered substantial differences between
individuals in the location of brain networks involved in speech and language. Finn et al. (2015) found clear-cut differences between individuals in
functional connectivity across the brain, concluding that “An individual’s
functional brain connectivity profile is both unique and reliable, similarly
to a fingerprint” (p. 1669).
Duffau (2017) reviewed interesting research conducted on patients
during surgery for epilepsy or a tumour. Direct electrical stimulation,
which causes “a genuine virtual transient lesion” (p. 305) is applied invasively to the cortex. The patient is awakened and given various cognitive
tasks while receiving stimulation. Impaired performance when direct electrical stimulation is applied to a given area indicates that area is involved
in the cognitive functions assessed by the current task.
Findings obtained using direct electrical stimulation and other techniques (e.g., fMRI) led Duffau (2017) to propose a two-level model. At
the cortical level, there is high variability across individuals in structure
and function of any given brain areas. At the subcortical level (e.g., in premotor cortex), in contrast, there is very little variability across individuals.
The findings at the cortical level seem inconsistent with the universality
assumption.
The fourth assumption is subtractivity. The basic idea is that brain
damage impairs one or more processing modules but does not change
or add anything. The fifth assumption (related to subtractivity) is transparency (Shallice, 2015). According to the transparency assumption, the
performance of a brain-damaged patient reflects the operation of a theory
designed to explain the performance of healthy individuals minus the
impact of their lesion.
Why are the subtractivity and transparency assumptions important? Suppose they are incorrect and brain-damaged patients develop
new modules to compensate for their cognitive impairments. That would
greatly complicate the task of learning about the intact cognitive system
by studying brain-damaged patients. Consider pure alexia, a condition in
which brain-damaged patients have severe reading problems but otherwise
intact language abilities. These patients generally have a direct relationship
between word length and reading speed due to letter-by-letter processing
(Bormann et al., 2015). This indicates the use of a compensatory strategy
differing markedly from the reading processes used by healthy adults.
KEY TERM
Pure alexia
Severe problems with
reading but not other
language skills; caused
by damage to brain
areas involved in visual
processing.
Research in cognitive neuropsychology
How do cognitive neuropsychologists set about understanding the cognitive
system? Of major importance is the search for dissociations, which occur
when a patient has normal performance on one task (task X) but is impaired
on a second one (task Y). For example, amnesic patients perform almost
normally on short-term memory tasks but are greatly impaired on many
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 9
28/02/20 2:15 PM
10
Approaches to human cognition
KEY TERMS
long-term memory tasks (see Chapter 6). It is tempting (but dangerous!) to
conclude that the two tasks involve different processing modules and that
the module(s) needed on long-term memory tasks have been damaged by
brain injury.
Why must we avoid drawing sweeping conclusions from dissociations?
Patients may perform well on one task but poorly on a second one simply
because the second task is more complex. Thus, dissociations may reflect
differences in task complexity rather than the use of different modules.
One apparent solution to the above problem is to find double dissociations. A double dissociation between two tasks (X and Y) is obtained
when one patient performs normally on task X and is impaired on task Y
but another patient shows the opposite pattern. We cannot explain double
dissociations by arguing that one task is harder. For example, consider the
double dissociation that amnesic patients have impaired long-term memory
but intact short-term memory whereas other patients (e.g., KF discussed
above) have the opposite pattern. This double dissociation strongly suggests there is an important distinction between short-term and long-term
memory and that they involve different brain regions.
The approach based on double dissociations has various limitations.
First, it is generally based on the assumption that separate modules exist
(which may be misguided). Second, double dissociations can often be
explained in various ways and so provide only indirect evidence for separate modules underlying each task (Davies, 2010). For example, a double
dissociation between tasks X and Y implies the cognitive system used on
X is not identical to the one used on Y. Strictly speaking, the most we
can generally conclude is that “Each of the two systems has at least one
sub-system that the other doesn’t have” (Bergeron, 2016, p. 818). Third,
it is hard to decide which of the very numerous double dissociations that
have been discovered are theoretically important.
Finally, we consider associations. An association occurs when a
patient is impaired on tasks X and Y. Associations are sometimes taken
as evidence for a syndrome (sets of symptoms or impairments often
found together). However, there is a serious flaw in the syndrome-based
approach. An association may be found between tasks X and Y because
the mechanisms on which they depend are adjacent anatomically in the
brain rather than because they depend on the same underlying mechanism.
Thus, the interpretation of associations is fraught with difficulty.
Double dissociation
The finding that some
brain-damaged individuals
have intact performance
on one task but poor
performance on another
task whereas other
individuals exhibit the
opposite pattern.
Association
The finding that
certain symptoms or
performance impairments
are consistently found
together in numerous
brain-damaged patients.
Syndrome
The notion that symptoms
that often co-occur have a
common origin.
Case-series study
A study in which several
patients with similar
cognitive impairments
are tested; this allows
consideration of individual
data and of variation
across individuals.
Single case studies vs case series
For many years after the rise of cognitive neuropsychology in the 1970s,
most cognitive neuropsychologists made extensive use of single-case
studies. There were two main reasons. First, researchers can often gain
access to only one patient having a given pattern of cognitive impairment. Second, it was often assumed every patient has a somewhat different
pattern of cognitive impairment and so is unique. As a result, it would
be misleading and uninformative to average the performance of several
patients.
In recent years, there has been a move towards the case-series study.
Several patients with similar cognitive impairments are tested. After that,
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 10
28/02/20 2:15 PM
11
Approaches to human cognition
the data of individual patients are compared and variation across patients
assessed.
The case-series approach is generally preferable to the single-case
approach for various reasons (Lambon Ralph et al., 2011; Bartolomeo
et al., 2017). First, it provides much richer data. With a case series, we
can assess the extent of variation between patients rather than simply being
concerned about the impairment (as in the single-case approach). Second,
with a case series, we can identify (and then de-emphasise) the findings
from patients who are “outliers”. With the single-case approach, in contrast, we do not know whether the one and only patient is representative
of patients with that condition or is an outlier.
KEY TERM
Diaschisis
The disruption to distant
brain areas caused by a
localised brain injury or
lesion.
Strengths
Cognitive neuropsychology has several strengths. First, it has the advantage that it allows us to draw causal inferences about the relationship
between brain areas and cognitive processes and behaviour. In other
words, we can conclude (with moderate but not total confidence) that a
given brain area is crucially involved in performing certain cognitive tasks
(Genon et al., 2018).
Second, as Shallice (2015, pp. 387–388) pointed out, “A key intellectual strength of neuropsychology . . . is its ability to provide evidence
falsifying plausible cognitive theories.” Consider patients reading visually
presented words and non-words aloud. We might imagine patients with
damage to language areas would have problems in reading all words
and non-words. However, some patients perform reasonably well when
reading regular words (with predictable pronunciations) or non-words,
but poorly when reading irregular words (words with unpredictable pronunciations). Other patients can read regular words but have problems
with unfamiliar words and non-words. These fascinating patterns of
impairment have transformed theories of reading (Coltheart, 2015; see
Chapter 9).
Third, cognitive neuropsychology “produces large-magnitude phenomena which can be initially theoretically highly counterintuitive” (Shallice,
2015, p. 405). For example, amnesic patients typically have severely
impaired long-term memory for personal events and experiences but an
essentially intact ability to acquire and retain motor skills (Chapter 7).
These strong effects played a major role in memory researchers abandoning the notion of a single long-term memory system and replacing it with
more complex theories.
Fourth, in recent years, cognitive neuropsychology has increasingly
been combined fruitfully with cognitive neuroscience. For example, cognitive neuroscience has revealed that a given brain injury or lesion often
has widespread effects within the brain. This phenomenon is known as
diaschisis: “the distant neurophysiological changes directly caused by a
focal injury . . . these changes should correlate with behaviour” (Carrera
& Tononi, 2014, p. 2410). Discovering the true extent of the brain areas
adversely affected by a lesion facilitates the task of relating brain functioning to cognitive processing and task performance.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 11
28/02/20 2:15 PM
12
Approaches to human cognition
Limitations
KEY TERMS
Sulcus
A groove or furrow in the
surface of the brain.
Gyrus
Prominent elevated area
or ridge on the brain’s
surface; “gyri” is the
plural.
Dorsal
Towards the top.
Ventral
Towards the bottom.
Rostral
Towards the front of the
brain.
What are the limitations of the cognitive neuropsychological approach?
First, the crucial assumption that the cognitive system is fundamentally
modular is reasonable but too strong. There is less evidence for modularity
among higher-level cognitive processes (e.g., consciousness; focused attention) than among lower-level processes (e.g., colour processing; motion
processing). If the modularity assumption is incorrect, this has implications for the whole enterprise of cognitive neuropsychology (Patterson &
Plaut, 2009).
Second, other theoretical assumptions also seem too extreme. For
example, evidence discussed earlier casts considerable doubts on the
assumption of anatomical modularity and the universality assumption.
Third, the common assumption that the task performance of patients
provides relatively direct evidence concerning the impact of brain damage
on previously intact cognitive systems is problematic. Brain-damaged
patients often make use of compensatory strategies to reduce or eliminate
the negative effects of brain damage on cognitive performance. We saw an
example of such compensatory strategies earlier – patients with pure alexia
manage to read words by using a letter-by-letter strategy rarely used by
healthy individuals.
Hartwigsen (2018) proposed a model to predict when compensatory
processes will and will not be successful. According to this model, general
processes (e.g., attention; cognitive control; error monitoring) can be used
to compensate for the disruption of specific processes (e.g., phonological
processing) by brain injury. However, specific processes cannot be used to
compensate for the disruption of general processes. Hartwigsen discussed
evidence supporting his model.
Fourth, lesions can alter the organisation of the brain in several ways.
Dramatic evidence for brain plasticity is discussed in Chapter 16. Patients
whose entire left brain hemisphere was removed at an early age (known
as hemispherectomy) often develop good language skills even though language is typically centred in the left hemisphere (Blackmon, 2016).
There is the additional problem that a brain lesion can lead to changes
in the functional connectivity between the area of the lesion and distant,
intact brain areas (Bartolomeo et al., 2017). Thus, impaired cognitive performance following brain damage may reflect widespread reduced brain
connectivity as well as direct damage to a specific brain area. This complicates the task of interpreting the findings obtained from brain-damaged
patients.
Posterior
Towards the back of the
brain.
COGNITIVE NEUROSCIENCE: THE BRAIN
IN ACTION
Lateral
Situated at the side of the
brain.
Cognitive neuroscience involves the intensive study of brain activity as well
as behaviour. Alas, the brain is extremely complicated (to put it mildly!).
It consists of 100 billion neurons connected in very complex ways. We
must consider how the brain is organised and how the different areas are
described to understand research involving functional neuroimaging. Below
we discuss various ways of describing specific brain areas.
Medial
Situated in the middle of
the brain.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 12
28/02/20 2:15 PM
13
Approaches to human cognition
Figure 1.4
The four lobes, or divisions,
of the cerebral cortex in the
left hemisphere.
Interactive feature:
Primal Pictures’
3D atlas of the brain
First, the cerebral cortex is divided into
four main divisions or lobes (see Figure 1.4).
There are four lobes in each brain hemisphere:
frontal; parietal; temporal; and occipital. The
frontal lobes are divided from the parietal
lobes by the central sulcus (sulcus means
furrow or groove), and the lateral fissure separates the temporal lobes from the parietal
and frontal lobes. In addition, the parietooccipital sulcus and pre-occipital notch divide
the occipital lobes from the parietal and temporal lobes. The main gyri (or ridges; gyrus
is the singular) within the cerebral cortex are
shown in Figure 1.4.
Researchers use various terms to describe
accurately the brain area(s) activated during
task performance:
●
●
●
●
●
●
dorsal (or superior): towards the top
ventral (or inferior): towards the bottom
anterior (or rostral): towards the front
posterior: towards the back
lateral: situated at the side
medial: situated in the middle.
The
German
neurologist
Korbinian
Brodmann (1868–1918) produced a brain map
based on differences in the distributions of cell
types across cortical layers (see Figure 1.5).
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 13
Figure 1.5
Brodmann brain areas on the lateral (top figure) and medial
(bottom figure) surfaces.
28/02/20 2:15 PM
14
Approaches to human cognition
KEY TERM
He identified 52 areas. We will often refer to areas, for example, as BA17,
which means Brodmann Area 17, rather than Brain Area 17!
Within cognitive neuroscience, brain areas are often described with reference to their main functions. For example, Brodmann Area 17 (BA17) is
commonly called the primary visual cortex because it is strongly associated
with the early processing of visual stimuli.
Connectome
A comprehensive wiring
diagram of neural
connections within the
brain.
Brain organisation
In recent years, there has been considerable progress in identifying the
connectome: this is a “wiring diagram” providing a complete map of the
brain’s neural connections. Why is it important to identify the connectome?
First, as we will see, it advances our understanding of how the brain is
organised. Second, identifying the brain’s structural connections facilitates
the task of understanding how it functions. More specifically, the brain’s
functioning is strongly constrained by its structural connections. Third,
as we will see, we can understand some individual differences in cognitive
functioning with reference to individual differences in the connectome.
Bullmore and Sporns (2012) used information about the connectome
to address issues about brain organisation. They argued two major principles might determine its organisation. First, there is the principle of cost
control: costs (e.g., use of energy and space) would be minimised if the brain
consisted of limited, short-distance connections (see Figure 1.6). Second,
there is the principle of efficiency (efficiency is the ability to integrate information across the brain). This can be achieved by having very numerous
connections, many of which are long-distance (see Figure 1.6). These two
principles are in conflict – you cannot have high efficiency at low cost.
You might imagine it would be best if our brains were organised primarily on the basis of efficiency. However, this would be incredibly costly –
if all 100 billion brain neurons were interconnected, the brain would need
to be 12½ miles wide (Ward, 2015)! In fact, neurons mostly connect with
nearby neurons and no neuron is connected to more than about 10,000
other neurons. As a result, the human brain has a near-optimal trade-off
between cost and efficiency (see Figure 1.6). Thus, our brains are reasonably efficient while incurring a manageable cost.
Figure 1.6
The left panel shows a
brain network low in cost
efficiency; the right panel
shows a brain network
high in cost efficiency; the
middle panel shows the
actual human brain in which
there is moderate efficiency
at moderate cost. Nodes
are shown as orange circles.
From Bullmore and Sporns
(2012). Reprinted with
permission of Nature Reviews.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 14
28/02/20 2:15 PM
Approaches to human cognition
superior frontal
фnorm
rich club
1.15
non-rich club
1.10
rich club
1.05
local
insula
feeder
1.00
0.95
k
>14
<6
superior parietal
precuneus
15
Figure 1.7
The organisation of the
“rich club”. It includes the
precuneus, the superior
frontal cortex, insular
cortex, and superior
parietal cortex. The figure
also shows connections
between rich club nodes,
connections between
non-rich club nodes
(local connections), and
connections between rich
club and non-rich club
nodes (feeder connections).
There is an important distinction between modules (small areas of
tightly clustered connections) and hubs (regions having large numbers
of connections to other regions). This is an efficient organisation as can
be seen by analogy to the world’s airports – the needs of passengers are
best met by having numerous local airports (modules) and relatively few
major hubs (e.g., Heathrow in London; Changi in Singapore; Los Angeles
International Airport).
Collin et al. (2014) argued the brain’s hubs are strongly interconnected
and used the term “rich club” to refer to this state of affairs. The organisation of the rich club is shown in Figure 1.7: it includes the precuneus,
the superior frontal cortex, insular cortex and superior parietal cortex.
The figure also shows connections between rich club nodes, connections
between non-rich club nodes (local connections), and connections between
rich club and non-rich club nodes (feeder connections).
What light does a focus on brain network organisation shed on individual differences in cognitive ability? Hilger et al. (2017) distinguished
between global efficiency (i.e., efficiency of the overall brain network) and
nodal efficiency (i.e., efficiency of specific hubs or nodes). Intelligence was
unrelated to global efficiency. However, it was positively associated with
the efficiency of two hubs or nodes: the anterior insula and dorsal anterior
cingulate cortex. The anterior insula is involved in the detection of taskrelevant stimuli whereas the dorsal anterior cingulate cortex is involved in
performance monitoring.
Techniques for studying brain activity: introduction
Technological advances mean we have numerous exciting ways of obtaining
detailed information about the brain’s functioning and structure. In principle, we can work out where and when specific cognitive processes occur in
the brain. This allows us to determine the order in which different brain
areas become active when someone performs a task. It also allows us to discover the extent to which two tasks involve the same brain areas.
Information concerning the main techniques for studying brain activity
is contained in Table 1.2. Which technique is the best? There is no single
(or simple) answer. Each technique has its own strengths and limitations,
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 15
28/02/20 2:15 PM
16
Approaches to human cognition
KEY TERMS
TABLE 1.2 MAJOR TECHNIQUES USED TO STUDY THE BRAIN
Single-unit recording
An invasive technique for
studying brain function,
permitting the study of
activity in single neurons.
Event-related potentials
(ERPs)
The pattern of
electroencephalograph
(EEG) activity obtained
by averaging the brain
responses to the same
stimulus (or very similar
stimuli) presented
repeatedly.
Positron emission
tomography (PET)
A brain-scanning
technique based on the
detection of positrons;
it has reasonable spatial
resolution but poor
temporal resolution.
Functional magnetic
resonance imaging
(fMRI)
A technique based
on imaging blood
oxygenation using an
MRI machine; it provides
information about the
location and time course
of brain processes.
Event-related functional
magnetic resonance
imaging (efMRI)
This is a form of
functional magnetic
resonance imaging in
which patterns of brain
activity associated with
specific events (e.g.,
correct vs incorrect
responses on a memory
test) are compared.
Magnetoencephalography (MEG)
A non-invasive brainscanning technique
based on recording the
magnetic fields generated
by brain activity; it has
good spatial and temporal
resolution.
•
Single-unit recording: This technique (also known as single-cell recording) involves
inserting a micro-electrode 1/10,000th of a millimetre in diameter into the brain to
study activity in single neurons. It is very sensitive: electrical charges of as little as
one-millionth of a volt can be detected.
•
Event-related potentials (ERPs): The same stimulus (or very similar ones) are
presented repeatedly, and the pattern of electrical brain activity recorded by several
scalp electrodes is averaged to produce a single waveform. This technique allows
us to work out the timing of various cognitive processes very precisely but its spatial
resolution is poor.
•
Positron emission tomography (PET): This technique involves the detection of
positrons (atomic particles emitted from some radioactive substances). PET has
reasonable spatial resolution but poor temporal resolution and measures neural
activity only indirectly.
•
Functional magnetic resonance imaging (fMRI): This technique involves imaging
blood oxygenation using a magnetic resonance imaging (MRI) machine (described
on p. 19). fMRI has superior spatial and temporal resolution to PET, but also provides
an indirect measure of neural activity.
•
Event-related functional magnetic resonance imaging (efMRI): This “involves
separating the elements of an experiment into discrete points in time, so that the
cognitive processes (and associated brain responses) associated with each element
can be analysed independently” (Huettel, 2012, p. 1152). Event-related fMRI is
generally very informative and has become more popular recently.
•
Magneto-encephalography (MEG): This technique involves measuring the magnetic
fields produced by electrical brain activity. It provides fairly detailed information at
the millisecond level about the time course of cognitive processes, and its spatial
resolution is reasonably good.
•
Transcranial magnetic stimulation (TMS): This is a technique in which a coil is
placed close to the participant’s head and a very brief pulse of current is run through
it. This produces a short-lived magnetic field that generally (but not always) inhibits
processing in the brain area affected. When the pulse is repeated several times in
rapid succession, we have repetitive transcranial magnetic stimulation (rTMS). rTMS is
used very widely.
It has often been argued that TMS or rTMS causes a very brief “lesion”. This
technique has (jokingly!) been compared to hitting someone’s brain with a hammer.
More accurately, TMS often causes interference because the brain area to which it is
applied is involved in task processing as well as the activity resulting from the TMS
stimulation.
•
Transcranial direct current stimulation (tDCS): A weak electric current is passed
through a given brain area for some time. The electric charge flows from a positive
site (an anode) to a negative one (a cathode). Anodal tDCS increases cortical
excitability and generally enhances performance. In contrast, cathodal tDCS
decreases cortical excitability and mostly impairs performance.
and so experimenters match the technique to the research question. Of key
importance, these techniques vary in the precision with which they identify
the brain areas active when a task is performed (spatial resolution) and
the time course of such activation (temporal resolution). Thus, they differ
in their ability to provide precise information concerning where and when
brain activity occurs.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 16
28/02/20 2:15 PM
17
Approaches to human cognition
Figure 1.8
The spatial and temporal
resolution of major
techniques and methods
used to study brain
functioning.
From Ward (2006), adapted
from Churchland & Sejnowski
(1988).
Techniques for studying the brain: detailed analysis
We have introduced the main techniques for studying the brain. In what
follows, we consider them in more detail.
Single-unit recording
The single-unit (or cell) recording technique is more fine-grain than any
other technique (see Chapter 2). However, it is invasive and so rarely used
with humans. An interesting exception is a study by Quiroga et al. (2005) on
epileptic patients with implanted electrodes to identify the focus of seizure
onset (see Chapter 3). A neuron in the medial temporal lobe responded
strongly to pictures of Jennifer Aniston (the actor from Friends) but not
to pictures of other famous people. We need to interpret this finding carefully. Only a tiny fraction of the neurons in that brain area were studied
and it is highly improbable that none of the others would have responded
to Jennifer Aniston.
Event-related potentials
Electroencephalography (EEG) is based on recordings of electrical brain
activity measured at several locations on the surface of the scalp. Very
small changes in electrical activity within the brain are picked up by scalp
electrodes and can be seen on a computer screen. However, spontaneous or
background brain activity can obscure the impact of stimulus processing on
the EEG recording.
The answer to the above problem is to present the same stimulus (or
very similar stimuli) many times. After that, the segment of the EEG following each stimulus is extracted and lined up with respect to the time
of stimulus onset. These EEG segments are then averaged together to
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 17
KEY TERMS
Transcranial magnetic
stimulation (TMS)
A technique in which
magnetic pulses briefly
disrupt the functioning
of a given brain area.
It is often claimed
that it creates a shortlived “lesion”. More
accurately, TMS causes
interference when the
brain area to which it is
applied is involved in
task processing as well as
activity produced by the
applied stimulation.
Transcranial direct
current stimulation
(tDCS)
A technique in which
a very weak electrical
current is passed through
an area of the brain (often
for several minutes);
anodal tDCS often
enhances performance,
whereas cathodal tDCS
often impairs it.
Electroencephalography
(EEG)
Recording the brain’s
electrical potentials
through a series of scalp
electrodes.
28/02/20 2:15 PM
18
Approaches to human cognition
produce a single waveform. This method produces event-related potentials
from EEG recordings and allows us to distinguish the genuine effects of
stimulation from background brain activity.
ERPs have excellent temporal resolution, often indicating when a
given process occurred to within a few milliseconds. The ERP waveform
consists of a series of positive (P) and negative (N) peaks, each described
with reference to the time in milliseconds after stimulus onset. Thus, for
example, N400 is a negative wave peaking at about 400 ms.
Behavioural measures (e.g., reaction times) typically provide only a
single measure of time on each trial, whereas ERPs provide a continuous
measure. However, ERPs do not indicate with precision which brain regions
are most involved in processing, in part because skull and brain tissue
distort the brain’s electrical fields. In addition, ERPs are mainly of value
when stimuli are simple and the task involves basic processes (e.g., target
detection) triggered by task stimuli. Finally, we cannot study the most
complex forms of cognition (e.g., problem solving) with ERPs because the
processes by participants would typically change with increased practice.
Positron emission tomography (PET)
Positron emission tomography (PET) is based on the detection of
positrons – atomic particles emitted by some radioactive substances.
Radioactively labelled water (the tracer) is injected into the body and
rapidly gathers in the brain’s blood vessels. When part of the cortex
becomes active, the labelled water moves there rapidly. A scanning device
measures the positrons emitted from the radioactive water which leads to
pictures of the activity levels in different brain regions.
PET has reasonable spatial resolution in that any active brain area
can be located to within 5–10 mm. However, it has very poor temporal
resolution – PET scans indicate the amount of activity in any given brain
region over approximately 30 seconds. As a consequence, PET has now
largely been superseded by fMRI (see p. 19).
The magnetic resonance
imaging (MRI) scanner
has proved an extremely
valuable source of data in
psychology.
Juice Images/Alamy Stock
Photo.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 18
28/02/20 2:15 PM
19
Approaches to human cognition
Magnetic resonance imaging (MRI and fMRI)
KEY TERMS
Magnetic resonance imaging (MRI) involves using an MRI scanner containing a very large magnet (weighing up to 11 tons). A strong magnetic
field causes an alignment of protons (subatomic particles) in the brain. A
brief radio-frequency pulse is applied, which causes the aligned protons to
spin and then regain their original orientations giving up a small amount of
energy as they do so. The brightest regions in the MRI are those emitting
the greatest energy. MRI scans can be obtained from numerous angles but
tell us only about brain structure rather than its functions.
Happily, MRI can also be used to provide functional information in the
form of functional magnetic resonance imaging (fMRI). Oxyhaemoglobin
is converted into deoxyhaemoglobin when neurons consume oxygen, and
deoxyhaemoglobin produces distortions in the local magnetic field (sorry
this is so complex!). This distortion is assessed by fMRI and provides a
measure of the concentration of deoxyhaemoglobin in the blood.
Technically, what is measured in fMRI is known as BOLD (blood
oxygen-level-dependent contrast). Changes in the BOLD signal produced
by increased neural activity take time, so the temporal resolution of fMRI
is 2 or 3 seconds. However, its spatial resolution is very good (approximately 1 mm). Thus, fMRI has superior temporal and spatial resolution
to PET.
Suppose we want to understand why people remember some items but
not others. We can use event-related fMRI (efMRI), in which we consider
BOLD
Blood oxygen-leveldependent contrast; this
is the signal measured by
fMRI.
Neural decoding
Using computer-based
analyses of patterns of
brain activity to work
out which stimulus an
individual is processing.
CAN COGNITIVE NEUROSCIENTISTS READ OUR BRAINS/MINDS?
There is much current interest in neural decoding – “determining what stimuli or mental states
are represented by an observed pattern of neural activity” (Tong & Pratte, 2012, p. 483). This
decoding involves complex computer-based analysis of individuals’ patterns of brain activity and
has sometimes been described as “brain reading” or “mind reading”.
Kay et al. (2008) obtained impressive findings using neural decoding techniques. Two participants
viewed 1,750 natural images and brain-activation patterns were obtained using fMRI. Computerbased approaches then analysed these patterns. After that, the participants were presented with
120 previously unseen natural images and fMRI data were collected. These fMRI data permitted
correct identification of the image being viewed on 92% of the trials for one participant and 72%
for the other. This is remarkable since chance performance was only 0.8%!
Huth et al. (2016) used more complex stimuli. They presented observers with clips taken from
several movies including Star Trek and Pink Panther 2 while using fMRI. Decoding accuracy was
reasonably successful in identifying general object categories (e.g., animal), specific object categories (e.g., canine) and various actions (e.g., talk; run) presented in the movie clips.
Research on neural decoding can enhance our understanding of human visual perception.
However, successful decoding of an object using the pattern of brain activation in a given brain
region does not necessarily mean that region is causally involved in observers’ identification of
that object. Several reasons why we need to be cautious when interpreting findings from neural
decoding studies are discussed by Popov et al. (2018). For example, some aspects of brain activity in response to visual stimuli are irrelevant to the observer’s perceptual representation. In an
experiment, computer analysis of brain activity in macaques successfully classified various stimuli
presented to them that the macaques themselves could not distinguish (Hung et al., 2005).
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 19
28/02/20 2:15 PM
20
Approaches to human cognition
each participant’s patterns of brain activation for remembered and forgotten items. Wagner et al. (1998) recorded fMRI while participants learned
a list of words. There was more brain activity during learning for words
subsequently recognised than those subsequently forgotten. These findings
suggest forgotten words were processed less thoroughly than remembered
words during learning.
Evaluation
fMRI is the dominant technique within cognitive neuroscience. Its value
has increased over the years with the introduction of more powerful
MRI scanners. Initially, most scanners had a field strength of 1.5 T but
recently scanners with field strengths of up to 7 T have become available. As a result, submillimetre spatial resolution is now possible (Turner,
2016).
What are fMRI’s limitations? The main ones are as follows:
(1)
(2)
(3)
(4)
It has relatively poor temporal resolution. As a result, it is uninformative about the order in which different brain regions (and cognitive processes) are used during performance of a task. However,
other techniques within cognitive neuroscience can be used in conjunction with fMRI in order to achieve good spatial and temporal
resolution.
It provides an indirect measure of underlying neural activity. “The
BOLD signal primarily measures the input and processing of neural
information . . . but not the output signal transmitted to other brain
regions” (Shifferman, 2015, p. 60).
Various complex processes are used by researchers to take account of
the fact that all brains differ. This involves researchers changing the
raw fMRI data and poses the danger that “BOLD-fMRI neuroimages
represent mathematical constructs rather than physiological reality”
(Shifferman, 2015).
There are important constraints on the visual stimuli that can be
presented to participants when lying in the scanner and on the types
of responses they can make. There can be particular problems with
auditory stimuli because the scanner is noisy.
Magneto-encephalography
The electric currents that the brain generates are associated with a magnetic
field. This magnetic field is assessed by magneto-encephalography (MEG)
involving at least 200 devices on the scalp. This technique has very good
spatial and temporal resolution. However, it is extremely expensive and this
has limited its use.
Transcranial magnetic stimulation
Transcranial magnetic stimulation (TMS) is a technique in which a coil
(often in the shape of a figure of eight) is placed close to the participant’s
head. A very brief (under 1 ms) but large magnetic pulse of current is run
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 20
28/02/20 2:15 PM
Approaches to human cognition
21
through it. This creates a short-lived magnetic field
generally leading to inhibited processing in the directly
affected area (about 1 cc in extent). More specifically,
the magnetic field created leads to electrical stimulation in the brain. In practice, several magnetic pulses
are typically given in rapid succession – this is repetitive transcranial magnetic stimulation (rTMS). Most
research has used rTMS but we will often simply use
the more general term TMS.
What is an appropriate control condition against
which to compare the effects of TMS or rTMS? We
could compare task performance with and without Transcranial magnetic stimulation coil.
TMS. However, TMS creates a loud noise and muscle University of Durham/Simon Fraser/Science Photo
twitching at the side of the forehead and these effects Library.
might lead to impaired performance. Applying TMS
to a non-critical brain area (irrelevant for task performance) often provides a suitable control condition. It is typically predicted
task performance will be worse when TMS is applied to a critical area
rather than a non-critical one because it produces a temporary “lesion” or
interference to the area targeted.
Evaluation
What are TMS’s strengths? First, it permits causal inferences – if TMS
applied to a particular brain area impairs task performance, we can infer
that brain area is necessary for task performance. Conversely, if TMS has
no effects on task performance, we can conclude the brain area affected by
it is irrelevant. In that respect, it resembles cognitive neuropsychology.
Second, TMS research is more flexible than cognitive neuropsychology.
For example, we can compare any given individual’s performance with and
without a “lesion” with TMS. This is rarely possible with brain-damaged
patients.
Third, TMS research is also more flexible than cognitive neuropsychology because the researcher controls the brain area(s) affected. In addition,
the temporary “lesions” created by TMS typically cover a smaller brain
area than patients’ lesions. This is important because the smaller the brain
affected, the easier it generally is to interpret task-performance findings.
Fourth, with TMS research, we can ascertain when a brain area is
most activated. For example, the presentation of a visual stimulus leads
to processing proceeding rapidly to higher visual levels (feedforward processing). According to Lamme (2010), conscious visual perception typically
requires subsequent recurrent processing proceeding in the opposite direction from higher levels to lower ones. Koivisto et al. (2011) used TMS to
disrupt recurrent processing. As predicted, this impaired conscious visual
perception.
What are the limitations of TMS research? First, the effects of TMS
are complex and not fully understood. When TMS was first introduced,
it was assumed it would disrupt performance. That is, the most common
finding. However, Luber and Lisanby (2014) reviewed 61 studies in which
performance speed and/or accuracy was enhanced!
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 21
28/02/20 2:15 PM
22
Approaches to human cognition
How can TMS enhance performance? It sometimes increases neural
activity in areas adjacent to the one stimulated and so increases their processing efficiency (Luber & Lisanby, 2014). There is also much evidence
for compensatory flexibility (Hartwigsen, 2018) – disruption to cognitive
processing within a given brain area caused by TMS is compensated for by
the recruitment of other brain areas.
Second, it is hard to establish the precise brain areas affected by TMS.
For example, adverse effects of TMS on performance might occur because
it interferes with communication between two brain areas at some distance from the stimulation point. This issue can be addressed by combining TMS with neuroimaging techniques to clarify its effects (Valero-Cabré
et al., 2017).
Third, TMS can only be applied to brain areas lying beneath the skull
but with overlying muscle. That limits its overall usefulness.
Fourth, there are safety issues with TMS. It has very occasionally
caused seizures in participants despite stringent rules designed to ensure
their safety.
Transcranial direct current stimulation (tDCS)
As mentioned earlier, anodal tDCS increases cortical excitability whereas
cathodal tDCS decreases cortical excitability. The temporal and spatial resolution of tDCS is less than TMS (Stagg & Nitsche, 2011). However, anodal
tDCS has a significant advantage over TMS in that it often facilitates or
enhances cognitive functioning. As a result, anodal tDCS is increasingly
used to reduce adverse effects of brain damage on cognitive functioning
(Stagg et al., 2018). Some of these beneficial effects on cognition are relatively long-lasting. Another advantage of tDCS is that it typically causes
little or no discomfort.
Much progress has been made in understanding the complex mechanisms associated with tDCS. However, “Knowledge about the physiological effects of tDCS is still not complete” (Stagg et al., 2018, p. 144).
Overall strengths
Cognitive neuroscience has contributed substantially to our understanding of human cognition. We discuss supporting evidence for that statement
throughout the book in areas that include perception, attention, learning,
memory, language comprehension, language production, problem solving,
reasoning, decision-making and consciousness. Here we identify its major
strengths.
First, cognitive neuroscience has helped to resolve theoretical controversies and issues that had proved intractable with purely behavioural
studies (Mather et al., 2013). The main reason is that cognitive neuroscience
adds considerably to the information available to researchers (Poldrack &
Yarkoni, 2016). Below we briefly consider two examples:
(1)
Listeners hearing degraded speech find it much more intelligible when
it is accompanied by visually presented words matching (rather than
not matching) the auditory input. There has been much theoretical
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 22
28/02/20 2:15 PM
23
Approaches to human cognition
(2)
controversy concerning when this visually presented information influences speech perception (see Chapter 9). Does it occur early and so
directly influence basic auditory processes or does it occur late (after
basic auditory processing has finished)? Wild et al. (2012) found there
was more activity in brain areas involved in early auditory processing
when the visual input matched the auditory input. This strongly suggests visual information directly influences basic auditory processes.
There has been much theoretical controversy as to whether visual
imagery resembles visual perception (see Chapter 3). Behavioural evidence has proved inconclusive. However, neuroimaging has shown
two-thirds of the brain areas activated during visual perception are
also activated during visual imagery (Kosslyn, 2005). Even brain
areas involved in the early stages of visual perception are often activated during visual imagery tasks (Kosslyn & Thompson, 2003).
Thus, there are important similarities. Functional neuroimaging has
also revealed important differences between the processes involved in
visual perception and imagery (Dijkstra et al., 2017b).
KEY TERM
Functional specialisation
The assumption that
each brain area or
region is specialised for
a specific function (e.g.,
colour processing; face
processing).
Second, the incredible richness of neuroimaging data means cognitive neuroscientists can (at least in principle) construct theoretical models accurately
mimicking the complexities of brain functioning. In contrast, cognitive
neuropsychology, for example, is less flexible and more committed to the
notion of a modular brain organisation.
Third, over 10,000 fMRI studies within cognitive neuroscience have
been published. Many meta-analyses based on these studies have been
carried out to understand brain-cognition relationships (Poldrack &
Yarkoni, 2016). Such meta-analyses “provide highly robust estimates of the
neural correlates of relatively specific cognitive tasks” (p. 592). At present,
this approach is limited because we do not know the pattern of activation
associated with any given cognitive process – data are coded with respect
to particular tasks rather than underlying cognitive processes.
Fourth, neuroimaging data can often be re-analysed based on theoretical developments. For example, early neuroimaging research on
face processing suggested it occurs mostly within the fusiform face
area (see Chapter 3). However, the assumption that face processing
involves a network of brain regions provides a more accurate account
(Grill-Spector et al., 2017). Thus, cognitive neuroscience can be selfcorrecting.
More generally, cognitive neuroscience has shown the assumption of
functional specialisation (each brain area is specialised for a different function) is oversimplified. We can contrast the notions of functional specialisation and functional integration (positive correlations of various brain
areas within a network). For example, conscious perception depends on
coordinated activity across several brain regions and so involves functional
integration (see Chapter 16).
Overall limitations
We turn now to general issues raised by cognitive neuroscience. We emphasise fMRI research because that technique has been used most often.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 23
28/02/20 2:15 PM
24
Approaches to human cognition
KEY TERM
First, many cognitive neuroscientists over-interpret their findings by assuming one-to-one links between cognitive processes and brain areas. For
example, activation in a particular small brain region (a “blob”) is interpreted as being the “love area” and another small region was interpreted as
being the “face processing area”. This approach has been described (unflatteringly) as “blobology”.
Blobology is in decline. However, there is still undue reliance on
reverse inference – the involvement of a given cognitive process is inferred
from activation within a given brain region. For example, face recognition
is typically associated with activation within the fusiform face area, which
led many researchers to identify that area as specifically involved in face
processing. This is incorrect in two ways (see Chapter 3): (1) the fusiform
face area is activated in response to many different kinds of objects as well
as faces (Downing et al., 2006); and (2) several other brain areas (e.g.,
occipital face area) are also activated during face processing.
Second, cognitive neuroscience is rarely used to test cognitive theories. For example, Tressoldi et al. (2012) reviewed 199 studies published
in 8 journals. They found 89% of these studies focused on localising the
brain areas associated with brain processes and only 11% tested a theory
of cognition.
There is some validity to Tressoldi et al.’s (2012) argument. However,
it has become less persuasive because of the rapidly increasing emphasis
within cognitive neuroscience on theory testing.
Third, it is hard to bridge the divide between psychological processes
and concepts and patterns of brain activation. As Harley (2012) pointed
out, we may never find brain patterns corresponding closely to psychological processes such as “attention” or “planning”. Harley (2012, p. 1372)
concluded as follows: “Our language and thought may not divide up in the
way in which the brain implements these processes.”
Fourth, it is sometimes hard to replicate findings within cognitive neuroscience. For example, Uttal (2012) compared the findings from different
brain-imaging meta-analyses on a given cognitive function. More specifically, he identified the Brodmann areas associated with a given cognitive function in two meta-analyses. Uttal then worked out the Brodmann
areas activated in both meta-analyses and divided this by the total number
of Brodmann areas identified in at least one meta-analysis. If the two
meta-analyses were in total agreement, the resultant figure would be 100%.
The actual figure varied between 14% and 51%!
Uttal’s (2012) findings seem devastating – after all, we might expect
meta-analyses based on numerous studies to provide very reliable evidence. However, many apparent discrepancies occurred mainly because
one meta-analysis reported activation in fewer brain areas than the other.
This often happened because of stricter criteria for deciding a given brain
area was activated (Klein, 2014).
Fifth, false-positive findings (i.e., mistakenly concluding that random
activity in a given brain area is task-relevant) are common (Yarkoni et
al., 2010; see discussion later on p. 25). There are several reasons for this.
For example, researchers have numerous options when deciding precisely
how to analyse their fMRI data (Poldrack et al., 2017). In addition, most
neuroimaging studies produce huge amounts of data and researchers
Reverse inference
As applied to functional
neuroimaging, it involves
arguing backwards from a
pattern of brain activation
to the presence of a given
cognitive process.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 24
28/02/20 2:15 PM
Approaches to human cognition
25
sometimes fail to adjust required significance
levels appropriately.
Bennett et al. (2009) provided an example
of a false-positive finding. They asked their
participant to determine the emotions shown
in photographs. When they did not adjust
required significance levels, there was significant evidence of brain activation (see Figure
1.9). Amusingly, the participant was a dead
salmon so we can be certain the “finding”
was a false positive.
Sixth, most brain-imaging techniques
Figure 1.9
reveal only associations between patterns of Areas showing greater activation in a dead salmon when
brain activation and behaviour. Such asso- presented with photographs of people than when at rest.
ciations are purely correlational and do not From Bennett et al. (2009). With kind permission of the authors.
establish the brain regions activated are necessary for task performance. For example,
brain activation might also be caused by participants engaging in unnecessary monitoring of their performance or attending to non-task stimuli.
Seventh, many cognitive neuroscientists previously assumed most
brain activity is driven by environmental or task demands. As a result, we
might expect relatively large increases in brain activity in response to such
demands. That is not the case. The increased brain activity when someone
performs a cognitive task typically adds less than 5% to resting brain activity. Surprisingly several brain areas exhibit decreased activity when a cognitive task is performed. Of importance here is the default mode network
(Raichle, 2015). It consists of an interconnected set of brain regions (including the ventral medial and dorsal medial prefrontal cortex, and the posterior cingulate cortex) more active during rest than during performance of
a task. Its functions include mind wandering, worrying and daydreaming.
The key point is that patterns of brain activity in response to any given
cognitive task reflect increased activity associated with task processing and
decreased activity associated with reduced activity within the default mode
network. Such complexities complicate the interpretation of neuroimaging
data.
Eighth, cognitive neuroscience shares with cognitive psychology
problems of ecological validity (applicability to everyday life) and paradigm specificity (findings not generalising across paradigms). Indeed, the
problem of ecological validity may be greater in cognitive neuroscience.
For example, participants in fMRI studies lie on their backs in claustroKEY TERM
phobic and noisy conditions and have very restricted movement – not
Default mode network
much like everyday life! Gutchess and Park (2006) found recognition
A network of brain
memory was significantly worse in an MRI scanner than in an ordiregions that is active
“by default” when an
nary laboratory. Presumably the scanner provided a more distracting or
individual is not involved
anxiety-creating environment.
in a current task; it is
Ninth, we must avoid what Ali et al. (2014) termed “neuroenchantment” –
associated with internal
exaggerating the importance of neuroimaging to our understanding of cogprocesses including mind
nition. Ali et al. provided a striking example of neuroenchantment. College
wandering, remembering
the past and imagining
students were exposed to a crudely built mock brain scanner (including a
the future.
discarded hair dryer!) (see Figure 1.10). They were asked to think about
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 25
28/02/20 2:15 PM
26
Approaches to human cognition
Figure 1.10
The primitive mock
neuroimaging device used
by Ali et al. (2014).
the answers to various questions (e.g., name a country). The mock neuroimaging device apparently “read their minds” and worked out exactly what
they were thinking. Amazingly, three-quarters of the student participants
believed this was genuine rather than being due to the researcher’s trickery?
COMPUTATIONAL COGNITIVE SCIENCE
KEY TERMS
Computational
modelling
This involves constructing
computer programs that
simulate or mimic human
cognitive processes.
Artificial intelligence
This involves developing
computer programs
that produce intelligent
outcomes.
There is an important distinction between computational modelling and
artificial intelligence. Computational modelling involves programming
computers to model or mimic human cognitive functioning. Thus, cognitive modellers “have the goal of understanding the human mind through
computer simulation” (Taatgen et al., 2016, p. 1). In contrast, artificial
intelligence involves constructing computer systems producing intelligent
outcomes but typically in ways different from humans. Consider Deep Blue,
the IBM computer that defeated the world chess champion Garry Kasparov
on 11 May 1997. Deep Blue processed up to 200 million positions per
second, which is vastly more than human chess players (see Chapter 12).
The IBM computer Watson also shows the power of artificial intelligence. This computer competed on the American quiz show Jeopardy
against two of the most successful human contestants ever on that show:
Brad Rutter and Ken Jennings. The competition took place between
14 and 16 February 2011, and Watson won the $1 million first prize.
Watson had the advantage over Rutter and Jennings of having access to
10 million documents (200 million pages of content). However, Watson
had the disadvantage of being less sensitive to subtleties contained in the
questions.
In the past (and even nowadays), many experimental cognitive psychologists expressed their theories in vague verbal statements (e.g., “Information
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 26
28/02/20 2:15 PM
27
Approaches to human cognition
The IBM Watson and two
human contestants (Ken
Jennings and Brad Rutter).
Ben Hider/Getty Images.
from short-term memory is transferred to long-term memory”). This
made it hard to provide precise predictions from the theory and to decide
whether the evidence fitted the theory. As Murphy (2011) pointed out,
verbal theories provided theorists with undesirable “wiggle room”. In contrast, a computational model “requires the researchers to be explicit about
a theory in a way that a verbal theory does not” (Murphy, 2011, p. 300).
Implementing a theory as a program is a good way to check it contains no
hidden assumptions or imprecise terms. This often reveals that the theory
makes predictions the theorist had not realised!
There are issues concerning the relationship between the performance
of a computer program and human performance (Costello & Keane, 2000).
For example, a program’s speed doing a simulated task can be affected
by psychologically irrelevant features (e.g., the power of the computer).
Nevertheless, the various materials presented to the program should result
in differences in program operation time correlating closely with differences
in participants’ reaction times with the same materials.
Types of models
Most computational models focus on specific aspects of human cognition.
For example, there are successful computational models providing accounts
of reading words and non-words aloud (Coltheart et al., 2001; Perry et al.,
2007, 2014; Plaut et al., 1996) (see Chapter 9).
More ambitious computational models provide cognitive
architectures – “models of the fixed structure of the mind” (Rosenbloom
et al., 2017, p. 2). Approximately 300 cognitive architectures have been
proposed over the years (Kotseruba & Tsotsos, 2018).
Note that a cognitive architecture typically has to be supplemented with
the knowledge required to perform a given task to produce a fully fledged
computational model (Byrne, 2012). Anderson et al. (2004) proposed an
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 27
KEY TERM
Cognitive architecture
Comprehensive framework
for understanding human
cognition in the form of a
computer program.
28/02/20 2:15 PM
28
Approaches to human cognition
KEY TERMS
especially influential cognitive architecture in their Adaptive Control of
Thought-Rational (ACT-R) (discussed on p. 30).
Connectionist models
Models in computational
cognitive science
consisting of
interconnected networks
of simple units or
nodes; the networks
exhibit learning through
experience and specific
items of knowledge
are distributed across
numerous units.
Connectionism
Connectionist models (also called neural network models) typically
consist of interconnected networks of simple units (or nodes) that exhibit
learning.
Connectionist or neural network models use elementary units or nodes
connected together in structures or layers (see Figure 1.11). First, a layer
of input nodes codes the input. Second, activation caused by input coding
spreads to a layer of hidden nodes. Third, activation spreads to a layer of
Neural network models
output nodes.
Computational models
Of major importance, the basic model shown in Figure 1.11 can learn
in which processing
simple additions. If input node 1 is active, output node 1 will also become
involves the simultaneous
active. If input nodes 1 and 2 are active, output node 3 will become active.
activation of numerous
If all input nodes are active, output node 10 will become active and so on.
interconnected nodes
(basic units).
The model compares the actual output against the correct output. If there
is a discrepancy, the model learns to adjust the weights of the connections
Nodes
between the nodes to produce the correct output. This is known as backThe basic units within a
neural network model.
ward propagation of errors or back-propagation, and it allows the model
to
learn the appropriate responses without being explicitly programmed to
Back-propagation
do
so.
A learning mechanism
in connectionist models
Numerous connectionist models have been constructed using the basic
based on comparing
architecture shown in Figure 1.11. Recent connectionist models often have
actual responses to
several hidden layers (they are called deep neural networks). Connectionist
correct ones.
models involve distributed representations whereas other computational
models involve localist representations (Bowers, 2017a). In the former
case, each node or unit responds to multiple stimulus categories and so
a given word or object is represented by the pattern of activation across
many nodes or units.
In the latter case, each node or unit responds most actively to a single
meaningful stimulus category (e.g., a given word or object).
Interest in the connectionist approach was triggered initially by
Rumelhart et al. (1986) and McClelland et al. (1986) with their parallel
distributed processing models. These models (like most models based on
distributed representations) exhibit learning. Other influential connectionist
models are Plaut et al.’s (1996) reading model (Chapter 9) and McClelland
& Elma’s (1986) TRACE model of spoken word recognition (Chapter 9).
In contrast, computational models based
on localist representations often do not exhibit
learning because the representations contain
all the required information. Examples of
such models in this book include Coltheart
et al.’s (2001) reading model (Chapter 9) and
the speech production models of Dell (1986)
and Levelt et al. (1999; see Chapter 11).
It has often been assumed localist models
are
biologically
implausible because they seem
Figure 1.11
Architecture of a basic three-layer connectionist network.
to imply a single neuron responds to stimuli
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 28
28/02/20 2:15 PM
29
Approaches to human cognition
from a given category. However, many neurons show considerable selectivity because they respond to only a very small fraction of stimuli. For
example, consider a study by Quiroga et al. (2005) mentioned earlier. They
presented epileptic patients with 100 images including famous individuals
and landmark buildings. On average, responsive neurons responded to
only approximately 3% of the images. The issues are complex. However, it
is not clear localist models are less biologically plausible than distributed
models (Bowers, 2017b).
Evaluation
There are many successful connectionist or neural network models (some
are mentioned above). Such models (discussed at various points in this
book) have provided valuable insights into human cognition and the models
have become increasingly sophisticated over time. A general strength of
neural networks is that they “exhibit robust flexibility in the face of the
challenges posed by the real world” (Garson, 2016, p. 3). That is one reason
why neural networks can perform numerous different kinds of cognitive
tasks. Finally, there are intriguing similarities between the brain, with its
numerous units (neurons) and synaptic connections, and neural networks,
with their units (nodes) and connections.
What are the limitations of connectionist models? First, there is an
issue with the common assumption that connectionist models are distributed. If two words are presented at the same time, this can lead to superimposing two patterns over the same units or nodes making it hard (or
impossible) to decide which activated units or nodes belong to which word.
This causes superposition catastrophe (Bowers, 2017a).
Second, there are many examples of neural networks that can make
associations and match patterns. However, it has proved much harder to
develop neural networks that can learn general rules (Garson, 2016).
Third, the analogy between neural networks and the brain is very
limited. In essence, the latter is hugely more complex than the former.
Fourth, back-propagation implies learning will be slow, whereas
humans sometimes exhibit one-trial learning (Garson, 2016). Furthermore,
there is little or no evidence of back-propagation in the human brain
(Mayor et al., 2014).
KEY TERMS
Production systems
These consist of very
large numbers of “IF . . .
THEN” production rules
and a working memory
containing information.
Production rules
“IF . . . THEN” or
condition-action rules
in which the action is
carried out whenever the
appropriate condition is
present.
Working memory
A limited-capacity system
used in the processing
and brief holding of
information.
Production systems
Production systems consist of numerous “IF . . . THEN” production rules.
Production rules can take many forms. However, an everyday example is:
“If the green man is lit up, then cross the road.” There is also a working
memory (i.e., a system holding information currently being processed).
If information from the environment that “green man is lit up” reaches
working memory, it will match the IF part of the rule in long-term memory
and trigger the THEN part of the rule (i.e., cross the road).
Production systems vary but generally have the following characteristics:
●
●
numerous IF . . . THEN rules;
a working memory containing information;
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 29
28/02/20 2:15 PM
30
Approaches to human cognition
●
●
a production system that operates by matching the contents of working
memory against the IF parts of the rules and then executing the THEN
parts;
if information in working memory matches the IF parts of two rules, a
conflict-resolution strategy selects one.
Adaptive Control of Thought-Rational (ACT-R) and beyond
As mentioned earlier, Anderson et al. (2004) proposed ACT-R, which was
subsequently developed (e.g., Anderson et al., 2008). ACT-R assumes the
cognitive system consists of several modules (relatively independent subsystems). It combines computational cognitive science with cognitive neuroscience by identifying the brain regions associated with each module (see
Figure 1.12). Four modules are of special importance:
(1)
(2)
(3)
(4)
Retrieval module: it maintains the retrieval cues needed to access
information; its proposed location is the inferior ventrolateral prefrontal cortex.
Imaginal module: it transforms problem representations to assist in
problem solving; it is located in the posterior parietal cortex.
Goal module: it keeps tracks of an individual’s intentions and controls information processing; it is located in the anterior cingulate
cortex.
Procedural module: it uses production (IF . . . THEN) rules to determine what action will be taken next; it is located at the head of the
caudate nucleus within the basal ganglia.
Each module has a buffer associated with it containing a limited amount of
important information. How is information from these buffers integrated?
According to Anderson et al. (2004, p. 1058): “A central production
system can detect patterns in these buffers and take co-ordinated action.”
Figure 1.12
The main modules of the
ACT-R (Adaptive Control of
Thought-Rational) cognitive
architecture with their
locations within the brain.
Reprinted from Anderson
et al. (2008). Reprinted with
permission of Elsevier.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 30
28/02/20 2:15 PM
Approaches to human cognition
31
If several productions could be triggered by the information contained
in the buffers, one is selected based on the value or gain associated with
each outcome plus the amount of time or cost incurred in achieving that
outcome.
ACT-R represents an impressive attempt to provide a theoretical
framework for understanding information processing and performance on
numerous cognitive tasks. It is also impressive in seeking to integrate computational cognitive science with cognitive neuroscience.
What are ACT-R’s limitations? First, it is very hard to test such a
wide-ranging theory. Second, areas of prefrontal cortex (e.g., dorsolateral prefrontal cortex) generally assumed to be of major importance in
cognition are de-emphasised. Third, as discussed earlier, research within
cognitive neuroscience increasingly reveals the importance to cognitive processing of brain networks rather than specific regions. Fourth, in common
with most other cognitive architectures, ACT-R has a knowledge base that
is substantially smaller than that possessed by humans (Lieto et al., 2018).
This reduces the applicability of ACT-R to human cognitive performance.
Standard model of the mind
Dozens of cognitive architectures have been proposed and it is difficult to
compare them. Laird et al. (2017; see also Rosenbloom et al., 2017) recently
proposed a standard model emphasising commonalities among major cognitive architectures including ACT-R (see Figure 1.13).
Figure 1.13 may look unimpressive because it represents the model at
a very general level. However, the model contains numerous additional
assumptions (Laird et al., 2017). First, procedural memory has special
importance because it has access to the whole of working memory; in contrast, the other modules have access only to specific aspects of working
memory.
Declarative
Long-term Memory
Procedural
Long-term Memory
Working Memory
Perception
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 31
Motor
Figure 1.13
The basic structure of the
standard model involving
five independent modules.
Declarative memory (see
Glossary) stores facts and
events whereas procedural
memory (see Glossary)
stores knowledge about
actions.
28/02/20 2:15 PM
32
Approaches to human cognition
Second, the crucial assumption is that there is a cognitive cycle lasting
approximately 50 ms per cycle. What happens is that: “Procedural memory
induces the selection of a single deliberate act per cycle, which can modify
working memory, initiate the retrieval of knowledge from long-term declarative memory, initiate motor actions . . ., and provide top-down influence
to perception” (Laird et al., 2017, p. 3). The cognitive cycle involves serial
processing but parallel processing can occur within any module as well as
between them.
The standard model is useful but incomplete. For example, it does not
distinguish between different types of declarative memory (e.g., episodic
and semantic memory; see Chapter 7). In addition, it does not account for
emotional influences on cognitive processing.
Links with other approaches
ACT-R represents an impressive attempt to apply computational models to
cognitive neuroscience. There has also been interest in applying such models
to data from brain-damaged patients. Typically, the starting point is to
develop a computational model accounting for the performance of healthy
individuals on some task. After that, aspects of the computational model
or program are altered to simulate “lesions”, and the effects on task performance are assessed. Finally, the lesioned model’s performance is compared
against that of brain-damaged patients (Dell & Caramazza, 2008).
Overall strengths
Computational cognitive science has several strengths. First, the development of cognitive architectures can provide an overarching framework for
understanding the cognitive system. This would be a valuable achievement
given that much research in cognitive psychology is limited in scope and
suffers from paradigm specificity. Laird et al.’s (2017) standard model represents an important step on the way to that achievement.
Second, the scope of computational cognitive science has increased.
Initially, it was applied mainly to behavioural data. More recently, however,
it has been applied to functional neuroimaging data (e.g., Anderson et al.,
2004) and EEG data (Anderson et al., 2016a). Why is this important? As
Taatgen et al. (2016, p. 3) pointed out: “The link to neuroimaging is critical
in establishing that the hypothesised processing steps in cognitive models
have plausibility in reality.”
Third, rigorous thinking is required to develop computational models
because computer programs must contain detailed information about the
processes involved in performing any given task. In contrast, many theories within traditional cognitive psychology are vaguely expressed and the
predictions following from their assumptions are unclear.
Fourth, progress is increasingly made by using nested incremental modelling. In essence, a new model builds on the strengths of previous related
models while eliminating (or reducing) their weaknesses and accounting for
additional data. For example, Perry et al. (2007; Chapter 9) put forward a
connectionist dual-process model (CDP+) of reading aloud that improved
on the model on which it was based.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 32
28/02/20 2:15 PM
Approaches to human cognition
33
Overall limitations
What are the main limitations of the computational cognitive science
approach? First, there is Bonini’s paradox: as models become more accurate and complete, they can become as hard to understand as the complex
phenomena they are designed to explain. Conversely, models easy to understand are typically inaccurate and incomplete. Many computational modellers have responded to this paradox by focusing on the essence of the
phenomena and ignoring the minor details (Milkowski, 2016, p. 1459).
Second, many computational models are hard to falsify. The ingenuity
of computational modellers means many models can account for numerous
behavioural findings (Taatgen et al., 2016). This issue can be addressed by
requiring computational models to explain neuroimaging findings as well
as behavioural ones.
Third, some computational models are less successful than they appear.
One reason is overfitting in which a model accounts for noise in the data
as well as genuine effects (Ziegler et al., 2010). Overfitting often means a
model seems to account very well for a given data set but is poor at predicting new data (Yarkoni & Westfall, 2017).
Fourth, most computational models ignore motivational and emotional factors. Norman (1980) distinguished between a cognitive system
(the Pure Cognitive System) and a biological system (the Regulatory
System). Computational cognitive science typically de-emphasises the
Regulatory System even though it often strongly influences the Pure
Cognitive System. This issue can be addressed by developing computational models indicating how emotions modulate cognitive processes
(Rodriguez et al., 2016).
Fifth, many computational models are very hard to understand
(Taatgen et al., 2016). Why is this so? Addyman and French (2012, p. 332)
identified several reasons:
Everyone still programs in [their] own favourite programming language, source code is rarely made available, . . . even for other modellers, the profusion of source code in a multitude of programming
languages, writing without programming guidelines, makes it almost
impossible to access, check, explore, re-use or continue to develop
[models].
Computational modellers often fail to share their source codes and models
because they have perceived ownership over their own research and are
concerned about losing control over it (Fecher et al., 2015).
COMPARISONS OF MAJOR APPROACHES
We have discussed the major approaches to human cognition at length and
you may wonder which is the most useful and informative. However, that is
not the best way of thinking about the issues for various reasons:
(1)
An increasing amount of research involves two or more different
approaches.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 33
28/02/20 2:15 PM
34
Approaches to human cognition
KEY TERMS
(2)
Converging operations
An approach in which
several methods with
different strengths and
limitations are used to
address a given issue.
Replication
The ability to repeat a
previous experiment
and obtain the same (or
similar) findings.
(3)
Each approach makes its own distinctive contribution and so all are
required. By analogy, it would be pointless asking whether a driver is
more or less useful than a putter for a golfer – both are essential.
Each approach has its own limitations as well as strengths (see Table
1.2). The optimal solution in such circumstances is to use converging
operations – several different research methods are used to address
a given theoretical issue with the strengths of one method balancing
out the limitations of other methods. If different methods produce the
same answer, that provides stronger evidence than could be obtained
using a single method. If different methods produce different answers,
further research is required to clarify matters. Note that using convergent operations is more difficult and demanding than using a single
approach (Brase, 2014).
In writing this book, our coverage of each topic emphasises research most
enhancing our understanding. As a result, any given approach (e.g., cognitive neuroscience; cognitive neuropsychology) is strongly represented when
we discuss some topics but is much less well represented with other topics.
IS THERE A REPLICATION CRISIS?
Replication (the ability to repeat the findings of previous research using the
same or similar experimental methods) is of central importance to psychology (including cognitive psychology). In recent years, however, there have
been concerns about the extent to which findings can be replicated, leading
Shrout and Rodgers (2018, p. 487) to refer to “the replication crisis” in
psychology.
An important trigger for these concerns was an influential article in
which the replicability of 100 studies published in leading psychology
journals was assessed (Open Science Collaboration, 2015). Only 36% of
the findings reported in these studies were replicated. Within cognitive
psychology, only 21 out of 42 findings (50%) were replicated.
Only 50% of findings were reproduced! This is perhaps less problematical than it sounds. The complexities of research mean individual studies
can provide only an estimate of the “true” state of affairs rather than
definitive evidence (Stanley & Spence, 2014).
Why are there problems with replicating findings in cognitive psychology (and psychology generally)? A major reason is the sheer complexity of
human cognition – cognitive processing and performance are influenced by
numerous factors (many of which are not controlled or manipulated). As a
result, even replicated findings often differ considerably in terms of the size
of the effects obtained (Stanley et al., 2018).
Another reason is that experimenters sometimes use questionable
research practices exaggerating the true statistical significance of their data.
One example is p-hacking (selective reporting), in which “Researchers
conduct many analyses on the same data set and just report those that are
statistically significant” (Simmons et al., 2018, p. 255). Another example
involves researchers proposing hypotheses after research results are known
rather than before as should be the case (Shrout & Rodgers, 2018).
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 34
28/02/20 2:15 PM
Approaches to human cognition
35
TABLE 1.3 STRENGTHS AND LIMITATIONS OF MAJOR APPROACHES TO HUMAN COGNITION
Strengths
Limitations
Experimental cognitive psychology
1. The first systematic approach to understanding
human cognition
1. Most cognitive tasks are complex and involve many
different processes
2. The source of most theories and tasks used by the
other approaches
2. Behavioural evidence provides indirect evidence concerning
internal processes
3. It is enormously flexible and can be applied to any
aspect of cognition
3. Theories are sometimes vague and hard to test empirically.
4. It has produced numerous important replicated
findings
4. Findings sometimes do not generalise because of paradigm
specificity
5. It has strongly influenced social, clinical and
developmental psychology
5. There is a lack of an overarching theoretical framework
Cognitive neuropsychology
1. Double dissociations have provided strong evidence
for various processing modules
1. Patients may develop compensatory major strategies not
found in healthy individuals
2. Causal links can be shown between brain damage
and cognitive performance
2. Most of its theoretical assumptions (e.g., the mind is
modular) seem too extreme
3. It has revealed unexpected complexities in cognition 3. Detailed cognitive processes and their interconnectedness
(e.g., language)
are often not specified
4. It transformed memory and language research
4. There has been excessive reliance on single-case studies
5. It straddles the divide between cognitive psychology 5. Brain plasticity complicates interpreting findings
and cognitive neuroscience
Cognitive neuroscience: functional neuroimaging + ERPs + TMS
1. Great variety of techniques offering excellent
temporal or spatial resolution
1. Functional neuroimaging techniques provide essentially
correlational data
2. Functional specialisation and brain integration can
be studied
2. Much over-interpretation of data involving reverse
inferences
3. TMS is flexible and permits causal inferences
3. There are many false positives and replication failures
4. Rich data permit assessment of integrated brain
processing as well as specialised functioning
4. It has generated very few new theories
5. Resolution of complex theoretical issues
5. Difficulty in relating brain activity to psychological processes
Computational cognitive science
1. Theoretical assumptions are spelled out with
precision
1. Many computational models do not make new predictions
2. Comprehensive cognitive architectures have been
developed
2. There is some overfitting, which restricts generalisation to
other data sets
3. Computational models are increasingly used to
model brain damage
3. It is sometimes hard to falsify computational effects of
models
4. Computational cognitive neuroscience is increasingly 4. Most computational models de-emphasise motivational and
used to model patterns of brain activity
emotional factors
5. The emphasis on parallel processing fits well with
functional neuroimaging data
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 35
5. Researchers’ reluctance to share source codes and models
inhibits progress
28/02/20 2:15 PM
36
Approaches to human cognition
KEY TERM
One answer to problems with replicability is to use meta-analysis,
in which findings from many studies are combined and integrated using
various statistical techniques. This approach has the advantage of not
exaggerating the importance of any single study, but has various potential
problems (Sharpe, 1997):
Meta-analysis
A form of statistical
analysis based on
combining the findings
from numerous studies on
a given research topic.
(1)
(2)
(3)
The “apples and oranges” problem: very different studies are often
included within a meta-analysis.
The “file drawer” problem: it is hard for researchers to publish
non-significant findings. Since meta-analyses often ignore unpublished
findings, the studies included may be unrepresentative.
The “garbage in – garbage out” problem: poorly designed and conducted studies are often included along with high-quality ones.
The above problems can be addressed. Precise criteria for studies to be
included can reduce the first and third problems. The second problem can
be reduced by asking researchers to provide relevant unpublished data.
Watt and Kennedy (2017) identified a more important problem: many
researchers use somewhat subjective criteria for inclusion of studies in
meta-analyses, favouring those supporting their theoretical position and
rejecting those that do not. This creates confirmation bias (see Chapter 14).
The good news is that experimental psychologists (including cognitive psychologists) have responded very positively to the above problems.
More specifically, there has been a large increase in disclosure and preregistration. Disclosure means that researchers “disclose all of their measures, manipulations, and exclusions” (Nelson et al., 2018, p. 518). In the
case of meta-analyses, it means researchers making available all the findings initially considered for inclusion. That would allow other researchers to conduct their own meta-analyses and check whether the outcome
remains the same.
Pre-registration involves researchers making publicly available all
decisions about sample size, hypotheses, statistical analyses and so on
before an experimental study is carried out. With respect to meta-analyses,
pre-registration involves making public the inclusion criteria for a metaanalysis, the methods of analysis and so on, before the findings of included
studies are known (Hamlin, 2017).
In sum, there are some genuine issues concerning the replicability of
findings within cognitive psychology. However, there are various reasons
why there is no “replication crisis”. First, numerous important findings
in cognitive psychology have been replicated dozens or even hundreds
of times as is clear from a very large number of meta-analytic reviews.
Second, as Nelson et al. (2018, p. 511) pointed out: “The scientific practices
of experimental psychologists have improved dramatically.”
OUTLINE OF THIS BOOK
One problem with writing a textbook of cognitive psychology is that virtually all the processes and systems in the cognitive system are interdependent. Consider, for example, a student reading a book to prepare for an
examination. The student is learning, but several other processes are going
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 36
28/02/20 2:15 PM
Approaches to human cognition
37
on as well. Visual perception is involved in the intake of information from
the printed page, and there is attention to the content of the book.
In order for the student to benefit from the book, they must possess
considerable language skill, and must have extensive relevant knowledge
stored in long-term memory. There may be an element of problem solving
in the student’s attempts to relate the book’s content to the possibly conflicting information they have learned elsewhere. Decision-making may also
be involved when the student decides how much time to devote to each
chapter.
In addition, what the student learns depends on their emotional state.
Finally, the acid test of whether the student’s learning has been effective
comes during the examination itself, when the material from the book
must be retrieved and consciously evaluated to decide its relevance to the
examination question.
The words italicised in the previous three paragraphs indicate major
aspects of human cognition and form the basis of our coverage. In view
of the interdependence of all aspects of the cognitive system, we emphasise how each process (e.g., perception) depends on other processes and
structures (e.g., attention, long-term memory). This should aid the task of
understanding the complexities of human cognition.
CHAPTER SUMMARY
•
Introduction. Cognitive psychology used to be unified by an
approach based on an analogy between the mind and the
computer. This information-processing approach viewed the mind
as a general-purpose, symbol-processing system of limited capacity.
Today there are four main approaches to human cognition:
experimental cognitive psychology; cognitive neuroscience;
cognitive neuropsychology; and computational cognitive science.
These four approaches are increasingly combined to provide an
enriched understanding of human cognition.
•
Cognitive psychology. Cognitive psychology focuses on internal
mental processes whereas behaviourism focused mostly on
observable stimuli and responses. Cognitive psychologists assume
top-down and bottom-up processes are both involved in the
performance of cognitive tasks. These processes can be serial
or parallel. Various methods (e.g., latent-variable analysis) have
been used to address the task impurity problem and to identify
the processes within cognitive tasks. Cognitive psychology has
massively influenced theorising and the tasks used across all
major approaches to human cognition. In spite of its enormous
contributions, cognitive psychology sometimes lacks ecological
validity, suffers from paradigm specificity and is theoretically vague.
•
Cognitive neuropsychology. Cognitive neuropsychology is
based on various assumptions including modularity, anatomical
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 37
28/02/20 2:15 PM
38
Approaches to human cognition
modularity, uniformity of functional architecture and subtractivity.
Double dissociations provide reasonable (limited) evidence
for separate modules or systems. The case-study approach
is more informative than the single-case approach. Cognitive
neuropsychology is limited for several reasons: its assumptions are
mostly too strong; patients can develop compensatory strategies,
there is brain plasticity; brain damage can cause widespread
reduced connectivity within the brain; and it underestimates the
extent of integrated brain functioning.
•
Cognitive neuroscience: the brain in action. Cognitive
neuroscientists study the brain as well as behaviour using
techniques varying in spatial and temporal resolution. Functional
neuroimaging techniques provide basically correlational evidence,
but transcranial magnetic stimulation (TMS) can indicate a given
brain area is necessarily involved in a given cognitive function. The
richness of the data obtained from neuroimaging studies permits
the assessment of functional specialisation and brain integration.
Cognitive neuroscience is a flexible and potentially self-correcting
approach. However, correlational findings are sometimes overinterpreted, underpowered studies make replication difficult, and
relatively few studies in cognitive neuroscience generate (or even
test) cognitive theories.
•
Computational cognitive science. Computational cognitive
scientists develop computational models to understand human
cognition. Connectionist networks use elementary units or nodes
connected together. They can learn using rules such as backward
propagation. Production systems consist of production or “IF . . .
THEN” rules. ACT-R is a highly developed model based on
production systems. Computational models have increased in
scope to provide detailed theoretical accounts of findings from
cognitive neuroscience and cognitive neuropsychology. They have
shown progress via the use of nested incremental modelling.
Computational models are often hard to falsify, de-emphasise
motivational and emotional factors, and often lack biological
plausibility.
•
Comparisons of different approaches. The major approaches
are increasingly used in combination. Each approach has its own
strengths and limitations, which makes it useful to use converging
operations. When two approaches produce the same findings, this
is stronger evidence than can be obtained from a single approach
on its own. If two approaches produce different findings, this
indicates further research is needed to clarify what is happening.
•
Is there a replication crisis? There is increasing evidence that
many findings in psychology (including cognitive psychology)
are hard to replicate. However, this does not mean there is a
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 38
28/02/20 2:15 PM
Approaches to human cognition
39
replication crisis. Meta-analyses indicate that numerous findings
have been successfully replicated many times. In addition,
experimental research practices have improved considerably in
recent years which should increase successful replications in the
future.
FURTHER READING
Hartwigsen, G. (2018). Flexible redistribution in cognitive networks. Trends in
Cognitive Sciences, 22, 687–698. Gesa Hartwigsen discusses several compensatory
strategies used by brain-damaged patients and healthy individuals administered
transcranial magnetic stimulation (TMS).
Laird, J.E., Lebiere, C. & Rosenbloom, P.S. (2017). A standard model of the mind:
Toward a common computational framework across artificial intelligence, cognitive science, neuroscience, and robotics. AI Magazine, 38, 1–19. These authors
proposed a standard model based on the commonalities found among different
proposed cognitive architectures (e.g., ACT-R; Soar).
Passingham, R. (2016). Cognitive Neuroscience: A very Short Introduction. Oxford:
Oxford University Press. Richard Passingham provides an accessible account of
the essential features of cognitive neuroscience.
Poldrack, R.A., Baker, C.I., Durnez, J., Gorgolewski, K.J., Matthews, P.M.,
Munafo, M.R., et al. (2017). Scanning the horizon: Towards transparent and
reproducible neuroimaging research. Nature Reviews Neuroscience, 18, 115–126.
Problems with research in cognitive neuroscience are discussed and proposals for
enhancing the quality and replicability of such research are put forward.
Shallice, T. (2015). Cognitive neuropsychology and its vicissitudes: The fate of
Caramazza’s axioms. Cognitive Neuropsychology, 32, 385–411. Tim Shallice discusses strengths and limitations of various experimental approaches within cognitive neuropsychology.
Shrout, P.E. & Rodgers, J.L. (2018). Psychology, science, and knowledge construction: Broadening perspectives from the replication crisis. Annual Review of
Psychology, 69, 487–510. Patrick Shrout and Joseph Rodgers discuss the numerous ways in which improving research practices are reducing replication problems.
Taatgen, N.A., van Vugt, M.K., Borst, J.P. & Melhorn, K. (2016). Cognitive modelling at ICCM: State of the art and future directions. Topics in Cognitive Science,
8, 259–263. Niels Taatgen and his colleagues discuss systematic improvements in
computational cognitive models.
Ward, J. (2015). The Student’s Guide to Cognitive Neuroscience (3rd edn). Hove,
UK: Psychology Press. The first five chapters of this textbook provide detailed
information about the main techniques used by cognitive neuroscientists.
9781138482210_COGNITIVE_PSYCHOLOGY_PRE_CHAP_1.indd 39
28/02/20 2:15 PM
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 40
28/02/20 6:43 PM
What is “perception”? According to Twedt and Parfitt (2018, p. 1),
“Perception is the study of how sensory information is processed into perceptual experience . . . all senses share the common goal of picking up
sensory information from the external environment and processing that
information into a perceptual experience.”
Our main emphasis in this section of the book is on visual perception,
which is of enormous importance in our everyday lives. It allows us to move
around freely, to see other people, to read magazines and books, to admire
the wonders of nature, to play sports and to watch movies and television. It
also helps to ensure our survival. If we misperceive how close cars are to us
as we cross the road, the consequences could be fatal. Unsurprisingly, far
more of the cortex (especially the occipital lobes at the back of the head) is
devoted to vision than to any other sensory modality.
PART I
Visual perception and
attention
Visual perception seems so simple and effortless, we typically take it for
granted. In fact, however, it is very complex, and numerous processes
transform and interpret sensory information. Relevant evidence comes
from researchers in artificial intelligence who tried to program computers
to “perceive” the environment. In spite of their best efforts, no computer
can match more than a fraction of the skills of visual perception that we
possess. For example, humans are much better than computer programs
when deciphering distorted interconnected characters (commonly known as
CAPTCHAs) to gain access to an internet website.
There is a rapidly growing literature on visual perception (especially from
the cognitive neuroscience perspective). The next three chapters provide
detailed coverage of the main issues. Chapter 2 focuses on basic processes
in visual perception with an emphasis on the great advances made in understanding the underlying brain mechanisms. Of importance, we see in this
chapter that the processes leading to object recognition differ from those
guiding vision for action. Finally, this chapter discusses important aspects
of visual perception (e.g., colour perception; perception without awareness;
depth perception).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 41
28/02/20 6:43 PM
42
Visual perception and attention
Chapter 3 focuses on the processing underlying our ability to identify
objects in the world around us. Initially, we discuss perceptual organisation
and how we decide which parts of the visual input belong together and so
form an object. We then move on to theories of object recognition including a discussion of the relevant behavioural and neuroscientific evidence.
Are the same recognition processes involved across all types of objects?
This issue remains controversial. However, most experts agree that face
recognition differs in important ways from the recognition of most other
objects. Accordingly, face recognition is discussed separately from the recognition of other objects.
The final part of Chapter 3 is concerned with another controversial issue –
whether the main processes involved in visual imagery are the same as
those involved in visual perception. As you will see, it is arguable that this
controversy has largely been resolved (turn to Chapter 3 to find out how!).
The central focus of Chapter 4 is on how we process a constantly changing
environment and manage to respond appropriately to those changes. Of
major importance is our ability to predict the speed and direction of objects
and to move towards our goal whether walking or driving. Other topics discussed in Chapter 4 are our ability to reach for (and grasp) objects and our
ability to make sense of other people’s movements.
There are major links between visual perception and attention. The final
topic in Chapter 4 is concerned with the notion that we may need to attend
to an object to perceive it consciously. Attentional failures can prevent us
from noticing changes in objects or the presence of an unexpected object.
However, failures to notice changes in objects also depend on the limitations of peripheral vision.
Issues relating directly to attention are discussed thoroughly in Chapter 5.
This chapter starts with the processes involved in focused attention in the
visual and auditory modalities. We next consider how we use visual processes when engaged in the everyday task of searching for some object
(e.g., a pair of socks in a drawer). We then consider research on disorders
of visual attention in brain-damaged individuals, research that has greatly
increased our understanding of visual attention in healthy individuals. After
that, we discuss the factors determining the extent to which we can do two
things at once (i.e., multi-tasking). This involves a consideration of the role
played by “automatic” processes.
In sum, the area spanning visual attention and attention is among the most
exciting and important within cognitive psychology and cognitive neuro­
science. Tremendous progress has been made in unravelling the complexities of perception and attention over the past decade. The choicest fruits of
that endeavour are set before you in the four chapters forming this section
of the book.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 42
28/02/20 6:43 PM
Chapter
Basic processes in visual
perception
2
INTRODUCTION
Considerable progress has been made in understanding visual perception in recent years. Much of this progress is due to cognitive neuroscientists, thanks to whom we now have a good knowledge of the visual brain.
Initially, we consider the main brain areas involved in vision and their functions. Then we discuss theories of brain systems in vision, followed by a
detailed analysis of basic aspects of visual perception (e.g., colour perception; depth perception). Finally, we consider whether perception can occur
without conscious awareness.
The specific processes we use in visual perception depend on what we
are looking at and on our perceptual goals (i.e., what we are looking for)
(Hegdé, 2008). On the one hand, we can sometimes perceive the gist of a
Figure 2.1
Complex scene that
requires prolonged
perceptual processing to
understand fully. Study the
picture and identify the
animals within it.
Reprinted from Hegdé (2008).
Reprinted with permission of
Elsevier.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 43
28/02/20 6:43 PM
44
Visual perception and attention
KEY TERMS
natural scene extremely rapidly (Thorpe et al., 1996). Observers saw photographs containing (or not containing) an animal for only 20 ms. Eventrelated potentials (ERPs: see Glossary) indicated the presence of an animal
was detected within about 150 ms. On the other hand, look at the photograph in Figure 2.1 and decide how many animals are present. It probably
took you several seconds to perform this task. Bear in mind the diversity of
visual perception as you read this and the following two chapters.
Retinal ganglion cells
Retinal cells providing the
output signal from the
retina.
Retinopy
The notion that there
is mapping between
receptor cells in the retina
and points on the surface
of the visual cortex.
Interactive feature:
Primal Pictures’
3D atlas of the brain
VISION AND THE BRAIN
In this section we consider the brain systems involved in visual perception.
Visual processing occurs in at least 30 distinct brain areas (Felleman & Van
Essen, 1991). The visual cortex consists of the entire occipital cortex at the
back of the brain and also extends well into the temporal and parietal lobes.
To understand visual processing in the brain fully, however, we first need to
consider briefly what happens between the eye and the cortex.
From eye to cortex
There are two types of visual receptor cells in the retina: cones and rods.
Cones are used for colour vision and sharpness of vision (see section on
colour vision, pp. 64–71). Patients with rod monochromatism have no
detectable cone function resulting in total colour blindness (Tsano &
Sharma, 2018).
There are 125 million rods concentrated in the outer regions of the
retina. Rods are specialised for vision in dim light. Many differences
between cones and rods stem from the fact that a retinal ganglion cell
receives input from only a few cones but from hundreds of rods. Thus,
only rods produce much activity in retinal ganglion cells in poor lighting
conditions.
The main pathway between the eye and the cortex is the retina-­
geniculate-striate pathway. It transmits information from the retina to
V1 and then V2 (both discussed shortly) via the lateral geniculate nuclei
(LGNs) of the thalamus. The entire retina-geniculate-striate system is
organised similarly to the retinal system. For example, two stimuli adjacent to each other in the retinal image will also be adjacent at higher levels
within that system. The technical term is retinopy: retinal receptor cells are
mapped to points on the surface of the visual cortex.
Each eye has its own optic nerve and the two optic nerves meet at
the optic chiasm. At this point the axons from the outer halves of each
retina proceed to the hemisphere on the brain hemisphere on the same
side, whereas those from the inner halves cross over and go to the other
hemisphere. As a result, each side of visual space is represented within the
opposite brain hemisphere. Signals then proceed along two optic tracts
within the brain. One tract contains signals from the left half of each eye
and the other signals from the right half (see Figure 2.2).
After the optic chiasm, the optic radiation proceeds to the lateral
geniculate nucleus, which is part of the thalamus. Nerve impulses finally
reach V1 in primary visual cortex within the occipital lobe at the back of
the head before spreading out to nearby visual cortical areas such as V2.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 44
28/02/20 6:43 PM
45
Basic processes in visual perception
Figure 2.2
Route of visual signals.
Note that signals reaching
the left visual cortex come
from the left sides of the
two retinas, and signals
reaching the right visual
cortex come from the right
sides of the two retinas.
There are two relatively independent channels or pathways within the
retina-geniculate-striate system:
(1)
(2)
The parvocellular (or P) pathway: it is most sensitive to colour and to
fine detail; most of its input comes from cones.
The magnocellular (or M) pathway: it is most sensitive to movement
information; most of its input comes from rods.
As stated above, these two pathways are only relatively independent. In
fact, there are numerous interconnections between them and the entire
visual system is extremely complex. For example, there is clear intermingling of the two pathways in V1 (Leopold, 2012). Ryu et al. (2018, p. 707)
studied brain activity in V1 when random-dot images were presented: “The
local V1 sites receiving those parallel inputs [from the P and M pathways]
are densely linked with one another via horizontal connections [which] are
organised in complicated yet systematic ways to subserve the multitude of
representational functions of V1.”
Finally, there is also a Koniocellular pathway. However, its functions
are still not well understood.
Early visual processing: V1 and V2
We start with three general points. First, to understand visual processing in the primary visual cortex (V1 or BA17) and the secondary visual
cortex (V2 or BA18), we must consider the notion of a receptive field. The
r­ eceptive field for any given neuron is the retinal region where light affects
its activity. The receptive field can also refer to visual space because it is
mapped in a one-to-one manner onto the retinal surface.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 45
KEY TERM
Receptive field
The region of the retina
in which light influences
the activity of a particular
neuron.
28/02/20 6:43 PM
46
KEY TERM
Lateral inhibition
Reduction of activity in
one neuron caused by
activity in a neighbouring
neuron.
Visual perception and attention
Second, neurons often influence each other. For example, there is
lateral inhibition, where reduced activity in one neuron is caused by activ-
ity in a neighbouring neuron. Lateral inhibition increases the contrast at
the edges of objects, making it easier to identify the dividing line between
objects. The phenomenon of simultaneous contrast depends on lateral inhibition (see Figure 2.3). The two central squares are physically identical but
the one on the left appears lighter. This difference is due to simultaneous contrast produced
because the left surround is much darker than
the right surround.
Third, early visual processing involves
large areas within the primary visual cortex
(V1) and secondary visual cortex (V2).
For example, Hegdé and Van Essen (2000)
found in macaques that one-third of V2 cells
responded to complex shapes and differences
in size and orientation.
Two pathways
As we have just seen, neurons from the P and
M pathways mainly project to V1 (primary
visual cortex). What happens after V1? The P
pathway associates with the ventral pathway
From Lehar (2008). Reproduced with permission of the author.
or stream that proceeds to the inferotemporal
cortex. In contrast, the M pathway associates
with the dorsal pathway or stream that proceeds to the posterior parietal
cortex. Note the above assertions oversimplify a complex reality.
We discuss the ventral and dorsal pathways in detail shortly. It is
assumed the ventral or “what” pathway culminating in the inferotemporal cortex is mainly concerned with form and colour processing and with
object recognition. In contrast, the dorsal or “how” pathway culminating
in the parietal cortex is more concerned with motion processing. As we
will see later, there are extensive interactions between the two pathways.
The nature of such interactions was reviewed by Rossetti et al. (2017; see
Figure 2.15 in this chapter).
Galletti and Fattori (2018) argued that visual processing is more
­flexible than implied by the notion of two interacting pathways or streams.
Figure 2.3
The square on the right looks darker than the identical square
on the left because of simultaneous contrast involving lateral
inhibition.
We should not conceive the cortical streams as fixed series of interconnected cortical areas in which each area belongs to one stream . . ., but
[rather] as interconnected neuronal networks, often involving the same
neurons, that are involved in a number of functional processes and
whose activation changes ­dynamically according to the context.
(p. 203)
Organisation of the visual brain
A more detailed picture of the brain areas involved in visual processing is given in Figure 2.4. V3 is generally assumed to be involved in form
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 46
28/02/20 6:43 PM
Basic processes in visual perception
47
Figure 2.4
Some distinctive features of the largest visual cortical areas. The relative size of the
boxes reflects the relative area of different regions. The arrows labelled with percentages
show the proportion of fibres in each projection pathway. The vertical position of each
box represents the response latency of cells in each area, as measured in single-unit
recording studies. IT = inferotemporal cortex; MT = medial or middle temporal cortex;
MST = medial superior temporal cortex. All areas are discussed in detail in the text.
From Mather (2009). Copyright 2009 George Mather. Reproduced with permission.
processing, V4 in colour processing and V5/MT in motion processing (all
discussed in more detail pp. 49–54). The ventral stream includes V1, V2, V3,
V4 and the inferotemporal cortex, whereas the dorsal stream proceeds from
V1 via V3 and MT (medial temporal cortex) to MST (medial superior temporal cortex).
Figure 2.4 reveals three important points. First, there are complex
interconnections among visual cortical areas. Second, the brain areas
within the ventral pathway are more than twice as large as those within
the dorsal pathway. Third, cells in the lateral geniculate nucleus respond
fastest when a visual stimulus is presented followed by activation of cells
in V1. However, cells are activated in several other areas (V3/V3A; MT;
MST) very shortly thereafter.
Figure 2.4 shows the traditional hierarchical view of the major brain
areas involved in visual processing. It is supported by anatomical evidence (see proportions of fibres projecting up the hierarchy in the figure).
Nevertheless, this view is oversimplified. Here we consider three of its main
limitations.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 47
28/02/20 6:43 PM
48
Visual perception and attention
First, Kravitz et al. (2013) disagreed with
the traditional view that the ventral pathway
or stream involves a serial hierarchy proceeding from simple to complex. Instead,
he argued it consists of several overlapping
recurrent networks (see Figure 2.5). There are
connections in both directions between the
components within these networks. As Hegdé
(2018, p. 902) argued, “Various regions of
the visual system process information not in
a strict hierarchical manner but as parts of
various dynamic brain-wide networks.”
Second, there is an initial “feedforward sweep” proceeding through the visual
areas starting with V1 and then V2 (shown
Figure 2.5
by the directional arrows in Figure 2.4).
Connectivity within the ventral pathway on the lateral surface
of the macaque brain. Brain areas involved include V1, V2, V3
This is followed by recurrent or top-down
and V4, the middle temporal (MT)/medial superior temporal
­processing proceeding in the opposite direc(MST) complex, the superior temporal sulcus (STS) and the
tion (not shown in Figure 2.4). Several theinferior temporal cortex (TE).
orists (e.g., Lamme, 2018; see Chapter 16)
From Kravitz et al. (2013). Reprinted with permission of Elsevier.
assume recurrent processing is of major
importance for conscious visual perception
because it integrates information across different visual areas. Note that
visual imagery depends on several top-down processes resembling those
used in visual perception (see Chapter 3).
Hurme et al. (2017) obtained support for the above assumptions. They
applied transcranial magnetic stimulation (TMS; see Glossary) to V1 at 60
ms to suppress feedforward processing and at 90 ms to suppress recurrent
processing. As predicted, early V1 activity was necessary for conscious and
unconscious vision but late V1 activity was necessary only for conscious
vision.
Third, Zeki (2016) distinguished three hierarchical models of the visual
brain (see Figure 2.6). Model (a) was proposed first and model (c), the
one favoured by Zeki, was proposed most recently. His central argument
is that, “Parallel processing . . . is much more ubiquitous than commonly
supposed” (p. 2515). Thus, models such as the one shown in Figure 2.4 are
inadequate because they de-emphasise parallel processing.
Functional specialisation
Zeki (1993, 2001) proposed a functional specialisation theory where different cortical areas are specialised for different visual functions. The visual
system resembles workers each working alone to solve part of a complex
problem, and it is consistent with Zeki’s (2016) emphasis on parallel processing within the visual brain. The results of their labours are then combined to produce coherent visual perception.
What are the advantages of functional specialisation? First, object
attributes can occur in unpredictable combinations (Zeki, 2005). For
example, a green object may be a car, a sheet of paper or a leaf, and a
car can be red, black or green. Thus, we often need to process all of an
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 48
28/02/20 6:43 PM
49
Basic processes in visual perception
(a) Single hierarchical model
Retina
LGN
Visual association cortex
V1
(b) Parallel hierarchical model through V1
Retina
LGN
V1
V2
V3
V3A
V4
V5
From Zeki (2016).
(c) Three parallel hierarchical feedforward systems
V1
V2
Figure 2.6
(a) The single hierarchical
model where all brain areas
after V1 are considered
jointly as “visual association
cortex”; (b) the parallel
hierarchical model
which is a hierarchy of
processing areas running
serially from V1 through
V2 to V3 but with much
parallel processing; (c) the
three parallel hierarchical
feedforward systems model
with a strong emphasis on
parallel rather than serial
processing.
V3
LGN
V4
V5
Pulvinar
Retina
object’s attributes for accurate perception. Second, the required processing
differs considerably across attributes (Zeki, 2005). For example, motion
processing involves integrating information across time whereas form or
shape processing involves considering the spatial relationship of elements
at a given moment.
Here are the main functions Zeki ascribed to the brain areas shown in
Figure 2.3:
●●
●●
●●
V1 and V2: They are involved at an early stage of visual processing.
They contain different groups of cells responsive to colour and form.
V3 and V3A: Cells in these areas respond to form (especially the shapes
of objects in motion) but not colour.
V4: The majority of cells in this area respond to colour; many are also
responsive to line orientation.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 49
28/02/20 6:43 PM
50
Visual perception and attention
KEY TERM
●●
Achromatopsia
A condition caused by
brain damage in which
there is very limited colour
perception but form and
motion perception are
relatively intact.
V5: This area is specialised for visual motion. In studies with macaque
monkeys, Zeki found all the cells in this area responded to motion
but not colour. In humans, the areas specialised for visual motion are
referred to as MT and MST.
Zeki assumed colour, motion and form are processed in anatomically separate visual areas. The relevant evidence is discussed below.
Form processing
Brain areas involved in form processing in humans include V1, V2, V3 and
V4, culminating in the inferotemporal cortex (Kourtzi & Connor, 2011).
Neurons in the inferotemporal cortex respond to specific semantic categories (e.g., animals; body parts; see Chapter 3). Neurons in the inferotemporal cortex are also involved in form processing. Baldassi et al. (2013) found
in monkeys that many neurons within the anterior inferotemporal cortex
responded on the basis of form or shape (e.g., round; star-like; horizontally
thin) rather than object category.
Pavan et al. (2017) investigated the role of early visual areas in form
processing using repetitive transcranial magnetic stimulation (rTMS; see
Glossary) to disrupt processing. With static stimuli, rTMS delivered to
early visual areas (V1/V2) disrupted form processing whereas rTMS delivered to V5/MT did not.
If form processing occurs in different brain areas from colour and
motion processing, we might anticipate some patients would have severely
impaired form processing but intact colour and motion processing. Some
support was reported by Gilaie-Dotan (2016a). She studied LG, a man with
visual form agnosia (see Glossary). LG had deficient functioning within
V2 and V3 (although no obvious brain damage) associated with impaired
form processing and object recognition, but relatively intact perception of
colour and biological motion. However, such cases are very rare. As Zeki
(1993) pointed out, brain damage sufficient to almost eliminate form perception would typically be so widespread that the patient would be blind.
Colour processing
The assumption that V4 (located within the ventral visual pathway) is specialised for colour processing has been tested in several ways. These include
studying brain-damaged patients, using brain-imaging techniques, and
using transcranial magnetic stimulation to produce a temporary “lesion”
(see pp. 20–22).
If V4 is specialised for colour processing, patients with damage to
that area should exhibit minimal colour perception with fairly intact
form and motion perception and ability to see fine detail. This is approximately the case in achromatopsia (also known as cerebral achromatopsia) although cases involving total achromatopsia are very rare (Zihl &
Heywood, 2016).
Bouvier and Engel (2006) found in a meta-analysis that a small brain
area within the ventral (bottom) occipital cortex in (or close to) area V4
was damaged in nearly all cases of achromatopsia. However, the loss of
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 50
28/02/20 6:43 PM
51
Basic processes in visual perception
colour vision was typically only partial, implying other areas are also
directly involved in colour processing.
Lafer-Sousa et al. (2016) identified three brain areas in the ventral
visual pathway responding more strongly to video clips of various objects
presented in colour than in black-and-white. Of importance, different
brain areas were associated with colour and shape processing: colour areas
responded comparably to intact and scrambled objects.
Bannert and Bartels (2018) studied brain activity in several brain areas
(including V1, V2, V3 and V4) while participants viewed abstract colour
stimuli or formed visual images of colour objects (e.g., tomato; banana).
The colour of visually presented stimuli could be worked out from brain
activity in every brain area studied, whereas the colour of imagined stimuli
could only be worked out from brain activity in V4. These findings suggest
that a network including several brain areas is involved in colour processing, but V4 is of special importance within that network.
Finally, note that V4 is a relatively large area. As such, it is involved in
the processing of texture, form and surfaces as well as colour (Winawer &
Witthoft, 2015).
KEY TERM
Akinetopsia
A brain-damaged
condition in which motion
perception is severely
impaired even though
stationary objects are
perceived reasonably well.
Motion processing
Area V5 (also known as motion processing area MT) is heavily involved in
motion processing. Functional neuroimaging studies indicate that motion
processing is associated with activity in V5/MT (Zeki, 2015). However, such
studies cannot show V5 (or MT) is necessary for motion perception. More
direct evidence was reported by McKeefry et al. (2008) using transcranial
magnetic stimulation (see Glossary) to disrupt motion perception. TMS,
applied to V5/MT, produced a subjective slowing of stimulus speed and
impaired observers’ ability to discriminate between different speeds.
Further evidence of the causal role of V5 in motion perception was
obtained by Vetter et al. (2015). Observers could not predict the motion of
a moving target when TMS was applied to V5.
Additional evidence that the area V5/MT is important in motion processing comes from research on patients with akinetopsia. Akinetopsia
is an exceptionally rare condition where stationary objects are perceived
fairly normally but motion perception is grossly deficient (Ardila, 2016).
Zihl et al. (1983) studied LM, a woman with akinetopsia who had suffered
bilateral damage to the motion area (V5/MT). She could locate stationary
objects by sight, had good colour discrimination and her binocular vision
was normal. However, her motion perception was grossly deficient:
She had difficulty . . . in pouring tea or coffee into a cup because the
fluid appeared to be frozen, like a glacier. . . . In a room where more
than two people were walking, . . . “people were suddenly here or there
but I have not seen them moving”.
Zihl and Heywood (2015) discussed additional findings relating to LM.
Even though her motion perception was extremely poor, she still retained
limited ability to distinguish between moving and stationary stimuli.
Heutink et al. (2018) studied TD, a patient with akinestopsia due to
damage to V5. She was severely impaired at perceiving the direction of
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 51
28/02/20 6:43 PM
52
Visual perception and attention
high-speed visual motion but not low-speed motion, suggesting V5 is less
important for processing low-speed than high-speed motion.
V5 (MT) is not the only brain area involved in motion processing.
There is also the area MST just above V5/MT. Vaina (1998) studied two
patients with damage to MST. Both patients had various problems relating
to motion perception. One patient (RR) “frequently bumped into people,
corners and things in his way, particularly into moving targets (e.g., people
walking)” (Vaina, 1998, p. 498). These findings suggest MST is involved in
the visual guidance of walking. Chaplin et al. (2018) found that the direction of motion of a stimulus in healthy individuals could be inferred by
taking account of activation within both MT and MST.
The notions that motion perception depends almost exclusively on
V5/MT and MST and that those areas only process information relating
to motion are both oversimplifications for various reasons. First, several
areas outside V5/MT and MST are involved in motion perception. For
example, consider biological motion perception (see Chapter 4). Such perception involves several additional areas including the superior temporal
sulcus, superior temporal gyrus and inferior frontal gyrus (Thompson &
Parasuraman, 2012; Pavlova et al., 2017).
Second, Heywood and Cowey (1999; see Figure 2.7) found that approximately 60% of cells within V5/MT respond to binocular disparity (difference between the retinal images in the left and right eyes; see Glossary)
and 50% of cells within V5/MT respond to stimulus orientation. However,
V5/MT is especially important with respect to direction of motion with
approximately 90% of cells responding.
Third, we should distinguish between different types of motion perception (e.g., first-order and second-order motion perception). With first-order
Figure 2.7
The percentage of cells in
six different visual cortical
areas responding selectively
to orientation, direction
of motion, disparity and
colour.
From Heywood and Cowey
(1999).
53
Basic processes in visual perception
(a)
Speeded inputs to MT/V5 bypassing hierarchy
am
/
ion
nt
e
t
at tion
ac
dor
V1
M
ST
d/
1
LGN
MT/V5
Plv
left
optic
flow
Motion pathway outputs according to function
nts/
me
ove cking
m
eye ct tra
e
obj
s al
stre
op
tic
flo
w
MT/V5 to MSTd/I
Figure 2.8
Visual motion inputs
proceed rapidly from
subcortical areas and V1
directly to MT/V5 and from
there to MST; information is
then transferred to several
other brain regions. [LGN =
lateral geniculate nucleus;
Plv = pulvinar].
From Gilaie-Dotan (2016b).
Reprinted with permission of
Elsevier.
kinematics
objectness
ventr
al stre
am
right
(b)
Ventral
MT & MST
Dorsal
displays, the moving shape differs in luminance (intensity of reflected light)
from its background. For example, the shape might be dark whereas the
background is light. With second-order displays, there is no difference in
luminance between the moving shape and the background.
In everyday life, we encounter second-order displays infrequently
(e.g., movement of grass in a field caused by the wind). Some patients
have intact first-order motion perception but impaired second-order
motion perception whereas others exhibit the opposite pattern (GilaieDotan, 2016a). Thus, all forms of motion perception do not involve
similar underlying processes.
Gilaie-Dotan (2016b) studied motion processing within the brain (see
Figure 2.8). In essence, information about visual motion inputs bypasses
several visual areas (e.g., V2, V4) and rapidly reaches MT/V5. It then continues rapidly to MST, after which information is transferred to several
other areas.
Gilaie-Dotan (2016b, p. 379) pointed out that visual motion perception
“has been continually associated with and considered part of the dorsal
pathway”. For example, Milner and Goodale’s (1995, 2008) perce­ptionaction model emphasises close links between visual motion processing
and perception and the dorsal (“how” pathway; see pp. 56–57). GilaieDotan accepted the dorsal pathway is of major importance. However, she
argued persuasively that the ventral (“what”) pathway is also involved
in motion perception. For example, efficient detection of visual motion
54
Visual perception and attention
KEY TERM
requires having an intact right ventral visual cortex (Gilaie-Dotan et al.,
2013).
Binding problem
The issue of integrating
different types of
information to produce
coherent visual
perception.
Binding problem
Zeki’s theoretical approach poses the obvious problem of how information about an object’s motion, colour and form is combined and integrated
to produce coherent perception. This is the binding problem: “How the
brain brings together what it has processed . . . in its different hierarchically
organised parallel processing systems . . . to give us our unitary experience
of the visual world” (Zeki, 2016, p. 3521). One aspect of this problem is that
object-related processing in different visual areas ends at different times,
thus making it harder to integrate these outputs in visual perception.
There may be continuous integration of information starting during
early stages of visual processing. Seymour et al. (2009) presented observers with red or green dots rotating clockwise or counterclockwise. Colourmotion conjunctions were processed in several brain areas including
V1, V2, V3, V3A/B, V4 and V5/MT+. Seymour et al. (2016) found that
binding of information about object form and colour occurred as early as
V2. These findings contradict the traditional assumption that “The visual
system initially extracts borders between objects and their background and
then ‘fills in’ colour” (Seymour et al., 2016, p. 1997).
Ghose and Ts’o (2017) reviewed research indicating progressively more
integration of different kinds of information (and thus less functional specialisation) during visual processing. They concluded:
In V2, we see an increase in the overlap of cortical generated selectivities such as orientation and colour . . . in V4 we see extensive overlap
among colour, size, and form, and the existence of . . . a combination
of colour and orientation . . . not present in earlier areas.
(p. 17)
So far we have focused on increases in integration of information as processing proceeds from early to late visual areas ( feedforward processing). However, conscious visual perception generally depends crucially on
recurrent processing (feedback from higher to lower visual brain areas).
Observers’ expectations influence recurrent processing and it is arguable
that expectations (e.g., bananas will be yellow) facilitate the binding or integration of different kinds of visual information.
The binding-by-synchrony hypothesis (e.g., Singer & Gray, 1995) provides an influential solution to the binding problem. According to this
hypothesis, detectors responding to features of a single object fire in synchrony. Of relevance, widespread synchronisation of neural activity is associated with conscious visual awareness (e.g., Gaillard et al., 2009; Melloni
et al., 2007; see Chapter 16).
The synchrony hypothesis is oversimplified. There is the largely unresolved issue of explaining why and how synchronised activity occurs across
visual areas. The fact that visual object processing occurs in widely distributed areas of the brain makes it implausible that precise synchrony could
be achieved.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 54
28/02/20 6:43 PM
55
Basic processes in visual perception
Finally, note there are various binding problems. As Feldman (2013)
pointed out, one problem is how visual features are bound together.
Another problem is how we bind together information over successive eye
movements to perceive a stable visual world.
Within the above broader context, it is clear several lines of research
are relevant. For example, observers must decide which parts of the
visual information available at any given time belong to the same object.
The gestaltists put forward several laws describing how this happens (see
Chapter 3). Research on visual search (detecting target stimuli among distractors) is also relevant (see Chapter 5). This research shows the important
role of selective attention in combining features close together in time and
space.
KEY TERM
Ventral stream
The part of the visual
processing system
involved in object
perception and
recognition and the
formation of perceptual
representations.
Evaluation
Zeki’s functional specialisation theory is an ambitious and influential
attempt to provide a coherent theoretical framework. As discussed later,
Zeki’s assumption that motion processing typically proceeds somewhat
independently of other types of visual process has reasonable empirical
support.
What are the limitations with Zeki’s theoretical approach? First, the
brain areas involved in visual processing are less specialised than implied
theoretically. As mentioned earlier on p. 52, Heywood and Cowey (1999)
considered the percentage of cells in each visual cortical area responding selectively to various stimulus characteristics (see Figure 2.7). Cells in
several areas responded to orientation, disparity and colour. Specialisation
was found only with respect to responsiveness to direction of stimulus
motion in MT.
Second, the visual brain is substantially more complex than assumed
by Zeki. There are far more brain areas devoted to visual processing than
shown in Figure 2.3, and each brain area has connections to numerous
other areas (Baker et al., 2018). For example, V1 is connected to at least
50 other areas! What is also de-emphasised in Zeki’s approach is the importance of brain networks and the key role played by recurrent processing.
Third, the binding problem (or problems) has not been solved. However,
integrated visual perception undoubtedly depends on both ­
bottom-up
­(feedforward) processes and top-down (recurrent) processes (see Chapter 3).
TWO VISUAL SYSTEMS: PERCEPTION-ACTION
MODEL
What are the major functions of the visual system? Historically, the most
popular answer was that it provides us with an internal (and conscious) representation of the external world. In contrast, Goodale and Milner (1992)
and Milner and Goodale (1995, 2008) argued in their perception-action
model, there are two visual systems each fulfilling a different function or
purpose. First, there is the vision-for-perception (or “what”) system based
on the ventral stream or pathway. It is used when we decide whether an
object is a cat or a buffalo or when admiring a magnificent landscape. Thus,
it is used to identify objects.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 55
28/02/20 6:43 PM
56
Visual perception and attention
Figure 2.9
Goodale and Milner’s
(1992) perception-action
model showing the dorsal
and ventral streams. (SC =
superior colliculus; LGNd =
dorsal lateral geniculate
nucleus; V1+ = early visual
areas.)
Posterior
parietal
cortex
Pulvinar
Dorsal
stream
SC
From de Haan et al. (2018).
Reprinted with permission of
Elsevier.
Retina
LGNd
V1+
Ventral
stream
Occipitotemporal
cortex
Second, there is the vision-for-action (or “how”) system based on the
dorsal stream or pathway (see Figure 2.9) used for visually guided action.
It is used when running to return a ball at tennis or when grasping an
object. When we grasp an object, we must calculate its orientation and
position with respect to ourselves. Since observers and objects often move
relative to each other, orientation and position need to be worked out
immediately prior to initiating a movement.
Milner (2017, p. 1297) summarised key differences between the two
systems:
The ventral stream . . . mediates the transformations of the contents
of the visual signal into the mental furniture that guides memory, recognition and conscious perception. In contrast, the dorsal stream . . .
mediates the visual guidance of action, primarily in real time.
Schenk and McIntosh (2010) identified four major differences between the
two processing streams:
KEY TERMS
Dorsal stream
The part of the visual
processing system most
involved in visually guided
action.
Allocentric coding
Visual or spatial coding of
objects relative to each
other; see egocentric
coding.
Egocentric coding
Visual or spatial coding
dependent on the
position of the observer’s
body; see allocentric
coding.
(1)
(2)
(3)
(4)
The ventral stream underlies vision for perception whereas the dorsal
stream underlies vision for action.
There is allocentric coding (object-centred; coding the locations of
objects relative to each other) in the ventral stream but egocentric
coding (body-centred; coding relative to the observer’s own body) in
the dorsal stream.
Representations in the ventral stream are sustained over time whereas
those in the dorsal stream are short-lasting.
Processing in the ventral stream generally (but not always) leads to
conscious awareness, whereas processing in the dorsal stream does
not.
Two other differences have been suggested. First, processing in the dorsal
stream is faster. Second, ventral stream processing depends more on input
from the fovea (the central part of the retina used for detecting detail).
Milner and Goodale originally implied that the dorsal and ventral
streams were largely independent of each other. However, they have
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 56
28/02/20 6:43 PM
57
Basic processes in visual perception
increasingly accepted that the two streams often interact. For example,
Milner argued the influence of the ventral stream on dorsal stream processing “seems to carry visual and semantic complexity, thereby allowing us to
bring meaning to our actions” (Milner, 2017, p. 1305). The key issue of
the independence (or interdependence) of the two streams is discussed on
pp. 62–63.
Findings: brain-damaged patients
KEY TERM
Optic ataxia
A condition in which
there are problems
making visually guided
movements in spite of
reasonably intact visual
perception.
We can test Milner and Goodale’s theory by studying brain-damaged
patients. Patients with damage to the dorsal pathway should have reasonably intact vision for perception but severely impaired vision for action. The
opposite pattern of intact vision for action but very poor vision for perception should be found in patients having damage to the ventral pathway.
Thus, there should be a double dissociation (see Glossary).
Optic ataxia
Patients with optic ataxia have damage to the posterior parietal cortex
(forming part of the dorsal stream; see Figure 2.10). Some evidence suggests
patients with optic ataxia are poor at making precise visually guided movements although their vision and ability to move their arms are reasonably
intact. As predicted, Perenin and Vighetto (1988) found patients with optic
ataxia had great difficulty in rotating their hands appropriately when reaching towards (and into) a large oriented slot.
Patients with optic ataxia do not all conform to the simple picture
described above. First, somewhat different regions of posterior parietal
cortex are associated with reaching and grasping movements and some
patients have greater problems with one type of movement than the other
(Vesia & Crawford, 2012).
Second, it is oversimplified to assume patients have intact visual perception but impaired visually guided action. Pisella et al. (2006) obtained
much less evidence for impaired visually guided action in central compared
to peripheral vision. This finding is consistent with evidence indicating
many optic ataxics can drive effectively.
Figure 2.10
Lesion overlap (purple =
>40% overlap; orange =
>60% overlap) in patients
with optic ataxia. (SPL =
superior parietal lobule;
SOG = superior occipital
gyrus; Pc = precuneus.)
From Vesia and Crawford
(2012). Reprinted with
permission of Springer.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 57
28/02/20 6:43 PM
58
Visual perception and attention
KEY TERM
Third, patients with optic ataxia have some impairment in vision for
perception (especially in peripheral vision). Bartolo et al. (2018) found such
patients had an impaired ability on the perceptual task of deciding whether
a target was reachable and they also had problems on tasks requiring vision
for action. Thus, patients with optic ataxia have difficulties in combining
information from the dorsal and ventral streams.
Fourth, Rossetti and Pisella (2018) concluded as follows from their
review: “Optic ataxia is not a visuo-motor deficit and there is no dissociation between perception and action capacities in optic ataxia” (p. 225).
Visual form agnosia
A condition in which there
are severe problems in
shape perception (what an
object is) but apparently
reasonable ability to
produce accurate visually
guided actions.
Visual form agnosia
Interactive exercise:
Müller-Lyer
What about patients with damage only to the ventral stream? Of relevance
are some patients with visual form agnosia, a condition involving severe
problems with object recognition even though visual information reaches
the visual cortex (see Chapter 3). The most-studied visual form agnosic is
DF, whose brain damage is in the ventral stream (James et al., 2003). For
example, her activation in that stream was no greater when presented with
object drawings than with scrambled line drawings. However, she showed
high levels of activation in the dorsal stream when grasping for objects.
Goodale et al. (1994) found DF was very poor at a visual perception
task that involved distinguishing between two shapes with irregular contours. However, she grasped these shapes firmly between her thumb and
index finger. Goodale et al. concluded DF “had no difficulty in placing her
fingers on appropriate opposition points during grasping” (p. 604).
Himmelbach et al. (2012) re-analysed DF’s performance based on data
in Goodale et al. (1994). DF’s performance was substantially inferior to
that of healthy controls. Similar findings were obtained when DF’s performance on other grasping and reaching tasks was compared against controls. Thus, DF had greater difficulties with visually guided action than
previously believed.
Rossit et al. (2018) found DF had impaired peripheral (but not central)
reaching, which is the pattern associated with optic ataxia. DF also had
significant impairment in the fast control of reaching movements (also
associated with optic ataxia). Rossit et al. (p. 15) concluded: “We can no
longer assume that DF’s dorsal visual stream is intact
and that she is spared in visuo-motor control tasks, as
she also presents clear signs of optic ataxia.”
Visual illusions
Figure 2.11
The Müller-Lyer illusion.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 58
There are numerous visual illusions, of which the
Müller-Lyer (see Figure 2.11) is one of the most famous.
The vertical line on the left looks longer than the one
on the right although they are the same length. The
Ebbinghaus illusion (see Figure 2.12) is also well known.
The central circle surrounded by smaller circles looks
smaller than a central circle of the same size surrounded
by larger circles although the two central circles are the
same size.
28/02/20 6:43 PM
59
Basic processes in visual perception
How has the human species flourished if
our visual perceptual processes are apparently
very prone to error? Milner and Goodale
(1995) argued the vision-for-perception system
processes visual illusions and provides visual
judgements. In contrast, we mostly use the
vision-for-action system when walking close to
a precipice or dodging cars. These ideas led to
a dramatic prediction: actions (e.g., pointing;
grasping) using the vision-for-action system
should be unaffected by most visual illusions.
Findings
Bruno et al. (2008) conducted a meta-analytic
review of Müller-Lyer studies where observers pointed rapidly at one figure (using the
vision-for-action system). The mean illusion
effect was 5.5%. In contrast, the mean illusion
effect was 22.4% when observers provided
verbal estimations of length (using the visionfor-­perception system). The perception-action
model is supported by this large difference.
However, the model seems to predict there
should have been no illusion effect at all with
pointing.
With the Ebbinghaus illusion, the illusion Figure 2.12
The Ebbinghaus illusion.
is often much stronger with visual judgements
using the vision-for-­perception system than with grasping movements using
the vision-for-action system (Whitwell & Goodale, 2017). Knol et al. (2017)
explored the Ebbinghaus illusion in more detail. As predicted theoretically,
only visual judgements were influenced by the distance between the target
and the context.
Support for the perception-action model has been reported with the
hollow-face illusion, a realistic hollow mask resembling a normal face
(see Figure 2.13; visit the website: www.richardgregory.org/experiments).
Króliczak et al. (2006) placed a target (a small magnet) on the face mask
or a normal face. Here are two tasks they used:
(1)
(2)
Draw the target position (using the vision-for-perception system).
Make a fast, flicking finger movement to the target (using the visionfor-action system).
There was a strong illusion effect when observers drew the target position,
whereas their performance was very accurate (i.e., illusion-free) when they
made a flicking movement. Both findings were as predicted theoretically.
Króliczak et al. (2006) also had a third condition where observers
made a slow pointing finger movement to the target and so the vision-foraction system was involved. However, there was a fairly strong illusory
effect. Why was this? Actions may involve the vision-for-perception system
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 59
KEY TERM
Hollow-face illusion
A concave face mask is
misperceived as a normal
face when viewed from
several feet away.
28/02/20 6:43 PM
60
Visual perception and attention
Figure 2.13
Left: normal and hollow
faces with small target
magnets on the forehead
and cheek of the normal
face. Right: front view of the
hollow mask that appears as
an illusory face projecting
forwards.
Króliczak et al. (2006). Reprinted
with permission of Elsevier.
KEY TERM
Proprioception
An individual’s awareness
of the position and
orientation of parts of
their body.
as well as the vision-for-action system when preceded by conscious cognitive processes.
Various problematical issues for the perception-action model have
accumulated. First, the type of action is important. Franz and Gegenfurtner
(2008) found the mean illusory effect with the Müller-Lyer was 11.2% with
perceptual tasks, compared to 4.4% with full visual guidance of the hand
movement. In contrast, grasping when observers could not monitor their
hand movements was associated with an illusory effect of 9.4%, perhaps
because action programming required the ventral stream.
Second, illusion effects assessed by grasping movements often decrease
with repeated practice (Kopiske et al., 2017). Kopiske et al. argued people
use feedback from their inaccurate grasping movements on early trials to
reduce illusion effects later on.
Third, illusion effects are often greater when grasping or pointing movements are made following a delay (Hesse et al., 2016). The ventral stream
(vision-for-perception) may be more likely to be involved after a delay.
The various interpretive problems with previous research led Chen
et al. (2018a) to use a different approach. In their key condition, observers
had restricted vision (they viewed a sphere coated in luminescent paint in
darkness through a pinhole). They estimated the sphere’s size by matching
the distance between their thumb and forefinger to that size (perception)
or they grasped the sphere (action). Their non-grasping hand was in their
lap or directly below the sphere. In the latter condition, observers could
make use of proprioception (awareness of the position of one’s body
parts).
Size judgements were very accurate in perception and action with full
vision (see Figure 2.14). However, the key finding was that proprioceptive
information about distance produced almost perfect performance when
observers grasped the sphere but not when providing a perceptual estimate.
These findings indicate a very clear difference in the processes underlying
vision-for-perception and vision-for-action.
In sum, there is some support for the predictions of the original
vision-action model. However, illusory effects with visual judgements and
with actions are more complex and depend on many more factors than
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 60
28/02/20 6:43 PM
61
Basic processes in visual perception
assumed by that model. Attempts by Milner
and Goodale to accommodate such complexities are discussed below.
(4)
***
***
0.4
0.0
–0.4
o
Pr
ith
w
o
Pr
no
(3)
***
o
Pr
no
(2)
Memory is required (e.g., there is a time
lag between the offset of the stimulus
and the start of the grasping movement).
Time is available to plan the forthcoming
movement (e.g., Króliczak et al., 2006).
Planning which movement to make is
necessary.
The action is unpractised or awkward.
**
o
Pr
ith
w
o
Pr
no
(1)
0.8
GRASPING
o
Pr
no
Milner and Goodale (2008) argued most tasks
requiring observers to grasp an object involve
some processing in the ventral stream in
addition to the dorsal stream. They reviewed
research showing that involvement of the
ventral stream is especially likely in the following circumstances:
***
Disruption index (DI)
Action planning + motor responses
ESTIMATION
Full
Restricted
Full
Restricted
Figure 2.14
Disruption of size judgements when estimated perceptually
(estimation) or produced by grasping (grasping) in full or
restricted vision when there was proprioception (withPro) or no
proprioception (noPro).
From Chen et al. (2018a). Reprinted with permission of Elsevier.
According to the perception-action model, actions are most likely to
require the ventral stream when they involve conscious processes. Creem
and Proffitt (2001) supported this notion. They started by distinguishing
between effective and appropriate grasping. For example, we can grasp
a toothbrush effectively by its bristles but appropriate grasping involves
accessing stored knowledge about the object and so often requires the
ventral stream. As predicted, appropriate grasping was much more
adversely affected than effective grasping by disrupting participants’ ability
to retrieve object knowledge.
van Polanen and Davare (2015) reviewed research on factors controlling skilled grasping. They concluded:
The ventral stream seems to be gradually more recruited as information about the object from pictorial cues or memory is needed to
control the grasping movement, or if conceptual knowledge about
more complex objects that are used every day or tools needs to be
retrieved for allowing the most appropriate grasp.
(p. 188)
Dorsal stream: conscious awareness
According to the two systems approach, ventral stream processing is generally accessible to consciousness whereas dorsal stream processing is not.
For example, it is assumed that the ventral stream (and conscious processing) are often involved in motor planning (Milner & Goodale, 2008). There
is some support for these predictions from the model (Milner, 2012). As we
will see, however, recent evidence mostly provides contrary evidence.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 61
28/02/20 6:43 PM
62
Visual perception and attention
Ludwig et al. (2016) assessed the involvement of the dorsal and ventral
streams in conscious visual perception using a different approach. The visibility of visual targets presented to one eye was manipulated by varying
the extent to which continuous flash suppression (rapidly changing stimuli
presented to the other eye) impaired the processing of the targets.
There were two main findings. First, there was a tight coupling
between visual awareness of target stimuli and ventral stream processing.
Second, there was a much looser coupling between target awareness and
dorsal stream processing. The first finding is consistent with the two visual
systems hypothesis. However, the second finding suggests dorsal processing is more relevant to conscious visual perception than assumed by that
hypothesis.
According to the perception-action model, manipulations (e.g., continuous flash suppression) preventing conscious perception should nevertheless permit more processing in the dorsal than the ventral stream. However,
neuroimaging studies have typically obtained no evidence that neural activity in the dorsal stream is greater than in the ventral stream when observers
lack conscious awareness of visual stimuli (Hesselmann et al., 2018).
Two pathways: update
The perception-action model was originally proposed before neuroimaging
and other techniques had clearly indicated the great complexity of the brain
networks involved in perception and action (de Haan et al., 2018). Recent
research has led to developments of the perception-action model in two
main ways. First, we now know much more about the various interactions
between processing in the dorsal and ventral streams. Second, there are
more than two visual processing streams. Rossetti et al. (2017) show how
theoretical conceptualisations of the relationship between visual perception
and action have become more complex (see Figure 2.15).
We have seen that the ventral pathway is often involved in visually
guided action. There is also increasing evidence the dorsal pathway is
involved in visual object recognition (Freud et al., 2016). For example,
patients with damage to the ventral pathway often retain some sensitivity
to three-dimensional (3-D) structural object representations (Freud et al.,
2017a). Zachariou et al. (2017) applied transcranial magnetic stimulation
to posterior parietal cortex within the dorsal pathway to disrupt processing. TMS disrupted the holistic processing (see Glossary) of faces, suggesting the dorsal pathway is involved in face recognition.
More supporting evidence was reported by Freud et al. (2016). They
studied shape processing, which is of central importance in object recognition and so should depend primarily on the ventral pathway. However, the
ventral and dorsal pathways were both sensitive to shape. The observers’
ability to recognise objects correlated with the shape sensitivity of regions
within the dorsal pathway. Thus, dorsal path activation was of direct relevance to shape and object processing.
How many visual processing streams are there? There is evidence that
actions towards objects depend on two partially separate dorsal streams
(Sakreida et al., 2016; see Chapter 4). First, there is a dorso-dorsal stream
(the “grasp” system) used to grasp objects rapidly. Second, there is a
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 62
28/02/20 6:43 PM
63
Basic processes in visual perception
Figure 2.15
Historical developments in
theories linking perception
and action. Row 1: the
intuitive notion that action
is preceded by conscious
perception. Row 2: Goodale
and Milner’s original two
systems’ theory. Row 3:
interaction between the
two anatomical pathways
and perceptual and visual
processes. Row 4: evidence
that processing in primary
motor cortex is preceded
by interconnections
between dorsal (green) and
ventral (red) pathways.
Row 1:
VISION
PERCEPTION
ACTION
Do
rs
a
l
Row 2:
ACTION
VISION
V1
PERCEPTION
Ventr
a
l
From Rossetti et al. (2017).
Reprinted with permission of
Elsevier.
Dorsal
xxx
Row 3:
xxx
xxx
xxx
xxx
ACTION
xxx
xxx
xxx
xxx
xxx
xxx
xxx
VISION
xxx
xxxxxxxx
xx
PERCEPTION
xxx
xxx
xxx
xxx
xxx
Ventral
xxx
BS
SC
Row 4:
V3d
PO
ACTION
VISION
PERCEPTION
PIP
POa, UP/IP
V1
MT
V3a
periphV4
V2 V4
V3v
Pre-strlate
Eye
FEF. SEF
Post. Parietal
MIP
7a
PFd
(46)
7b
PFv
(12)
AIP
PMv
Frontal
FST MSSTP
S.T.S
TEO
Inf. Temporal
PNd
Cing
SMA
TE
Arm
M1
Hand
Face
Hipp.
ventro-dorsal stream that makes use of memorised object knowledge and
operates more slowly than the first stream.
Haak and Beckmann (2018) investigated the connectivity patterns
among 22 visual areas, discovering these areas “are organised into not
two but three visual pathways: one dorsal, one lateral, and one ventral”
(p. 82). Their findings thus provide some support for the emphasis within
the ­
perception-action model on dorsal and ventral streams. Haak and
Beckmann speculated that the new lateral pathway may “incorporate . . .
aspects of vision, action and language” (p. 81).
Overall evaluation
Milner and Goodale’s theoretical approach has been hugely influential. Their central assumption that there are two visual systems (“what”
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 63
28/02/20 6:43 PM
64
Visual perception and attention
and “how”) is partially correct. It has received inconsistent support from
research on patients with optic ataxia and visual agnosia. Earlier we discussed achromatopsia (see Glossary) and akinetopsia (see Glossary). The
former condition depends on damage to the ventral pathway and the latter
condition on damage to the dorsal pathway (Haque et al., 2018). As predicted theoretically, many visual illusions are much reduced in extent when
observers engage in action-based performance (e.g., pointing; grasping).
What are the model’s limitations? First, evidence from brain-­damaged
patients provides relatively weak support for it. In fact, “The idea of a
double dissociation between optic ataxia and visual form agnosia, as
cleanly separating visuo-motor from visual perceptual functions, is no
longer tenable” (Rossetti et al., 2017, p. 130).
Second, findings based on visual illusions provide only partial support
for the model. The findings generally indicate that illusory effects are greater
with perceptual judgements than actions but there are many exceptions.
Third, the model exaggerates the independence of the two visual systems.
For example, Janssen et al. (2018) reviewed research on 3-D object perception and found strong effects of the dorsal stream on the ventral stream.
As de Haan et al. indicated,
The prevailing evidence suggests that cross-talk [interactions between
visual systems] is the norm rather than the exception . . . [There is] a
flexible and dynamic pattern of interaction between visual processing
areas in which visually processing networks may be created on-the-fly
in a highly task-specific manner.
(de Haan et al., 2018, p. 6)
Fourth, the notion there are only two visual processing streams is an oversimplification. Earlier on pp. 62–63 we discussed two attempts (Haak &
Beckmann, 2018; Sakreida et al., 2016) to develop more complete accounts.
COLOUR VISION
Why do we have colour vision? After all, if you watch an old black-andwhite movie on television you can easily understand the moving images.
One reason is that colour often makes an object stand out from its surroundings making it easier to identify. Chameleons very sensibly change
colour to blend in with the background, thus reducing their chances of
being detected by predators.
Colour perception also helps us to recognise and categorise objects.
For example, it is useful when deciding whether a piece of fruit is under- or
overripe. Predictive coding (processing primarily aspects of sensory input
that violate the observer’s predictions) is also relevant (Huang & Rao,
2011). Colour vision allows observers to focus rapidly on any aspects of
the incoming visual input (e.g., discolouring) discrepant with predictions
based on ripe fruit.
There are three main qualities associated with colour:
(1)
(2)
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 64
Hue: the colour itself and what distinguishes red from yellow or blue.
Brightness: the perceived intensity of light.
28/02/20 6:43 PM
65
Basic processes in visual perception
(3)
Saturation: this allow us to determine whether a colour is vivid or
pale; it is influenced by the amount of white present.
Trichromacy theory
Retinal cones are specialised for colour vision. Cone receptors contain
light-sensitive photopigment allowing them to respond to light. According to
the trichromatic [three-coloured] theory, there are three kinds of receptors:
(1)
(2)
(3)
One type is especially sensitive to short-wavelength light and generally
responds most strongly to stimuli perceived as blue.
A second type of cone receptor is most sensitive to medium-wavelength
light and responds greatly to stimuli generally seen as yellow-green.
A third type of cone responds most to long-wavelength light such as
that reflected from stimuli perceived as orange-red.
KEY TERMS
Dichromacy
A deficiency in colour
vision in which one of
the three cone classes is
missing.
Negative afterimages
The illusory perception
of the complementary
colour to the one that has
just been fixated; green
is the complementary
colour to red and blue is
complementary to yellow.
How do we see other colours? According to the theory, most stimuli activate two or all three cone types. The colour we perceive is determined by
their relative stimulation levels. Evolution has equipped us with three types
of cones because that produces a very efficient system – we can discriminate
millions of colours even with so few cone types.
Many forms of colour deficiency are consistent with trichromacy
theory. Most individuals with colour deficiency have dichromacy, in which
one cone class is missing. In red-green dichromacy (the most common
form) there are abnormalities in the retinal pigments sensitive to medium
or long wavelengths. Individuals with red-green dichromacy differ from
intact observers in perceiving far fewer colours. However, their colour constancy (see Glossary) is almost at normal levels (Álvaro et al., 2017).
The density of cones (the retinal cells responsible for colour vision) is
far higher in the fovea (see Glossary) than the periphery. However, there
are enough cones in the periphery to permit accurate peripheral colour
judgements if colour patches are reasonably large (Rosenholtz, 2016).
The crucial role of cones for colour vision explains the following
common phenomenon: “The sunlit world appears in sparkling colour, but
when night falls . . . we see the world in 50 shades of grey” (Kelber et
al., 2017, p. 1). In dim light, the cones are not activated and our vision
depends almost entirely on rods.
Opponent-process theory
Trichromacy theory does not explain what happens after activation of the
cone receptors. It also fails to account for negative afterimages. If you
stare at a square of a given colour for several seconds and then shift your
gaze to a white surface, you see a negative afterimage in the complementary colour (complementary colours produce white when combined). For
example, a green square produces a red afterimage, whereas a blue square
produces a yellow afterimage.
Hering (1878) explained negative afterimages. He identified three types
of opponent processes in the visual system. One opponent process (redgreen channel) produces perception of green when responding one way
and red when responding the opposite way. A second opponent process
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 65
28/02/20 6:43 PM
66
Visual perception and attention
(blue-yellow channel) produces perception of blue or yellow in the same
way. The third opponent process (achromatic channel) produces the perception of white at one extreme and black at the other.
What is the value of these three opponent processes? The three dimensions associated with opponent processes provide maximally independent
representations of colour information. As a result, opponent processes
provide very efficient encoding of chromatic stimuli.
Much research supports the notion of opponent processes. First,
there is strong physiological evidence for the existence of opponent cells
(Shevell & Martin, 2017). Second, the theory accounts for negative afterimages (discussed above). Third, the theory claims it is impossible to see blue
and yellow together or red and green, but the other colour combinations
can be seen. That is precisely what Abramov and Gordon (1994) found.
Fourth, opponent processes explain some types of colour deficiency. Redgreen deficiency occurs when the red-green channel cannot be used, and
blue-yellow deficiency occurs when individuals cannot make effective use
of the blue-yellow channel.
Dual-process theory
Hurvich and Jameson (1957) proposed a dual-process theory combining the
ideas discussed so far. Signals from the three cones types identified by trichromacy theory are sent to the opponent cells (see Figure 2.16). There are
three channels:
(1)
(2)
The achromatic [non-colour] channel combines the activity of the
medium- and long-wavelength cones.
The blue-yellow channel represents the difference between the sum
of the medium-and long-wavelength cones on the one hand and the
short-wavelength cones on the other. The direction of difference
determines whether blue or yellow is seen.
Figure 2.16
Schematic diagram of the
early stages of neural colour
processing. Three cone
classes (red = long; green =
medium; blue = short)
supply three “channels”.
The achromatic (light-dark)
channel receives nonspectrally opponent input
from long- and mediumcone classes. The two
chromatic channels receive
spectrally opponent inputs
to create the red-green and
blue-yellow channels.
From Mather (2009). Copyright
2009 George Mather.
Reproduced with permission.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 66
28/02/20 6:43 PM
67
Basic processes in visual perception
(3)
The red-green channel represents the difference between activity levels
in the medium- and long-wavelength cones. The direction of this difference determines whether red or green is perceived.
Overall evaluation
Dual-process theory has much experimental support. However, it is
­oversimplified in several ways (Shevell & Martin, 2017). First, there are
complex interactions between the channels. For example, short-­wavelength
cones are activated even in conditions where it would be expected that
only the red-green channel (involving medium- and long-wavelength
cones) would be active (Conway et al., 2018). Second, the proportions of
­different cone types vary considerably across individuals but this typically
has ­surprisingly little effect on colour perception. Third, the arrangement of
cone types in the eye is fairly random. This seems odd because it p
­ resumably
makes it hard for colour-opponent processes to work effectively.
More generally, much research has focused on colour perception and
other research has focused on how nerve cells respond to light of different
wavelengths. What has proved difficult is to relate these two sets of findings directly to each other. So far there is only limited convergence between
psychological and physiological research
(Shevell & Martin, 2017).
KEY TERMS
Colour constancy
The tendency for an
object to be perceived as
having the same colour
under widely varying
viewing conditions.
Illuminant
A source of light
illuminating a surface or
object.
Mutual illumination
The light reflected from
the surface of an object
impinges on the surface
of a second object.
Colour constancy
Colour constancy is the tendency for a surface
or object to be perceived as having the same
colour when there are changes in the wavelengths contained in the illuminant (the light
source illuminating the surface or object).
Colour constancy indicates colour vision does
not depend solely on the wavelengths of the
light reflected from objects. Learn more about
colour constancy on YouTube: “This is Only
Red by Vsauce”.
Why is colour constancy important? If we
lacked colour constancy, the apparent colour
of familiar objects would change dramatically
when the lighting conditions altered. This
would make it very hard to recognise objects
rapidly and accurately.
Attaining reasonable levels of colour
constancy is an impressive achievement.
Look at the object in Figure 2.17. It is immediately ­
recognisable as a blue mug even
though several other colours can be perceived. The wavelengths of light depend on
the mug itself, the illuminant and ­reflections
from other objects onto the mug’s surface
(mutual illumination).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 67
Figure 2.17
Photograph of a mug showing enormous variation in the
properties of the reflected light across the mug’s surface. The
patches at the top of the figure show image values from the
locations indicated by the arrows.
From Brainard and Maloney (2011). Reprinted with permission of the
Association for Research in Vision and Ophthalmology.
28/02/20 6:43 PM
68
Visual perception and attention
How good is colour constancy?
Case study:
Colour constancy
Colour constancy is often reasonably good. For example, Granzier et al.
(2009a) assessed colour constancy for six similarly coloured papers in
various indoor and outdoor locations differing substantially in lighting conditions. They found 55% of the papers were identified correctly. This represents good performance given the similarities among the papers and the
large differences in lighting conditions.
Reeves et al. (2008) distinguished between our subjective experience
and our judgements about the world. For example, as you walk towards
a fire, it feels increasingly hot subjectively. However, how hot you judge
the fire to be is unlikely to change. Reeves et al. found colour constancy
with non-naturalistic (artificial stimuli) was much greater when observers
judged the objective similarity of two stimuli seen under different illuminants than when rating their subjective similarity. Radonjić and Brainard’s
(2016) obtained similar findings with naturalistic stimuli. However, colour
constancy was higher overall with naturalistic stimuli because such stimuli
provided more cues to guide performance.
Estimating scene illumination
The wavelengths of light reflected from an object are greatly influenced
by the illuminant (light source). High levels of colour constancy could be
achieved if observers made accurate illuminant estimates. However, they
often do not, especially when the illuminant’s characteristics are unclear
(Foster, 2011). For example, there are substantial individual differences in
the perceived illuminant (and perceived colour) of the famous dress discussed in the Box.
Colour constancy should be high when illuminant estimation is accurate (Brainard & Maloney, 2011). Bannert and Bartels (2017) tested this
prediction. Observers were presented with visual scenes using three different illuminants, and cues within the scenes were designed to facilitate
colour constancy. Bannert and Bartels used functional magnetic resonance
imaging (fMRI) to assess the neural encoding of each scene.
What did Bannert and Bartels (2017) find? Their key finding was that,
“The neural accuracy of encoding the illuminant of a scene [predicted] the
behavioural accuracy of constant colour perception” (p. 357). Thus, colour
constancy was high when the illuminant was processed accurately.
Local colour contrast
Land (1986) proposed retinex theory, according to which we perceive a
surface’s colour by comparing its ability to reflect, short-, medium- and
long-wavelength light against that of adjacent surfaces. Thus, we make use
of local colour contrast. Kraft and Brainard (1999) studied colour constancy for complex visual scenes. Under full viewing conditions, colour
constancy was 83% even with large changes in illumination. When local
contrast could not be used, however, colour constancy dropped to 53%.
Foster and Nascimento (1994) developed Land’s ideas into an influential theory based on local contrast. We can see the nature of their big
69
Basic processes in visual perception
IN THE REAL WORLD: WHAT COLOUR IS “THE DRESS”?
On 7 February 2015, Cecilia Bleasdale took a photograph of the dress she intended to wear at
her daughter’s imminent wedding (see below) and posted it on the internet. It caused an almost
immediate sensation because observers disagreed vehemently concerning the dress’s colour.
What colour do you think the dress is (see Figure 2.18)? Wallisch (2017) found 59% of observers
said the dress was white and gold and 27% said it was black and blue. How can we explain these
individual differences? Wallisch argued the illumination of the dress is ambiguous: the upper part
of the dress implies illumination by daylight whereas the lower part implies artificial illumination.
Many theories predict the perceived colour of an object depends on its assumed illumination (discussed on p. 62). If so, observers assuming the dress is illuminated by natural light should perceive
it as white and gold. In contrast, those assuming artificial illumination should perceive it as black
and blue.
What did Wallisch (2017) find? As predicted, observers assuming the dress was illuminated by
natural light were much more likely than those assuming artificial light to perceive the dress as
white/gold (see Figure 2.19).
75
Percent reporting white/gold
70
65
60
55
50
45
Natural
Artificial
Unsure
light assumption
Figure 2.18
“The Dress” made famous by its
appearance on the internet.
From Rabin et al. (2016).
Figure 2.19
The percentage of observers perceiving “The Dress” to be white
and gold depended on whether they believed it to be illuminated
by natural light or by artificial light, and those who were unsure.
From Wallisch et al. (2017).
discovery through an example. Suppose there are two illuminants and two
surfaces. If surface 1 led to the long-wavelength or red cones responding three times as much with illuminant 1 as illuminant 2, then the same
threefold difference was also found with surface 2. Thus, the ratio of cone
responses was essentially invariant across different illuminations. Thus,
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 69
28/02/20 6:43 PM
70
Visual perception and attention
KEY TERM
cone-excitation ratios can be used to eliminate the illuminant’s effects and
so increase colour constancy.
Much evidence indicates cone-excitation ratios are important (Foster,
2011, 2018). For example, Nascimento et al. (2004) obtained evidence
­suggesting the level of colour constancy in different conditions could be
predicted on the basis of cone-excitation ratios.
Foster and Nascimento’s (1994) theory provides an elegant account
of illuminant-independent colour constancy in simple visual environments. However, it has limited value in complex visual environments. For
example, colour constancy for a given object can become harder because of
reflections from other objects (see Figure 2.17) or because multiple sources
of illumination are present together.
The theory is generally less applicable to natural scenes than artificial
laboratory scenes. For example, the illuminant often changes more rapidly
in natural scenes (e.g., clouds change shape, which influences the shadows
they cast) (Nascimento et al., 2016). In addition, there are dramatic changes
in the level and colour of natural illuminants over the course of the day.
In sum, cone-excitation ratios are most likely to be almost invariant,
“provided that sampling is from points close together in space or time . . .,
or from points separated arbitrarily but undergoing even changes in
­illumination” (Nascimento et al., 2016, p. 44).
Chromatic adaptation
Changes in visual
sensitivity to colour stimuli
when the illumination
alters.
Effects of familiarity
Colour constancy is influenced by our knowledge of the familiar colours of
objects (e.g., bananas are yellow). Hansen et al. (2006) asked observers to
view photographs of fruits and to adjust their colour until they appeared
grey. There was over-adjustment. For example, a banana still looked yellowish to observers when it was actually grey, leading them to adjust its
colour to a slightly bluish hue. Such findings may reflect an influence of
familiar size on subjective colour perception. Alternatively, familiar colour
may primarily influence observers’ responses rather than their perception
(e.g., our knowledge that bananas are yellow may bias us to report them as
more yellow than they actually appear).
Vandenbroucke et al. (2016) investigated the above issue. Observers
viewed an ambiguous colour intermediate between red and green presented
on typically red (e.g., tomato) or green (e.g., pine tree) objects. Familiar
colour influenced colour perception. Of most importance, neural responses
in various visual areas (e.g., V4, which is much involved in colour processing) were influenced by familiar colour. Neural responses corresponded
more closely to those associated with red objects when the object was typically red than when it was typically green and more closely to those found
with green objects when it was typically green. Thus, familiar colour had a
direct influence on perception early in visual processing.
Chromatic adaptation
One reason we have reasonable colour constancy is because of chromatic
adaptation – an observer’s visual sensitivity to a given illuminant decreases
over time. If you stand outside after nightfall, you may be surprised by the
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 70
28/02/20 6:43 PM
71
Basic processes in visual perception
apparent yellowness of the artificial light in people’s houses. However, this
is not the case if you spend some time in a room illuminated by artificial
light. Lee et al. (2012b) found some aspects of chromatic adaptation within
six seconds. Such rapid adaptation increases colour constancy.
Evaluation
In view of the complexity of colour constancy, it is unsurprising the visual
system adopts an “all hands on deck” approach in which several factors
contribute to colour constancy. Of major importance are zone-excitation
ratios that remain almost invariant across changes in illumination. In
addition, top-down factors (e.g., our memory for the familiar colours of
common objects) also play a role.
What are the limitations of theory and research on colour constancy?
First, we lack a comprehensive theory of how the various factors combine.
Second, most research has focused on relatively simple artificial visual
environments. In contrast, “The natural world is optically unconstrained.
Surface properties may vary from one point to another, and reflected
light may vary from one instant to the next” (Foster, 2018, p. B192). As a
result, the processes involved in trying to achieve colour constancy in more
complex environments are poorly understood.
Third, more research is needed to understand why colour constancy
depends greatly on the precise instructions given to observers. Fourth, as
Webster (2016, p. 195) pointed out, “There are pronounced [individual]
differences in almost all measures of colour appearance . . . the basis for
these differences remains uncertain.”
KEY TERMS
Monocular cues
Cues to depth that can be
used by one eye but can
also be used by both eyes
together.
Binocular cues
Cues to depth that
require both eyes to be
used together.
Oculomotor cues
Cues to depth produced
by muscular contractions
of the muscles around
the eye; use of such cues
involves kinaesthesia (also
known as the muscle
sense).
DEPTH PERCEPTION
A major accomplishment of visual perception is the transformation of the
two-dimensional retinal image into perception of a three-dimensional world
seen in depth. The construction of 3-D representations is very important
if we are to pick up objects, decide whether it is safe to cross the road and
so on.
Depth perception depends on numerous visual and other cues (discussed below). All cues provide ambiguous information and so we would
be ill-advised to place total reliance on any single cue. Moreover, different
cues often provide conflicting information. When you watch a movie, some
cues (e.g., stereo ones) indicate everything you see is at the same distance.
In contrast, other cues (e.g., perspective; shading) indicate some objects
are closer.
In real life, depth cues are often provided by movement of the observer
or objects in the visual environment and some cues are non-visual (e.g.,
object sounds). Here, however, the main focus will be on visual depth cues
available when the observer and environmental objects are static.
Cues to depth perception are monocular, binocular and oculomotor.
Monocular cues require only one eye but can also be used with two eyes.
The fact that the world still retains a sense of depth with one eye closed
indicates clearly that monocular cues exist. Binocular cues involve both
eyes used together. Finally, oculomotor cues depend on sensations of
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 71
28/02/20 6:43 PM
72
Visual perception and attention
KEY TERMS
muscular contractions of the muscles around the eye. Use of these cues
involves kinaesthesia (the muscle sense).
Texture gradient
The rate of change of
texture density from the
front to the back of a
slanting object.
Monocular cues
Monocular cues to depth are called pictorial cues because they are used by
artists. Of particular importance is linear perspective, which artists use to
create the impression of three-dimensional scenes on two-dimensional canvases. Linear perspective (based on laws of optics and geometry) is based
on various principles. For example, parallel lines pointing away from us
converge (e.g., motorway edges) and objects reduce in size as they recede
into the distance.
Tyler (2015) argued that linear perspective is only really effective in
creating a powerful 3-D effect when viewed from the point from which the
artist constructed the perspective. This is typically very close to the picture
as can be seen in a drawing by the Dutch artist Jan Vredeman de Vries
(see Figure 2.20).
Texture is another monocular cue. Most objects (e.g., carpets; cobblestone roads) possess texture, and textured objects slanting away from us
have a texture gradient (Gibson, 1979; see Figure 2.21). This is a gradient
(rate of change) of texture density as you look from the front to the back
of a slanting object with the gradient changing more rapidly for objects
slanted steeply away from the observer. Sinai et al. (1998) found observers
judged the distances of nearby objects better when the ground was uniformly textured than when there was a gap (e.g., a ditch) in the texture
pattern.
Texture gradient is a limited cue because the perceived slant depends
on the direction of the gradient. For reasons that are unclear, ground
­patterns are perceived as less slanted than equivalent ceiling or sidewall
patterns (Higashiyama & Yamazaki, 2016).
Figure 2.20
An engraving by de Vries
(1604/1970) in which linear
perspective creates an
effective three-dimensional
effect when viewed from
very close but not from
further away.
From Todorović (2009).
Copyright 1968 by Dover
Publications. Reprinted with
permission from Springer.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 72
28/02/20 6:43 PM
Basic processes in visual perception
73
Another monocular cue is interposition where a nearer
object hides part of a more distant one. The strength of
this cue can be seen in Kanizsa’s (1976) illusory square
(see Figure 2.22). There is a strong impression of a
yellow square in front of four purple circles even though
many of its contours are missing. This depends on processes that relatively “automatically” complete boundaries using the available information (e.g., ­
incomplete
circles).
Another useful cue is familiar size (discussed more
fully later). If we know an object’s size, we can use its
retinal image size to estimate its distance. However, we
can be misled. Ittelson (1951) had observers view playing
cards through a peephole restricting them to monocular
vision. The perceived distance was determined almost
entirely by familiar size. For example, playing cards Figure 2.21
double the usual size were perceived as being twice as far Examples of texture gradients that can be
perceived as surfaces receding into the
away from the observers than was actually the case.
distance.
We turn now to blur. There is no blur at fixation From Bruce et al. (2003).
point and it increases more rapidly at closer distances
than ones further away. Held et al. (2012) found blur
was an effective depth cue (especially at longer distances).
However, observers may simply have learned to respond
that the blurrier stimulus was further away. Langer and
Siciliano (2015) provided minimal training and obtained
little evidence blur was used as a depth cue. They argued
blur provides ambiguous information: an object can
appear blurred because it is in peripheral vision rather
than because it is far away.
Finally, there is motion parallax, which involves
­“transformations of the retinal image that are created . . .
both when the observer moves (observer-­
produced
parallax) and when objects move with respect to the
­
observer (object-produced parallax)” (Rogers, 2016,
p. 1267). For example, when you look out of the
window of a moving train, nearby objects appear to
move in the opposite direction but distant objects in Figure 2.22
the same d
­irection. Rogers and Graham (1979) found Kanizsa’s (1976) illusory square.
motion parallax on its own can produce accurate depth
judgements. Most research demonstrating the value of motion parallax as a depth cue has used very simple random-dot displays. However,
Buckthought et al. (2017) found comparable effects in more complex and
naturalistic conditions.
Cues such as linear perspective, texture gradient and interposition
allow observers to perceive depth even in two-dimensional displays.
KEY TERMS
However, research with c­ omputer-generated two-dimensional displays has
Motion parallax
found depth is often underestimated (Domini et al., 2011). Such displays
A depth cue based on
provide cues to flatness (e.g., binocular ­
disparity, accommodation and
movement in one part of
the retinal image relative
vergence, all discussed on pp. 74–75) that may reduce the impact of cues
to another.
­suggesting depth.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 73
28/02/20 6:43 PM
74
Visual perception and attention
KEY TERMS
Binocular cues
Binocular disparity
A depth cue based on the
slight disparity in the two
retinal images when an
observer views a scene; it
is the basis for stereopsis.
Stereopsis
Depth perception based
on the small discrepancy
in the two retinal images
when a visual scene is
observed (binocular
disparity).
Autostereogram
A complex twodimensional image
perceived as threedimensional when not
focused on for a period
of time.
Amblyopia
A condition in which one
eye sends an inadequate
input to the visual cortex;
colloquially known as
lazy eye.
Depth perception does not depend solely on monocular and oculomotor
cues. It can also be achieved by binocular disparity, which is the slight difference or disparity in the images projected on the retinas of the two eyes
when you view a scene (Welchman, 2016). Binocular disparity produces
stereopsis (the ability to perceive the world three-dimensionally).
The great subjective advantage of binocular vision was described by
Susan Barry (2009, pp. 94–132), a neuroscientist who recovered binocular
vision in late adulthood:
[I saw] palpable volume[s] of empty space . . . I could see, not just infer,
the volume of space between tree limbs . . . the grape was rounder and
more solid than any grape I had ever seen . . . Objects seemed more
solid, vibrant, and real.
Stereopsis is very powerful at short distances. However, the disparity or discrepancy in the retinal images of objects decreases by a factor of 100 as
their distance from an observer increases from 2 to 20 metres. Thus, stereopsis rapidly becomes less available at greater distances. While stereopsis
provides valuable information at short distances, we must not exaggerate its importance. Bülthoff et al. (1998) found observers’ recognition of
familiar objects was not adversely affected when stereoscopic information
was scrambled. Indeed, observers were unaware the depth information was
scrambled!
Stereopsis involves matching features in the inputs to the two eyes.
This process is fallible. For example, consider an autostereogram (a
two-­dimensional image containing depth information so it appears three-­
dimensional when viewed appropriately; the Wikipedia entry for autostereogram provides examples).
With autostereograms, the same repeating 2-D pattern is presented to
each eye. If there is a dissociation of vergence and accommodation, two
adjacent patterns will form an object apparently at a different depth from
the background. Some individuals are better than others at perceiving 3-D
objects in autostereograms because of individual differences in binocular
disparity, vergence and accommodation (Gómez et al., 2012).
The most common reason for impaired stereoscopic depth perception is amblyopia (one eye exhibits poor visual acuity; also known as lazy
eye). However, deficient stereoscopic depth perception can also result from
damage to various cortical areas (Bridge, 2016). As Bridge concluded,
intact stereoscopic depth perception requires the following: “(i) both eyes
aligned and functional; (ii) control over the eye muscles and vergence to
the images into alignment; (iii) initial matching of retinal images; and (iv)
integration of disparity information” (p. 2).
Oculomotor cues
The pictorial cues discussed so far can all be used equally well by oneeyed individuals as by those with intact vision. Depth perception also
depends on oculomotor cues based on perceiving muscle contractions
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 74
28/02/20 6:43 PM
75
Basic processes in visual perception
around the eyes. One such cue is vergence (the eyes turn inwards to
focus on very close objects than those further away). Another oculomotor
cue is a
­ ccommodation. It refers to the variation in optical power produced
by the thickening of the eye’s lens when someone focuses on a close object.
Vergence and accommodation are both very limited. First, they only
provide information about the distance of a single object at any given time.
Second, they are both of value only when judging the distance of close
objects. Even then, the information they provide is not very accurate.
Cue combination or integration
So far we have considered depth cues one by one. In the real world,
however, we typically have access to many depth cues. How do we use these
cues? One possibility is additivity (combining or integrating information
from all cues) and another possibility is selection (using information from
only a single cue) (Bruno & Cutting, 1988).
How could we maximise the accuracy of our depth perception? Jacobs
(2002) argued we should assign more weight to reliable cues. Since cues
reliable in one context may be less so in a different context, we should be
flexible when assessing cue reliability. These considerations led Jacobs to
propose two hypotheses:
(1)
(2)
KEY TERMS
Vergence
A cue to depth based on
the inward focus of the
eyes with close objects.
Accommodation
A depth cue based on
changes in optical power
produced by thickening
of the eye’s lens when an
observer focuses on close
objects.
Less ambiguous cues (i.e., those providing consistent information) are
regarded as more reliable than more ambiguous ones.
A cue is regarded as reliable if inferences based on it are consistent
with those based on other available cues.
Other theoretical approaches resemble that of Jacobs (2002). For example,
Rohde et al. (2016, p. 36) discuss Maximum Likelihood Estimation, which
is “a rule used . . . to optimally combine redundant estimates of a variable [e.g., object distance] by taking into consideration the reliability of each
estimate and weighting them accordingly”. We can extend this approach
to include prior knowledge (e.g., natural light typically comes from above;
many familiar objects have a typical size).
Finally, there are ideal-observer models (e.g., Landy et al., 2011;
Jones, 2016). Many of these models are based on the Bayesian approach
(see Chapter 13), in which initial probabilities are altered by new data
or information (e.g., presentation of cues). Ideal-observer models involve
making assumptions about the optimal way of combining the cue and
other i­nformation available and comparing that against observers’ actual
performance.
As we will see, experimentation has benefitted from advances in virtual
reality technologies. These advances permit researchers to control visual
cues very precisely, thus permitting clear-cut tests of many hypotheses.
Findings
Evidence supporting Jacobs’ (2002) first hypothesis was reported by Triesch
et al. (2002). Observers in a virtual reality situation tracked an object
defined by colour, shape and size. On each trial, two attributes were unreliable or inconsistent (their values changed frequently). Observers attached
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 75
28/02/20 6:43 PM
76
Visual perception and attention
KEY TERM
increasing weight to the reliable or consistent cue and less to the unreliable
cues during each trial.
Evidence supporting Jacobs’ (2002) second hypothesis was reported
by Atkins et al. (2001). Observers in a virtual reality environment viewed
and grasped elliptical cylinders. There were three cues to cylinder depth:
texture, motion and haptic (relating to the sense of touch).
When the haptic and texture cues indicated the same cylinder depth
but the motion cue indicated a different depth, observers made increasing
use of the texture cue and decreasing use of the motion cue. When the
haptic and motion cues indicated the same cylinder depth but the texture
cue did not, observers increasingly relied on the motion cue rather than the
texture cue. Thus, whichever visual cue correlated with the haptic cue was
preferred, and this preference increased with practice.
Much research suggests observers integrate cue information according
to the additivity notion: they take account of most (or all) cues but attach
additional weight to more reliable ones (Landy et al., 2011). However, these
conclusions are based primarily on studies involving only small ­conflicts in
the information provided by each cue.
What happens when two or more cues are in strong conflict? Observers
typically rely heavily (or even exclusively) on only one cue, i.e., they use
the selection strategy as defined by Bruno and Cutting (1988; see p. 75).
This makes sense. Suppose one cue suggests an object is 10 metres away
but another cue suggests it is 90 metres away. It is probably not sensible
to split the difference and decide it is 50 metres away! We use the ­selection
strategy at the movies – perspective and texture cues produce a 3-D
effect, whereas we largely ignore cues (e.g., binocular disparity) indicating
everything on the screen is the same distance from us.
Relevant evidence was reported by Girshick and Banks (2009) in a
study on slant perception. When there was a small conflict between the
­information provided by binocular disparity and texture gradient cues,
observers used information from both. However, when there was a large
conflict between these cues, perceived slant was determined ­exclusively by
one cue (binocular disparity or texture gradient). Interestingly,
the ­observers were not consciously aware of the large conflict between
the cues.
Do observers combine information from different cues to produce
optimal performance (i.e., accurate depth perception)? Lovell et al. (2012)
compared the effects of binocular disparity and shading on depth perception. Overall, binocular disparity was the more informative cue to depth,
but Lovell et al. tested the effects of making it less reliable. Information
from the cues was combined optimally, with observers consistently
­attaching more weight to reliable cues.
Many other studies have also reported that observers’ depth perception is close to optimal. However, there are several studies where observers
performed less impressively (Rahnev & Denison, 2018). For example, Chen
and Tyler (2015) carried out a similar study to that of Lovell et al. (2012).
Observers’ depth judgements were strongly influenced by shading but made
very little use of binocular disparity information.
Haptic
Relating to the sense of
touch.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 76
28/02/20 6:43 PM
77
Basic processes in visual perception
Evaluation
Much has been learned about the numerous cues observers use to estimate
depth or distance. Information from different depth cues is typically combined or integrated in studies assessing depth perception. There is also evidence that one cue often dominates the others when different cues conflict
strongly.
Overall, as Brenner and Smeets (2018, p. 385) concluded, “By combining the many sources of information in a clever manner people obtain
quite reliable judgments that are not too sensitive to violations of the
assumptions of the individual sources of depth information.” More specifically, observers generally attach most weight to cues providing reliable
information consistent with that provided by other cues. If a cue becomes
more or less reliable over time, observers generally increase or decrease its
weighting appropriately. Overall, depth perception often appears close to
optimal.
What are the limitations of theory and research on cue integration?
First, we typically estimate distance in real-life settings where numerous
cues are present and there are no large conflicts among them. In contrast,
laboratory settings often provide only a few cues and these cues sometimes provide very discrepant information. The unfamiliarity of laboratory
settings may sometimes cause suboptimal performance by observers and
reduce generalisation to everyday life (Landy et al., 2011).
Second, the assumption that observers process several essentially
independent cues before integrating all the information is dubious. It may
apply when observers view a very limited and artificial visual display.
However, natural environments typically provide observers with very rich
information. In such environments, visual processing probably depends
more on a global assessment of the overall structure of the environment and less on processing of specific depth cues than usually assumed
(Sedgwick & Gillam, 2017). There are also issues concerning the meaning
of the word “cue”. For example, “Stereopsis is not a cue. It encompasses
all the ways images of a scene differ in the two eyes” (Sedgwick & Gillam,
2017, p. 81).
Third, ideal-observer models differ in the assumptions used to compute
“ideal” performance and the meaning of “optimal” combining of cues in
depth perception (Rahnev & Denison, 2018). Most models focus on the
accuracy of depth-perception judgements. However, there are circumstances (e.g., presence of a fierce wild animal) where rapid if somewhat
inaccurate judgements are preferable. More generally, humans focus on
“computational efficiency” – our goal is to maximise reward while minimising the computational costs of visual processing (Summerfield & Li,
2018). Thus, optimality of depth-perception judgements does not depend
solely on performance accuracy.
KEY TERM
Size constancy
Objects are perceived
to have a given size
regardless of the size of
the retinal image.
Size constancy
Size constancy is the tendency for any given object to appear the same
size whether its size in the retinal image is large or small. For example, if
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 77
28/02/20 6:43 PM
78
Visual perception and attention
someone walks towards you, their retinal image increases progressively but
their apparent size remains the same.
Why do we show size constancy? Many factors are involved. An
object’s apparent distance is especially important when judging its size. For
example, an object may be judged to be large even though its retinal image
is very small provided it is far away. According to the size-distance invariance hypothesis (Kilpatrick & Ittelson, 1953), perceived size is ­proportional
to perceived distance.
Findings
Haber and Levin (2001) argued that an object’s perceived size depends on
memory of its familiar size as well as perceptual information concerning its
distance. Initially, observers estimated the sizes of common objects with great
accuracy from memory. Then they saw various objects at close (0–50 metres)
or distant (50–100 metres) viewing range and made size judgements. Some
familiar objects were almost invariant in size (e.g., bicycle) or of varying size
(e.g., television set); there were also unfamiliar stimuli (e.g., ovals).
What findings would we expect? If familiar size is important, size judgements should be more accurate for objects of invariant size than those of
variable size, with size judgements least accurate for unfamiliar objects. If
distance perception is all-important (and known to be more accurate for
nearby objects), size judgements should be better for all object categories
at close viewing range.
Haber and Levin (2001) found that size judgements were much better
with objects having an invariant size than those having a variable size (see
Figure 2.23). In addition, the viewing distance had a minimal effect on size
judgements. Both of these findings are contrary to predictions from the
size-distance invariance hypothesis.
If size judgements depend on perceived distance, size constancy should
not be found when an object’s perceived distance differs considerably from
Figure 2.23
Accuracy of size
judgements as a function
of object type (unfamiliar;
familiar variable size;
familiar invariant size) and
viewing distance (0–50
metres vs 50–100 metres).
Based on data in Haber and
Levin (2001).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 78
28/02/20 6:43 PM
79
Basic processes in visual perception
Figure 2.24
(a) A representation of the
Ames room; (b) an actual
Ames room showing the
effect achieved with two
adults.
Photo Peter Endig/dpa/Corbis.
its actual distance. The Ames room (Ames, 1952; see Figure 2.24) provides a good example. It has a peculiar shape: the floor slopes and the
rear wall is not at right angles to the adjoining walls. Nevertheless, the
Ames room creates the same retinal image as a normal rectangular room
when viewing monocularly through a peephole. The fact that one end of
the rear wall is much further away from the viewer is disguised by making
it much higher.
The cues suggesting the rear wall is at right angles to observers are so
strong they mistakenly assume two adults standing in the corners by the
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 79
KEY TERM
Ames room
A very distorted room that
nevertheless looks normal
under certain viewing
conditions.
28/02/20 6:43 PM
80
Visual perception and attention
KEY TERMS
rear wall are at the same distance (see photograph). They thus estimate
the size of the nearer adult as much greater than that of the adult further
away. See the Ames room on YouTube: “Ramachandran – Ames room
illusion explained”.
The illusion effect with the Ames room is so great someone walking
backwards and forwards in front of the rear wall seems to grow and shrink
as they move! Thus, perceived distance apparently determines perceived
size. However, this effect is reduced when the person walking along the
rear wall is a man and the observer is a female having a close emotional
relationship with him. This is known as the Honi phenomenon because it
was first experienced by a woman (whose nickname was Honi) when she
saw her husband in the Ames room.
Similarly dramatic findings were reported by Glennerster et al. (2006).
Participants walked through a virtual-reality room as it expanded or contracted considerably. Even though they had detailed information from
motion parallax and motion to indicate the room’s size was changing, no
participants noticed the changes! There were large errors in participants’
judgements of the sizes of objects at longer distances because of their powerful expectation the size of the room would not alter.
Some evidence discussed so far has been consistent with the assumption of the size-distance invariance hypothesis that perceived size depends
on perceived distance. However, many other findings are inconsistent
(Kim, 2017b). For example, Kim et al. (2016) obtained size and distance
estimates from observers for objects placed in various tunnels. Size and
distance were perceived independently (i.e., depended on different factors).
In contrast, the size-distance invariance hypothesis predicts that perceived
size and perceived distance should depend on each other and thus should
not be independent.
Kim (2018) obtained similar findings when observers viewed a virtual
object presented stereoscopically. Their size judgements were more accurate than their distance judgements, with each judgement depending on its
own information source.
More evidence inconsistent with the size-distance invariance hypothesis was reported by Makovski (2017). Participants were presented with
stimuli such as those shown in Figure 2.25 on a monitor. Even though perceived distance was the same for all stimuli, “open” objects (having missing
boundaries) were perceived as much larger than “closed” objects (with all
boundaries intact). This is the open-object illusion in which observers
extend the missing boundaries. This may resemble our common perception
that open windows make a room seem larger.
Van der Hoort et al. (2011) found evidence for the body size
effect, in which the size of a body mistakenly perceived to be one’s
own influences the perceived sizes of objects. Participants equipped with
­head-mounted displays connected to CCTV cameras saw the environment
from the ­perspective of a doll (see Figure 2.26). The doll was small or
large.
Van der Hoort et al. (2011) found objects were perceived as larger
and further away when the doll was small than when it was large. These
effects were greater when participants misperceived the body as their own
(this was achieved by having the bodies of the participants and the doll
Honi phenomenon
The typical apparent
size changes when an
individual walks along
the rear wall of the Ames
room are reduced when
female observers view a
man to whom they are
very close emotionally.
Open-object illusion
The misperception that
objects with missing
boundaries are larger
than objects the same
size without missing
boundaries.
Body size effect
An illusion in which
misperception of one’s
own bodily size causes the
perceived size of objects
to be misjudged.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 80
28/02/20 6:43 PM
81
Basic processes in visual perception
touched at the same time). Thus, size
and ­
distance perception depend partly
on our lifelong experience of seeing
everything from the perspective of our
own body.
Tajadura-Jiménez et al. (2018) ex­
tended the above findings. Participants
experienced having the body of a 4-yearold child or an adult with the body
scaled down to match the height of the
child’s body. Object size was overestimated more in the child-body condition,
indicating that object size is influenced
by higher-level cognitive processes (i.e.,
age perception).
A
B
C
D
Evaluation
Size perception and size constancy
sometimes depend on perceived dis- Figure 2.25
tance. Some of the strongest evidence Top: stimuli presented to participants; bottom: example of the
comes from research where misper- stimulus display.
ceptions of distance (e.g., in the From Makovski (2017).
Ames room; in virtual environments)
produce systematic distortions in perceived size. However, several other
factors also ­
influence size perception. These include familiar size, one’s
perceived body size and whether objects do (or do not) contain missing
boundaries.
What are the limitations of research and theory on size perception?
First, psychologists have discovered fewer sources of information accounting for size perception than depth perception. In addition, as Kim (2017b,
Figure 2.26
p. 2) pointed out, “The efficacy of the few information sources that have What participants in the
been identified for size perception is questionable.” Second, while the doll experiment could see.
size-distance invariance hypothesis remains influential, there is a “vast lit- From the viewpoint of a
erature demonstrating independence of perceived size and distance” (Kim, small doll, objects such as
a hand look much larger
2018, p. 17).
PERCEPTION WITHOUT AWARENESS:
SUBLIMINAL PERCEPTION
Can we perceive aspects of the visual world without any conscious awareness we are doing so? In other words, is there such a thing as subliminal
­perception (stimulus perception occurring even though the stimulus is
below the threshold of conscious awareness)? Common sense suggests the
answer is “No”. However, much research evidence suggests the answer
is “Yes”. However, we must use terms carefully. A thermostat responds
appropriately to temperature changes and so could be said to exhibit
unconscious perception!
Much important evidence has come from blindsight patients with
damage to early visual cortex (V1), an area of crucial importance to
than when seen from the
viewpoint of a large doll.
This exemplifies the body
size effect.
From Van der Hoort et al.
(2011). Public Library of Science.
With kind permission from the
author.
KEY TERM
Subliminal perception
Perceptual processing
occurring below the level
of conscious awareness
that can nevertheless
influence behaviour.
82
Visual perception and attention
KEY TERM
visual perception (discussed on pp. 45–46). Blindsight refers to patients’
ability to “detect, localise, and discriminate visual stimuli in their blind
field, despite denying being able to see the stimuli” Mazzi et al. (2016,
p. 1).
In what follows, we initially consider blindsight patients. After that,
we discuss evidence of subliminal perception in healthy individuals.
Blindsight
The ability to respond
appropriately to visual
stimuli in the absence
of conscious visual
experience in patients
with damage to the
primary visual cortex.
Blindsight
Many British soldiers in the First World War who had been blinded by
gunshot wounds that destroyed their primary visual cortex (V1 or BA17)
were treated by George Riddoch, a captain in the Royal Army Medical
Corps. These soldiers responded to motion in those parts of the visual field
in which they claimed to be blind. The apparently paradoxical nature of
their condition was neatly captured by Weiskrantz et al. (1974), who coined
the term “blindsight”.
How is blindsight assessed? Various approaches have been taken but
there are generally two measures. First, there is a forced-choice test in
which patients guess (e.g., stimulus present or absent?) or point at stimuli
they cannot see. Second, there are patients’ subjective reports that they
cannot see stimuli presented to their blind region. Blindsight is typically
defined by an absence of self-reported visual perception accompanied by
above-chance performance on the forced-choice test.
IN THE REAL WORLD: BLINDSIGHT PATIENT DB
Much early research on blindsight involved a patient, DB. He was blind in the lower part of his left
visual field as a result of surgery involving removal of part of his right primary visual cortex (BA17)
to relieve his frequent severe migraine. DB was studied intensively by Larry Weiskrantz.
DB is one of the most thoroughly studied blindsight patients (see Weiskrantz, 2010, for a historical review). He underwent surgical removal of the right occipital cortex, including most of the
primary visual cortex, to relieve very severe migraine attacks. DB could detect the presence of an
object and could indicate its approximate location by pointing. He could also discriminate between
moving and stationary objects and could distinguish vertical from horizontal lines. However, DB’s
abilities were limited – he could not distinguish between different-sized rectangles or between
triangles having straight and curved sides. Such findings suggest DB processed only low-level features of visual stimuli and could not discriminate form.
We have seen DB showed some ability to perform various visual tasks. However, he reported
no conscious experience in his blind field. According to Weiskrantz et al. (1974, p. 721), “When
he was shown a video film of his reaching and judging orientation of lines [by presenting it to his
intact visual field], he was openly astonished.”
Campion et al. (1983) pointed out that DB and other blindsight patients are only partially blind.
They favoured the stray-light hypothesis, according to which patients respond to light reflected
from the environment onto areas of the visual field still functioning. This hypothesis implies DB
should have shown reasonable visual performance when objects were presented to his blind spot
(the area where the optic nerve passes through the retina). However, DB could not detect objects
presented to his blind spot.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 82
28/02/20 6:43 PM
Basic processes in visual perception
83
We must not exaggerate patients’ preserved visual abilities. Indeed,
their visual abilities in their blind field are so poor that a seeing person
with comparable impairment would be legally classified as blind.
What do blindsight patients experience?
It is surprisingly hard to decide exactly what blindsight patients experience when presented with visual stimuli to their blind field. For example,
the blindsight patient GY described his experiences as “similar to that of a
normally sighted man who, with his eyes shut against sunlight, can perceive
the direction of motion of a hand waved in front of him” (Beckers & Zeki,
1995, p. 56).
On another occasion GY was asked about his qualia (sensory experiences). He said, “That [experience of qualia] only happens on very easy
trials, when the stimulus is very bright. Actually, I’m not sure I really have
qualia then” (Persaud & Lau, 2008, p. 1048).
There is an important distinction between type-1 and type-2 blindsight. Type-1 blindsight occurs when patients have no conscious awareness
of visual stimuli presented to the blind field. In contrast, type-2 blindsight occurs when patients have some residual awareness (although very
different from that of healthy individuals). For example, a patient, EY,
“sensed a definite pinpoint of light”, although “it looks like nothing at all”
(Weiskrantz, 1980). Another patient, GY, said, “You don’t actually ever
sense anything or see anything . . . it’s more an awareness but you don’t
see it” (Weiskrantz, 1997). Many patients exhibit type-1 blindsight on some
occasions but type-2 blindsight on others.
Findings: evidence for blindsight
Numerous studies have assessed the perceptual abilities of blindsight
patients. Here we briefly consider three illustrative studies. As indicated
already, blindsight patients often perform better when guessing an object’s
direction of motion than its perceptual qualities (e.g., form; colour). For
example, Chabanat et al. (2019) studied a blindsight patient, SA. He was
correct 98% of the time when reporting an object’s direction of motion but
performed at chance level when reporting its colour.
GY (discussed earlier) is a much-studied blindsight patient. He has
extensive damage to the primary visual cortex in the left hemisphere. In
one study (Persaud & Cowey, 2008), GY was presented with a stimulus in the upper or lower part of his visual field. On inclusion trials, he
was instructed to report the part of the visual field to which the stimulus had been presented. On exclusion trials, GY was instructed to report
the opposite of its actual location (e.g., “up” when it was in the lower
part).
GY tended to respond with the real rather than the opposite location on exclusion and inclusion trials suggesting he had access to location
information but lacked any conscious awareness of it (see Figure 2.27).
In contrast, healthy individuals showed a large difference in performance
on inclusion and exclusion trials indicating they had conscious access to
­location information.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 83
28/02/20 6:43 PM
84
Visual perception and attention
Figure 2.27
Estimated contributions of
conscious and subconscious
processing to GY’s
performance in exclusion
and inclusion conditions in
his normal and blind fields.
Reprinted from Persaud and
Cowey (2008). Reprinted with
permission from Elsevier.
Persaud et al. (2011) manipulated the stimuli presented to GY so his
visual performance was comparable in both fields. However, GY indicated
conscious awareness of far more stimuli in the intact field than the blind
one (43% of trials vs 3%, respectively). GY had substantially more activation in the prefrontal cortex and parietal areas to targets presented in the
intact field suggesting those targets were processed much more thoroughly.
Blindsight vs degraded conscious vision
Some researchers argue blindsight patients exhibit degraded vision rather
than a total absence of conscious awareness of “blind” field stimuli. For
example, Overgaard et al. (2008) asked a blindsight patient, GR, to decide
whether a triangle, circle or square had been presented to her blind field.
In one experiment, GR simply responded “yes” or “no”. In another experiment, Overgaard et al. used a 4-point Perceptual Awareness Scale: “clear
image”, “almost clear image”, “weak glimpse” and “not seen”.
Using the yes/no measure, GR indicated she had not seen the stimulus on 79% of trials. However, she identified it correctly 46% of the time.
These findings suggest the presence of type-1 blindsight. With the 4-point
scale, in contrast, GR was correct 100% of the time when she had a clear
image, 72% of the time when her image was almost clear, 25% when she
had a weak glimpse and 0% when the stimulus was not seen. If the “clear
image” and “almost clear image” data are combined, GR claimed awareness of the stimulus on 54% of trials, on 83% of which she was correct.
Thus, the use of a sensitive method (the 4-point scale) suggested much of
GR’s apparent blindsight reflected degraded conscious vision.
Ko and Lau (2012) argued blindsight patients have more conscious
visual experience than usually assumed. Their key assumption was as
follows: “Blindsight patients may use an unusually conservative criterion for detection, which results in them saying ‘no’ nearly all the time
to the question of ‘do you see something?’” (Ko & Lau, 2012, p. 1402).
This excessive caution may occur in part because damage to the prefrontal
cortex impairs their ability to set the criterion for visual detection appropriately. Their excessive conservatism or caution may explain why the
reported visual experience of blindsight patients is so discrepant from their
forced-choice perceptual performance.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 84
28/02/20 6:43 PM
Basic processes in visual perception
85
Ko and Lau’s (2012) theoretical position is supported by Overgaard
et al.’s (2008) finding (discussed on p. 84) that blindsight patients were
very reluctant to admit to having seen stimuli presented to their blind
field. They also cited research supporting their assumption that blindsight
patients often have prefrontal damage.
Mazzi et al. (2016) carried out a study resembling that of Overgaard
et al. (2008) on another blindsight patient, SL, showing no activity in the
primary visual cortex (V1). SL decided which of two features (e.g., red or
green colour) was present in a stimulus. When she indicated whether she
had seen the stimulus or was merely guessing, her guessing performance
was significantly above chance suggestive of type-1 blindsight. However,
when she indicated her awareness using the 4-point Perceptual Awareness
Scale, her visual performance was at chance level when she reported no
awareness of the stimulus. These findings suggest an absence of blindsight. The title of Mazzi et al.’s article provides the take-home message:
“Different measures tell a different story” (p. 1).
What can we conclude? Overgaard and Mogensen (2015, p. 37) argued
that “rudimentarily analysed visual information is available in blindsight”
but typically does not lead to conscious awareness. However, such information can produce conscious awareness if the patient uses much effort
and top-down control.
Two findings support this approach. First, blindsight patients generally do not regard their experiences as “visual” because they differ so
much from normal visual perception. Second, there is much evidence
(Overgaard & Mogensen, 2015) that blindsight patients show enhanced
visual performance (and sometimes subjective awareness) after training.
This occurs because they make increasingly effective use of the rudimentary visual information available to them.
Blindsight and the brain
As indicated above, the main brain damage in blindsight patients is to V1
(the primary visual cortex). As we saw earlier on p. 47 in the chapter, visual
processing typically proceeds from V1 (BA17) to other brain areas (e.g.,
V2, V3, V4; see Figure 2.4). Of importance, stimuli presented to the “blind”
field often produce some activation in these other brain areas. However,
this activation is not associated with visual awareness in blindsight patients.
On p. 48 in the chapter we discussed research by Hurme et al. (2017)
designed to clarify the role of V1 (the primary visual cortex) in the visual
perception of healthy individuals. Transcranial magnetic stimulation
applied to the primary visual cortex to reduce its efficiency disrupted
unconscious and conscious vision. In a similar study, Hurme et al. (2019)
found TMS applied to V1 prevented conscious and unconscious motion
perception in healthy individuals.
In view of the above findings, how is it that many blindsight patients
provide evidence of unconscious visual and motion processing? Part of the
answer lies within the lateral geniculate nucleus of the thalamus, an intermediate relay station between the eye and V1 (see Figure 2.28). Ajina et al.
(2015) divided patients with V1 damage into those with or without blindsight. All those with blindsight had intact connections between LGN and
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 85
28/02/20 6:43 PM
86
Visual perception and attention
Figure 2.28
The areas of most relevance
to blindsight are the lateral
geniculate nucleus (LGN)
and middle temporal visual
area (MT/V5). The structure
close to the LGN is the
pulvinar.
Pulvinar
Do
rsa
MT/V5
V3
PLdm
PM
Plp
rea
m
V2
PLvl
From Tamietto and Morrone
(2016).
l st
Plm Pl
cm Plcl
V1
V4
LGN
TE
TEO
V3
Ve
nt
ra
l
Superior
colliculus
st
re
am
V2
MT/V5 (blue arrow in the figure) whereas those connections were impaired
in patients without blindsight. This finding is important given the crucial
importance of MT/V5 for motion perception.
Celeghin et al. (2019) reported a meta-analysis (see Glossary) providing
a fuller account of the brain areas associated with patients’ visual processing. They identified 14 such areas. Some of these areas (e.g., the LGN; the
pulvinar) are critical for non-conscious motion perception, whereas others
(e.g., superior temporal gyrus; amygdala) are involved in non-­conscious
emotion processing. Overall, the meta-analysis strongly suggested that
blindsight typically consists of several non-conscious visual abilities rather
than one. Of interest, prefrontal areas (e.g., dorsolateral prefrontal cortex)
often associated with conscious visual perception (see Chapter 16) were
not activated during visual processing by blindsight patients. These findings support the view that visual processing in these patients is typically
unaccompanied by conscious experience.
Finally, Celeghin et al. (2019) discussed evidence that there is substantial reorganisation of brain connectivity in many blindsight patients following damage to V1 (primary visual cortex). For example, consider the
blindsight patient, GY, whose left V1 was destroyed. He has nerve fibre
connections between the undamaged right lateral geniculate nucleus and
the contralesional (opposite side of the body) visual motion area MT/V5
(Bridge et al., 2008) – connections not present in healthy individuals. Such
reorganisation helps to explain the visual abilities displayed by blindsight
patients.
Evaluation
Much has been learned about the nature of blindsight. First, two main
types of blindsight have been identified. Second, evidence for the existence
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 86
28/02/20 6:43 PM
Basic processes in visual perception
87
of blindsight often depends on the precise measure of visual awareness
used. Third, brain connections important in blindsight (e.g., between the
lateral geniculate nucleus and MT/V5) have been discovered. Fourth, the
visual abilities of many blindsight patients probably depend on the reorganisation of connections within the brain following damage to the primary
visual cortex. Fifth, the assumption that visual processing is rudimentary in
blindsight patients explains many findings. Sixth, research on blindsight has
shed light on the many visual pathways that bypass V1 but whose functioning can be overshadowed by pathways involving V1 (Celeghin et al., 2019).
What are the limitations of research in this area? First, there are considerable differences among blindsight patients with several apparently
possessing some conscious visual awareness in their allegedly blind field.
Second, many blindsight patients have more conscious visual experience
in their “blind” field than appears from yes/no judgements about stimulus
awareness. This probably happens because they are excessively cautious
about claiming to have seen a stimulus (Mazzi et al., 2016; Overgaard
et al., 2008).
Third, the extent to which blindsight patients have degraded vision
remains controversial. Fourth, the existence of reorganisation within the
brain in blindsight patients (e.g., Bridge et al., 2008) may limit the applicability of findings from such patients to healthy individuals.
Subliminal perception
In research on subliminal perception in visually intact individuals, a performance measure of perception (e.g., enhanced speed or accuracy of responding) is typically compared with an awareness measure. We can distinguish
between subjective and objective measures of awareness: subjective measures involve self-reports concerning observers’ awareness, whereas objective
measures involve forced-choice responses (e.g., did the stimulus belong to
category A or B?) (Hesselmann, 2013). As Shanks (2017, p. 752) argued,
“Unconscious processing [subliminal perception] is inferred when abovechance performance is combined with null awareness.”
For example, Naccache et al. (2002) had observers decide rapidly
whether a visible target digit was smaller or larger than 5. Unknown to
them, an invisible masked digit on the same side of 5 as the target (congruent) or the other side (incongruent) was presented immediately before the
target. There were two main findings. First, responses to the target digits
were faster on congruent than incongruent trials (performance measure).
Second, no participants reported seeing any masked digits (subjective
awareness measure) and their performance was at chance level when guessing whether masked digits were below or above 5 (objective awareness
measure). These findings suggested the existence of subliminal perception.
Findings
Persaud and McLeod (2008) tested the notion that only information perceived with awareness can control our actions. They presented the letter “b”
or “h” for 10 ms (short interval) or 15 ms (long interval). In the key condition, participants were instructed to respond with the letter not presented.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 87
28/02/20 6:43 PM
88
Visual perception and attention
For example, if they were aware “b” had been presented, they would say
“h”. The rationale was that only participants consciously aware of the letter
could inhibit saying it.
Persaud and McLeod (2008) found participants responded correctly
with the non-presented letter on 83% of long-interval trials indicating reasonable conscious awareness. In contrast, participants responded correctly
on only 43% of short-interval trials (significantly below chance) suggesting
some stimulus processing but an absence of conscious awareness.
An important issue is whether perceptual awareness is all-or-none
(i.e., present or absent) or graded (i.e., varying in extent). Evidence
­suggesting it is graded was reported by Sandberg et al. (2010). One of four
shapes was presented very briefly followed by masking. Observers made
a behavioural response (deciding which shape had been presented) followed by one of three subjective measures: (1) clarity of perceptual experience (the Perceptual Awareness Scale); (2) confidence in their decision;
and (3) wagering variable amounts of money on having made the correct
decision.
What did Sandberg et al. (2010) find? First, above-chance task performance sometimes occurred without reported awareness with all three
subjective measures. Second, the Perceptual Awareness Scale predicted performance better than the other measures, probably because it was the most
sensitive measure of conscious experience.
The partial awareness hypothesis (Kouider et al., 2010) potentially
explains graded perceptual experience. According to this hypothesis, perceptual awareness can be limited to low-level features (e.g., colour) while
excluding high-level features (e.g., face identity). Supportive evidence was
reported by Gelbard-Sagiv et al. (2016) with faces coloured blue or green.
They used continuous flash suppression (CFS): a stimulus presented to one
eye cannot be seen consciously when rapidly changing patterns are presented to the other eye. Observers often had conscious awareness of the
colour of faces they could not identify.
Koivisto and Grassini (2016) presented stimuli to one of four locations. Observers then made a forced-choice responses concerning the stimulus location and rated their subjective visual awareness of the stimulus
on a 3-point version of the Perceptual Awareness Scale (discussed above).
Of central importance was the no-awareness category (i.e., “I did not see
any stimulus”). The finding that observers were correct on 38% of trials
associated with no awareness (chance performance = 25%) was apparent
evidence for subliminal perception.
However, there is an alternative explanation. According to Koivisto and
Grassini (2016, p. 241), the above finding occurred mainly when “observers
were very weakly aware of the stimulus, but behaved c­onservatively and
claimed not having seen it”. This conservatism is known as response bias.
Two findings supported this explanation. First, nearly all the observers
showed response bias on no-awareness trials (see Figure 2.29).
Second, Koivisto and Grassini (2016) used event-related potentials.
The N200 (a negative wave 200 ms after stimulus presentation) is typically substantially larger for stimuli associated with awareness. Of key
importance, the N200 was greater on no-awareness correct trials than
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 88
28/02/20 6:43 PM
89
Basic processes in visual perception
Figure 2.29
The relationship between
response bias in reporting
conscious awareness (C)
and enhanced N200 on
no-awareness correct trials
compared to no-awareness
incorrect trials (UC).
–6
r = –0.53
UC (µV)
–4
From Koivisto and Grassini
(2016). Reprinted with
permission of Elsevier.
–2
0
2
–0.5
0
0.5
1
1.5
C
no-awareness incorrect trials for observers with high-response bias but not
those with low-response bias (see Figure 2.29).
In sum, Koivisto and Grassini (2016) provided a coherent explanation for the finding that visual performance was well above chance on
no-awareness trials. Observers often had weak conscious awareness on
correct no-awareness trials (indicated by the N200 findings). Such weak
conscious awareness occurred most frequently among those most biased
against claiming to have seen the stimulus.
Neuroimaging research has consistently shown that stimuli of which
the observers are unaware nevertheless produce activation in several brain
areas. In one study (Rees, 2007), activation was assessed in brain areas
associated with face processing and with object processing while invisible
pictures of faces or houses were presented. The identity of the picture (face
vs house) could be predicted with almost 90% accuracy from patterns of
brain activation. Thus, subliminal stimuli can be processed reasonably
thoroughly by the visual system.
Research focusing on differences in brain activation between conditions where there is (or is not) conscious perceptual awareness is discussed
thoroughly in Chapter 16. Here we will mention two major findings. First,
there is much less integrated or synchronised brain activation when there is
no conscious perceptual awareness (e.g., Godwin et al., 2015; Melloni et al.,
2007). Second, activation of areas within the prefrontal cortex (involved in
integrating brain activity) is much greater for consciously perceived visual
stimuli than those not consciously perceived (e.g., Gaillard et al., 2009;
Godwin et al., 2015).
What do these findings mean? They strongly suggest processing is
predominantly limited to low-level features (e.g., colour; motion) when
stimuli are not consciously perceived, which is consistent with the partial
­awareness hypothesis (Kouider et al., 2010).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 89
28/02/20 6:43 PM
90
Visual perception and attention
Evaluation
Evidence for unconscious or subliminal perception has been reported in
numerous studies using numerous tasks. Some evidence is behavioural (e.g.,
Naccache et al., 2002; Persaud & McLeod, 2008) and some is based on patterns of brain activity (e.g., Melloni et al., 2007; Rees, 2007). The latter line
of research suggests there can be considerable low-level processing of visual
stimuli in the absence of conscious visual awareness. In spite of limitations
of research in this area (see below), there is reasonably strong evidence for
subliminal perception.
What are the limitations of research on subliminal perception? First,
measures of conscious awareness vary in sensitivity. As a consequence, it
is relatively easy for researches to apparently demonstrate the existence
of subliminal perception by using an insensitive measure (Rothkirch &
Hesselmann, 2017).
Second, many researchers focus on observers whose verbal reports
show a lack of awareness. That would be appropriate if such reports were
totally reliable. However, such reports are somewhat unreliable meaning
that some of them would report awareness if they provided a second verbal
report (Shanks, 2017). In addition, limitations of attention and memory
may sometimes cause observers’ reports to omit some of their conscious
experience from verbal reports (Lamme, 2010).
Third, many claimed demonstrations of subliminal perception are
flawed because of the typical failure to consider and/or control response
bias (Peters et al., 2016). In essence, observers with response bias may
claim to have no conscious awareness of visual stimuli when they actually
have partial awareness (Koivisto and Grassini, 2016).
Fourth, Breitmeyer (2015) identified 24 different methods used to make
visual stimuli inaccessible to visual awareness. Neuroimaging and other
techniques have been used to estimate the amount of unconscious processing associated with each method. Some methods (e.g., object-substitution
masking: a visual stimulus is replaced by dots surrounding it) are associated with much more unconscious processing than others (e.g., binocular
rivalry, see Glossary). Of key relevance here, the likelihood of obtaining
evidence for subliminal perception depends substantially on the method
used to suppress visual awareness.
CHAPTER SUMMARY
•
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 90
Vision and the brain. In the retina, there are cones (specialised
for colour vision) and rods (specialised for motion detection).
The retina-geniculate-striate pathway between the eye and
cortex is divided into partially separate P and M pathways. The
dorsal stream (associated with the M pathway) terminates in
the parietal cortex and the ventral stream (associated with the
P pathway) terminates in the inferotemporal cortex. There are
­numerous i­nteractions between the two pathways and the two
streams.
28/02/20 6:43 PM
Basic processes in visual perception
91
According to Zeki’s functional specialisation theory, different
cortical areas are specialised for different visual functions (e.g.,
form; colour; motion). This is supported by findings from patients
with selective visual deficits (e.g., achromatopsia; akinetopsia).
However, much visual processing depends on large brain networks
rather than specific areas and Zeki de-emphasised the importance
of top-down (recurrent) processing. It remains unclear how we
integrate the outputs of different visual processes (the binding
problem). However, selective attention, synchronised neural activity
and combining bottom-up (feedforward) processing and top-down
(recurrent) processing all play a role.
•
Two visual systems: perception-action model. Milner and
Goodale identified a vision-for-perception system based on
the ventral stream and a vision-for-action system based on the
dorsal stream. There is limited (and inconsistent) support for
the predicted double dissociation between patients with optic
ataxia (damage to the dorsal stream) and visual form agnosia
(damage to the ventral stream). Illusory effects found when
perceptual judgements are made (ventral stream) are often much
reduced when grasping or pointing responses are used (dorsal
stream).
However, such findings are often hard to interpret, and
visually guided action often relies more on the ventral stream than
acknowledged theoretically. More generally, the two visual systems
interact with each other much more than previously assumed and
there are probably more than two visual pathways.
•
Colour vision. Colour vision helps us detect objects and make
fine discriminations among them. According to dual-process
theory, there are three types of cone receptors and three types
of opponent processes (green-red; blue-yellow; white-black). This
theory explains negative afterimages and colour deficiencies but is
oversimplified. Colour constancy occurs when a surface’s perceived
colour remains the same when the illuminant changes. Colour
constancy is influenced by our ability to assess the illuminant
accurately; local colour contrast; familiarity of object colour;
chromatic adaptation; and cone-excitation ratios. Most theories are
more applicable to colour vision with simple artificial stimuli than
complex objects in the natural world.
•
Depth perception. There are numerous monocular cues to depth
(e.g., linear perspective; texture; familiar size) plus oculomotor
and binocular cues. Cues are sometimes combined additively in
depth perception. However, more weight is generally given to
reliable cues than unreliable ones with weightings changing if a
cue’s reliability alters. However, one cue often dominates all others
when different cues conflict strongly. It is often assumed that
observers generally combine cues near-optimally, but it is hard to
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 91
28/02/20 6:43 PM
92
Visual perception and attention
define “optimality”. The assumption that observers process several
independent cues prior to integrating all the information is probably wrong in natural environments providing rich information about
overall environmental structure.
Size perception is sometimes strongly influenced by perceived
distance as predicted by the size-distance invariance hypothesis.
However, the impact of familiar size on depth perception cannot
be explained by that hypothesis. More generally, perceived size
and perceived distance often depend on different factors.
•
Perception without awareness: subliminal perception. Patients
with extensive damage to V1 sometimes suffer from blindsight.
This is a condition involving some ability to respond to visual
stimuli in the absence of normal conscious visual awareness
(especially motion detection). There is no conscious awareness in
type-1 blindsight but some residual awareness in type-2 blindsight.
Blindsight patients are sometimes excessively cautious when
reporting their conscious experience. The visual abilities of some
blindsight patients probably depend on reorganisation of brain
connections following brain damage.
There is much behavioural and neuroimaging evidence for
subliminal perception in visually intact individuals. However,
there are problems of interpretation caused by insensitive (and
unreliable) measures of self-reported awareness. Some observers
may show apparent subliminal perception because they have a
response bias leading them to claim no conscious awareness of
visual stimuli of which they actually have limited awareness.
FURTHER READING
Brenner, E. & Smeets, J.B.J. (2018). Depth perception. In J.T. Serences (ed.),
Stevens’ Handbook of Experimental Psychology and Cognitive Neuroscience, Vol.
2: Sensation, Perception, and Attention (4th edn; 385–414). New York: Wiley.
The authors provide a comprehensive account of theory and research on depth
perception.
de Haan, E.H.F., Jackson, S.R. & Schenk, T. (2018). Where are we now with
“what” and “how”? Cortex, 98, 1–7. Edward de Haan and his colleagues provide
an evaluation of the perception-action model
Goldstein, E.B. & Brockmole, J. (2017). Sensation and Perception (10th edn).
Boston: Cengage. There is coverage of key areas within visual perception in this
introductory textbook.
Naccache, L. (2016). Chapter 18: Visual consciousness: A “re-updated” neurological tour. Neurology of Consciousness (2nd edn; pp. 281–295). Lionel Naccache
provides a theoretical framework within which to understand blindsight and
other phenomena associated with visual consciousness.
Shanks, D.R. (2017). Regressive research: The pitfalls of post hoc data selection
in the study of unconscious mental processes. Psychonomic Bulletin & Review,
24, 752–775. David Shanks discusses some issues relating to research claiming to
provide evidence for subliminal perception.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 92
28/02/20 6:43 PM
Basic processes in visual perception
93
Tang, F. (2018). Foundations of vision. In J.T. Serences (ed.), Stevens’ Handbook
of Experimental Psychology and Cognitive Neuroscience, Vol. 2: Sensation,
Perception, and Attention (4th edn; pp. 1–62). New York: Wiley. Frank Tong
provides a comprehensive account of the visual system and its workings.
Witzel, C. & Gegenfurtner, K.R. (2018). Colour perception: Objects, constancy,
and categories. Annual Review of Vision Science, 4, 475–499. Christoph Witzel
and Karl Gegenfurtner discuss our current knowledge of colour perception.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 93
28/02/20 6:43 PM
Chapter
3
Object and face
recognition
INTRODUCTION
Tens of thousands of times every day we identify or recognise objects in the
world around us. At this precise moment, you are looking at this book. If
you raise your eyes, perhaps you can see a wall and windows. Object recognition typically happens so effortlessly it is hard to believe it is actually
a complex achievement. Evidence of its complexity comes from numerous
unsuccessful attempts to program computers to “perceive” the environment. However, computer programs that are reasonably effective at recognising complicated two-dimensional patterns have been developed.
Why is visual perception so complex? First, objects often overlap
and so we must decide where one object ends and the next one starts.
Second, numerous objects (e.g., chairs; trees) vary enormously in their
visual ­properties (e.g., colour; size; shape) and so it is hard to assign such
diverse stimuli to the same category. Third, we recognise objects almost
regardless of orientation (e.g., we can easily identify a plate that appears
elliptical).
We can go beyond simply identifying objects. For example, we can
generally describe what an object would look like from different angles,
and we also know its uses and functions. All in all, there is much more to
object recognition than might be supposed (than meets the eye?).
What is discussed in this chapter? The overarching theme is to unravel
the mysteries associated with recognising three-dimensional objects.
However, we initially discuss how two-dimensional patterns are recognised.
Then the focus shifts to how we decide which parts of the visual world
belong together and thus form separate objects. This is a crucial early
stage in object recognition. After that, general theories of object recognition are evaluated against the available neuroimaging and behavioural
evidence.
Face recognition (vitally important in our everyday lives) differs in
important ways from object recognition. Accordingly, we discuss face recognition in a separate section. Finally, we consider whether the processes
involved in visual imagery resemble those involved in visual perception.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 94
28/02/20 6:43 PM
95
Object and face recognition
Other issues relating to object recognition (e.g., depth perception; size constancy) were discussed in Chapter 2.
PATTERN RECOGNITION
KEY TERM
Pattern recognition
The ability to identify
or categorise twodimensional patterns (e.g.,
letters; fingerprints).
We spend much of our time (e.g., when reading) engaged in pattern
­recognition – the identification or categorisation of ­
two-dimensional
­patterns. Much research has considered how alphanumeric patterns (alphabetical and numerical symbols) are recognised. A key issue is the flexibility
of the human perceptual system (e.g., we can recognise the letter “A” rapidly
and across wide variations in orientation, typeface, size and writing style).
Patterns can be regarded as consisting of a set of specific features or
attributes (Jain & Duin, 2004). For example, the key features of the letter
“A” are two straight lines and a connecting cross-bar. An advantage of this
feature-based approach is that visual stimuli varying greatly in size, orientation and minor details can be identified as instances of the same pattern.
Many feature theories assume pattern recognition involves processing
specific features followed by more global or general processing to integrate feature information. However, Navon (1977) argued global processing often precedes more specific processing. He presented observers with
stimuli such as the one shown in Figure 3.1. On some trials, they decided
whether the large letter was an “H” or an “S”; on others, they decided Interactive exercise:
Navon
whether the small letters were Hs or Ss.
Navon (1977) found performance speed with the small letters was
greatly slowed when the large letter differed from the small letters. However,
decision speed with the large letters was uninfluenced by the nature of the
small letters. Navon concluded we often see the forest (global structure)
before the trees (features).
There are limitations with Navon’s (1977) research and conclusions.
First, Dalrymple et al. (2009) found performance was faster at the level of
the small letters than the large letter when the small letters were relatively
large and spread out. Thus, attentional processes influence performance.
Second, Navon failed to distinguish adequately between encoding (neuronal responses
triggered by visual stimuli) and decoding
(conscious perception of those stimuli) (Ding
et al., 2017). Encoding typically progresses
from lower-level representations of simple
features to higher-level representations of
more complex features (Felleman & Van
Essen, 1991). In contrast, Ding et al. (2017,
p. E9115) found, “The brain prioritises de-­
coding of higher-level features because they
are . . . more invariant and categorical, and
thus easier to . . . maintain in noisy working
memory.” Thus, Navon’s (1977) conclusions
may be more applicable to visual decoding Figure 3.1
(conscious perception) than the preceding The kind of stimulus used by Navon (1977) to demonstrate the
importance of global features in perception.
internal neuronal responses.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 95
28/02/20 6:43 PM
96
Visual perception and attention
Feature detectors
If presentation of a visual stimulus leads to detailed processing of its basic
features, we should be able to identify cortical cells involved in such processing. Hubel and Wiesel (1962) studied cells in parts of the occipital cortex
involved in visual processing. Some cells responded in two different ways to
a spot of light depending on which part of the cell was affected:
(1)
(2)
An “on” response with an increased rate of firing when the light
was on.
An “off” response with the light causing a decreased rate of firing.
Hubel and Wiesel (e.g., 1979) discovered two types of neuron in the primary
visual cortex: simple cells and complex cells. Simple cells have “on” and
“off” rectangular regions. These cells respond most to dark bars in a light
field, light bars in a dark field, or straight edges between areas of light and
dark. Any given cell responds strongly only to stimuli of a particular orientation and so its responses could be relevant to feature detection.
Complex cells resemble simple cells in responding maximally to
straight-line stimuli in a particular orientation. However, complex cells
have large receptive fields and respond more to moving contours. Each
complex cell is driven by several simple cells having the same orientation
preference and closely overlapping receptive fields (Alonso & Martinez,
1998). There are also end-stopped cells. Their responsiveness depends on
stimulus length and orientation. In sum, Hubel and Wiesel envisaged, “A
hierarchically organised visual system in which more complex visual features are built (bottom-up) from more simple ones” (Ward, 2015, p. 111).
Hubel and Wiesel’s account is limited in several ways:
(1)
(2)
(3)
(4)
The cells they identified provide ambiguous information because they
respond comparably to different stimuli (e.g., a horizontal line moving
rapidly and a nearly horizontal line moving slowly). Observers must
combine information from numerous neurons to remove ambiguities.
Neurons differ in their responsiveness to different spatial frequencies
and several phenomena in visual perception depend on this differential responsiveness (discussed on pp. 104–105).
As Schulz et al. (2015, p. 1022) pointed out, “The responses of cortical neurons [in the primary visual cortex] to repeated presentations of
a stimulus are highly variable.” This variability complicates pattern
recognition.
Pattern recognition and object recognition depend on top-down
processes triggered by expectations and context (e.g., Goolkasian
& Woodberry, 2010; discussed on pp. 111–116) as well as on the
­bottom-up processes emphasised by Hubel and Wiesel.
PERCEPTUAL ORGANISATION
Our visual environment is typically complex and confusing with many
objects overlapping others, thus making it hard to achieve perceptual segregation of visual objects. How this is done was first studied systematically
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 96
28/02/20 6:43 PM
Object and face recognition
97
IN THE REAL WORLD: HOW CAN WE DISCOURAGE SPAMMERS?
Virtually everyone has received a substantial amount of spam (unwanted
KEY TERM
emails). Spammers use bots (robots running automated tasks over
CAPTCHA
the internet) to send emails to thousands of individuals for various
A Completely Automated
­money-making purposes (e.g., fake sweepstake entries).
Turing Test to tell
A CAPTCHA (Completely Automated Turing test to tell Computers
Computers and Humans
and Humans Apart) is commonly used to discourage spammers. The
Apart involving distorted
characters connected
intention is to ensure a website user is human by providing a test
together is often used
humans can solve but automated computer-based systems cannot. The
to establish that the user
CAPTCHA in Figure 3.2 is typical in consisting of distorted characters
of an internet website
connected together horizontally. In principle, the study of CAPTCHAs
is human rather than an
can shed light on the strengths of human pattern recognition.
automated system.
Computer programs to solve CAPTCHAs
generally involve a segmentation phase to
locate the characters followed by a recognition phase where each character is
identified. Many computer programs can
recognise individual characters even when
very distorted but their performance is
much worse at segmenting connected characters. Overall, the performance of most
computer programs at solving CAPTCHAs Figure 3.2
The CAPTCHA used by Yahoo.
was poor until fairly recently.
Nachar et al. (2015) devised a computer From Gao et al. (2012).
program focusing on edge corners (an
edge corner is the intersection of two straight edges). Such corners are relatively unaffected by
the distortions and overlaps of characters found in CAPTCHAs. Nachar et al.’s approach proved
successful, allowing them to solve 57% of CAPTCHAs resembling the one shown in Figure 3.2.
There are two take-home messages. First, the difficulties encountered in devising computer programs to solve CAPTCHAs indicate humans have excellent pattern-recognition abilities. Second,
edge corners provide an especially valuable source of information in pattern recognition. Of relevance, successful camouflage in many species depends heavily on markings that break up an
animal’s edges, making it less visible (Webster, 2015).
IN THE REAL WORLD: FINGERPRINTING
An important form of real-world pattern recognition involves experts matching a criminal’s fingerprints (latent print) against stored fingerprint records. Automatic fingerprint identification systems
(AFISs) scan huge databases. This typically produces a small number of possible matches to the
fingerprint obtained from the crime scene ranked by similarity to the criminal’s fingerprint. Experts
then decide which database fingerprint (if any) matches the criminal’s.
We might imagine experts are much better at fingerprint matching than novices because their
analytic (slow, deliberate) processing is superior. However, Thompson and Tangen (2014) found
experts greatly outperformed novices when pairs of fingerprints were presented for only 2 seconds,
forcing them to rely heavily on non-analytic (fast and relatively “automatic”) processing. However,
when fingerprint pairs were presented for 60 seconds, experts showed a greater performance
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 97
28/02/20 6:43 PM
98
Visual perception and attention
improvement than novices (19% vs 7%, respectively). Thus, experts have superior analytic and
non-analytic processing.
According to signal-detection theory, experts may surpass novices in their ability to discriminate
between matching and non-matching prints. Alternatively, they may simply have a more lenient
response bias than novices. If so, they would tend to respond “match” to every pair of prints.
Good discrimination is associated with many “hits” (responding “match” on match trials) plus a
low false-alarm rate (not responding “match” on non-match trials). In contrast, a lenient response
criterion is associated with many false alarms.
Thompson et al. (2014) found novices made false alarms on 57% of trials on which two prints
were similar but did not match, whereas experts did so on only 1.65% of trials. Thus, experts have
a much more conservative response criterion as well as much better discrimination between matching and non-matching prints.
It is often assumed expert fingerprint identification is very accurate. However, experts listing the
minutiae (features) on fingerprints on two occasions showed total agreement between their assessments only 16% of the time (Dror et al., 2012). Nevertheless, experts are much less likely than
non-experts to decide incorrectly that two fingerprints from the same person are from different
individuals (Champod, 2015).
Fingerprint identification is often complex. As an example, try to decide whether the fingerprints
in Figure 3.3 come from the same person. Four fingerprinting experts said the fingerprint on the
right was from the same person as the one on the left (Ouhane Daoud, the bomber involved in
the terrorist attack in Madrid on 11 March 2004). In fact, the one on the right came from Brandon
Mayfield, an American lawyer who was falsely arrested.
Experts’ mistakes are often due to the incompleteness of the fingerprints found at crime scenes.
However, top-down processes also contribute. Experts’ errors often involve forensic confirmation
bias: “an individual’s pre-existing beliefs, expectations, motives, and situational context influence
the collection, perception, and interpretation of evidence” (Kassin et al., 2013, p. 45).
Dror et al. (2006) found evidence of forensic confirmation bias. Experts were asked to judge
whether two fingerprints matched having been told, incorrectly, that they were the ones mistakenly
matched by the FBI as the Madrid bomber. In fact, these experts had judged these fingerprints to
be a clear and definite match several years earlier. The misleading information provided led 60%
of them to judge the prints to be definite non-matches! Thus, top-down processes triggered by
contextual information can distort fingerprint
identification.
Langenburg et al. (2009) studied the effects
of context (e.g., alleged conclusions of internationally respected experts) on fingerprint
identification. Experts and non-experts were
both influenced by contextual information
(and so showed confirmation bias). However,
non-experts were influenced more.
The above studies on confirmation bias
manipulated context very directly and explicitly. Searston et al. (2016) found a more subtle
context effect based on familiarity. Novice parFigure 3.3
ticipants were presented initially with a series The FBI’s mistaken identification of the Madrid bomber.
of cases and fingerprint pairs and given feed- The fingerprint from the crime scene is on the left. The
back as to whether the ­fingerprints matched fingerprint of the innocent suspect (positively identified by
or not. Then they were presented with various fingerprint experts) is on the right.
cases very similar to those seen previously From Dror et al. (2006). Reprinted with permission from Elsevier.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 98
28/02/20 6:43 PM
99
Object and face recognition
and decided whether the fingerprint pairs matched. The participants exhibited response bias during
the second part of the experiment: their decisions (i.e., match or no-match) tended to correspond
to the correct decisions associated with similar (but not identical) cases encountered earlier.
In sum, experts typically outperform novices at fingerprint matching because they have superior
discrimination ability and a more conservative response criterion. However, even experts are influenced by irrelevant or misleading contextual information and often show evidence of confirmation
bias. Worryingly, among forensic experts (including fingerprinting experts), only 52% regarded bias as
a matter for concern and even fewer (26%) believed their own judgements were influenced by bias.
by the gestaltists, German psychologists (including Koffka, Köhler and
Wertheimer) who emigrated to the United States between the two World
Wars. Their fundamental principle was the law of Prägnanz – we typically
perceive the simplest possible organisation of the visual field.
Most of the gestaltists’ other laws can be subsumed under the law of
Prägnanz. Figure 3.4(a) illustrates the law of proximity (visual elements
close in space tend to be grouped together). Figure 3.4b shows the law of
similarity (similar elements tend to be grouped together).
We see two crossing lines in Figure 3.4(c) because, according to the
law of law continuation, we group together those elements requiring the
fewest changes or interruptions in straight or smoothly curving lines.
Finally, Figure 3.4(d) illustrates the law of closure: the missing parts of a
figure are filled in to complete the figure (here a circle).
We might dismiss these principles as “mere textbook curiosities”
(Wagemans et al., 2012a, p. 1180). However, the various grouping principles “pervade virtually all perceptual experiences because they determine
the objects and parts that people perceive in their environment” (Wagemans
et al., 2012a, p. 1180).
The gestaltists emphasised figure-ground segmentation in perception. The figure is perceived as having a distinct form or shape whereas
the ground lacks form. In addition, the figure is perceived as being in
KEY TERMS
Law of Prägnanz
The notion that the
simplest possible
organisation of the
visual environment is
perceived; proposed by
the gestaltists.
Figure-ground
segmentation
The perceptual
organisation of the visual
field into a figure (object
of central interest) and a
ground (less important
background).
Figure 3.4
Examples of the Gestalt
laws of perceptual
organisation: (a) the law
of proximity; (b) the law
of similarity; (c) the law of
good continuation; and (d)
the law of closure.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 99
28/02/20 6:43 PM
100
Visual perception and attention
Figure 3.5
An ambiguous drawing that can be seen as either two faces
or as a goblet.
front of the ground and the contour separating the figure from ground “belongs”
to the figure. Check these claims with the
­faces-goblet illusion (see Figure 3.5). When
the goblet is ­perceived as the figure, it seems
to be in front of a dark background. Faces
are in front of a light background when
forming the figure.
What determines which region is identified as the figure and which as the ground?
Regions that are convex (curving outwards),
small, surrounded and symmetrical are most
likely to be perceived as figures (Wagemans
et al., 2012a). For example, Fowlkes et al.
(2007) found with images of natural scenes
that regions identified by observers as figures
were generally smaller and more convex than
ground regions.
Finally, the gestaltists argued perceptual grouping and organisation are innate or
intrinsic to the brain. As a result, they de-­
emphasised the importance of past experience.
Findings
The gestaltists’ approach was limited because they mostly used artificial
figures, making it important to see whether their findings apply to more
realistic stimuli. Geisler et al. (2001) used pictures to study the contours
of flowers, rivers, trees and so on. They discovered object contours could
be calculated accurately using two principles that were different from those
emphasised by the gestaltists:
(1)
Adjacent segments of any contour typically have very similar
orientations.
(2)Segments of any contour that are further apart generally have somewhat different orientations.
Geisler et al. (2001) asked observers to decide which of two complex patterns presented together contained a winding contour. Task performance
was well predicted by the two key principles described above.
Elder and Goldberg (2002) analysed the statistics of natural contours
and obtained findings largely consistent with Gestalt laws. Proximity was a
very p
­ owerful cue when deciding which contours belonged to which objects.
There was also a small contribution from similarity and good continuation.
Numerous cues influence figure-ground segmentation and the perception of object boundaries with natural scenes. Mély et al. (2016) found
colour and luminance (see Glossary) strongly influenced the perception of
object boundaries. There was more accurate perception of object boundaries when several cues were combined than that found for any single cue
in isolation.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 100
28/02/20 6:43 PM
101
Object and face recognition
In sum, there is some support for Gestalt laws in natural scene
perception. However, figure-ground segmentation is more complex in
natural scenes than most artificial figures, and so the Gestalt approach is
oversimplified.
The gestaltists failed to discover several principles of perceptual organisation. For example, Palmer and Rock (1994) proposed the principle of
uniform connectedness. According to this principle, any connected region
having uniform visual properties (e.g., colour; texture; lightness) tends to
be organised as a single perceptual unit. Palmer and Rock found grouping
by uniform connectedness dominated proximity and similarity when there
was a conflict.
Pinna et al. (2016) argued that the gestaltists de-emphasised the role
of dissimilarity in perceptual organisation. Consider Figure 3.6. The perception of the empty circles as a rotated square or a diamond is strongly
influenced by the location of the dissimilar element (i.e., the black circle).
This illustrates the principle of accentuation: “Elements group in the same
oriented direction of the dissimilar element placed . . . outside a whole set
of continuous/homogeneous components” (Pinna et al., 2016, p. 21).
Much processing involved in perceptual organisation occurs very
rapidly. Williford and von der Heydt (2016) discovered signals from
neurons in V2 (see Chapter 2) relating to figure-ground organisation
emerged within 70 ms of stimulus presentation for complex natural scenes
as well as for simple figures. This extremely rapid processing is ­consistent
with the gestaltists’ assumption that perceptual organisation is due to
innate factors but may also reflect massive experience in object recognition.
The role of learning was discussed by Bhatt and Quinn (2011). Infants
as young as 3 or 4 months show grouping by continuation, proximity and
connectedness, which is apparently consistent with the Gestalt position.
However, other grouping principles (e.g., closure) were used only later in
infancy, and infants typically made increased use of grouping principles
over time. Thus, learning is important.
A
KEY TERM
Uniform connectedness
The notion that adjacent
regions in the visual
environment having
uniform visual properties
(e.g., colour) are
perceived as a single
perceptual unit.
B
Figure 3.6
The dissimilar element (black circle) accentuates the tendency to perceive the array of
empty circles as (A) a rotated square or (B) a diamond.
From Pinna et al., 2016.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 101
28/02/20 6:43 PM
102
Visual perception and attention
According to the gestaltists, perceptual grouping occurs rapidly and
should be uninfluenced by attentional processes. The evidence is mixed.
Rashal et al. (2017) conducted several experiments. Attention was not
required with grouping by proximity or similarity in colour. However,
attention was required with grouping by similarity in shape. In general,
attention was more likely to be required when the processes involved in perceptual grouping were relatively complex. Overall, the processes involved
in perceptual grouping are much more complicated and variable than the
gestaltists had assumed.
The gestaltists also assumed figure-ground segmentation is innate and
so not reliant on past experience or learning. Barense et al. (2012) reported
contrary evidence. Amnesic patients (having severe memory problems)
and healthy controls were presented with various stimuli, some containing
parts of well-known objects (see Figure 3.7). In other stimuli, the object
parts were rearranged. The task was to decide which region of each stimulus was the figure.
The healthy controls identified the regions containing familiar objects
as figures more often than those containing rearranged parts. In contrast,
the amnesic patients showed no difference between the two types of stimuli
because they experienced difficulty in identifying the objects presented.
Thus, figure-ground segmentation can depend
on past experience and memory (i.e., object
familiarity).
Experimental stimuli: Intact familiar configurations
Several recent theories explain perceptual
grouping and figure-ground segmentation.
For example, consider Froyen et al.’s (2015)
Bayesian hierarchical grouping model, according to which observers initially form “beliefs”
concerning the objects to be expected in the
current context. In addition, their visual
system assumes the visual image ­consists of
a mixture of objects. The information availControl stimuli: Part-rearranged novel configurations
able in the image is then used to change the
subjective probabilities of different grouping hypotheses to make optimal use of that
information. Of key importance, observers
use their learned knowledge of patterns and
objects (e.g., visual elements close together
generally belong to the same object).
The above approach exemplifies theories
based on Bayesian inference (see Glossary).
Their central assumption is that the initial
subjective probabilities associated with vari­
Figure 3.7
ous hypotheses as to the organisation of
The top row shows intact familiar shapes (from left to right:
objects within a visual image change on the
a guitar, a standing woman, a table lamp). The bottom row
basis of the information it provides. This
shows the same objects but with the parts rearranged. The
approach is much more realistic than the
task was to decide which region in each stimulus was the
gestaltists’ relatively cut-and-dried approach.
figure.
From Barense et al. (2012). Reprinted with permission of Oxford
University Press.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 102
28/02/20 6:43 PM
Object and face recognition
103
Evaluation
What are the strengths of the Gestalt approach? First, the gestaltists
focused on key issues (e.g., figure-ground segmentation). Second, nearly all
their grouping laws (and the notion of figure-ground segmentation) have
stood the test of time and are applicable to natural scenes as well as artificial figures. Third, the notion that observers perceive the simplest possible organisation of the visual environment has proved very fruitful. Many
recent theories are based on the assumption that striving for simplicity is
central to visual perception (Jäkel et al., 2016).
What are the approach’s limitations? First, the gestaltists de-­
emphasised the importance of past experience and learning. As Wagemans
et al. (2012b, p. 1229) pointed out, the gestaltists “focused almost exclusively on processes intrinsic to the perceiving organism . . . The environment itself did not interest [them]”.
Second, the gestaltists produced descriptions of important perceptual
phenomena but not adequate explanations. Recently, however, such explanations have been provided.
Third, nearly all the evidence the gestaltists provided was based on
two-dimensional drawings. The greater complexity of real-world scenes
(e.g., important parts of objects hidden or occluded) means additional
explanatory assumptions are required.
Fourth, the gestaltists did not discover all the principles of perceptual organisation. Among such undiscovered principles are uniform
connectedness, the principle of accentuation and generalised common
­
fate (e.g., when elements of a visual scene become brighter or darker
together, they are grouped together). More generally, the gestaltists did
not ­appreciate the sheer complexity of the processes involved in perceptual grouping.
Fifth, the gestaltists focused mostly on drawings involving only one
Gestalt law. With natural scenes, several laws often operate simultaneously
and interact in complex ways not predicted by the gestaltists (Jäkel et al.,
2016).
Sixth, the gestaltists’ approach was too inflexible. They did not realise
perceptual grouping and figure-ground segregation depend on complex
interactions between basic (and possibly innate) processes and past experience (Rashal et al., 2017).
APPROACHES TO OBJECT RECOGNITION
Object recognition (identifying objects in the visual field) is enormously
important if we are to interact effectively with the environment. We start
with basic aspects of the human visual system followed by major theories of
object recognition.
Perception-action model
Milner and Goodale’s (1995, 2008) perception-action model (discussed in
Chapter 2) is relevant to understanding object perception. It is based on a
distinction between ventral (or “what”) and dorsal (or “how”) streams (see
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 103
28/02/20 6:43 PM
104
Visual perception and attention
Figure 2.9), with the latter providing visual guidance for action (e.g., grasping). They argued object recognition and perception depend primarily on
the ventral stream. This stream is hierarchically organised. Visual processing
basically proceeds from the retina through several areas including the lateral
geniculate nucleus, V1, V2 and V4, culminating in the inferotemporal cortex.
The importance of the ventral stream is indicated by research showing
object recognition can be reasonably intact after damage to the dorsal
stream (Goodale & Milner, 2018). However, object recognition involves
numerous interactions between the ventral and dorsal streams (Freud et al.,
2017b).
Spatial frequency
Visual perception develops over time even though it seems instantaneous
(Hegdé, 2008). The visual processing involved in object recognition typically proceeds in a coarse-to-fine way with initial coarse or general processing followed by fine or detailed processing. As a result, we can perceive
visual scenes at a very general level and/or at a fine-grained level.
How does coarse-to-fine processing occur? Numerous cells in the
primary visual cortex respond to high spatial frequencies and capture fine
detail in the visual image. Numerous others respond to low spatial frequencies and capture coarse information in the visual image.
Low spatial frequency information (often relating to motion and/
or spatial location) is transmitted rapidly to higher-order brain areas via
the fast magnocellular system using the dorsal visual stream (discussed in
Chapter 2). Awasthi et al. (2016) used red light to produce magnocellular
suppression. As predicted, this interfered with the low spatial frequency
components of face perception.
In contrast, high spatial frequency information (often relating to
colour, shape and other aspects of object recognition) is transmitted relatively slowly via the parvocellular system using the ventral visual stream
(see Chapter 2). This speed difference explains why coarse processing typically precedes fine processing, although conscious perception is typically
based on integrated low and high spatial information.
We can observe the effects of varying
spatial frequency by comparing images consisting only of low or high spatial frequency
(see Figure 3.8). You probably agree it is considerably easier to achieve object recognition
with the high spatial frequency image.
Findings
Figure 3.8
High and low spatial frequency versions of a place (a building).
From Awasthi et al. (2016).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 104
Musel et al. (2012) presented participants
with very brief (150 ms) scenes proceeding
from coarse (low spatial frequency) to fine
(high spatial ­frequency) or vice versa (sample
videos can be viewed at DOI.10.1371/
journal.pone.003893). Performance (deciding
whether each scene was an outdoor or indoor
28/02/20 6:43 PM
Object and face recognition
105
Figure 3.9
Image of Mona Lisa
revealing very low spatial
frequencies (left), low
spatial frequencies (centre)
and high spatial frequencies
(right).
From Livingstone (2000). By
kind permission of Margaret
Livingstone.
one) was faster with the coarse-to-fine sequence, a finding subsequently
replicated by Kauffmann et al. (2015). These fi
­ ndings suggest the visual
processing of natural scenes is predominantly coarse to fine. This sequence
may be more effective because low spatial frequency information is used to
generate plausible interpretations of the visual input.
There is considerable evidence that global (general) processing often
precedes local (specific) processing (discussed on p. 95). Much research
has established an association between processing low spatial frequencies and global perception and between processing high spatial frequencies and local perception. However, use of low and high spatial frequency
information in visual processing is often very flexible and is influenced by
task demands. Flevaris and Robertson (2016, p. 192) reviewed research
showing, “Attention to global and local aspects of a display biases the flexible selection of relatively lower and relatively higher SFs [spatial frequencies] during image processing.”
Finally, we can explain the notoriously elusive smile of Leonardo da
Vinci’s Mona Lisa with reference to spatial frequencies. Livingston (2000)
produced images of that painting with different spatial frequencies. Mona
Lisa’s smile is much more obvious in the two low spatial frequency images
(see Figure 3.9). Livingston pointed out that our central or foveal vision
is dominated by higher spatial frequencies compared with our peripheral
vision. As a result, “You can’t catch her smile by looking at her mouth.
She smiles until you look at her mouth” (p. 1299).
Historical background: Marr’s computational approach
David Marr (1982) proposed a very influential theory. He argued object
recognition involves various processing stages and is much more complex
than had previously been thought. More specifically, Marr claimed observers construct various representations (descriptions) providing increasingly
detailed information about the visual environment:
●●
Primal sketch: this provides a two-dimensional description of the main
light intensity changes in the visual input, including information about
edges and contours.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 105
28/02/20 6:43 PM
106
Visual perception and attention
●●
●●
2½-D sketch: this incorporates a description of the depth and orientation of visible surfaces using information from shading, texture,
motion and binocular disparity. It resembles the primal sketch in being
viewer-centred or viewpoint-dependent (i.e., it is influenced by the
­
angle from which the observer sees objects or the environment).
3-D model representation: this describes objects’ shapes and their relative positions three-dimensionally; it is independent of the observer’s
viewpoint and so is viewpoint-invariant.
Why has Marr’s theoretical approach been so influential? First, he successfully combined ideas from neurophysiology, anatomy and computer
vision (Mather, 2015). Second, he was among the first to realise the enormous complexity of object recognition. Third, his distinction between
viewpoint-dependent and viewpoint-invariant representations triggered
­
much subsequent research (discussed on pp. 109–111).
What are the limitations of Marr’s approach? First, he focused excessively on bottom-up processes. Marr (1982, p. 101) admitted, “Top-down
processing is sometimes used and necessary.” However, he de-emphasised
the major role expectations and knowledge play in object recognition
­(discussed in detail on pp. 111–116).
Second, Marr assumed that “Vision tells the truth about what is out
there” (Mather, 2015, p. 44). In fact, there are numerous exceptions. For
example, people observed from a tall building (e.g., the Eiffel Tower) seem
very small. Another example is the vertical-horizontal illusion – ­observers
typically overestimate the length of a vertical line when it is compared
against a horizontal line of the same length (e.g., Gavilán et al., 2017).
Third, many processes proposed by Marr are incredibly complex computationally. As Mather (2015, p. 44) pointed out, “The computations
required to produce view-independent 3-D object models are now thought
by many researchers to be too complex.”
Biederman’s recognition-by-components theory
Biederman’s (1987) recognition-by-components theory developed Marr’s
theoretical approach. His central assumption was that objects consist of
basic shapes or components known as “geons” (geometric ions); examples
include blocks, cylinders, spheres, arcs and wedges. Biederman claimed
there are approximately 36 different geons, which sounds suspiciously low
to provide descriptions of all objects. However, geons can be combined in
almost endless ways. For example, a cup is an arc connected to the side of a
cylinder. A pail involves the same two geons but with the arc connected to
the top of the cylinder.
Figure 3.10 shows the key features of recognition-by-components
theory. We have already considered the stage where the components or
geons of an object are determined. When this information is available, it is
matched with stored object representations or structural models consisting
of information about the nature of the relevant geons, their orientations,
sizes and so on. Whichever stored representation fits best with the geonbased information obtained from the visual object determines which object
is identified by observers.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 106
28/02/20 6:43 PM
Object and face recognition
As indicated in Figure 3.10, the first step
in object recognition is edge extraction in
which various aspects of the visual stimulus
(e.g., luminance; texture; colour) are processed, leading to a description of the object
resembling a line drawing. After that, decisions are made as to how the object should
be segmented to establish its geons.
Which edge information should observers focus on? According to Biederman (1987),
non-accidental image properties are crucial.
These are aspects of the visual image that
are invariant across different viewing angles.
Examples include whether an edge is straight
or curved and whether a contour is concave
(hollow) or convex (bulging) with the
former of particular importance. Biederman
assumed objects’ geons of a visual object are
constructed from various non-accidental or
invariant properties.
This part of the theory leads to the key
prediction that object recognition is typically viewpoint-invariant (i.e., objects can
be recognised equally easily from nearly all
viewing angles). The argument is that object
recognition depends crucially on the identification of geons, which can be identified from
numerous viewpoints. Thus, object recognition is difficult only when one or more geons
are hidden from view.
How do we recognise objects in suboptimal viewing conditions (e.g., an intervening
object obscures part of the target object)?
First, non-accidental properties can still be
detected even when only parts of edges are
visible. Second, if the concavities of a contour
are visible, there are mechanisms for restoring
the missing parts of the contour. Third, we
can recognise many objects when some geons
are missing because there is much redundant information under optimal viewing
conditions.
107
Irving Biederman. University of Southern California.
Findings
Non-accidental properties play a vital role in
object recognition (Parker & Serre, 2015). For
example, it is easier to distinguish between
two objects differing in non-­accidental properties. In addition, neuroimaging studies
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 107
Figure 3.10
An outline of Biederman’s recognition-by-components theory.
Adapted from Biederman (1987).
28/02/20 6:43 PM
108
Visual perception and attention
reveal greater neural responses to changes in non-accidental properties than
other visual changes. Rolls and Mills (2018) developed a model of object
recognition showing how non-accidental properties of objects can promote
viewpoint-invariant object recognition.
There is general agreement that an object’s contour or outline is
important in object recognition. For example, camouflage in many animal
species is achieved by markings breaking up and distorting contour information (Webster, 2015). There is also general agreement that concavities
and convexities are especially informative regions of an object’s contour.
However, the evidence relating to Biederman’s (1987) assumption that
concavity is more important than convexity in object recognition is mixed
(Schmidtmann et al., 2015).
In their own study, Schmidtmann et al. (2015) focused specifically on
shape recognition using unfamiliar shapes. Shape recognition depended
more on information about convexities than concavities (although concavity information had some value). They argued convexity information
is likely to be more important because convexities reveal an object’s outer
boundary.
According to the theory, object recognition depends on edge rather
than surface information (e.g., colour). However, Sanocki et al. (1998)
argued that edge-extraction processes are less likely to produce accurate
object recognition when objects are presented in the context of other objects
rather than on their own. This is because it can be hard to decide which
edges belong to which objects when several objects are presented together.
Sanocki et al. presented observers briefly with objects, with line drawings,
or full-colour photographs of objects. As predicted, object recognition was
much worse with the line drawings than the full-colour photographs when
objects were presented in context.
A key theoretical prediction is that object recognition is typically
viewpoint-invariant. Biederman and Gerhardstein (1993) supported this
prediction when familiar objects presented at different angles were named
rapidly. However, numerous other studies have failed to obtain evidence
for viewpoint-invariance. This is especially the case with unfamiliar objects
differing from familiar objects in not having previously been viewed from
multiple viewpoints (discussed in next section on pp. 109–111).
Evaluation
Biederman’s (1987) recognition-by-components theory has been very influential. It indicates how we can identify objects despite substantial differences among the members of most categories in shape, size and orientation.
The assumption that non-accidental properties of stimuli and geons play a
role in object recognition has received much support.
What are the theory’s limitations? First, it focuses predominantly
on bottom-up processes triggered directly by the stimulus input. As a
result, it de-emphasises the impact on object recognition of top-down
processes based on expectation (Trapp & Bar, 2015; discussed further on
pp. ­111–116). Second, the theory accounts only for fairly unsubtle perceptual discriminations. It cannot explain how we decide whether an animal
is, for example, a particular breed of dog or cat. Third, the notion that
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 108
28/02/20 6:43 PM
Object and face recognition
109
objects consist of invariant geons is too inflexible. As Hayward and Tarr
(2005, p. 67) pointed out, “You can take almost any object, put a working
light-bulb on the top, and call it a lamp.”
Does viewpoint influence object recognition?
Form a visual image of a bicycle. Your image probably involved a side view
with both wheels clearly visible. We can use this example to discuss a theoretical controversy. Consider an experiment where some participants see a
photograph of a bicycle in the typical (or canonical) view as in your visual
image, whereas others see a photograph of the same bicycle viewed end-on
or from above. Would those given the typical view identify the object as a
bicycle fastest?
We will address the above question shortly. Before that, we must
discuss two key terms mentioned earlier. If object recognition is equally
rapid and easy regardless of viewing angle, it is viewpoint-invariant. In contrast, if object recognition is faster and easier when objects are seen from
certain angles, it is viewer-centred or viewpoint-dependent. Another important distinction is between categorisation (e.g., is the object a dog?) and
identification (e.g., is the object a poodle?), which requires within-category
discriminations.
Findings
Milivojevic (2012) reviewed behavioural research in this area. Object recognition is typically uninfluenced by an object’s orientation when categorisation is required (i.e., it is viewpoint-invariant). In contrast, object
recognition is significantly slower if an object’s orientation differs from
its canonical or typical viewpoint when identification is required (i.e.,
it is viewer-centred). Hamm and McMullen (1998) reported supporting
findings. Changes in viewpoint had no effect on speed of object recognition when categorisation was required (e.g., deciding an object was a car).
However, there were clear effects of changing viewpoint with identification
(e.g., deciding whether an object was a taxi).
Small (or non-significant) effects of object orientation on categorisation time do not necessarily indicate orientation has not affected internal
processing. Milivojevic et al. (2011) found stimulus orientation had only
small effects on speed and accuracy of categorisation. However, early components of the event-related potentials (ERPs; see Glossary) were larger
when stimuli were not in the upright position. Thus, stimulus orientation
had only modest effects on task performance but perceptual processing was
less demanding with upright stimuli.
Neuroimaging research has enhanced our understanding of object recognition (Milivojevic, 2012). With categorisation tasks, brain activation is
mostly very similar regardless of object orientation. However, orientation
influences brain activity early in processing suggesting initial processing is
viewpoint-dependent.
With identification tasks, there is typically greater activation of areas
within the inferior temporal cortex when objects are not in their typical
or canonical orientation (Milivojevic, 2012). This finding is unsurprising
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 109
28/02/20 6:43 PM
110
Visual perception and attention
since the inferotemporal cortex is heavily involved in object recognition
(Gauthier & Tarr, 2016). Identification may require additional processing
(e.g., more detailed processing of object features) for objects presented in
unusual orientations.
Learning influences the extent to which object recognition is
viewpoint-­
dependent or viewpoint-invariant. Zimmermann and Eimer
(2013) presented unfamiliar faces on 640 trials. Face recognition was
viewpoint-dependent initially but became more viewpoint-invariant there­
after. Learning caused more information about each face to be stored in
long-term memory and this facilitated rapid access to visual face memory
regardless of facial orientation.
Etchells et al. (2017) also studied the effects of learning on face recognition. During learning, observers were repeatedly shown one or two views
of unfamiliar faces. Subsequently they were shown a novel view of these
faces. There was evidence of viewpoint-invariant face recognition when
learning had been based on two different views but not when it had been
based on only a single view.
Related research was reported by Weibert et al. (2016). They found
evidence of a viewpoint-invariant response in face-selective regions of
the medial temporal lobe with familiar (but not unfamiliar) faces. Thus,
viewpoint-invariant responses during object recognition are more fre­
quent for faces for which observers have stored considerable relevant
information.
Evidence of viewpoint-dependent or viewpoint-invariant responses
within the brain often depends on the precise brain areas studied. Erez
et al. (2016) found viewpoint-dependent responses in several visual areas
(e.g., fusiform face area) but viewpoint-invariant responses in the perirhinal
cortex. There is more evidence for viewpoint-invariant brain responses late
rather than early in visual processing. Why is that? As Erez et al. (p. 2271)
argued, “Representations of low-level features are transformed into more
complex and invariant representations as information flows through successive stages of [processing].”
Most research is limited because object recognition is typically
assessed in only one context, which may prompt either viewpoint-­invariant
or ­
viewpoint-dependent recognition performance. Tarr and Hayward
(2017) argued this approach can misleadingly suggest observers store only
viewpoint-invariant or viewpoint-dependent information. Accordingly,
­
they used various contexts. Observers originally learned the identities of
novel objects that could be discriminated by viewpoint-invariant information. As predicted, they exhibited viewpoint-invariant object recognition
when tested. When the testing context was changed to make it hard to
continue to use that approach, observers shifted to exhibiting viewpoint-­
dependent behaviour.
The central conclusion from the above findings is that: “Object representations are neither viewpoint-dependent nor viewpoint-invariant,
but rather encode multiple kinds of information . . . deployed in a flexible manner appropriate to context and task” (Tarr & Hayward, 2017,
p. 108). Thus, visual object representations contain richer and more variegated information than typically assumed on the basis of limited testing
conditions.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 110
28/02/20 6:43 PM
Object and face recognition
111
Conclusions
As Gauthier and Tarr (2016, p. 179) concluded: “Depending on the experimental conditions and which parts of the brain we look at, one can
obtain data supporting both the structural-description (i.e., the viewpoint-­
invariant) and the view-based [viewpoint-dependent] approaches.” There
has been progress in identifying factors (e.g., is categorisation or identification required? is the object familiar or unfamiliar?) influencing whether
object recognition is viewpoint-invariant or viewpoint-dependent.
Gauthier and Tarr (2016, p. 379) argued researchers should address
the following question: “What is the nature of the features that comprise high-level visual representations and lead to image-dependence or
image-invariance?” Thus, we should focus more on why object recognition
is viewpoint-invariant or viewpoint-dependent. As yet, “The exact and
fine-grained features of object representations are still unknown and are
not easily resolved” (Gauthier & Tarr, 2016, p. 379).
OBJECT RECOGNITION: TOP-DOWN PROCESSES
Historically, most theorists (e.g., Marr, 1982; Biederman, 1987) studying
object recognition emphasised bottom-up processes. Apparent support can
be found in the hierarchical nature of visual processing. As Yardley et al.
(2012, p. 4) pointed out,
Traditionally, visual object recognition has been taken as mediated by a
hierarchical, bottom-up stream that processes an image by s­ ystematically
analysing its individual elements and relaying this information to the
next areas until the overall form and identity are determined.
The above account, assuming a feedforward hierarchy of processing stages
from visual cortex through to inferotemporal cortex, is oversimplified.
There are as many backward projecting neurons (associated with top-down
processing) as forward projecting ones throughout most of the visual system
(Gilbert & Li, 2013). Up to 90% of the synapses from incoming neurons to
primary visual cortex (involved in early visual processing) originate in the
cortex and thus reflect top-down processes. Recurrent processing (a form
of top-down processing) from higher to lower brain areas is often necessary
for conscious visual perception (van Gaal & Lamme, 2012; see Chapter 16).
Top-down processes should have their greatest impact on object recognition when bottom-up processes are relatively uninformative (e.g., when
observers are presented with degraded or briefly presented stimuli). Support
for this prediction is discussed on p. 112.
Findings
Evidence for the involvement of top-down processes in visual perception was reported by Goolkasian and Woodberry (2010). They presented
observers with ambiguous figures immediately preceded by primes relevant
to one interpretation (see Figure 3.11). The primes systematically biased the
interpretation of the ambiguous figures via top-down processes.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 111
28/02/20 6:43 PM
112
Visual perception and attention
Young boy
Peacock feathers
Words on a page
Figure 3.11
Ambiguous figures
(e.g., Eskimo/Indian, Liar/
Face) were preceded by
primes (e.g., Winter Scene,
Tomahawk) relevant to
one interpretation of the
following figure.
From Goolkasian and
Woodberry (2010). Reprinted
with permission from the
Psychonomic Society 2010.
Viggiano et al. (2008) obtained strong evidence that top-down processes within the prefrontal cortex influence object recognition. Observers
viewed blurred or non-blurred photographs of living and non-living objects.
On some trials, repetitive transcranial magnetic stimulation (rTMS; see
Glossary) was applied to the dorsolateral prefrontal cortex to disrupt topdown processing. rTMS slowed object recognition time only with blurred
photographs. Thus, top-down processes were directly involved in object
recognition when the sensory information available to bottom-up processes
was limited.
Controversy
Firestone and Scholl (2016) argued in a review, “There is . . . no evidence
for top-down effects of cognition on visual perception.” They claimed that
top-down processes often influence response bias, attention or memory
rather than perception itself.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 112
28/02/20 6:43 PM
113
Object and face recognition
First, consider ambiguous or reversible figures (e.g., the faces-goblet
illusion shown in Figure 3.5). Observers alternate between the two possible
interpretations (e.g., faces vs goblet). The dominant one at any moment
depends on their direction of attention but not necessarily on top-down
processes (Long & Toppino, 2004).
Second, Auckland et al. (2007) presented observers briefly with a
target object (e.g., playing cards) surrounded by four context objects. When
the context objects were semantically related to the target (e.g., dice; chess
pieces; plastic chips; dominoes), the target was recognised more often than
when they were semantically unrelated. This finding depended in part on
response bias (i.e., guesses based on context) rather than perceptual information about the target. (More evidence of response bias is discussed in
the Box).
Firestone and Scholl’s (2016) article has provoked much controversy.
Lupyan (2016, p. 40) attacked their tendency to attribute apparent topdown effects on perception to attention, memory and so on: “This ‘It’s
not perception, it’s just X’ reasoning assumes that attention, memory, and
so forth be cleanly split from perception proper.” In fact, all these processes interact dynamically and so attention, perception and memory are
not clearly separate.
KEY TERM
Shooter bias
The tendency for unarmed
black individuals to be
more likely than unarmed
white individuals to be
shot.
IN THE REAL WORLD: SHOOTER BIAS
Shooter bias is shown by “More shooting errors for unarmed black than white suspects” (Cox &
Devine, 2016, p. 237). Black Americans are more than twice as likely as white Americans to be
unarmed when killed by the police (Ross, 2015). For example, on 22 November 2014, a police
officer in Cleveland, Ohio, shot dead a 12-year-old black male (Tamir Rice) playing with a replica
pistol.
Shooter bias may reflect top-down influences on visual perception. Payne (2006) presented a
white or black face followed by the very brief presentation of a gun or tool. When participants
made a rapid response, they indicated falsely they had seen a gun more often when the face was
black.
Shooter bias reflects top-down effects based on inaccurate racial stereotypes associating black
individuals with threat (e.g., Azevedo et al., 2017). This bias might be due to direct top-down
effects on perception: objects are more likely to be misperceived as guns if held by black individuals. Alternatively, shooter bias may reflect response bias (the expectation someone has a gun
is greater if that person is black rather than white): there is no effect on perception but shooters
require less perceptual evidence to shoot a black individual.
Azevedo et al. (2017) found a briefly presented weapon (a gun) was more accurately perceived
when preceded by a black face than a white one. However, the opposite was the case when a tool
was presented. These findings were due to response bias rather than perception.
Moore-Berg et al. (2017) asked non-black participants to decide rapidly whether or not to shoot
an armed or unarmed white or black person of high or low socio-economic status. There was shooter
bias: participants were biased towards shooting if the individual was black, of low socio-economic
status, or both. This shooter bias mostly reflected a response bias against shooting a white person
of high socio-economic status (probably because of a low level of perceived danger).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 113
28/02/20 6:43 PM
114
Visual perception and attention
Further findings
Howe and Carter (2016) identified two
perception-like phenomena driven by top­
down processes. First, there are visual hallucinations which are found in schizophrenic
patients. Hallucinations are experienced as
actual perceptions even though the relevant
object is not present and so they cannot
depend on bottom-up processes. Second, there
is visual imagery (discussed on pp. 130–137).
Visual imagery involves several processes
involved in visual perception. Like hallucinations, visual imagery occurs in the absence
of bottom-up processes because the relevant
object is absent.
Lupyan (2017) discussed numerous topdown effects on visual perception in studies
avoiding the problems identified by Firestone
and Scholl (2016). Look at Figure 3.12, which
apparently shows an ordinary brick wall. If
that is what you see, have another look. This
Figure 3.12
time see whether you can spot the object
A brick wall that can be seen as something else.
mentioned at the end of the Conclusions
From Plait (2016).
section. Once you have spotted the object, it
becomes impossible not to see it afterwards.
Here we have powerful effects of top-down processing on perception based
on knowledge of what is in the photograph.
Conclusions
As Firestone and Scholl (2016) argued, it is hard to demonstrate top-down
processes directly influence perception rather than attention, memory or
response bias. However, many studies have shown such a direct influence.
As Yardley et al. (2012, p. 1) pointed out, “Perception relies on existing
knowledge as much as it does on incoming information.” Note, however,
the influence of top-down processes is generally greater when visual stimuli
are degraded. By the way, the hard-to-spot object in the photograph is a
cigar!
Theories emphasising top-down processes
Bar et al. (2006) found greater activation of the orbitofrontal cortex
(part of the prefrontal cortex) when object recognition was hard than
when it was easy. This activation occurred 50 ms before activation in
­recognition-related regions of the temporal cortex, and so seemed important for object recognition. In Bar et al.’s model, object recognition
depends on top-down processes involving the orbitofrontal cortex and
­bottom-up processes involving the ventral visual stream (see Figure 3.13;
and Chapter 2).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 114
28/02/20 6:43 PM
115
Object and face recognition
Trapp and Bar (2015) developed this
model claiming that visual input rapidly
elicits various hypotheses concerning what
has been presented. Subsequent top-down
processes a­ssociated with the orbitofrontal
cortex select ­
relevant hypotheses and suppress irrelevant ones. More specifically, the
orbitofrontal cortex uses contextual information to generate hypotheses and resolve
competition among hypotheses. Palmer
(1975) showed the ­
importance of context.
He presented a picture of a scene (e.g., a
kitchen) followed by the very brief presentation of the picture of an object. The object
was recognised more often when relevant to
the context (e.g., a loaf) than when irrelevant
(e.g., a drum).
Interactive-iterative framework
Baruch et al. (2018) argued that previous theorists had not appreciated the full complexities
of interactions between bottom-up and topdown processes in object recognition. They
rectified this situation with their interactive-­
iterative framework (see Figure 3.14).
According to this framework, observers
typically form hypotheses concerning object
identity based on their goals, knowledge and
the environmental context. Of importance,
these hypotheses are often formed before the
object is presented. Observers discriminate
among competing hypotheses by attending to a distinguishing feature of the object.
For example, if your tentative hypothesis
was elephant, you might allocate attention
to the expected location of its trunk. If that
failed to provide the necessary information
because that area was partially hidden (see
Figure 3.15), you might then attend to other
features (e.g., size and shape of the leg; skin
texture).
In sum, Baruch et al. (2018) emphasised
two related top-down processes strongly influencing object recognition. First, observers
form hypotheses about the possible identity
of an object prior to (or in interaction with)
the visual input. Second, observers direct
their attention to object parts likely to be
maximally informative concerning its identity.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 115
Figure 3.13
In this modified version of Bar et al.’s (2006) theory, it is
assumed that object recognition involves two different routes:
(1) a top-down route in which information proceeds rapidly
to the orbitofrontal cortex, which is involved in generating
predictions about the object’s identity; (2) a bottom-up route
using the slower ventral visual stream.
From Yardley et al. (2012). Reprinted with permission from Springer.
Pre-existing dynamically
changing context
Goals, knowledge and
context-based expectancies
Hypothesis/es regarding
object identity
Object
identified?
yes
Response
no
Visual
data
extraction
Visual
input
Guidance of attention to
distinguishing features
Potential
conflict
Capture of attention by
salient features
Top-down
Bottom-up
Figure 3.14
Interactive-iterative framework for object recognition with
top-down processes shown in dark green and bottom-up
processes in brown.
From Baruch et al. (2018). Reprinted with permission of Elsevier.
28/02/20 6:43 PM
116
Visual perception and attention
Findings
According to the ­
interactive-iterative
framework, expectations can exert topdown influences on processing even before
a visual stimulus is presented. Kok et al.
(2017) obtained support for this prediction. Observers expecting a given stimulus
produced a neural signal resembling that
generated by the actual presentation of the
stimulus shortly before it was presented.
Baruch et al. (2018) tested various predictions from their theoretical framework. In
one experiment, participants decided which
of two types of artificial fish (tass or grout)
Figure 3.15
had been presented. The two fish types difRecognising an elephant when a key feature (its trunk) is
fered with respect to distinguishing features
partially hidden.
associated with the tail and the mouth, with
From Baruch et al. (2018). Reprinted with permission of Elsevier.
the tail being easier to discriminate. As predicted, participants generally attended more to the tail than the mouth
region from stimulus onset. When much of the tail region was hidden from
view, participants redirected their attention to the mouth region.
Summary
Numerous theorists have argued that object recognition depends on
top-down processes as well as bottom-up ones. Baruch et al.’s (2018)
­interactive-iterative framework extends such ideas by identifying how these
two types of processes interact. Of central importance, top-down processes
influence the allocation of attention, and the allocation of attention influences subsequent bottom-up processing.
FACE RECOGNITION
There are two main reasons for devoting a section to face recognition.
First, recognising faces is of enormous importance to us, since we generally identify individuals from their faces. Form a visual image of someone
important in your life – it probably contains detailed information about
their face.
Second, face recognition differs importantly from other forms of object
recognition. As a result, we need theories specifically devoted to face recognition rather than simply relying on theories of object recognition.
KEY TERM
Holistic processing
Processing that involves
integrating information
from an entire object
(especially faces).
Face vs object recognition
How does face recognition differ from object recognition? There is more
holistic processing in face recognition. Holistic processing involves “integration across the area of the face, or processing of the relationships
between features as well as, or instead of, the features themselves” (Watson
& Robbins, 2014, p. 1). Holistic processing is faster because facial features
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 116
28/02/20 6:43 PM
117
Object and face recognition
are processed in parallel rather than individually. Caharel et al. (2014)
found faces can be categorised as familiar or unfamiliar within approximately 200 ms. Holistic processing is also more reliable than feature processing because individual facial features (e.g., mouth shape) are subject to
change.
Relevant evidence comes from the face inversion effect: faces are
much harder to identify when presented inverted or upside-down rather
than upright (Bruyer, 2011). This effect probably reflects difficulties in
processing inverted faces holistically. There are surprisingly large effects
of face inversion within the brain – Rosenthal et al. (2017, p. 4823) found
face inversion “induces a dramatic functional reorganisation across related
brain networks”.
In contrast, adverse effects of inversion are often much smaller with
non-face objects. For example, Klargaard et al. (2018) found there was a
larger inversion effect for faces than for cars. However, it can be argued we
possess expertise in face recognition and so we should consider individuals
possessing expertise with non-face objects. The findings are mixed. Rossion
and Curran (2010) found car experts had a much smaller inversion effect
for cars than faces. However, those with the greatest expertise showed a
greater inversion effect for cars. In contrast, Weiss et al. (2016) found horse
experts had no inversion effect for horses.
More evidence suggesting faces are special comes from the ­part-whole
effect – it is easier to recognise a face part when presented within a
whole face rather than in isolation. Farah (1994) studied this effect using
drawings of faces and houses. Participants’ ability to recognise face
­
parts was much better when whole faces were presented rather than only
a single feature (i.e., the part-whole effect). In contrast, recognition performance for house features was very similar in whole and single-feature
conditions.
Richler et al. (2011) explored the hypothesis that faces are processed
holistically by using composite faces. Composite faces consist of a top half
and a bottom half that may or may not be from the same face. The task
was to decide whether the top halves of two successive composite faces
were the same or different. Performance was worse when the bottom halves
of the two composite faces were different. This composite face effect suggests people find it hard to ignore the bottom halves and thus that face
processing is holistic.
Finally, accurate face recognition is so important to humans we might
expect to find holistic processing of faces even in young children. As predicted, children aged between 3 and 5 show holistic processing (McKone
et al., 2012).
In sum, face recognition (even in young children) involves holistic processing. However, it remains unclear whether the processing differences
between faces and other objects occur because faces are special or because
we have dramatically more expertise with faces than most other object categories. Relevant evidence was reported by Ross et al. (2018). When participants were presented with car pictures, car experts formed more holistic
representations within the brain than did car novices. The role played by
expertise is discussed further shortly.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 117
KEY TERM
Face inversion effect
The finding that faces are
much harder to recognise
when presented upside
down; the effect of
inversion is less marked
(or absent) with other
objects.
Part-whole effect
The finding that a face
part is recognised more
easily when presented in
the context of a whole
face rather than on its
own.
28/02/20 6:43 PM
118
Visual perception and attention
KEY TERM
Prosopagnosia
Prosopagnosia
A condition (also known
as face blindness) in
which there is a severe
impairment in face
recognition but much
less impairment of
object recognition; it
is often the result of
brain damage (acquired
prosopagnosia) but can
also be due to impaired
development of facerecognition mechanisms
(developmental
prosopagnosia).
Much research has involved brain-damaged patients with severely impaired
face processing. Such patients suffer from prosopagnosia (pros-uh-pagNO-see-uh) coming from the Greek words for “face” and “without knowledge”. Prosopagnosia is also known as “face blindness”.
Prosopagnosia is a heterogeneous or diverse condition with the precise
problems of face and object recognition varying across patients. It can
be caused by brain damage (acquired prosopagnosia) or can occur in the
absence of any obvious brain damage (developmental prosopagnosia).
Acquired prosopagnosics differ in terms of their specific face-processing
deficits and brain areas involved (discussed later).
Studying prosopagnosics is of direct relevance to the issue of whether
face recognition involves specific or specialised processes absent from
object recognition. If prosopagnosics invariably have great impairments
in object recognition, it would suggest face and object recognition involve
similar processes. In contrast, if some prosopagnosics have intact object
recognition, it would imply the processes underlying the two forms of recognition are different.
Farah (1991) reviewed research on patients with acquired prosopagnosia. All these patients also had more general problems with object
recognition. However, some exceptions have been reported. Moscovitch
IN REAL LIFE: HEATHER SELLERS
We can understand the profound problems
prosopagnosics suffer in everyday life by
considering Heather Sellers (see YouTube: “You
­
Don’t Look Like Anyone I Know”). She is an
American woman with severe prosopagnosia.
When she was a child, she became separated
from her mother at a grocery store. When reunited with her mother, she did not initially recognise her.
Heather Sellers still has difficulty in recognising
her own face. Heather: “A few times I have been
in a crowded elevator with mirrors all found and a
woman will move, and I will go to get out the way
and then realise ‘oh that woman is me’.” Such
experiences made her very anxious.
Surprisingly, Heather Sellers was 36 before she
realised she had prosopagnosia. Why was this?
Heather Sellers.
As a child, she became very skilled at identifying
Patricia Roehling.
people by their hair style, body type, clothing,
voice and gait. In spite of these skills, she has
occasionally failed to recognise her own husband! According to Heather Sellers, “Not being able
to reliably know who people are – it feels terrible like failing all the time.”
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 118
28/02/20 6:43 PM
Object and face recognition
119
et al. (1997) studied CK, a man with object agnosia (impaired object recognition). He performed comparably to healthy controls on several face-­
recognition tasks including photos, caricatures and cartoons.
Geskin and Behrmann (2018) reviewed the literature on patients with
developmental prosopagnosia. Out of 238 cases, 80% had impaired object
recognition but 20% did not. Thus, several patients had impaired face
recognition but not object recognition. We would have a double dissociation (see Glossary) if we could find individuals with developmental object
agnosia but intact face recognition. Germine et al. (2011) found a female
(AW), who had preserved face recognition but impaired object recognition
for many categories of objects.
Overall, far more individuals have impaired face recognition (prosopagnosia) but relatively intact object recognition than have impaired object
recognition but intact face recognition. These findings suggest that, “Face
recognition is an especially difficult instance of object recognition where
both systems [i.e., face and object recognition] rely on a common mechanism” (Geskin & Behrmann, 2018, p. 18). Face recognition is hard in part
because it involves distinguishing among broadly similar category members
(e.g., two eyes; nose; mouth). In contrast, object recognition often only
involves identifying the relevant category (e.g., cat; car). According to this
viewpoint, prosopagnosics would perform poorly if required to make finegrained perceptual judgments with objects.
An alternative interpretation emphasises expertise (Wang et al., 2016).
Nearly everyone has more experience (and expertise) at recognising faces
than the great majority of other objects. It is thus possible that brain
damage in prosopagnosics affects areas associated with expertise generally rather than specifically faces (the expertise hypothesis is discussed on
pp. 122–124).
Findings
In spite of their poor conscious or explicit recognition of faces, many
prosopagnosics show evidence of covert recognition (face processing
without conscious awareness). For example, Eimer et al. (2012) found
developmental prosopagnosics were much worse than healthy controls at
explicit recognition of famous faces (27% vs 82% correct, respectively).
However, famous faces produced brain activity in half the developmental prosopagnosics indicating the relevant memory traces were activated
(covert recognition). These prosopagnosics have very poor explicit recognition performance because brain areas containing more detailed information
about the famous individuals were not activated.
Busigny et al. (2010b) compared the first two interpretations discussed
above by using object-recognition tasks requiring complex within-category
distinctions for several categories: birds, boats, cars, chairs and faces. A
male patient (GG) with acquired prosopagnosia was as accurate as controls with each non-face category (see Figure 3.16). However, he was substantially less accurate than controls with faces (67% vs 94%, respectively).
Thus, GG apparently has a face-specific impairment rather than a general
inability to recognise complex stimuli.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 119
28/02/20 6:43 PM
120
Visual perception and attention
Figure 3.16
Accuracy and speed of
object recognition for birds,
boats, cars, chairs and faces
by patient GG and healthy
controls.
From Busigny et al. (2012b).
Reprinted with permission from
Elsevier.
Busigny et al. (2010a) reviewed previous findings suggesting many
patients with acquired prosopagnosia have essentially intact object recognition. However, this research was limited because the difficulty of the
recognition decisions required of the patients was not controlled systematically. Busigny et al. manipulated the similarity between target items and
distractors on an object-recognition task. Increasing similarity had comparable effects on PS (a patient with acquired prosopagnosia) and healthy
controls. In contrast, PS performed very poorly on a face-recognition task
which was very easy for healthy controls.
Why is face recognition so poor in prosopagnosics? Busigny et al.
(2010b) tested the hypothesis that they have great difficulty with holistic
processing. A prosopagnosic patient (GG) did not show the face inversion
or composite face effects suggesting he does not perceive individual faces
holistically (an ability enhancing accurate face recognition). In contrast,
GG’s object recognition was intact perhaps because holistic processing was
not required.
Van Belle et al. (2011) also investigated the deficient holistic processing hypothesis. GG’s face-recognition performance was poor when holistic
processing was possible. However, it was intact when it was not possible to
use holistic processing (only one part of a face was visible at a time).
Finally, we consider the expertise hypothesis. According to this
­hypothesis, faces differ from most other categories of objects in that we
have more expertise in identifying faces. As a result, apparent differences
between faces and other objects in processes and brain mechanisms may
mostly reflect differences in expertise. This hypothesis is discussed further
below on pp. 122–124. Barton and Corrow (2016) reported evidence
­consistent with this hypothesis in patients with acquired prosopagnosia
who had expertise in car recognition and reading prior to their brain
damage. These patients had impairments in car recognition and aspects
of visual word reading suggesting they had problems with objects for
which they had ­possessed expertise (i.e., objects of expertise). Contrary
evidence was reported by Weiss et al. (2016), who studied a patient
(OH) with ­
developmental prosopagnosia. In spite of severely impaired
face-recognition ability, she displayed superior recognition skills for
­
horses (she had spent 15 years working with them). Thus, visual expertise
can be acquired independently of the mechanisms responsible for expertise
in face recognition.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 120
28/02/20 6:43 PM
121
Object and face recognition
In sum, the finding that many prosopagnosics have face-specific
impairments is consistent with the hypothesis that face recognition involves
special processes. However, more general recognition impairments have
also often been reported and are apparently inconsistent with that hypothesis. There is also some support for the expertise hypothesis but again the
findings are mixed.
Fusiform face area
KEY TERM
Fusiform face area
An area that is associated
with face processing;
the term is somewhat
misleading given that the
area is also associated
with processing other
categories of objects.
If faces are processed differently to other objects, we would expect to find
brain regions specialised for face processing. The fusiform face area (FFA)
in the ventral temporal cortex has (as its name strongly implies!) been identified as such a brain region.
The fusiform face area is indisputably involved in face processing. Downing et al. (2006) found the fusiform face area responded more
strongly to faces than any of 18 object categories (e.g., tools; fruits; vegetables). However, other brain regions, including the occipital face area
(OFA) and the superior temporal sulcus (STS) are also face-selective
(Grill-Spector et al., 2017; see Figure 3.17). Such findings indicate that face Interactive feature:
processing depends on one or more brain networks rather than simply on Primal Pictures’
3D atlas of the brain
the fusiform face area.
Even though several brain areas are face-selective, the fusiform
face area has been regarded as having special importance. For example,
Axelrod and Yovel (2015) considered brain activity in several face-­
selective regions when observers were shown photos of Leonardo
DiCaprio and Brad Pitt. The fusiform face area was the only region in
which the pattern of brain activity differed significantly between these
actors. However, Kanwisher et al. (1997) found only 80% of their participants had greater activation within the fusiform face area to faces than
to other objects.
In sum, the fusiform face area plays a major role in face processing
and recognition for most (but probably not all) individuals. However, face
(a) Dorsal
(b) Ventral
OFA
FFA
ATL-FA
IFG-FA
pSTS-FA
OFA
aSTS-FA
Figure 3.17
Face-selective areas in the right hemisphere. OFA = occipital face area; FFA = fusiform
face area; pSTS-FA and aSTS-FA = posterior and anterior superior temporal sulcus face
areas; IFG-FA = inferior frontal gyrus face area; ATL-FA = anterior temporal lobe face
area.
From Duchaine and Yovel (2015).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 121
28/02/20 6:43 PM
122
Visual perception and attention
processing depends on a brain network, including several areas in addition
to the fusiform face area (see Figure 3.17). Note also that the fusiform
face area is activated when we process numerous types of non-face objects.
Finally, face-processing deficits in prosopagnosics are not limited to the
fusiform face area. For example, developmental prosopagnosics had less
selectivity for faces than healthy controls in 12 different face areas (including the fusiform face area) (Jiahui et al., 2018).
Expertise hypothesis
According to advocates of the expertise hypothesis (e.g., Wang et al., 2016;
discussed on p. 119), major differences between face and object processing
should not be taken at face value (sorry!). According to this hypothesis,
the brain and processing mechanisms allegedly specific to faces are also
involved in processing and recognising all object categories for which we
possess expertise. Thus, we should perhaps relabel the fusiform face area as
the “fusiform expertise area”.
Why is expertise so important in determining face and object processing? One reason is that expertise leads to greater holistic or integrated processing. For example, chess experts can very rapidly use holistic processing
based on their relevant stored knowledge to understand complex chess
positions (see Chapter 12).
Three main predictions follow from the expertise hypothesis:
(1)
(2)
(3)
Holistic or configural processing is not unique to faces but should be
found for any objects of expertise.
The fusiform face area should be highly activated when observers ­recognise the members of any category for which they possess
expertise.
If the processing of faces and of objects of expertise involves similar
processes, then processing objects of expertise should interfere with
face processing.
Findings
The first prediction is plausible. Wallis (2013) tested a model of object recognition to assess the effects of prolonged exposure to any given stimulus
category. The model predicted that many phenomena associated with face
processing (e.g., holistic processing; inversion effect) would be found for
any stimulus category for which observers had expertise. Repeated simultaneous presentation of the same features (e.g., nose; mouth; eyes) gradually
increases holistic processing. Wallis concluded a single model can explain
object and face recognition.
There is some support for the first prediction in research on detection
of abnormalities in medical images (see Chapter 12). Kundel et al. (2007)
found experts generally fixated on an abnormality in under 1 second suggesting they used very fast, holistic processes. However, as we saw earlier,
experts with non-face objects often have a small inversion effect (assumed
to reflect holistic processing). McKone et al. (2007) found such experts
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 122
28/02/20 6:43 PM
Object and face recognition
123
rarely show the composite effect (also assumed to reflect holistic processing; discussed on p. 117).
We turn now to the second prediction. In a review, McKone et al.
(2007) found a modest tendency for the fusiform face area to be more
activated by objects of expertise than other objects. However, larger activation effects for objects of expertise were found outside the ­fusiform face
area than inside it. Support for the second prediction was reported by
McGugin et al. (2014): activation to car stimuli within the ­fusiform face
area was greater in participants having greater car expertise.
McGugin et al. (2018) argued that we can test the second prediction
by comparing individuals varying in face-recognition ability (or expertise). As predicted, those with high face-recognition ability exhibited more
face-selective activation within the fusiform face area than those having
low ability.
Bilalić (2016) found chess experts had more activation in the fusiform
face area than non-experts when viewing chess positions but not single
chess pieces. He concluded, “The more complex the stimuli, the more likely
it is that the brain will require the help of the FFA in grasping its essence”
(p. 1356).
It is important not to oversimplify the issues here. Even if face processing and processing of other objects of expertise both involve the fusiform
face area, the two forms of processing may use different neurons in different combinations (Grill-Spector et al., 2017).
We turn now to the third prediction. McKeeff et al. (2010) found that
car experts were slower than novices when searching for face targets among
cars but not among watches. Car and face expertise may have interfered
with each other because they depend on similar processes. Alternatively,
car experts may have been more likely than car novices to attend to distracting cars because they find such stimuli more interesting.
McGugin et al. (2015) also tested the third prediction. Overall, car
experts had greater activation than car novices in face-selective areas (e.g.,
fusiform face area) when processing cars. Of key importance, that difference was greatly reduced when faces were also presented. Thus, interference was created when participants processed objects belonging to two
different categories of expertise (i.e., cars and faces).
Evaluation
There is some support for the expertise hypothesis with respect to all three
predictions. However, the extent of that support remains controversial. One
reason is that it is hard to assess expertise level accurately or to control it. It
is certainly possible that many (but not all) processing differences between
faces and other objects are due to greater expertise with faces. This would
imply that faces are less special than often assumed.
According to the expertise hypothesis, we are face experts. This may
be true of familiar faces, but it is certainly not true of unfamiliar faces
(Young & Burton, 2018). Evidence of the problems we experience in recognising unfamiliar faces is contained in the Box on passport control.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 123
28/02/20 6:43 PM
124
Visual perception and attention
IN REAL LIFE: PASSPORT CONTROL
Look at the 40 faces displayed below (see Figure 3.18). How many different individuals are shown?
Provide your answer before reading on.
Figure 3.18
An array of 40 face photographs to be sorted into piles for each of the individuals shown in the photographs.
From Jenkins et al. (2011). Reproduced with permission from the Royal Society.
In a study by Jenkins et al. (2011) using a similar stimulus array, participants on average decided
7.5 different individuals were shown. However, the actual number for the array used by Jenkins et
al. and the one shown in Figure 3.18 is only two! The two individuals (A and B) are arranged as
shown below:
A
B
A
A
A
B
A
B
A
B
A
A
A
A
A
B
B
B
A
B
B
B
B
A
A
A
B
B
A
A
B
A
B
A
A
B
B
B
B
B
Perhaps we are poor at matching unfamiliar faces because we rarely perform this task in everyday life. White et al. (2014) addressed this issue in a study on passport officers averaging 8 years of
service. These passport officers indicated on each trial whether a photograph was that of a physically present person. Overall, 6% of valid photos were rejected and 14% of fraudulent photos were
wrongly accepted. Thus, individuals with specialist training and experience are not exempt from
problems in matching unfamiliar faces. The main problem is that there is considerable variability in
how an individual looks in different photos (discussed further on p. 127).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 124
28/02/20 6:43 PM
125
Object and face recognition
In another experiment, White et al. (2014) compared the performance of passport officers and
students on a matching task with unfamiliar faces. The two groups were comparable with 71%
correct performance on match trials and 89% on non-match trials. Thus, training and experience
were irrelevant.
In White et al.’s (2014) research, 50% of the photos were invalid (non-matching). This is (hopefully!) a massively higher percentage of invalid photos than typically found at passport control.
Papesh and Goldinger (2014) compared performance when actual mismatches occurred on 50% or
10% of trials. In the 50% condition, mismatches were missed on 24% of trials, whereas they were
missed on 49% of trials in the 10% condition. Participants had a low expectation of mismatches in
the 10% condition and so were very cautious about deciding two photos were of different individuals (i.e., they had a very cautious response criterion indicating response bias).
Papesh et al. (2018) replicated the above findings. They attempted to improve performance
in the condition where mismatches occurred on only 10% of trials by introducing blocks where
mismatches occurred on 90% of trials. However, this manipulation had little effect because participants were reluctant to abandon their very cautious response criterion.
How can we provide better security at passport control? Increased practice at matching unfamiliar faces is not the answer – White et al. (2014) found performance was unrelated to the number
of years passport officers had served. A promising approach is to find individuals having an exceptional ability to recognise faces (super-recognisers). Robertson et al. (2016) asked participants to
decide whether face pairs depicted the same person. Mean accuracy was 96% for previously identified police super-recognisers compared to only 81% for police trainees.
Why do some individuals have very superior face-recognition ability? Wilmer et al. (2010)
found the face-recognition performance of monozygotic (identical) twins was much closer than that
of dizygotic (fraternal) twins, indicating face-recognition ability is strongly influenced by genetic
factors. Face-recognition ability correlated very modestly with other forms of ­
recognition (e.g.,
abstract art images), suggesting it is very specific. In similar fashion, Turano et al. (2016) found
good and poor face recognisers did not differ with respect to car-recognition ability.
Theoretical approaches
Bruce and Young’s (1986) model has been the most influential theoretical
approach to face processing and recognition and so we start with it. It is a
serial stage model consisting of eight components (see Figure 3.19):
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Structural encoding: this produces various descriptions or representations of faces.
Expression analysis: people’s emotional states are inferred from their
facial expression.
Facial speech analysis: speech perception is assisted by lip reading (see
Chapter 9).
Direct visual processing: specific facial information is processed
selectively.
Face recognition units: these contain structural information about
known faces; this structural information emphasises the less change­
able aspects of the face and is fairly abstract.
Person identity nodes: these provide information about individuals
(e.g., occupation; interests).
Name generation: a person’s name is stored separately.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 125
Interactive exercise:
Face recognition
KEY TERM
Super-recognisers
Individuals with an
outstanding ability to
recognise faces.
28/02/20 6:43 PM
126
Visual perception and attention
(8) Cognitive system: this contains additional information (e.g., most actors
have attractive faces); it influences which
components receive attention.
What predictions follow? First, there should
be major differences in the processing of
familiar and unfamiliar faces because various
components (face recognition units; person
identity nodes; name generation) are involved
only when processing familiar faces. Thus, it is
much easier to recognise familiar faces, especially when faces are seen from an unusual
angle.
Second, separate processing routes are
involved in working out facial identity (who
is it?) and facial expression (what is he/she
feeling?). The former processing route (including the occipital face area and the fusiform
face area) focuses on relatively unchanging
aspects of faces, whereas the latter (involving
the superior temporal sulcus) deals with more
changeable aspects. This separation between
processes responsible for recognising identity
and expression makes sense – if there were
no separation, we would have great problems recognising familiar faces with unusual
expressions (Young, 2018).
Third, when we see a familiar face, familFigure 3.19
iarity information from the face recognition
The model of face recognition put forward by Bruce and
unit should be accessed first. This is followed
Young (1986).
by information about that person (e.g., occuAdapted from Bruce and Young (1986). Reprinted with permission of
pation) from the person identity node and
Elsevier.
then that person’s name from the name generation component. As a result, we can find a
face familiar while unable to recall anything else about that person, or we
can recall personal information about a person while being unable to recall
their name. However, a face should never lead to recall of the person’s
name in the absence of other information.
Fourth, the model assumes face processing involves several stages.
This implies the nature of face-processing impairments in brain-damaged
patients depends on which stages of processing are impaired. DaviesThompson et al. (2014) developed the model to account for three forms of
face impairment (see Figure 3.20).
Findings
According to the model, it is easier to recognise familiar faces than unfamiliar ones for various reasons. Of special importance, we possess much more
structural information about familiar faces. This structural information
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 126
28/02/20 6:43 PM
Object and face recognition
127
Early perceptual encoding
Perceptual
encoding of
dynamic
structure
Perceptual
encoding of
static
structure
Expression
analysis
Facial
memories
Voice & gait
analysis
Biographic
information
Semantic data
Name input
Pole
aLT
FFA
OFA
Damage results in:
Apperceptive prosopagnosia
Associative prosopagnosia
Person-specific amnesia
Figure 3.20
Damage to regions of the inferior occipito-temporal cortex (including fusiform face area
(FFA) and occipital face area (OFA)) is associated with apperceptive prosopagnosia
(blue); damage to anterior inferior temporal cortex (aLT) is associated with associative
prosopagnosia (red); and damage to the anterior temporal pole is associated with
person-specific amnesia (green). Davies-Thompson et al. (2014) discuss evidence
consistent with their model.
From Davies-Thompson et al. (2014).
(associated with face recognition units) relates to relatively unchanging
aspects of faces and gradually accumulates with increasing familiarity with
any given face.
However, the differences in ease of recognition between familiar and
unfamiliar faces are greater than envisaged by Bruce and Young (1986).
Jenkins et al. (2011) found 40 face photographs showing only two different
unfamiliar individuals were thought to show almost four times that number
(discussed on p. 124). The two individuals were actually two Dutch celebrities almost unknown in Britain. When Jenkins et al. (2011) repeated their
experiment with Dutch participants, nearly all performed the task perfectly
because the faces were so familiar.
Why is unfamiliar face recognition so difficult? There is ­considerable
within-person variability in facial images, which is why different photographs of the same unfamiliar individual often look as if they come from
­different individuals (Young & Burton, 2017, 2018). Jenkins and Burton
(2011) argued we could improve identification of unfamiliar faces by averaging across several photographs of the same individual and so greatly
­reducing image variability. Their findings supported this prediction.
Burton et al. (2016) shed additional light on the complexities of recognising unfamiliar faces. In essence, how one person’s face varies across
images differs from how someone else’s face varies. Thus, the characteristics that vary or remain constant across images differ from one individual
to another.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 127
28/02/20 6:43 PM
128
Visual perception and attention
The second prediction is that different routes are involved in the processing of facial identity and facial expression. There is some support
for this prediction. Fox et al. (2011) found patients with damage to the
face-recognition network had impaired identity perception but not expression perception. In contrast, a patient with damage to the superior temporal sulcus had impaired expression perception but reasonably intact
identity perception. Sliwinska and Pitcher (2018) confirmed the role played
by the superior temporal sulcus. Transcranial magnetic stimulation (TMS;
see Glossary) applied to this area impaired recognition of facial expression.
However, the two routes are not entirely independent. Judgements of
facial expression are strongly influenced by irrelevant identity information
(Schweinberger & Soukup, 1998). Redfern and Benton (2017) asked participants to sort cards of faces into piles, one for each perceived identity. One
pack contained expressive faces and the other neutral faces. With expressive faces, faces belonging to different individuals were more likely to be
placed in the same pile. Thus, expressive facial information can influence
(and impair) identity perception.
Fitousi and Wenger (2013) asked participants to respond positively to
a face that had a given identity and emotion (e.g., a happy face belonging
to Kiera Knightley). Facial identity and facial expression were not processed independently although they should have been according to the
model.
Another issue is that the facial expression route is more complex than
assumed theoretically. For example, damage to the amygdala produces
greater deficits in recognising fear and anger than other emotions (Calder
& Young, 2005). Young and Bruce (2011) admitted they had not expected
deficits in emotion recognition in faces to vary across emotions.
The third prediction is that we always retrieve personal information
(e.g., occupation) about a person before recalling their name. Young et al.
(1985) asked people to record problems they experienced in face recognition. There were 1,008 such incidents but people never reported putting
a name to a face while knowing nothing else about that person. In contrast, there were 190 occasions on which someone remembered a reasonable amount of information about a person but not their name (also as
predicted by the model).
Several other findings support the third prediction (Hanley, 2011).
However, the notion that names are always recalled after personal information is too rigid. Calderwood and Burton (2006) asked fans of the
television series Friends to recall the name or occupation of the main characters when shown their faces. Names were recalled faster than occupations (against the model’s prediction).
Fourth, we relate face-processing impairments to Bruce and Young’s
(1986) serial stage model. We consider three such impairments (discussed by
Davies-Thompson et al., 2014; see Figure 3.20) with reference to Figure 3.19:
(1)
Patients with impaired early stages of face processing: such patients
(categorised as having apperceptive prosopagnosia) have “an inability
to form a sufficiently accurate representation of the face’s structure
from visual data” (Davies-Thompson et al., 2014, p. 161). As a result,
faces are often not recognised as familiar.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 128
28/02/20 6:43 PM
Object and face recognition
(2)
(3)
129
Patients with impaired ability to access facial memories in face recognition units although early processing of facial structure is relatively
intact: such patients have associative prosopagnosia: they have greater
problems with memory than perception.
Patients with impaired access to biographical information stored in
person identity nodes: such patients have person-specific amnesia and
differ from those with associative prosopagnosia because they often
cannot recognise other people by any cues (including spoken names
or voices).
So far we have applied Bruce and Young’s (1986) model to acquired prosopagnosia. However, we can also apply the model to developmental prosopagnosia (in which face-recognition mechanisms fail to develop normally).
Parketny et al. (2015) presented previously unfamiliar faces to developmental prosopagnosics and recorded event-related potentials (ERPs) while
they performed an easy face-recognition task. They focused on three ERP
components:
(1)
(2)
(3)
N170: this early component (about 170 ms) reflects processes involved
in perceptual structural face processing.
N250: this component (about 250 ms) reflects a match between a presented face and a stored face representation.
P600: this component (about 600 ms) reflects attentional processes
associated with face recognition.
What did Parketny et al. (2015) find? Recognition times were 150 ms slower
in the developmental prosopagnosics than healthy controls. N170 was
broadly similar in both groups with respect to timing and magnitude. N250
was 40 ms slower in the prosopagnosics than controls but of comparable
magnitude. Finally, P600 was significantly smaller in the prosopagnosics
than controls and was delayed by 80 ms. In sum, developmental prosopagnosics show relatively intact early face processing but are slower and less
efficient later in processing. ERPs provide an effective way of identifying
those aspects of face processing adversely affected in prosopagnosia.
Evaluation
Bruce and Young (1986) provided a comprehensive framework emphasising the wide range of information that can be extracted from faces. It was
remarkably innovative in identifying the major processes and structures
involved in face processing and recognition and incorporating them within a
plausible serial stage approach. Finally, the model enhanced our understanding of why familiar faces are much easier to recognise than unfamiliar ones.
What are the model’s limitations? First, the complexities involved in
recognising unfamiliar faces (e.g., coping with the great variability in a
given individual’s facial images) were not fully acknowledged. As Young
and Burton (2017, p. 213) pointed out, it was several years after 1986
before researchers appreciated that “humans’ relatively poor performance
at ­unfamiliar-face recognition is as much a problem of perception as of
memory”.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 129
Case study:
Model of face processing
28/02/20 6:43 PM
130
Visual perception and attention
KEY TERMS
Second, the model’s account of the processing of facial expression
is oversimplified. For example, the processing of facial expression is less
independent of the processing of facial identity than assumed theoretically.
According to the model, damage to the expression analysis component
should produce impaired ability to recognise all facial expressions. In fact,
many brain-damaged patients have much greater impairment in facial recognition of some emotions than others (Young, 2018).
Third, the model was somewhat vague about the precise information
stored in the face recognition units and the person identity nodes. Fourth,
it was wrong to exclude gaze perception from the model because it provides useful information about what an observer is attending to (Young &
Bruce, 2011). Fifth, Bruce and Young (1986) focused on general factors
influencing face recognition. However, as discussed earlier, there are substantial individual differences in face-recognition ability with a few individuals (super-recognisers) having outstanding ability. These individual
differences depend mostly on genetic factors (Wilmer, 2017) not considered
within the model.
Aphantasia
The inability to form
mental images of objects
when those objects are
not present.
Hallucinations
Perceptual experiences
that appear real even
though the individuals or
objects perceived are not
present.
VISUAL IMAGERY
Close your eyes and imagine the face of someone you know very well.
What did you experience? Many people claim forming visual images is like
“seeing with the mind’s eye”, suggesting there are important similarities
between imagery and perception. Mental imagery is typically regarded as
involving conscious experience. However, we could also regard imagery as
a form of mental representation (an internal cognitive symbol representing
aspects of external reality) (e.g., Pylyshyn, 2002). We would not necessarily
be consciously aware of images as mental representations.
Galton (1883) supported the above viewpoint. He found many individuals reported no conscious imagery when imagining a definite object
(e.g., their breakfast table). Zeman et al. (2015) studied several individuals lacking visual imagery and coined the term aphantasia to refer to this
condition.
If visual imagery and perception are similar, why do we very rarely
confuse them? One reason is that we are generally aware of deliberately
constructing images (unlike with visual perception). Another reason is
that images contain much less detail. For example, people rate their visual
images of faces as similar to photographs lacking sharp edges and borders
(Harvey, 1986).
However, many people sometimes confuse visual imagery and
­perception. Consider hallucinations in which perception-like experiences
occur in the absence of the appropriate environmental stimulus. Visual
hallucinations occur in approximately 27% of schizophrenic patients
but also in 7% of the general population. Waters et al. (2014) discussed
research showing visual hallucinations in schizophrenics are often associated with activity in the primary visual cortex, suggesting hallucinations
involve many processes associated with visual perception. One reason
­schizophrenics are susceptible to visual hallucinations is because of distortions in top-down processing (e.g., forming strong expectations of what
they will see).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 130
28/02/20 6:43 PM
131
Object and face recognition
In Anton’s syndrome (“blindness denial”), blind people are unaware
that they are blind and sometimes confuse imagery for actual perception.
Goldenberg et al. (1995) described a patient whose primary visual cortex
had been nearly wholly destroyed. Nevertheless, she generated such vivid
visual images that she mistook them for genuine visual perception. The
brain damage in patients with Anton’s syndrome typically includes large
parts of the visual cortex (Gandhi et al., 2016).
There is also Charles Bonnet syndrome, defined as “consistent or
periodic complex visual hallucinations that occur in visually impaired
­
individuals” (Yacoub & Ferrucci, 2011, p. 421). However, patients are
­
generally aware the hallucinations are not real and so they are actually
pseudo-hallucinations. When patients hallucinate, they have increased
activity in brain areas specialised for visual processing (e.g., hallucinations
in colour are associated with activity in colour-processing areas) (ffytche
et al., 1998).
Painter et al. (2018) identified a major reason for this elevated activity.
Stimuli presented to intact regions of the retina cause extreme excitability
(hyperexcitability) within early visual cortex. Visually impaired individuals
with hallucinations show greater hyperexcitability than those without.
KEY TERMS
Anton’s syndrome
A condition found in some
blind people in which
they misinterpret their
visual imagery as visual
perception.
Charles Bonnet
syndrome
A condition in which
individuals with eye
disease form vivid
and detailed visual
hallucinations sometimes
mistaken for visual
perception.
Depictive representation
A representation (e.g.,
visual image) resembling
a picture in that objects
within it are organised
spatially.
Why is visual imagery useful?
What functions are served by visual imagery? According to Moulton and
Kosslyn (2009, p. 1274), visual imagery “allows us to answer ‘what if’ questions by making explicit and accessible the likely consequences of being in
a specific situation or performing a specific action”. For example, professional golfers use mental imagery to predict what would happen if they hit
a certain shot.
Pearson and Kosslyn (2015) pointed out that many visual images
contain rich information that is accessible when required. For example,
what is the shape of a cat’s ears? You may be able to answer the question
by constructing a visual image.
More generally, visual imagery supports numerous cognitive functions.
These include creative insight, attentional search, guiding deliberate action,
short-term memory storage and long-term memory retrieval (Mitchell &
Cusack, 2016).
Research activity:
Mental imagery
Imagery theories
Kosslyn (e.g., 1994; Pearson & Kosslyn, 2015) proposed an influential
theory based on the assumption that visual imagery resembles visual perception. It was originally called perceptual anticipation theory because
image generation involves processes used to anticipate perceiving visual
stimuli.
According to the theory, visual images are depictive representations.
What is a depictive representation? In such a depiction, “each part of the
representation corresponds to a part of the represented object such that the
distances among the parts in the representation correspond to the actual
distances among the parts” (Pearson & Kosslyn, 2015, p. 10089). Thus,
for example, a visual image of a desk with a computer on top and a cat
Interactive exercise:
Kosslyn – mental imagery
132
Visual perception and attention
sleeping underneath would have the computer
at the top and the cat at the bottom.
Where are depictive representations
formed? Kosslyn argued they are created
in early visual cortex (BA17 and BA18; see
Figure 3.21) within a visual buffer. The visual
buffer is a short-term store for visual information only and is of major importance in
visual perception and imagery. There is also
an “attention window” selecting some visual
information in the visual buffer and passing
it on to other brain areas for further processing. This attention window is flexible – it can
be adjusted to include more, or less, visual
Figure 3.21
information.
The approximate locations of the visual buffer in BA17
Processing in the visual buffer depends
and BA18, of long-term memories of shapes in the inferior
primarily on external stimulation during pertemporal lobe, and of spatial representations in the posterior
parietal cortex, according to Kosslyn and Thompson’s (2003)
ception. However, such processing involves
anticipation theory.
non-pictorial information stored in long-term
memory during imagery. Shape information
is stored in the inferior temporal lobe whereas spatial representations are
stored in posterior parietal cortex (see Figure 3.21). In sum, visual perception mostly involves bottom-up processing whereas visual imagery depends
on top-down processing.
Pylyshyn (e.g., 2002) argued visual imagery differs substantially from
visual perception. According to his propositional theory, performance on
mental imagery tasks does not involve depictive or pictorial representations. Instead, it involves tacit knowledge (knowledge inaccessible to conscious awareness). Tacit knowledge is “Knowledge of what things would
look like to subjects in situations like the ones in which they are to imagine
themselves” (Pylyshyn, 2002, p. 161). Thus, performance on an imagery
task relies on relevant stored knowledge rather than visual images. Within
this theoretical framework, it is improbable that early visual cortex would
be involved on an imagery task.
Imagery resembles perception
KEY TERMS
Visual buffer
Within Kosslyn’s theory, a
short-term visual memory
store involved in visual
imagery and perception.
Binocular rivalry
When two different visual
stimuli are presented one
to each eye, only one
stimulus is seen; the seen
stimulus alternates over
time.
If visual perception and imagery involve similar processes, they should
influence each other. There should be facilitation if the contents of perception and imagery are the same but interference if they differ. Pearson et al.
(2008) reported a facilitation effect with binocular rivalry – when a different
stimulus is presented to each eye, only one is consciously perceived at any
given moment. The act of imagining a specific pattern strongly influenced
which stimulus was subsequently perceived and this facilitation depended
on the similarity between the imagined and presented stimuli. The findings
were remarkably similar when the initial stimulus was perceived rather than
imagined.
Baddeley and Andrade (2000) reported an interference effect.
Participants rated the vividness of visual and auditory images while performing a second task involving visual/spatial processes. This task reduced
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 132
28/02/20 6:43 PM
133
Object and face recognition
TOP LEFT
TOP RIGHT
Dwell time
Perception = 4%
Imagery = 8%
Dwell time
Perception = 77%
Imagery = 64%
Figure 3.22
Dwell time for the four
quadrants of a picture
during perception and
imagery.
From Laeng et al. (2014).
Reprinted with permission of
Elsevier.
White space
Dwell time
Perception = 2%
Imagery = 5%
BOTTOM LEFT
BOTTOM RIGHT
Dwell time
Perception = 1%
Imagery = 4%
Dwell time
Perception = 10%
Imagery = 12%
the vividness of visual imagery more than that of auditory imagery because
similar processes were involved on the visual/spatial and visual imagery
tasks.
Laeng et al. (2014) asked participants to view pictures of animals and
to follow each one by forming a visual image of that animal. There was
a striking similarity in eye fixations devoted to the various areas of each
picture in both conditions (see Fig. 3.22). Participants having the greatest
similarity in dwell time between perception and imagery showed the best
memory for the size of each animal.
According to Kosslyn’s theoretical position, much processing associated with visual imagery occurs in early visual cortex (BA17 and BA18)
plus several other areas. In a review, Kosslyn and Thompson (2003) found
50% of studies using visual-imagery tasks reported activation in early
visual cortex. Significant findings were most likely when the task involved
inspecting the fine details of images or focusing on an object’s shape. In a
meta-analysis (see Glossary), Winlove et al. (2018) found the early visual
cortex (V1) was typically activated during visual imagery. Consistent with
Kosslyn’s theory, activation in the early visual cortex is greater among
individuals reporting vivid visual imagery.
The neuroimaging evidence discussed above is limited – it is correlational and so the activation associated with visual imagery may not be
directly relevant to the images that are formed. Naselaris et al. (2015)
reported more convincing evidence. Participants formed images of five artworks. It was possible to some extent to identify the imagined artworks
from hundreds of other artworks through careful analysis of activity in the
early visual cortex. Some of this activity corresponded to the processing of
low-level visual features (e.g., space; orientation).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 133
28/02/20 6:43 PM
134
Visual perception and attention
Further neuroimaging support for the notion that imagery closely
resembles perception was reported by Dijkstra et al. (2017a). They found
“the overlap in neural representations between imagery and perception . . .
extends beyond the visual cortex to include also parietal and premotor/
frontal areas” (p. 1372). Of most importance, the greater the neural overlap
between imagery and perception throughout the entire visual system, the
more vivid was the imagery experience.
Imagery does not resemble perception
Look at Figure 3.23. Start with the object on the left and form a clear image
of it. Then close your eyes, mentally rotate the image by 90o clockwise
and decide what you see. Then repeat the exercise with the other objects.
Finally, rotate the book through 90o. You probably found it very easy to
identify the objects when perceiving them but impossible when only imagining rotating them. Slezak (1991, 1995) used stimuli closely resembling those
in Figure 3.23 and found no observers reported seeing the objects. Thus,
the information within images is much less detailed and flexible than visual
information.
Lee et al. (2012) identified important differences between imagery
and perception. Observers viewed or imagined common objects (e.g., car;
umbrella) while activity in the early visual cortex and areas associated with
later visual processing (object-selective regions) was assessed. Attempts
were made by the researchers to identify the objects being imagined or
perceived on the basis of activation in those areas.
What did Lee et al. (2012) find? First, activation in all brain areas
was considerably greater when participants perceived rather than
imagined objects. Second, objects being perceived or imagined were
identified with above-chance accuracy based on patterns of brain activation except for imagined objects in the primary visual cortex (V1; see
Figure 3.24).
Third, the success rate in identifying perceived objects was greater based on brain activation in areas associated with early visual
processing than those associated with later
processing. However, the opposite was the case
with imagined objects (see Figure 3.24). Thus,
object processing in the early visual cortex is
very limited during imagery but is extremely
important during perception. Imagery for
objects depends mostly on top-down processes based on object knowledge rather than
processing in the early visual cortex.
Figure 3.23
Most cognitive neuroscience research has
Slezak (1991, 1995) asked participants to memorise one of
focused on the brain areas activated during
the above images. They then imagined rotating the image 90
degrees clockwise and reported what they saw. None of them
visual perception and imagery. It is also
reported seeing the figures that can be seen clearly if you
important to focus on connectivity between
rotate the page by 90 degrees clockwise.
brain areas. Dijkstra et al. (2017b) considered
Left image from Slezak (1995), centre image from Slezak (1991), right
connectivity among four brain areas of central
image reprinted from Pylyshyn (2002), with permission from Elsevier
and the author.
importance in perception and imagery: early
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 134
28/02/20 6:43 PM
Object and face recognition
##
##
**
60
**
Classification performance (%)
Classification performance (%)
**
**
40
20
0
Figure 3.24
The extent to which
perceived (left side of
figure) or imagined (right
side of figure) objects could
be classified accurately on
the basis of brain activity
in the early visual cortex
and object-selective cortex.
ES =extrastriate retinotopic
cortex; LO = lateral
occipital cortex; pFs =
posterior fusiform sulcus.
##
##
V1 ES
Retinotopic
20
**
**
*
10
LO pFs
Objectselective
0
Chance
V1 ES
Retinotopic
From S.H. Lee et al. (2012).
Reproduced with permission
from Elsevier.
LO pFs
Objectselective
7
Bottom-up
IPS
IFG
OCC
Probability density
(a)
FG
6
5
4
3
2
1
0
–0.5
Perception
0
0.5
1
1.5
Posterior estimate
135
2
Imagery
Figure 3.25
Connectivity during
perception and imagery
involving (a) bottom-up
processing; and (b)
top-down processing.
Posterior estimates indicate
connectivity strength (the
further from 0 the stronger).
The meanings of OCC, FG,
IPS and IFG are given in
the text.
From Dijkstra et al. (2017b).
(b)
Top-down
IPS
IFG
OCC
FG
Probability density
6
5
4
3
2
1
0
–0.5
0
0.5
1
1.5
Posterior estimate
2
visual cortex (OCC), fusiform gyrus (FG; late visual cortex), IPS (intraparietal sulcus) and IFG (inferior frontal gyrus). The first two are mostly
associated with bottom-up processing whereas the second two are mostly
associated with top-down processing.
Dijkstra et al.’s (2017b) key findings are shown in Figure 3.25. First,
perception was associated with reasonably strong bottom-up brain connectivity and weak top-down brain connectivity. Second, imagery was associated with non-significant bottom-up connectivity but very strong top-down
connectivity. Thus, top-down connectivity from frontal to early visual areas
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 135
28/02/20 6:43 PM
136
Visual perception and attention
is a common mechanism during perception and imagery. However, there
is much stronger top-down connectivity during imagery to compensate for
the absence of bottom-up connectivity. Individuals having the greatest topdown connectivity during imagery reported the most vivid images.
Dijkstra et al. (2018) studied the time course for the development
of visual representations in perception and in imagery using magneto-­
encephalography (MEG; see Glossary). With perception, they confirmed
that visual representations develop through a series of processing stages
(see Chapter 2). With imagery, in contrast, the entire visual representation
appeared to be activated simultaneously, presumably because all the relevant information was retrieved together from memory.
Brain damage
If visual perception and visual imagery involve the same mechanisms, we
might expect brain damage to have comparable effects on perception and
imagery. That is often the case. However, there are numerous exceptions
(Bartolomeo, 2002, 2008). Moro et al. (2008) studied two brain-damaged
patients with intact visual perception but impaired visual imagery. They
were both very poor at drawing objects from memory but could copy the
same objects when shown a drawing.
These patients (and others with impaired visual imagery but intact
visual perception) have damage to the left temporal lobe. Visual images are
probably generated from information about concepts (including objects)
stored in the temporal lobes (Patterson et al., 2007). However, this generation process is less important for visual perception.
Bridge et al. (2012) studied a young man, SBR, who had virtually no
primary visual cortex and nearly total blindness. However, he had vivid
visual imagery and his pattern of cortical activation when engaged in
visual imagery resembled that of healthy controls. Similar findings were
reported with a 70-year-old woman, SH, who became blind at the age of
27. She had intact visual imagery predominantly involving areas outside
the early visual cortex. Of relevance, she had greater connectivity between
some visual networks in the brain than most individuals.
How can we interpret the above findings? Visual perception mostly
involves bottom-up processes triggered by the stimulus whereas visual
imagery primarily involves top-down processes based on object knowledge.
Thus, it is unsurprising brain areas involved in early visual processing are
more important for perception than imagery whereas brain areas associated with storage of information about visual objects are more important
for imagery.
Evaluation
Much progress has been made in understanding the relationship between
visual imagery and visual perception. Similar processes are involved in
imagery and perception and they are both associated with somewhat
similar patterns of brain activity. In addition, the predicted facilitatory and
­interfering effects between imagery and perception tasks have been reported.
These findings are more consistent with Kosslyn’s theory than Pylyshyn’s.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 136
28/02/20 6:43 PM
Object and face recognition
137
On the negative side, visual perception and visual imagery are less
similar than assumed by Kosslyn. For example, there is the neuroimaging evidence reported by Lee et al. (2012) and the frequent dissociations
between perception and imagery found in brain-damaged patients. Of most
importance, visual perception involves strong bottom-up connectivity and
weak top-down connectivity, whereas visual imagery involves very strong
top-down connectivity but negligible bottom-up connectivity (Dijkstra
et al., 2017b).
CHAPTER SUMMARY
•
Pattern recognition. Pattern recognition involves processing
of specific features and global processing. Feature processing
­generally (but not always) precedes global processing. Several types
of cells (e.g., simple cells; complex cells; end-stopped cells) are
involved in feature processing. There are complexities in pattern
recognition due to interactions among cells and the ­influence
of top-down processes. Evidence from computer ­programs to
solve CAPTCHAs suggests humans are very good at processing
edge corners. Fingerprint identification is sometimes very
accurate; however, even experts show confirmation bias (distorted
performance caused by contextual information). Fingerprint experts
are much better than novices at discriminating between matches
and non-matches and also adopt a more conservative response bias.
•
Perceptual organisation. The gestaltists proposed several
principles of perceptual grouping and emphasised the importance
of figure-ground segmentation. They argued that perceptual
grouping and figure-ground segregation depend on innate factors.
They also argued we perceive the simplest possible organisation
of the visual field. The gestaltists provided descriptions rather
than explanations. Their approach underestimated the complex
interactions of factors underlying perceptual organisation. The
gestaltists de-emphasised the role of experience and learning
in perceptual organisation. However, recent theories based on
Bayesian inference (e.g., the Bayesian hierarchical grouping model)
have emphasised learning processes and fully acknowledge the
importance of learning.
•
Approaches to object recognition. Visual processing typically
involves a coarse-to-fine processing sequence: low spatial frequencies in visual input (associated with coarse processing) are conveyed
to higher visual areas faster than high spatial frequencies (associated with fine processing). Biederman assumed in his recognitionby-components theory that objects consist of geons (basic shapes).
An object’s geons are determined by edge-extraction processes
and the resultant geon-based description is viewpoint-invariant.
Biederman’s theory de-emphasises the role of top-down processes.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 137
28/02/20 6:43 PM
138
Visual perception and attention
Object recognition is sometimes viewpoint-invariant (as
predicted by Biederman) with easy categorical discriminations, but
it is more typically viewer-centred when identification is required.
Object representations often contain viewpoint-dependent and
viewpoint-invariant information.
•
Object recognition: top-down processes. Top-down
processes are more important in object recognition when
observers view degraded or briefly presented stimuli. Topdown processes sometimes influence attention, memory or
response bias rather than perception itself. However, there are
also direct effects of top-down processes on object recognition.
According to the interactive-iterative framework (Baruch et al.,
2018), top-down and bottom-up processes interact with top-down
processes (e.g., attention) influencing subsequent bottom-up
processing.
•
Face recognition. Face recognition involves more holistic
processing than object recognition. Deficient holistic processing
partly explains why prosopagnosic patients have much greater
problems with face recognition than object recognition. Face
processing involves a brain network including the fusiform face and
occipital face areas. However, much of this network is also used in
processing other objects (especially when recognising objects for
which we have expertise).
Bruce and Young’s model assumes several serial processing
stages. Research on prosopagnosics supports this assumption
because the precise nature of their face-recognition impairments
depends on which stage(s) are most affected. The model also
assumes there are major differences in the processing of familiar
and unfamiliar faces. This assumption has received substantial
support. However, Bruce and Young did not fully appreciate
that unfamiliar faces are hard to recognise because of the
great variability of any given individual’s facial images. The
model assumes there are two independent processing routes
(for facial expression and facial identity), but they are not entirely
independent. The model ignores the role played by genetic
factors in accounting for individual differences in face-recognition
ability.
•
Visual imagery. Visual imagery allows us to predict the visual
consequences of performing certain actions. According to
Kosslyn’s perceptual anticipation theory, visual imagery closely
resembles visual perception. In contrast, Pylyshyn, in his
propositional theory, argued visual imagery involves making use
of tacit knowledge and does not resemble visual perception.
Visual imagery and perception influence each other as predicted
by Kosslyn’s theory. Neuroimaging studies and studies on braindamaged patients indicate similar areas are involved in imagery
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 138
28/02/20 6:43 PM
Object and face recognition
139
and perception. However, areas involved in top-down processing
(e.g., left temporal lobe) are more important in imagery than
perception, and areas involved in bottom-up processing (e.g., early
visual cortex) are more important in perception. More generally,
bottom-up brain connectivity is far more important in perception
than imagery, whereas top-down brain connectivity is far more
important in imagery than perception.
FURTHER READING
Baruch, O., Kimchi, R. & Goldsmith, M. (2018). Attention to distinguishing features in object recognition: An interactive-iterative framework. Cognition, 170,
228–244. Orit Baruch and colleagues provide a theoretical framework for understanding how bottom-up and top-down processes interact in object recognition.
Dijkstra, N., Zeidman, P., Ondobaka, S., van Gerven, M.A.J. & Friston, K.
(2017b). Distinct top-down and bottom-up brain connectivity during visual perception and imagery. Scientific Reports, 7 (Article 5677). In this article, Nadine
Dijkstra and her colleagues clarify the roles of top-down and bottom-up processes in visual perception and imagery.
Firestone, C. & Scholl, B.J. (2016). Cognition does not affect perception: Evaluating
the evidence for “top-down” effects. Behavioral and Brain Sciences, 39, 1–77. The
authors argue that top-down processes do not directly influence visual perception. Read the open peer commentary following the article, however, and you
will see most experts disagree.
Gauthier, I. & Tarr, M.J. (2016). Visual object recognition: Do we (finally) know
more now than we did? Annual Review of Vision Science, 2, 377–396. Isabel
Gauthier and Michael Tarr provide a comprehensive overview of theory and
research on object recognition.
Grill-Spector, K., Weiner, K.S., Kay, K. & Gomez, J. (2017). The functional
neuroanatomy of human face perception. Annual Review of Vision Science, 3,
167–196. This article by Kalanit Grill-Sector and colleagues contains a comprehensive account of brain mechanisms underlying face perception.
Wagemans, J. (2018). Perceptual organisation. In J.T. Serences (ed.), Stevens’
Handbook of Experimental Psychology and Cognitive Neuroscience, Vol. 2:
Sensation, Perception, and Attention (4th edn; pp. 803–822). New York: Wiley.
Johan Wagemans reviews various theoretical and empirical approaches to understanding perceptual organisation.
Young, A.W. (2018). Faces, people and the brain: The 45th Sir Frederic Bartlett
lecture. Quarterly Journal of Experimental Psychology, 71, 569–594. Andy Young
provides a very interesting account of theory and research on face perception.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 139
28/02/20 6:43 PM
Chapter
4
Motion perception
and action
INTRODUCTION
Most research on perception discussed in previous chapters involved presenting a visual stimulus and assessing aspects of its meaning. What was
missing (but is an overarching theme of this chapter) is the time dimension.
In the real world, we move around and/or people or objects in the environment move. The resulting changes in the visual information available to us
are very useful in ensuring we perceive the environment accurately and also
respond appropriately. This emphasis on change and movement necessarily
leads to a consideration of the relationship between perception and action.
In sum, our focus in this chapter is on how we process (and respond to) a
constantly changing environment.
The first theme addressed in this chapter is the perception of movement. This includes our ability to move successfully within the visual environment and predict accurately when moving objects will reach us.
The second theme is concerned with more complex issues – how do
we act appropriately on the environment and the objects within it? Of relevance are theories (e.g., the perception-action theory; the dual-­
process
approach) distinguishing between processes and systems involved in visionfor-­
perception and those involved in vision-for-action (see Chapter 2).
Here we consider theories providing more detailed accounts of vision-foraction and/or the workings of the dorsal pathways allegedly underlying
vision-for-action.
The third theme focuses on the processes involved in making sense
of moving objects (especially other people). It thus differs from the first
theme in which moving stimuli are considered mostly in terms of predicting
when they will reach us. There is an emphasis on the perception of biological movement when the available visual information is impoverished. We
also consider the role of the mirror neuron system in interpreting human
movement.
Finally, we consider our ability (or failure!) to detect changes in
objects within the visual environment over time. Unsurprisingly, attention
importantly determines which aspects of the environment are consciously
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 140
28/02/20 6:43 PM
141
Motion perception and action
detected. This issue provides a useful bridge between the areas of visual
perception and attention (the subject of the next chapter).
DIRECT PERCEPTION
James Gibson (1950, 1966, 1979) put forward a radical approach to visual
perception that was largely ignored at the time. Until approximately 40
years ago, it was assumed the main purpose of visual perception is to allow
us to identify or recognise objects. This typically involves relating information extracted from the visual environment to our stored knowledge
of objects (see Chapter 3). Gibson argued that this approach is limited –
in evolutionary terms, vision developed so our ancestors could respond
rapidly to the environment (e.g., hunting animals; escaping from danger).
Gibson (1979, p. 239) argued that perception involves “keeping in
touch with the environment”. This is sufficient for most purposes because
the information provided by environmental stimuli is much richer than
previously believed. We can relate Gibson’s views to Milner and Goodale’s
(1995, 2008) vision-for-action system (see Chapter 2). According to both
theoretical accounts, there is an intimate relationship between perception
and action.
Gibson regarded his theoretical approach as ecological. He emphasised
that perception facilitates interactions between the individual and their
environment. Here is the essence of his direct theory of perception:
When I assert that perception of the environment is direct, I mean that
it is not mediated by retinal pictures, neural pictures, or mental pictures. Direct perception is the activity of getting information from the
ambient array of light. I call this a process of information pickup that
involves . . . looking around, getting around, and looking at things.
(Gibson, 1979, p. 147)
We will briefly consider some of Gibson’s theoretical assumptions:
●●
●●
The pattern of light reaching the eye is an optic array. It contains all
the visual information from the environment striking the eye.
The optic array provides unambiguous or invariant information about
the layout of objects. This information comes in many forms including
optic flow patterns and affordances (see below) and texture gradients
(discussed in Chapter 2).
Gibson produced training films in the Second World War describing how
pilots handle taking off and landing. Of crucial importance is optic flow –
the changes in the pattern of light reaching observers when they move or
parts of the visual environment move. When pilots approach a landing strip,
the point towards which they are moving (focus of expansion) appears
motionless with the rest of the visual environment apparently moving away
from that point (see Figure 4.1). The further away any part of the landing
strip is from that point, the greater is its apparent speed of movement.
Wang et al. (2012) simulated the pattern of optic flow that would be
experienced if individuals moved forwards in a stationary environment.
KEY TERMS
Optic array
The structural pattern of
light falling on the retina.
Optic flow
The changes in the
pattern of light reaching
an observer when there is
movement of the observer
and/or aspects of the
environment.
Focus of expansion
The point towards which
someone in motion is
moving; it does not
appear to move.
Case study:
Gibson's theory of direct
perception affordances
142
Visual perception and attention
Figure 4.1
The optic-flow field as a
pilot comes in to land, with
the focus of expansion in
the middle.
From Gibson (1950).
Wadsworth, a part of Cengage
Learning, Inc. 2014 American
Psychological Association.
Reproduced with permission.
Their attention was attracted towards the focus of expansion, thus showing
its psychological importance. (More is said later about optic flow and the
focus of expansion.)
Gibson (1966, 1979) argued certain higher-order characteristics of
the visual array (invariants) remain unaltered as observers move around
their environment. Invariants (e.g., the focus of expansion) are important
because they remain the same over different viewing angles. The focus of
expansion is an invariant feature of the optic array.
Affordances
KEY TERMS
Invariants
Properties of the optic
array that remain constant
even though other
aspects vary; part of
Gibson’s theory.
Affordances
The potential uses of
an object which Gibson
claimed are perceived
directly.
According to Gibson (1979), the potential uses of objects (their ­affordances)
are directly perceivable. For example, a ladder “affords” ascent or descent.
Gibson believed that “affordances are opportunities for action that exist
in the environment and do not depend on the animal’s mind . . . they do
not cause behaviour but simply make it possible” (Withagen et al., 2012,
p. 251). In Gibson (1979, p. 127), affordances are what the environment
“offers the animal, what it provides or furnishes”.
Evidence for the affordance of “climbability” of steps varying in height
was reported by Di Stasi and Guardini (2007). The step height judged the
most “climbable” was the one that would have involved the minimum
expenditure of energy.
Gibson argued an object’s affordances are perceived directly or automatically. In support, Pappas and Mack (2008) found images of objects
presented below the level of conscious awareness nevertheless produced
motor priming. For example, the image of a hammer caused activation
in brain areas involved in preparing to use a hammer. Wilf et al. (2013)
focused on the affordance of graspability with participants lifting their
arms to perform a reach-like movement with graspable and non-graspable
objects (see Figure 4.2). Muscle activity started faster for graspable than
non-graspable objects suggesting that the affordance of graspability triggers rapid activity in the motor system.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 142
28/02/20 6:43 PM
143
Motion perception and action
Nongraspable
Graspable
Nongraspable
Graspable
Figure 4.2
Graspable and nongraspable objects having
similar asymmetrical
features.
From Wilf et al. (2013).
Reprinted with permission.
Gibson’s approach to affordances is substantially oversimplified. For
example, an apparently simple task such as cutting up a tomato involves
selecting an appropriate tool, deciding on how to grasp and manipulate
the tool, and monitoring movement execution (Osiurak & Badets, 2016).
In other words, “People reason about physical object properties to solve
everyday life activities” (Osiurak & Badets, 2016, p. 540). This is sharply
different to Gibson’s emphasis on the ease and immediacy of tool use.
When individuals observe a tool, Gibson assumed this provided them
with direct access to knowledge about how to manipulate it and this
manipulation knowledge gave access to the tool’s functions. This assumption exaggerates the importance of manipulation knowledge. For example,
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 143
28/02/20 6:43 PM
144
Visual perception and attention
Garcea and Mahon (2012) found function judgements about tools were
made faster than manipulation judgements, whereas Gibson’s approach
implies that manipulation judgements should have been faster.
Finally, Gibson argued stored knowledge is not required for
individuals to make appropriate movements with respect to objects
­
(e.g., tools). In fact, individuals often make extensive use of motor and
­function ­knowledge when dealing with objects (Osiurak & Badets, 2016).
For example, making tea involves filling the kettle with water, boiling the
water, finding some milk and so on. Foulsham (2015) discussed research
showing there are only small individual differences in the pattern of eye
­fixations when people make tea. Such findings strongly imply they use
stored information about the sequence of motor actions involved in
tea-making.
Evaluation
What are the strengths of Gibson’s ecological approach? First, “Gibson’s
realisation that natural scenes are the ecologically valid stimulus that should
be used for the study of vision was of fundamental importance” (Bruce &
Tadmor, 2015, p. 32).
Second, and related to the first point, Gibson disagreed with the previous emphasis on static observers looking at static visual displays. Foulsham
and Kingstone (2017) compared the eye fixations of participants walking
around a university campus with those of other participants viewing static
pictures of the same scene. The eye fixations were significantly different:
those engaged in walking focused more on features (e.g., the path) important for locomotion whereas those viewing static pictures focused centrally
within each picture.
Third, Gibson was far ahead of his time. There is support for two
visual systems (Milner & Goodale, 1995, 2008; see Chapter 2): a visionfor-­perception system and a vision-for-action system. Before Gibson, the
major emphasis was on the former. In contrast, he argued our perceptual system allows us to respond rapidly and accurately to environmental
stimuli without using memory, which is a feature of the latter system.
What are the limitations of Gibson’s approach? First, Gibson attempted
to specify the visual information used to guide action but ignored many
of the processes involved (see Chapters 2 and 3). For example, Gibson
assumed the perception of invariants occurred almost “automatically”, but
it actually requires several complex processes.
Second, Gibson’s argument that we do not need to assume the existence
of internal representations (e.g., object knowledge) is flawed. The logic of
Gibson’s position is that: “There are invariants specifying a friend’s face, a
performance of Hamlet, or the sinking of the Titanic, and no knowledge of
the friend, of the play, or of maritime history is required to perceive these
things” (Bruce et al., 2003, p. 410). Evidence refuting Gibson’s argument
was reviewed by Foulsham (2015; discussed above).
Third, and related to the second point, Gibson de-emphasised the role
of top-down processes (based on our knowledge and expectations) in visual
perception. Such processes are especially important when the visual input
is impoverished (see Chapter 3).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 144
28/02/20 6:43 PM
145
Motion perception and action
Fourth, Gibson’s views on the effects of motion on perception were
oversimplified. For example, when moving towards a goal, we use more
information sources than Gibson assumed (discussed below).
VISUALLY GUIDED MOVEMENT
From an ecological perspective, it is important to understand how we move
around the environment. For example, what information do we use when
walking towards our current goal or target? We must ensure we are not
hit by cars when crossing the road and when driving we must avoid hitting
other cars. Playing tennis well involves predicting exactly when and where
the ball will strike our racquet. The ways visual perception plays a crucial
role in facilitating our locomotion and ensuring our safety are discussed in
the next section.
Heading and steering
KEY TERMS
Retinal flow field
The changing patterns
of light on the retina
produced by movement
of the observer relative
to the environment as
well as by eye and head
movements.
Efference copy
An internal copy of a
motor command (e.g., to
the eyes); it can be used
to identify movement
within the retinal image
that is not due to
object movement in the
environment.
When we want to reach some goal (e.g., a gate at the end of a field), we
use visual information to move directly towards it. Gibson (1950) emphasised the importance of optic flow (see Glossary; discussed on pp. 141–142).
When we move forwards in a straight line, the point towards which we are
moving (the focus of expansion) appears motionless. In contrast, the area
around that point seems to expand.
Gibson (1950) proposed a hypothesis, according to which, if we are
not moving directly towards our goal, we use the focus of expansion and
optic flow to bring our heading (point of expansion) into alignment with
our goal. This is known as the global radial outflow hypothesis.
Gibson’s approach works well in principle when applied to an individual trying to move straight from A to B. However, matters are more
complex when we cannot move directly to our goal (e.g., driving around a
bend in the road; avoiding obstacles). Another complexity is that observers
often make head and eye movements. In sum, the retinal flow field (changes
in the pattern of light on the retina) is influenced by rotation in the retinal
image produced by following a curved path and/or eye and head movements.
The above complexities mean it is often hard to use information from
retinal flow to determine our direction of heading. It has often been claimed
that a copy of motor commands (preprogramming) to move the eye and
head (efference copy) is used by observers to compensate for the effects of
eye and head movements on the retinal image. However, Feldman (2016)
argued this approach is insufficient on its own because it de-emphasises the
brain’s active involvement in relating perception and action.
Findings: heading
Gibson emphasised the role of optic flow in allowing individuals to move
directly towards their goal. Relevant information includes the focus of
expansion (see Glossary) and the direction of radial motion (e.g., expansion within optic flow). Strong et al. (2017) obtained evidence indicating the
importance of both factors and also established they depend on separate
brain areas. More specifically, they used transcranial magnetic stimulation
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 145
28/02/20 6:43 PM
146
Visual perception and attention
(TMS; see Glossary) to disrupt key brain areas. TMS applied to area V3A
impaired perception of the focus of expansion but not direction of radial
motion, with the opposite pattern being obtained when TMS was applied to
the motion area V5/MT+ (see Chapter 2).
As indicated above, eye and/or head movements make it harder to use
optic flow effectively for heading. Bremmer et al. (2010) considered this
issue in macaque monkeys presented with distorted visual flow fields simulating the combined effects of self-motion and an eye movement. Their
key finding was that numerous cells in the medial superior temporal area
successfully compensated for this distortion.
According to Gibson, a walker tries to make the focus of expansion
coincide with the body moving straight ahead. If a walker wore prisms
producing a 9° error in their perceived visual direction, the focus of expansion should be misaligned compared to their expectation. As a result, there
should be a correction process, a prediction confirmed by Herlihey and
Rushton (2012). Also as predicted, walkers denied access to information
about retinal motion failed to show any correction.
Factors additional to the optic flow information emphasised by Gibson
are also used when making heading judgements. This is unsurprising given
the typical richness of the available environmental information. van den
Berg and Brenner (1994) noted we only require one eye to use optic flow
information. However, they discovered heading judgements were more
accurate when observers used both eyes. Binocular disparity (see Glossary)
in the two-eye condition provided useful additional information about the
relative depths of objects. Cormack et al. (2017) introduced the notion of
a binoptic flow field to describe the 3-D information available to observers
(but de-emphasised by Gibson).
Gibson assumed optic-flow patterns generated by self-motion are of
fundamental importance when we head towards a goal. However, motion is
not essential for accurate perception of heading. The judgements of heading
direction made by observers viewing two static photographs of a real-world
scene in rapid succession were reasonably accurate in the absence of opticflow information (Hahn et al., 2003). These findings can be explained in
terms of retinal displacement – objects closer to the direction of heading
show less retinal displacement as we move closer to the target.
Snyder and Bischof (2010) argued that information about the direction
of heading is provided by two systems. One system uses movement information (e.g., optic flow) rapidly and fairly automatically (as proposed by
Gibson). The other system uses displacement information more slowly and
requires greater processing resources. It follows that performing a second
task at the same time as making judgements about direction of heading
should have little effect on those judgements if movement information is
available. In contrast, a second task should impair heading judgements
when only displacement information is available. The evidence supported
both predictions.
Heading: future path
Wilkie and Wann (2006) argued judgements of heading (the direction in
which someone is moving) are of little relevance if they are moving along a
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 146
28/02/20 6:43 PM
Motion perception and action
147
curved path. With curved paths, path judgements (identifying future points along one’s
path) were much more accurate than heading
judgements.
According to the above analysis, we
might expect individuals (e.g., drivers) to
fixate some point along their future path
when it is curved. This is the future path
strategy. In contrast, Land and Lee (1994)
argued (with supportive evidence) that drivers
approaching a bend focus on the tangent
point – the point on the inside edge of the
road at which its direction appears to reverse
(see Figure 4.3).
The tangent point has two potential
advantages. First, it is easy to identify and
track. Second, road curvature can easily be
worked out by considering the angle between
the direction of heading and the tangent point.
Kandil et al. (2009) found most drivers negotiating 270° bends at a motorway junction
fixated the tangent point much more often than
the future path (75% vs 14%, respectively).
Other research suggests the tangent point
is less important. For example, Itkonen et al.
(2015) instructed drivers to “drive as they
Road position
normally would” or “look at the tangent
point”. Eye movements differed markedly in
the two conditions – drivers were much more Figure 4.3
likely to fixate points along the future path in The visual features of a road viewed in perspective. The
the former condition.
tangent point is marked by the filled circle on the inside edge
How can we interpret the above appar- of the road, and the desired future path is shown by the
ently inconsistent findings? Lappi et al. (2013) dotted line. According to the future-path theory, drivers should
hypothesised drivers often fixate the tangent gaze along the line marked “active gaze”.
Wilkie et al. (2010). Reprinted with permission from
point when approaching and entering a bend From
Springer-Verlag.
but fixate the future path further into the
bend. They argued the tangent point provides relatively precise information and so drivers use it when uncertainty about the precise nature of the
curve or bend is maximal (i.e., when approaching and entering it).
Lappi et al. (2013) obtained supporting evidence for the above hypothesis. Drivers’ fixations while driving along a lengthy curve formed by the
slip road to a motorway were predominantly on the path ahead rather than
the tangent point after the first few seconds (see short clips of drivers’ eye
movements while performing this task at 10.1371/journal.pone.0068326).
KEY TERM
The evidence discussed so far does not rule out optic flow as a factor
Tangent point
influencing drivers’ steering. Mole et al. (2016) manipulated optic-flow
From a driver’s
speed in a simulated driving situation. This produced steering errors
perspective, the point
(understeering or oversteering) when going around bends even when full
on a road at which the
direction of its inside
information about road edges was available. Thus, optic flow influenced
edge appears to reverse.
driving performance.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 147
28/02/20 6:43 PM
148
Visual perception and attention
IN THE REAL WORLD: ON-ROAD DRIVING
Much research on drivers’ gaze patterns lacks ecological validity (see Glossary). Drivers are typically in a simulator and the environment through which they drive is somewhat oversimplified.
Accordingly, Lappi et al. (2017) studied the gaze patterns of a 43-year-old male driving school
instructor driving on a rural road in Finland. His eye movements revealed a more complex picture
than most previous research.
What did Lappi et al. (2017) discover? Here are four major findings:
(1) The driver’s gaze shifted very frequently from one feature of the visual environment to another
and he made many head movements.
(2) The driver’s gaze was predominantly on the far road (see Figure 4.4). This preview of the road
ahead allowed him to make use of anticipatory control.
(3) In bends, the driver’s gaze was mostly within the far road “triangle” formed by the tangent
point (TP), the lane edge opposite the TP and the occlusion point (OP; the point where the
road disappears from view). In general terms, the OP is used to anticipate the road ahead
whereas the TP is used for more immediate compensatory steering control.
(4) The driver fixated specific targets (e.g., traffic signs; other road users) very rapidly, suggesting
his peripheral vision was very efficient.
(A)
(B)
Figure 4.4
The far road “triangle” in (A) a left turn and (B) a right turn.
From Lappi et al. (2017).
In sum, drivers’ gaze patterns are more complex than implied by previous research. Drivers do not
constantly fixate any given feature (e.g., tangent point) passively. Instead, they “sample visual information as needed, leading to input that is intermittent, and determined by the active observer . . .
rather than imposed by the environment” (Lappi et al., 2017, p. 11). Drivers’ eye movements are
determined in part by control mechanisms (e.g., path planning) (Lappi & Mole, 2018). These mechanisms are responsive to drivers’ goals. For example, professional racing drivers have the goal of
driving as fast as possible whereas many ordinary drivers have the goal of driving safely.
Evaluation
Gibson’s views concerning the importance of optic-flow information have
deservedly been very influential. Such information is especially useful when
individuals can move directly towards their goal rather than following a
curved or indirect path. Indeed, the evidence suggests optic flow is often
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 148
28/02/20 6:43 PM
Motion perception and action
the dominant source of information determining judgements of heading
direction. Drivers going around bends use optic-flow information. They
also make some use of the tangent point. This is a relatively simple feature
of the visual environment and its use by drivers is in the spirit of Gibson’s
perspective.
What are the limitations of Gibson’s approach and other related
approaches?
(1)
(2)
(3)
(4)
Individuals moving directly towards a target use several kinds of
information (e.g., binocular disparity; retinal displacement) ignored
by Gibson.
The tangent point is used infrequently when individuals move along a
curved path: they more often fixate points lying along the future path.
Drivers going around bends use a greater variety of information
sources than implied by Gibson’s approach. Of most importance,
drivers’ eye movements are strongly influenced by active, top-down
processes (e.g., motor control) not included within Gibson’s theorising. More specifically, drivers’ eye movements depend on their current
driving goals as well as the environmental conditions.
Research and theorising have de-emphasised meta-cognition (beliefs
about one’s own performance). Mole and Lappi (2018) found drivers
often made inaccurate meta-cognitive judgements of their own driving
performance (e.g., they tended to exaggerate the importance of driving
speed in determining performance). Such inaccurate judgements probably often lead to impaired driving performance.
Time to contact
In everyday life, we often need to predict the moment of contact between
us and some object. These situations include ones where we are moving
towards an object (e.g., a wall) and those in which an object (e.g., a ball)
is approaching us. We might work out the time to contact by dividing our
estimate of the object’s distance by our estimate of its speed. However, this
would be complex and error-prone because information about speed and
distance is not directly available.
Lee (1976, 2009) argued that there is a simpler way to work out the time
to contact or collision. If we approach it (or it approaches us) at constant
velocity, we can use tau. Tau is defined as the size of an object’s retinal
image divided by its rate of expansion. The faster the rate of ­expansion,
the less time there is to contact.
When driving, the rate of decline of tau over time (tau-dot) indicates
whether there is sufficient braking time to stop before contact or collision.
Lee (1976) argued drivers brake to hold constant the rate of change of
tau. This tau-dot hypothesis is consistent with Gibson’s approach because
it assumes tau-dot is an invariant available to observers from optic flow.
Lee’s theoretical approach has been highly influential. However, his
emphasis on tau has limited applicability in various ways (Tresilian, 1999).
First, tau ignores acceleration in object velocity. Second, tau only provides
information about the time to contact or collision with the eyes. Thus,
drivers might find the front of their car smashed in if they relied solely on
149
150
Visual perception and attention
tau! Third, tau is accurate only when applied to spherically symmetrical
objects: do not rely on it when catching a rugby ball!
Harrison et al. (2016) argued that people’s behaviour is often influenced by factors other than their estimate of the time to contact. For
example, consider someone deciding whether to cross a road when there
is an approaching car. Their decision is often influenced by judgements of
their physical mobility and their personality (e.g., cautious or impetuous)
(see p. 151).
Findings
According to Lee’s (1976) theory, observers can often judge time to contact
accurately based on using tau relatively “automatically”. If so, observers’ time-to-contact judgements might not be impaired if they performed
a cognitively demanding task while observing an object’s movement.
Baurès et al. (2018) obtained support for this prediction. Indeed, time-tocontact judgements were more accurate when observers performed a secondary task, perhaps because this made it less likely they would attend to
potentially misleading information (e.g., expectations about an object’s
movements).
According to Lee (1976), judgements of the time to contact when
catching a ball should depend crucially on the rate of expansion of the
ball’s retinal image. Savelsbergh et al. (1993) used a deflating ball having
a significantly slower rate of expansion than an ordinary ball. The prediction was that peak grasp closure should occur later to the deflating
ball. This prediction was confirmed. However, the actual slowing was
much less than predicted (30 ms vs 230 ms).
Participants minimised the distorting effects
of m
­ anipulating the rate of expansion by
using additional sources of information (e.g.,
depth cues).
Hosking and Crassini (2010) had participants judge time to contact for familiar
objects (tennis ball and football) presented in
their standard size or with their sizes reversed.
They also used unfamiliar black spheres.
Contrary to Lee’s hypothesis, time-to-contact
judgments were influenced by familiar size
(especially when the object was a very large
tennis ball) leading participants to overestimate time to contact (see Figure 4.5).
Tau is available in monocular vision.
However,
observers often make use of inforFigure 4.5
Errors in time-to-contact judgements for the smaller and
mation available in binocular vision, espethe larger object as a function of whether they were presented
cially binocular disparity (see Chapter 2).
in their standard size, the reverse size (off-size) or lacking
Fath et al. (2018) discussed research showing
texture (no-texture). Positive values indicate that responses
binocular information sometimes provides
were made too late and negative values that they were made
more accurate judgements than tau of time to
too early.
contact (e.g., when viewing small objects or
From Hosking and Crassini (2010). With kind permission from Springer
Science+Business Media.
rotating non-spherical objects).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 150
28/02/20 6:43 PM
Motion perception and action
151
In their own research, Fath et al. (2018) assessed accuracy of time-tocontact judgements when observers viewed fast- or slow-moving objects.
They used three conditions varying in the amount of information available
to observers: (1) monocular flow information only (permitting assessment
of tau); (2) binocular disparity information only; (3) all sources of information available. Fath et al. predicted that binocular disparity information
would be less likely to be used with fast-moving objects than slow-moving
ones because it is relatively time-consuming to calculate changes in binocular disparity over time.
What did Fath et al. (2018) find? First, with fast objects, time-to-­
contact judgements were more accurate with monocular flow information
only than with binocular disparity information only. Second, with slow
objects, the opposite findings were obtained. Third, accuracy of time-tocontact judgements when all sources of information were available were
comparable to accuracy in the better of the single-source conditions with
fast and with slow objects.
DeLucia (2013) found observers mistakenly predicted a large approaching object would reach them sooner than a closer small approaching object:
the size-arrival effect. This effect occurred because observers attached more
importance to relative size than tau.
We turn now to research on drivers’ braking decisions. Lee’s (1976)
notion that drivers brake to hold constant the rate of change of tau was
tested by Yilmaz and Warren (1995). They told participants to stop at a
stop sign in a simulated driving task. As predicted, there was generally a
linear reduction in tau during braking. However, some participants showed
large rather than gradual changes in tau shortly before stopping.
Tijtgat et al. (2008) found individual differences in stereo vision influenced drivers’ braking behaviour to avoid a collision. Drivers with weak
stereo vision started braking earlier than those with normal stereo vision
and their peak deceleration also occurred earlier. Those with weak stereo
vision found it harder to calculate distances causing them to underestimate
the time to contact. Thus, deciding when to brake does not depend only
on tau or tau-dot.
Harrison et al. (2016) argued that Lee’s (1976) theoretical approach is
limited in two important ways when applied to drivers’ braking behaviour.
First, it ignores physical limitations in the real world. For example, tau-dot
specifies to a driver the deceleration during braking required to avoid collision. However, this strategy will not work if the driver’s braking system
makes the required deceleration unachievable.
Second, individuals differ in the emphasis they place on minimisation
of costs (e.g., preferred safety margin). According to Harrison et al., these
limitations suggest drivers’ braking behaviour is influenced by their sensitivity to relevant affordances (possibilities for action) such as their knowledge of the dynamics of the braking system in their car.
Evaluation
The notion that tau is used to make time-to-contact judgements is simple
and elegant. There is much evidence that such judgements are often strongly
influenced by tau. Even when competing factors affect time-to-contact
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 151
28/02/20 6:43 PM
152
Visual perception and attention
judgements, tau often has the greatest influence on those judgements. Tau is
also often used when drivers make decisions about when to brake.
What are the limitations of theory and research in this area? First,
time-to-contact judgements are typically more influenced by tau or tau-dot
in relatively uncluttered laboratory environments than naturalistic conditions (Land, 2009). Second, tau is not the only factor determining timeto-contact judgements. As Land (2009, p. 853) pointed out, “The brain
will accept all valid cues in the performance of an action, and weight them
according to their current reliability.” These cues can include object familiarity, binocular disparity and relative size. It clearly makes sense to use all
the available information in this way.
Third, the tau hypothesis ignores the emotional value of the approaching object. Time-to-contact judgements are shorter for threatening pictures
than neutral ones (Brendel et al., 2012). This makes evolutionary sense – it
could be fatal to overestimate how long a very threatening object (e.g., a
lion) will take to reach you!
Fourth, braking behaviour involves factors additional to tau and
tau-dot. For example, there are individual differences in preferred safety
margin. Rock et al. (2006) identified an alternative braking strategy in a
real-world driving task in which drivers directly estimated the constant
ideal deceleration required to stop at a given point.
VISUALLY GUIDED ACTION: CONTEMPORARY
APPROACHES
The previous section focused mainly on the issue of how we use visual
information when moving through the environment. Here we consider
similar issues but the emphasis shifts towards processes involved in successful goal-directed action towards objects. For example, how do we reach for
a cup of coffee? This issue was addressed by Milner and Goodale (1995,
2008) in their perception-action model (see Chapter 2). Contemporary
approaches that have developed and extended the perception-action model
are discussed below.
Role of planning: planning-control model
Interactive exercise:
Planning control
Glover (2004) proposed a planning-control model of goal-directed action
towards objects. According to this model, we initially use a planning system
followed by a control system but the two systems often overlap in time.
Here are the main features of the two systems:
(1)
Planning system
●●
It is used mostly before the initiation of movement.
●●
It selects an appropriate target (e.g., cup of coffee), decides how
it should be grasped and works out the timing of the movement.
●●
It is influenced by factors such as the individual’s goals, the
nature of the target object, the visual context and various cognitive processes.
●●
It is relatively slow because it uses much information and is influenced by conscious processes.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 152
28/02/20 6:43 PM
153
Motion perception and action
(2)
Control system
●●
It is used during the carrying out of a movement.
●●
It ensures movements are accurate, making adjustments, if necessary, based on visual feedback. Efference copy (see Glossary) is
used to compare actual with desired movement. Proprioception
is also involved.
●●
It is influenced by the target object’s spatial characteristics (e.g.,
size; shape; orientation) but not by the surrounding context.
●●
It is fairly fast because it uses little information and is not susceptible to conscious influence.
KEY TERM
Proprioception
An individual’s awareness
of the position and
orientation of parts of
their body.
According to the planning-control model, most errors in human action
stem from the planning system. In contrast, the control system typically
ensures actions are accurate and achieve their goal. Many visual illusions
occur because of the influence of visual context. Since information about
visual context is used only by the planning system, responses to visual illusions should typically be inaccurate if they depend on the planning system
but accurate if they depend on the control system.
There are similarities between the planning-control model and
Milner and Goodale’s perception-action model. However, Glover
(2004) focused more on the processing changes occurring during action
performance.
Findings
Glover et al. (2012) compared the brain areas involved in planning and
control using a planning condition (prepare to reach and grasp an object
but remain still) and a control condition (reach out immediately for the
object). There was practically no overlap in the brain areas associated with
planning and control. This finding supports the model’s assumption that
planning and control processes are separate.
According to the planning-control model, various factors (e.g., semantic properties of the visual scene) influence the planning process associated
with goal-directed movements but not the subsequent control process. This
prediction was tested by Namdar et al. (2014). Participants grasped an
object in front of them using their thumb and index finger. The object had
a task-irrelevant digit (1, 2, 8 or 9) on it. As predicted, numerically larger
digits led to larger grip apertures during the first half of the movement trajectory but not the second half (involving the control process).
According to Glover (2004), action planning involves conscious
processing followed by rapid non-conscious processing during action
control. These theoretical assumptions can be tested by requiring participants to carry out a second task while performing an action towards an
object. According to the model, this second task should disrupt planning
but not control. However, Hesse et al. (2012) found a second task disrupted ­planning and control when participants made grasping movements
towards objects. Thus, planning and control can both require attentional
resources.
According to the model, visual illusions occur because misleading
visual context influences the initial planning system rather than the later
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 153
28/02/20 6:43 PM
154
Visual perception and attention
control system. Roberts et al. (2013) required participants to make rapid
reaching movements to a Müller-Lyer figure. Vision was available only
during the first 200 ms of movement or the last 200 ms. The findings were
opposite to those predicted theoretically – performance was more accurate
with early vision than late vision.
Elliott et al. (2017) explained the above findings with their multiple
process model. According to this model, performance was good when early
vision was available because of a control system known as impulse control.
Impulse control “entails an early, and continuing, comparison of expected
sensory consequences to perceived sensory consequences to regulate limb
direction and velocity during the distance-covering phase of the movement” (p. 108).
Evaluation
Glover’s (2004) planning-control model has proved successful in various
ways. First, it successfully developed the common assumption that motor
movements towards an object involve successive planning and control processes. Second, the assumption cognitive processes are important in action
planning is correct. Third, there is evidence (e.g., Glover et al., 2012) that
separate brain areas are involved in planning and control.
What are the model’s limitations? First, the planning system involves
several very different processes: “goal determination; target identification
and selection; analysis of object affordances [potential object uses]; timing;
and computation of the metrical properties of the target such as its size,
shape, orientation and position relative to the body” (Glover et al., 2012,
p. 909). This diversity sheds doubt on the assumption there is a single planning system.
Second, the model argues control occurs late during object-directed
movements and is influenced by visual feedback. However, there appears
to be a second control process (called impulse control by Elliott et al.,
2017) operating throughout the movement trajectory and not influenced
by visual feedback.
Third, and related to the second point, the model presents an oversimplified picture of the processes involved in goal-directed action. More
specifically, the processing involved in producing goal-directed movements
is far more complex than implied by the notion of a planning process followed by a control process. For example, planning and control processes
are often so intermixed that “the distinction between movement planning
and movement control is blurred” (Gallivan et al., 2018, p. 519).
Fourth, complex decision-making processes are often involved when
individuals plan goal-directed actions in the real world. For example, when
planning, tennis players players must often decide between a simple shot
­minimising energy expenditure and risk or injury or a more ambitious shot
that might immediately win the current point (Gallivan et al., 2018).
Fifth, the model is designed to account for planning and control processes when only one object is present or of interest. In contrast, visual
scenes in everyday life are often far more complex and contain several
objects of potential relevance (see below).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 154
28/02/20 6:43 PM
155
Motion perception and action
Role of planning: changing action plans
We all have considerable experience of changing, modifying and abandoning action plans with respect to objects in the environment. How do we
resolve competition among action plans? According to Song (2017, p. 1),
“Critical is the existence of parallel motor planning processes, which allow
efficient and timely changes.”
What evidence indicates we often process information about several
different potential actions simultaneously? Suppose participants are given
the task of reaching rapidly towards a target in the presence of distractors (Song & Nakayama, 2008). On some trials, their reach is initially
directed towards the target. On other trials, their initial reach is directed
towards a distractor but this is corrected in mid-flight producing a strongly
curved trajectory. Song and Nakayama’s key finding was that corrective
movements occurred very rapidly following the onset of the initial movement. This finding strongly implies that the corrective movement had been
planned prior to execution of the initial incorrect movement.
Song (2017) discussed several other studies where similar findings were
obtained. He concluded, “The sensori-motor system generates multiple
competing plans in parallel before actions are initiated . . . this concurrent processing enables us to efficiently resolve competition and select one
appropriate action rapidly” (p. 6).
Brain pathways
In their perception-action model, Milner and Goodale (1995, 2008) distinguished between a ventral stream or pathway and a dorsal stream
or pathway (see Chapter 2). In approximate terms, the ventral stream is
involved in object perception whereas the dorsal stream “is generally considered to mediate the visual guidance of action, primarily in real time”
(Milner, 2017, p. 1297).
Much recent research has indicated that the above theoretical account
is oversimplified (see Chapter 2). Of central importance is the accumulating evidence that there are actually two somewhat separate dorsal streams
(Osiurak et al., 2017; Sakreida et al., 2016):
(1)
(2)
Interactive feature:
Primal Pictures’
3D atlas of the brain
The dorso-dorsal stream: processing in this stream relates to the
online control of action and is hand-centred; it has been described as
the “grasp” system (Binkofski & Buxbaum, 2013).
The ventro-dorsal stream: processing in this stream is offline and relies
on memorised knowledge of objects and tools and is object-centred;
it has been described as the “use” system (Binkofski & Buxbaum,
2013).
Sakreida et al. (2016) identified several other differences between these two
streams (see Figure 4.6). In essence, object processing within the dorso-­
dorsal stream is variable because it is determined by the immediately accessible properties of an object (e.g., its size and shape). Such processing is
fast and “automatic”. In contrast, processing within the ventro-dorsal
stream is stable because it is determined by memorised object knowledge.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 155
28/02/20 6:43 PM
Central sulcus
Dorso-dorsal
VARIABLE
From Sakreida et al. (2016).
Reprinted with permission of
Elsevier.
STABLE
Ventro-dorsal
• Fast and “automatic” online
processing during actual
object interaction
• Variation of object properties
(e.g., size, shape, weight, or
orientation) during task
performance
• Low working memory load
• Structurebased
actions/
“Grasp”
system by
Buxbaum and
Kalénine
• Slow and “non-automatic”
“offline” processing of memorised object knowledge
• Constant object properties
during active or observed
object-related reaching,
grasping or pointing
• High working memory load
• Grasping
circuit by
Jeannerod
• Reaching
circuit by
Jeannerod
Related concepts
Figure 4.6
The dorso-dorsal and
ventro-dorsal streams
showing their brain
locations and forms of
processing.
Visual perception and attention
Continuum
156
• Functionbased
actions/
“Use”
system by
Buxbaum and
Kalénine
Such processing is slow and more cognitively demanding than processing
within the dorso-dorsal stream.
Findings
KEY TERM
Limb apraxia
A condition caused by
brain damage in which
individuals have impaired
ability to make skilled
goal-directed movements
towards objects even
though they possess the
physical ability to perform
them.
Considerable neuroimaging evidence supports the proposed distinction
between two dorsal streams. Martin et al. (2018, p. 3755) reviewed research
indicating the dorso-dorsal stream “traverses from visual area V3a through
V6 toward the superior parietal lobule, and . . . reaches the dorsal premotor
cortex”. In contrast, the ventro-dorsal stream “encompasses higher-­order
visual areas like MT/V5+, the inferior parietal lobule . . . as well as the
ventral premotor cortex and inferior frontal gyru” (p. 3755). Sakreida et al.
(2016) conducted a meta-analytic review based on 71 neuroimaging studies
and obtained similar findings.
Evidence from brain-damaged patients is also supportive of the distinction between two dorsal streams. First, we consider patients with
damage to the ventro-dorsal stream. Much research has focused on limb
apraxia, a disorder where patients often fail to make precise goal-directed
actions in spite of possessing the physical ability to perform those actions
(Pellicano et al., 2017). More specifically, “Reaching and grasping actions
in LA [limb apraxia] are normal when vision of the limb and target is
available, but typically degrade when they must be performed ‘off-line’, as
when subjects are blindfolded prior to movement execution” (Binkovski
& Buxbaum, 2013, p. 5). This pattern of findings is expected if the dorso-­
dorsal stream is intact in patients with limb apraxia.
Second, we consider patients with damage to the dorso-dorsal stream.
Much research here has focused on optic ataxia (see Glossary). As predicted, patients with optic ataxia have impaired online motor control and
so exhibit inaccurate reaching towards (and grasping of) objects.
Evaluation
Neuroimaging research has provided convincing evidence for the existence
of two dorsal processing streams. The distinction between dorso-dorsal and
ventro-dorsal streams has also been supported by studies on brain-damaged
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 156
28/02/20 6:43 PM
Motion perception and action
157
patients. More specifically, there is some evidence for a double dissociation
(see Glossary) between the impairments exhibited by patients with limb
apraxia and optic ataxia.
What are the limitations of research in this area? First, the ventral
stream (strongly involved in object recognition) is also important in visually guided action (Osiurak et al., 2017). However, precisely how this
stream interacts with the dorso-dorsal and ventro-dorsal streams is unclear.
Second, there is some overlap in the brain between the dorso-dorsal and
ventro-dorsal streams and so it is important not to exaggerate their independence. Third, there is a lack of consensus concerning the precise functions of the two dorsal streams (see Osiurak et al., 2017, and Sakreida
et al., 2016).
PERCEPTION OF HUMAN MOTION
We are very good at interpreting other people’s movements. We can decide
very rapidly whether someone is walking, running or limping. Our initial
focus is on two key issues. First, how successfully can we interpret human
motion with very limited visual information?
Second, do the processes involved in perception of human motion differ from those
involved in perception of motion in general? If
the answer to this question is positive, we also
need to consider why the perception of human
motion is special.
As indicated already, our focus is
mostly on the perception of human motion.
However, there are many similarities between
the perception of human and animal motion,
and we will sometimes use the term “biological motion” to refer generally to the perception of animal motion.
Finally, we discuss an important theoretical approach based on the notion that the
same brain system or network is involved in
perceiving and understanding human actions
and in performing those same actions.
Perceiving human motion
Suppose you were presented with point-light
displays, as was done initially by Johansson
(1973). Actors were dressed entirely in black
with lights attached to their joints (e.g., wrists;
knees; ankles). They were filmed moving
around a darkened room so only the lights
were visible to observers watching the film (see
Figure 4.7 and “Johansson Motion Perception
Part 1” on YouTube).What do you think
you would perceive in those circumstances?
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 157
Figure 4.7
Point-light sequences (a) with the walker visible and (b) with
the walker not visible.
Shiffrar and Thomas (2013). With permission of the authors.
28/02/20 6:43 PM
158
Visual perception and attention
Figure 4.8
Human detection and
discrimination efficiency for
human walkers presented
in contour, point lights,
silhouette and skeleton.
Human efficiency (%)
1.2
From Lu et al. (2017).
Detection
Discrimination
0.6
0
Contour
Point light
Silhouette
Skeleton
In fact, Johansson found observers perceived the moving person accurately
with only six lights and a short segment of film.
In subsequent research, Johansson et al. (1980) found observers perceived human motion with no apparent difficulty when viewing a pointlight display for only one-fifth of a second! Ruffieux et al. (2016) studied a
patient, BC, who was cortically blind but had a residual ability to process
motion. When presented with two point-light displays (one of a human
and one of an animal) at the same time, he generally correctly identified
the human.
The above findings imply we are very efficient at processing impoverished point-light displays. However, Lu et al. (2017) reported some
contrary evidence. Observers were given two tasks: (1) detecting the
­presence of a human walker; (2) discriminating whether a human walker
was walking leftward or rightward. The walker was presented in point
lights, contour, silhouette or as a skeleton. Detection performance was
relatively good for the point-light display but discrimination performance
was not (see Figure 4.8). Performance was high with the skeleton display
because it provided detailed information about the connections between
joints.
Top-down or bottom-up processes?
Johansson (1975) argued the ability to perceive biological motion is innate,
describing the processes involved as “spontaneous” and “automatic”.
Support was reported by Simion et al. (2008) in a study on newborns (1–3
days). These newborns preferred to look at a display showing biological
motion more than one that did not. Remarkably, Simion et al. used pointlight displays of chickens of which the newborns had no previous experience. These findings suggest the perception of biological motion involves
basic bottom-up processes.
Evidence that learning plays a role was reported by Pinto (2006).
Three-month-olds were equally sensitive to motion in point-light humans,
cats and spiders. In contrast, 5-month-olds were more sensitive to displays
of human motion. Thus, the infant visual system becomes increasingly specialised for perceiving human motion.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 158
28/02/20 6:43 PM
Motion perception and action
159
If the detection of biological motion were “automatic”, it would be
relatively unaffected by attention. However, in a review Thompson and
Parasuraman (2012) concluded attention is required, especially when the
available visual information is ambiguous or competing information is
present.
Mayer et al. (2015) presented circular arrays of between two and
eight video clips. In one condition, observers decided rapidly whether
any clip showed human motion; in another condition, they decided
whether any clips showed machine motion. There were two key findings. First, d
­ etection times increased with array size for both human and
machine motion, ­suggesting attention is required to detect both types of
motion. Second, the effects of array size on detection times were much
greater for machine motion. Thus, searching is more efficient for human
than machine motion suggesting human motion perception may be special
(see below).
Is human motion perception special?
Much evidence indicates we are better at detecting human motion than
motion in other species (Shiffrar & Thomas, 2013). Cohen (2002) assessed
observers’ sensitivity to human, dog and seal motion using point-light
displays. Performance was best with human motion and worst with seal
motion. Of importance, the same pattern of performance was found in seal
trainers and dog trainers. Thus, the key factor is not simply visual experience; instead, we are more sensitive to observed motions resembling our
own repertoire of actions.
We can also consider whether human motion perception is special by
considering the brain. There has been an increasing recognition that many
brain areas are involved in biological motion processing (see Figure 4.9). The pathway from the fusiform gyrus (FFG) to the superior temporal sulcus
*
IFG
INS
(STS) is of particular importance, as are top-down
processes from the insula (INS), the STS and the
*
9
STS
inferior frontal gyrus (IFG).
Much research indicates the central importance
Crus I
8
of the superior temporal sulcus. Grossman et al.
6
10
4
11
(2005) applied repetitive transcranial magnetic stimulation (rTMS; see Glossary) to that area to disrupt
MTC
FFG
processing. This caused a substantial reduction in
observers’ sensitivity to biological motion. GilaieOCC
Dotan et al. (2013) found grey matter volume in the
superior temporal sulcus correlated positively with
the detection of biological (but not non-biological)
motion.
Evidence from brain-damaged patients indi- Figure 4.9
Brain areas involved in biological motion processing
cates that perceiving biological motion involves dif- (STS = superior temporal sulcus; IFG = inferior frontal
ferent processes from those involved in perceiving gyrus; INS = insula; Crus 1 = left lateral cerebellar
object motion generally. Vaina et al. (1990) studied lobule; MTC = middle temporal cortex; OCC = early
a patient, AF, with damage to the posterior visual visual cortex; FFG = fusiform gyrus).
pathways. He performed poorly on basic motion From Sokolov et al. (2018).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 159
28/02/20 6:43 PM
160
Visual perception and attention
tasks but was reasonably good at detecting biological motion from
­point-light displays. In contrast, Saygin (2007) found in stroke patients
with damage in the temporal and premotor frontal areas that their
­perception of biological motion was more impaired than non-biological
motion.
Why is biological motion perception special?
We could explain the special nature of biological motion perception in three
ways (Shiffrar & Thomas, 2013). First, biological motion is the only type of
motion humans can produce as well as perceive. Second, most people spend
more time perceiving and trying to understand other people’s motion than
any other form of visual motion. Third, other people’s movements provide
a rich source of social and emotional information.
We start with the first reason (discussed further on pp. 161–162). The
­relevance of motor skills to the perception of biological motion was shown
by Kloeters et al. (2017). Patients with Parkinson’s disease (which impairs
movement execution) had significantly inferior perception of human
movement with point-light displays compared to healthy controls. More
dramatically, paraplegics with severe spinal injury were almost three times
less sensitive than healthy controls to human movement in point-light
displays.
We must not exaggerate the importance of motor involvement in biological motion perception. A man, DC, born without upper limbs, identified manual actions shown in videos and photographs as well as healthy
controls (Vannuscorps et al., 2013). Motor skills may be most important
in biological motion perception when the visual information presented is
sparse or ambiguous (e.g., as with point-light displays).
Jacobs et al. (2004) obtained support for the second reason listed
above. Observers’ ability to identify walkers from point-light displays was
much better when the walker was observed for 20 hours a week rather than
5 hours. In our everyday lives, we often recognise individuals in motion by
integrating information from biological motion with information from the
face and the voice within the superior temporal sulcus (Yovel & O’Toole,
2016). Successful integration of these different information sources clearly
depends on learning and experience.
We turn now to the third reason mentioned earlier. Charlie Chaplin
showed convincingly that bodily movements can convey social and emotional information. Atkinson et al. (2004) found observers performed
well at identifying emotions from point-light displays (especially for fear,
sadness and happiness). Part of the explanation for these findings is that
angry individuals walk especially fast whereas fearful or sad ones walk
very slowly (Barliya et al., 2013).
We can explore the role of social factors in biological motion detection by studying adults with autism spectrum disorder who have severely
impaired social interaction skills. The findings are somewhat inconsistent However, adults with autism spectrum disorder generally have a reasonably intact ability to detect human motion in point-light displays but
exhibit impaired emotion processing in such displays (see Bakroon &
Lakshminarayanan, 2018 for a review).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 160
28/02/20 6:43 PM
161
Motion perception and action
Mirror neuron system
Research on monkeys in the 1990s transformed our
PMd
understanding of biological motion. Gallese et al. (1996)
assessed monkeys’ brain activity while they performed
AIP SI M1
a given action and while they observed another monkey
PF
perform the same action. They found 17% of neurons in
PMv
area F5 of the premotor cortex were activated in both
pMT G/STS
conditions. Such findings led theorists to propose a mirror
neuron system consisting of neurons activated during
both observation and performance of actions (see Keysers
et al., 2018, for a review).
Visual input
There have been numerous attempts to identify a
Mirror network
mirror neuron system in humans. Our current underMotor output
standing of brain areas associated with the mirror neuron
system is shown in Figure 4.10. Note that the mirror
neuron system consists of an integrated network rather
Figure 4.10
than separate brain areas (Keysers et al., 2018).
The main brain areas associated with the
Most research is limited because it shows only that mirror neuron system (MNS) plus their
the same brain areas are involved in action perception interconnections (red). Areas involved in visual
and production. Perry et al. (2018) used more precise input (blue; pMTG = posterior mid-temporal
methods to reveal a more complex picture within areas gyrus; STS = superior temporal gyrus) and motor
assumed to form part of the mirror neuron system. Some output (green; M1 = primary motor cortex) are
small areas were activated during both observing actions also shown. AIP = anterior intraparietal areas;
PF = area within the parietal lobe; PMv and
and imitating them, thus providing evidence for a human
PMd = ventral and dorsal premotor cortex;
neuron system. However, other adjacent areas were acti- SI = primary somato-sensory cortices.
vated only during observing or action imitation.
From Keysers et al. (2018). Reprinted with permission
More convincing evidence for a human mirror of Elsevier.
neuron system was reported by de la Rosa et al. (2016).
They focused on activation in parts of the inferior frontal gyrus (BA44/45)
corresponding to area F5 in monkeys. Their key finding was that 52 voxels
(see Glossary) within BA44/45 responded to both action perception and
action production.
Before proceeding, we should note the term “mirror neuron system”
is somewhat misleading because mirror neurons do not provide us with
an exact motoric coding of observed actions. As Williams (2013, p. 2962)
wittily remarked, “If only this was the case! I could become a Olympic iceskater or a concert pianist!”
Findings
We have seen that neuroimaging studies have indicated that the mirror
neuron system is activated during motor perception and action. Such evidence is correlational, and so does not demonstrate that the mirror neuron
system is necessary for motor perception and action understanding.
More direct evidence comes from research on brain-damaged patients.
Binder et al. (2017) studied left-hemisphere stroke patients with apraxia
(impaired ability to perform planned actions) having damage within the
mirror neuron system (e.g., inferior frontal gyrus). These patients had
comparable deficits in action imitation, action recognition and action
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 161
KEY TERM
Mirror neuron system
Neurons that respond
to actions whether
performed by oneself
or someone else; it is
claimed these neurons
assist in imitating (and
understanding) the
actions of others.
28/02/20 6:43 PM
162
Visual perception and attention
KEY TERM
comprehension. The co-existence of these deficits was precisely as predicted. Another predicted finding was that left-hemisphere stroke patients
without apraxia had less brain damage in core regions of the mirror neuron
system than those with apraxia.
Another approach to demonstrating the causal role of the mirror
neuron system is to use experimental techniques such as transcranial direct
current stimulation (tDCS; see Glossary). Avenanti et al. (2018) assessed
observers’ ability to predict which object would be grasped after seeing
the start of a reaching movement. Task performance was enhanced when
anodal tDCS was used to facilitate neural activity within the mirror neuron
system, whereas it was impaired when cathodal tDCS was used to inhibit
such neural activity.
Apraxia
A condition caused by
brain damage in which
there is greatly reduced
ability to perform
purposeful or planned
bodily movements in
spite of the absence of
muscular damage.
Findings: functions of the mirror neuron system
What are the functions of the mirror neuron system? It has often been
assumed mirror neurons play a role in working out why someone else is
performing certain actions as well as deciding what those actions are. For
example, Eagle et al. (2007, p. 131) claimed the mirror neuron system is
involved in “the automatic, unconscious, and non-inferential simulation
in the observer of the actions, emotions, and sensations carried out and
expressed by the observed”.
Rizzolatti and Sinigaglia (2016) argued that full understanding of
another person’s actions requires a multi-level process. The first level
involves identifying the outcome of the observed action and the emotion
being displayed by the other person. This is followed by the observer representing the other person’s desires, beliefs and intentions. The mirror
neuron system is primarily involved at the first level but may provide an
input to subsequent processes.
Lingnau and Petris (2013) argued that understanding another person’s
actions often requires complex cognitive processes as well as simpler processes within the mirror neuron system. Observers saw point-light displays
of human actions and some were asked to identify the goal of each action.
Areas within the prefrontal cortex (associated with high-level cognitive processes) were more activated when goal identification was required. These
findings can be explained within the context of Rizzolatti and Sinigaglia’s
(2016) approach discussed above.
Wurm et al. (2016) distinguished between two forms of motion perception and understanding. They used the example of observers understanding that someone is opening a box. If they have a general or abstract
understanding of this action, their understanding should generalise to
other boxes and other ways of opening a box. In contrast, if they only
have a specific or concrete understanding of the action, their understanding will not generalise. Wurm et al. (2016) found specific or concrete action
understanding could occur within the mirror neuron system. However,
­
more general or abstract understanding involved high-level perceptual
regions (e.g., the lateral parieto-temporal cortex) outside the mirror neuron
system.
In sum, the mirror neuron system is of central importance with respect
to some (but not all) aspects of action understanding. More specifically,
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 162
28/02/20 6:43 PM
Motion perception and action
163
additional (more “cognitive”) brain areas are required if action understanding is complex (Lingnau & Petris, 2013) or involves generalising from
past experience (Wurm et al., 2016). It is also likely that imitating someone
else’s actions often involves processes (e.g., person-perception processes)
additional to those directly involving the mirror neuron system (Ramsey,
2018).
Overall evaluation
Several important research findings have been obtained. First, we have an
impressive ability to perceive human or biological motion even with very
limited visual input. Second, the brain areas involved in human motion
perception differ somewhat from those involved in perceiving motion in
general. Third, perception of human motion is special because it is the only
type of motion we can both perceive and produce. Fourth, a mirror neuron
system allows us to imitate and understand other people’s movements.
Fifth, the core brain network of the mirror neuron system has been identified. Its causal role has been established through studies on brain-damaged
patients and research using techniques such as transcranial direct current
stimulation.
What are the limitations of research in this area? First, much remains
unclear about interactions of bottom-up and top-down processes in the
perception of biological motion.
Second, the mirror neuron system does not account for all aspects of
action understanding. As Gallese and Sinigaglia (2014, p. 200) pointed out,
action understanding “involves representing to which . . . goals the action
is directed; identifying which beliefs, desires, and intentions specify reasons
explaining why the action happened; and realising how those reasons are
linked to the agent and to her action”.
Third, nearly all studies on the mirror neuron system have investigated
its properties with respect only to hand actions. However, somewhat different mirror neuron networks are probably associated with hand-and-mouth
actions (Ferrari et al., 2017a).
Fourth, it follows from theoretical approaches to the mirror neuron
system that an observer’s ability to understand another person’s actions
should be greater if they both execute any given action in a similar fashion.
This prediction has been confirmed (Macerollo et al., 2015). Such research
indicates the importance of studying individual differences in motor
actions, which have so far been relatively neglected.
CHANGE BLINDNESS
We have seen that a changing visual environment allows us to move in the
appropriate direction and to make coherent sense of our surroundings.
However, as we will see, our perceptual system does not always respond
appropriately to changes in the visual environment.
Have a look around you (go on!). You probably have a strong impression of seeing a vivid and detailed picture of the visual scene. As a result,
you are probably confident you could immediately detect any reasonably
large change in the visual environment. In fact, that is often not the case.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 163
28/02/20 6:43 PM
164
Visual perception and attention
KEY TERMS
Change blindness, which is “the failure to detect changes in visual
scenes” (Ball et al., 2015, p. 2253) is the main phenomenon we will discuss.
We also consider inattentional blindness, “the failure to consciously perceive otherwise salient events when they are not attended” (Ward & Scholl,
2015, p. 722). Research on change blindness focuses on dynamic processes
over time. It has produced striking and counterintuitive findings leading to
new theoretical thinking about the processes underlying conscious visual
awareness.
Change blindness and inattentional blindness both depend on a mixture
of perceptual and attentional processes. It is thus appropriate to discuss
these phenomena at the end of our coverage of perception and immediately
prior to the start of our coverage of attention.
You have undoubtedly experienced change blindness at the movies
caused by unintended continuity mistakes when a scene has been reshot.
For example, in the film Skyfall, James Bond is followed by a white
car. Mysteriously, this car suddenly becomes black and then returns to
being white! For more examples, type “Movie continuity mistakes” into
YouTube.
We greatly exaggerate our ability to detect visual changes. Levin
et al. (2002) asked observers to watch videos involving two people in
a ­restaurant. In one video, the plates change from red to white and in
another a scarf worn by one person disappeared. Levin et al. found 46%
of observers claimed they would have noticed the change in the colour
of the plates without being forewarned and the figure was 78% for the
­disappearing scarf. In a previous study, 0% of observers detected either
change! Levin et al. introduced the term change blindness blindness to
describe our wildly optimistic beliefs about our ability to detect visual
changes.
In the real world, we are often aware of visual changes because we
detect motion signals accompanying the change. Laboratory researchers
have used various ways to prevent observers from detecting motion signals.
One way is to make the change during a saccade
(rapid movement of the eyes). Another way is
to have a short gap between the original and
changed displays (the flicker paradigm).
Suppose you walked across a large square
close to a unicycling clown wearing a vivid
purple and yellow outfit, large shoes and a bright
red nose (see Figure 4.11). Would you spot
him? I imagine your answer is “Yes”. However,
Hyman et al. (2009) found only 51% of people
walking on their own spotted the clown. Those
failing to spot the clown exhibited inattentional
blindness.
Change blindness
Failure to detect various
changes (e.g. in objects)
in the visual environment.
Inattentional blindness
Failure to detect an
unexpected object
appearing in the visual
environment.
Change blindness
blindness
The tendency of observers
to overestimate greatly
the extent to which they
can detect visual changes
and so avoid change
blindness.
Figure 4.11
The unicycling clown who cycled close to students walking
across a large square.
From Hyman et al. (2009).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 164
Change blindness vs inattentional
blindness
Change blindness and inattentional blindness
both involve a failure to detect some visual event
28/02/20 6:43 PM
Motion perception and action
165
occurring in plain sight. Unsurprisingly, failures of attention often play an
important role in causing both forms of blindness.
However, there are major differences between the two phenomena
(Jensen et al., 2011). First, consider the effects of instructing observers to
look for unexpected objects or visual changes. Target detection in change
blindness paradigms is often hard even with such instructions. In contrast,
target detection in inattentional blindness paradigms becomes trivially
easy. Second, change blindness involves the use of memory to compare prechange and post-change stimuli, whereas inattentional blindness does not.
Third, inattentional blindness mostly occurs when the observer’s attention
is engaged in a demanding task (e.g., chatting on a mobile phone) unlike
change blindness.
In sum, more complex processing is typically required for successful
performance in change blindness tasks. More specifically, observers must
engage successfully in five separate processes for change detection to occur
(Jensen et al., 2011):
(1)
(2)
(3)
(4)
(5)
Attention must be paid to the change location.
The pre-change visual stimulus at the change location must be
encoded into memory.
The post-change visual stimulus at the change location must be
encoded into memory.
The pre- and post-change representations must be compared.
The discrepancy between the pre- and post-change representations
must be recognised at the conscious level.
IN THE REAL WORLD: IT’S MAGIC!
Magicians benefit from the phenomena of inattentional blindness and change blindness (Kuhn &
Martinez, 2012). Most magic tricks involve misdirection which is designed “to disguise the method
and thus prevent the audience from detecting it” (Kuhn & Martinez, 2012, p. 2). Many people
believe misdirection involves the magician manipulating the audience’s attention away from some
action crucial to the trick’s success. That is often (but not always) the case.
Inattentional blindness
Kuhn and Findlay (2010) studied inattentional blindness using a disappearing lighter (see
Figure 4.12 for details). There were three main findings. First, of the observers who detected the
drop, 31% were fixating close to the magician’s left hand when the lighter was dropped from that
hand. However, 69% were fixating some distance away and so detected the drop in peripheral
vision (see Figure 4.13). Second, the average distance between fixation and the drop was the same
in those who detected the drop in peripheral vision and those who did not. Third, the time taken
after the drop to fixate the left hand was much less in observers using peripheral vision to detect
the drop than those failing to detect it (650 ms vs 1,712 ms).
What do the above findings mean? The lighter drop can be detected by overt attention (attention directed to the fixation point) or covert attention (attention directed away from the fixation
point). Covert attention was surprisingly effective because the human visual system can readily
detect movement in peripheral vision (see Chapter 2).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 165
28/02/20 6:43 PM
166
Visual perception and attention
Figure 4.12
The sequence of events in the disappearing lighter trick: (a) the magician picks up a lighter with his left hand and (b)
lights it; (c) and (d) he pretends to take the flame with his right hand and (e) gradually moves it away from the hand
holding the lighter; (f) he reveals his right hand is empty while the lighter is dropped into his lap; (g) the magician
directs his gaze to his left hand and (h) reveals that his left hand is also empty and the lighter has disappeared.
From Kuhn and Findlay (2010). Reprinted with permission of Taylor & Francis.
Most people underestimate the importance of peripheral vision to trick detection. Across several
magic tricks (including the lighter trick and other tricks involving change blindness), Ortega et
al. (2018) found under 30% of individuals
thought they were likely to detect how a
trick worked using peripheral vision. In fact,
however, over 60% of the tricks where they
detected the method involved peripheral
vision! Thus, most people exaggerate the
role of central vision in understanding magic
tricks.
Change blindness
Figure 4.13
Participants’ fixation points at the time of dropping the
lighter for those detecting the drop (triangles) and those
missing the drop (circles).
From Kuhn and Findlay (2010). Reprinted with permission of Taylor
& Francis.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 166
Smith et al. (2012) used a magic trick in
which a coin was passed from one hand
to the other and then dropped. Observers
guessed whether the coin landed heads or
tails. On one trial, the coin was switched
(e.g., from a £1 coin to a 2p coin). All
observers fixated the coin throughout the
time it was visible but about 90% failed
to detect the coin had changed! Thus, an
object can be attended to without some
of the features irrelevant to the current
task being processed sufficiently to prevent
change blindness.
28/02/20 6:43 PM
Motion perception and action
167
Kuhn et al. (2016) used a trick in which a magician made the colour of playing cards change.
Explicit instructions to observers to keep their eyes on the cards influenced overt attention but
failed to reduce change blindness.
Conclusions
The success of many magic tricks depends less on where observers are fixating (overt attention)
than we might think. Observers can be deceived even when their overt attention is directed to
the crucial location. In addition, they often avoid change blindness or inattentional blindness even
when their overt attention is directed some distance away from the crucial location. Such findings
are typically explained by assuming the focus of covert attention often differs from that of overt
attention. More generally, peripheral vision is often of more importance to the detection of magic
tricks than most people believe.
Change blindness underestimates visual processing
Ball and Busch (2015) distinguished between two types of change detection: (1) seeing the object that changed; (2) sensing there has been a change
without conscious awareness of which object has changed. Several coloured
objects were presented in pre- and post-change displays. If the post-change
display contained a colour not present in the pre-change display, observers often sensed change had occurred without being aware of what had
changed.
When observers show change blindness, it does not necessarily mean
there was no processing of the change. Ball et al. (2015) used object changes
where the two objects were semantically related (e.g., rail car changed to
rail) or unrelated (e.g., rail car changed to sausage). Use of event-related
potentials (ERPs; see Glossary) revealed a larger negative wave when the
objects were semantically unrelated even when observers exhibited change
blindness. Thus, there was much unconscious processing of the pre- and
post-change objects.
What causes change blindness?
There is no single (or simple) answer to the question “What causes change
blindness?”. However, two major competing theories both provide partial
answers. First, there is the attentional approach (e.g., Rensink et al., 1997).
According to this approach, change detection requires selective attention
to be focused on the object that changes. Attention is typically directed to
only a limited part of visual space, and changes in unattended objects are
unlikely to be detected.
Second, there is a theoretical approach emphasising the importance of
peripheral vision (Rosenholtz, 2017a,b; Sharan et al., 2016 unpublished).
It is based on the assumption that visual processing occurs in parallel
across the entire visual field (including peripheral vision). According to
this approach, “Peripheral vision is a limiting factor underlying standard
demonstrations of change blindness” (Sharan et al., 2016, p. 1).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 167
28/02/20 6:43 PM
168
Visual perception and attention
Attentional approach
Change blindness often depends on attentional processes. We typically
attend to regions of a visual scene likely to contain salient or important
information. Spot the differences between the pictures in Figure 4.14.
Observers took an average of 10.4 seconds with the first pair of pictures
but only 2.6 with the second pair (Rensink et al., 1997). The height of the
railing is less important than the helicopter’s position.
Hollingworth and Henderson (2002) recorded eye movements while
observers viewed visual scenes (e.g., kitchen; living room). It was assumed
the object fixated at any moment was the being attended. There were two
potential changes in each scene:
●●
●●
Type change: an object was replaced by one from a different category
(e.g., a plate replaced by a bowl).
Token change: an object was replaced by an object from the same category (e.g., a plate replaced by a different plate).
Figure 4.14
(a) The object that is
changed (the railing)
undergoes a shift in
location comparable to
that of the object that is
changed (the helicopter) in
(b). However, the change
is much easier to see in
(b) because the changed
object is more important.
From Rensink et al. (1997).
Copyright 1997 by SAGE.
Reprinted by permission of
SAGE Publications.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 168
28/02/20 6:43 PM
Motion perception and action
169
What did Hollingworth and Henderson (2002) find? First, there was much
greater change detection when the changed object was fixated prior to
the change than when it was not fixated (see Figure 4.15a). Second, there
was change blindness for 60% of objects fixated prior to changing. Thus,
attention to the to-be-changed object was necessary (but not sufficient)
for change detection. Third, change detection was much greater when the
object type changed rather than simply token change because type changes
are more dramatic and obvious.
Evaluation
The attentional approach has various successes to its credit. First, change
detection is greater when target stimuli are salient or important and so
attract attention. Second, change detection is generally greater when the
to-be-changed object has been fixated (attended to) prior to the change.
What are the limitations with the attentional approach? First, the
notion that narrow-focused attention determines our visual experience is
Figure 4.15
(a) Percentage of correct
change detection as a
function of form of change
(type vs token) and time
of fixation (before vs
after change); also false
alarm rate when there
was no change. (b) Mean
percentage correct change
detection as a function of
the number of fixations
between target fixation and
change of target and form
of change (type vs token).
Both from Hollingworth and
Henderson (2002). Copyright
2002 American Psychological
Association. Reproduced with
permission.
170
Visual perception and attention
KEY TERM
hard to reconcile with our strong belief that experience spans the entire
field of view (Cohen et al., 2016). Second, “A selective attention account is
hard to prove or disprove, as it relies on largely unknown attentional loci
as well as poorly understood effects of attention” (Sharan et al., 2016).
Third, change blindness is sometimes poorly predicted by the focus of
overt attention (indexed by eye fixations) (e.g., Smith et al., 2012; Kuhn
et al., 2016). Such findings are often explained by covert attention, but this
is typically not measured directly. Fourth, the attentional approach implies
incorrectly that very little useful information is extracted from visual areas
outside the focus of attention (see below).
Visual crowding
The inability to recognise
objects in peripheral
vision due to the presence
of neighbouring objects.
Peripheral vision approach
Visual acuity is greatest in the centre of the visual field (the fovea; see
Glossary). However, peripheral vision (all vision outside the fovea) typically
covers the great majority of the visual field (see Chapter 2). As Rosenholtz
(2016, p. 438) pointed out, it is often assumed “Peripheral vision is impoverished and all but useless”. This is a great exaggeration even though acuity
and colour perception are much worse in the periphery than the fovea. In
fact, peripheral vision is often most impaired by visual crowding: “identification of a peripheral object is impaired by nearby objects” (Pirkner &
Kimchi, 2017, p. 1) (see Chapter 5).
According to Sharan et al. (2016, p. 3), “The hypothesis that change
blindness may arise in part from limitations of peripheral vision is quite
different from usual explanations of the phenomenon [which attribute it to]
a mix of inattention and lack of details stored in memory.”
Sharan et al. (2016) tested the above hypothesis. Initially, they categorised change-detection tasks as easy, medium and hard on the basis of how
rapidly observers detected the change. Then they presented these tasks to
different observers who fixated at various degrees of visual angle (eccentricities) from the area that changed. There were two key findings:
(2)
Figure 4.16
(a) Change-detection
accuracy as a function
of task difficulty and
visual eccentricity. (b)
The eccentricity at which
change-detection accuracy
was 85% correct as a
function of task difficulty.
Accuracy
(a)
Change-detection performance was surprisingly good even when the
change occurred well into peripheral vision.
Peripheral vision plays a major role in determining change-detection
performance – hard-to-detect changes require closer fixations than
those that are easy to detect.
+
1 +++
(b) 8
++++ +++ +
Easy
++++++ + +
7
0.9 + +
+
+
Medium
++
++
+
+
+
6
Hard
+
+
+++++
+
0.8 + +
++ + +
5
+
+
++
+
0.7
4
+
+
++
+
+
3
+
+
0.6
++
++
2
+ ++ +
0.5
+
+
+
1
+
0.4
0
0
5
10
15
20
Eccentricity (deg)
p = 0.019
p = 0.013
Eccentricity (deg)
(1)
Easy
Medium
Hard
From Sharan et al. (2016).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 170
28/02/20 6:43 PM
Motion perception and action
171
Further evidence that much information is extracted from peripheral as well
as foveal vision was reported by Clarke and Mack (2015). On each trial,
two real-world scenes were presented with an interval of 1,500 ms between
them. When this interval was unfilled, only 11% of changes were detected.
However, when a cue indicating the location of a possible change was presented 0, 300 or 1,000 ms after the offset of the first scene, change-detection
rates were much higher. They were greatest in the 0 ms condition (29%) and
lowest in the 1,000 ms condition (18%). Thus, much information about the
first scene (including information from peripheral vision) was stored briefly
in iconic memory (see Glossary and Chapter 6).
If peripheral vision provides observers with general or gist information,
they might detect global scene changes without detecting precisely what
had changed. Howe and Webb (2014) obtained support for this prediction.
Observers were presented with an array of 30 discs (15 red, 15 green). On
24% of trials when three discs changed colour, observers detected the array
had changed but could not identify the discs involved.
Evaluation
The peripheral vision approach has proved successful in various ways.
First, visual information is often extracted from across the entire visual
field as predicted by this approach (but not the attentional approach). This
supports our strong belief that we perceive most of the immediate visual
environment. Second, this approach capitalises on established knowledge
concerning peripheral vision. Third, this approach has been applied successfully to explain visual-search performance (see Chapter 5).
What are the limitations of this approach? First, it de-emphasises attention’s role in determining change blindness, and does not provide a detailed
account of how attentional and perceptual processes are integrated. Second,
Sharan et al. (2016) discovered change detection was sometimes difficult
even though the change could be perceived easily in peripheral vision. This
indicates that other factors (as yet unidentified) are also involved. Third,
the approach does not consider failure to compare pre- and post-change
representations as a reason for change blindness (see below).
Comparison of pre- and post-change representations
Change blindness can occur because observers fail to compare their preand post-change representations. Angelone et al. (2003) presented a video
in which the identity of the central actor changed. On a subsequent line-up
task to identify the pre-change actor, observers showing change blindness
performed comparably to those showing change detection (53% vs 46%,
respectively).
Varakin et al. (2007) extended the above research in a real-world study
in which a coloured binder was switched for one of a different colour while
observers’ eyes were closed. Some observers exhibited change blindness even
though they remembered the colours of the pre- and post-change binders
and so had failed to compare the two colours. Other observers showing
change blindness had poor memory for the pre- and post-change colours
and so failed to represent these two pieces of information in memory.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 171
28/02/20 6:43 PM
172
Visual perception and attention
KEY TERM
Is change blindness a defect?
Serial dependence
Systematic bias of current
visual perception towards
recent visual input.
Is change blindness an unfortunate defect? Fischer and Whitney (2014)
argued the answer is “No”. The visual world is typically relatively stable
over short time periods. As a result, it is worthwhile for us to sacrifice
­perceptual accuracy occasionally to ensure we have a continuous, stable
perception of our visual environment.
Fischer and Whitney (2014) supported their argument by finding the
perceived orientation of a grating was biased in the direction of a previously presented grating, an effect known as serial dependence. Manassi
et al. (2018) found serial dependence for an object’s location – when an
object that had been presented previously was re-presented, it was perceived as being closer to its original location than was actually the case.
Serial dependence probably involves several stages of visual perception and
may also involve memory processes (Bliss et al., 2017). In sum, the visual
system’s emphasis on perceptual stability inhibits our ability to detect
changes within the visual scene.
Inattentional blindness and its causes
The most famous study on inattentional blindness was reported by Simons
and Chabris (1999). In one condition, observers watched a video where students dressed in white (the white team) passed a ball to each other and the
observers counted the number of passes (see the video at www.simonslab.
com/videos.html). At some point, a woman in a black gorilla suit walks into
camera shot, looks at the camera, thumps her chest and then walks off (see
Figure 4.17). Altogether she is on screen for 9 seconds. Very surprisingly,
only 42% of observers noticed the gorilla! This is a striking example of
­inattentional blindness.
Why was performance so poor in the
above experiment? Simons and Chabris
(1999) obtained additional relevant evidence. In a second condition, observers
counted the number of passes made by students dressed in black. Here 83% of observers detected the gorilla’s presence. Thus,
observers were more likely to attend to
the gorilla when it resembled task-relevant
stimuli (i.e., in colour).
It is generally assumed detection performance is good when observers count black
team passes because of selective attention
to black objects. Indeed, Rosenholtz et al.
(2016) found that observers counting black
team passes had eye fixations closer to the
gorilla than those counting white team
Figure 4.17
passes. However, Rosenholtz et al. also
Frame showing a woman in a gorilla suit in the middle of a game
found that observers counting black team
of passing the ball.
passes (but whose fixation patterns resemFrom Simons & Chabris (1999). Figure provided by Daniel Simons, www.
dansimons.com/www.theinvisiblegorilla.com.
bled those of observers counting white
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 172
28/02/20 6:43 PM
Motion perception and action
173
team passes) had unusually poor detection performance (54% compared to
a typical 80%). Thus, detection performance may depend on the strengths
and limitations of peripheral vision as well as failures of selective attention.
The presence of inattentional blindness can lead us to underestimate
the amount of processing of the undetected stimulus. Schnuerch et al.
(2016) found categorising attended stimuli was slower when the meaning
of an undetected stimulus conflicted with that of the attended stimulus.
Thus, the meaning of undetected stimuli was processed despite inattentional blindness. Other research using event-related potentials (reviewed by
Pitts, 2018) has shown that undetected stimuli typically receive moderate
processing.
How can we explain inattentional blindness? As we have seen, explanations often emphasise the role of selective attention or attentional set.
Simons and Chabris’ (1999) findings indicate the importance of similarity
in stimulus features (e.g., colour) between task stimuli and the unexpected
object. However, Most (2013) argued that similarity in semantic category
is also important. Participants tracked numbers or letters. On the critical
trial, an unexpected stimulus (the letter E or number 3) was visible for
7 seconds. The letter and number were visually identical except they were
mirror images of each other.
What did Most (2013) find? There was much less inattentional blindness when the unexpected stimulus belonged to the same category as the
tracked objects. Thus, inattentional blindness can depend on attentional
sets based on semantic categories (e.g., letters; numbers).
Légal et al. (2017) investigated the role of demanding top-down attentional processes in producing inattentional blindness using Simons and
Chabris’ (1999) gorilla video. Some observers counted the passes made by
the white team (standard task) whereas others had the more attentionally
demanding task of counting the number of aerial passes as well as total
passes. As predicted, there was much more evidence of inattentional blindness (i.e., failing to detect the gorilla) when the task was more demanding.
Légal et al. (2017) reduced inattentional blindness in other conditions
by presenting detection-relevant words subliminally (e.g., identify; notice)
to observers prior to watching the video. This increased detection rates for
the gorilla in the standard task condition from 50% to 83%. Overall, the
findings indicate that manipulating attentional processes can have powerful
effects on inattentional blindness.
Compelling evidence that inattentional blindness depends on top-down
processes that strongly influence what we expect to see was reported by
Persuh and Melara (2016). Observers fixated a central dot followed by the
presentation of two coloured squares and decided whether the colours were
the same. On the critical trial, the dot was replaced by Barack Obama’s
face (see Figure 4.18). Amazingly, 60% of observers failed to detect this
unexpected stimulus presented in foveal vision: Barack Obama blindness.
Of these observers, a below-chance 8% identified Barack Obama when
deciding whether the unexpected stimulus was Angelina Jolie, a lion’s
head, an alarm clock or Barack Obama (see Figure 4.18).
Persuh and Melara’s (2016) findings are dramatic because they indicate inattentional blindness can occur even when the novel stimulus is presented on its own with no competing stimuli. These findings suggest there
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 173
28/02/20 6:43 PM
174
Visual perception and attention
Figure 4.18
The sequence of events on
the initial baseline trials and
the critical trial.
From Persuh and Melara (2016).
are important differences in the processes underlying inattentional blindness and change blindness: the latter often depends on visual crowding (see
Glossary), which is totally absent in Persuh and Melara’s study.
Evaluation
Several factors influencing inattentional blindness have been identified.
These factors include the similarity (in terms of stimulus features and
semantic category) between task stimuli and the unexpected object; the
attentional demands of the task; and observers’ expectations concerning
what they will see. If there were no task requiring attentional resources and
creating expectations, there would undoubtedly be very little inattentional
blindness (Jensen et al., 2011).
What are the limitations of research in this area? First, it is typically
unclear whether inattentional blindness is due to perceptual failure or to
memory failure (i.e., the unexpected object is perceived but rapidly forgotten). However, Ward and Scholl (2015) found that observers showed
inattentional blindness even when observers were instructed to report
immediately seeing anything unexpected. This finding strongly suggests that inattentional blindness reflects deficient perception rather than
memory failure.
Second, observers typically engage in some processing of undetected
stimuli even when they fail to report the presence of such stimuli (Pitts,
2018). More research is required to clarify the extent of non-conscious processing of undetected stimuli.
Third, it is likely that the various factors influencing inattentional
blindness interact in complex ways. However, most research has considered only a single factor and so the nature of such interactions has not
been established.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 174
28/02/20 6:43 PM
Motion perception and action
175
CHAPTER SUMMARY
•
Introduction. The time dimension is very important in visual
perception. The changes in visual perception produced as we
move around the environment and/or environmental objects move
promote accurate perception and facilitate appropriate actions.
•
Direct perception. Gibson argued perception and action are
closely intertwined and so research should not focus exclusively on
static observers perceiving static visual displays. According to his
direct theory, an observer’s movement creates optic flow providing
useful information about the direction of heading. Invariants,
which are unchanged as people move around their environment,
have particular importance. Gibson claimed the uses of objects
(their affordances) are perceived directly. He underestimated the
complexity of visual processing, minimising the role of object
knowledge in visual perception, with the effects of motion on
perception being more complex than he realised.
•
Visually guided movement. The perception of heading depends
in part on optic-flow information. However, there are complexities
because the retinal flow field is determined by eye and head
movements as well as by optic flow. Heading judgements are also
influenced by binocular disparity and the retinal displacement of
objects as we approach them.
Accurate steering on curved paths (e.g., driving around a
bend) sometimes involves focusing on the tangent point (e.g.,
point on the inside edge of the road at which its direction seems
to reverse). However, drivers sometimes fixate a point along the
future path. More generally, drivers’ gaze patterns are flexibly
determined by control mechanisms that are responsive to their
goals.
Calculating time to contact with an object often involves
calculating tau (the size of the retinal image divided by the
object’s rate of expansion). Drivers often use tau-dot (rate of
decline of tau over time) to decide whether there is sufficient
braking time to stop before contact. Observers often make use of
additional sources of information (e.g., binocular disparity; familiar
size; relative size) when working out time to contact. Drivers’
braking decisions also depend on their preferred margin of safety
and the effectiveness of the car’s braking system.
•
Visually guided action: contemporary approaches. The planningcontrol model distinguishes between a slow planning system used
mostly before the initiation of movement and a fast control system
used during movement execution. As predicted, separate brain
areas are involved in planning and control. However, the definition
of “planning” is very broad, and the notion that planning always
precedes control is oversimplified. Recent evidence indicates
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 175
28/02/20 6:44 PM
176
Visual perception and attention
that visually guided action depends on three processing streams
(dorso-dorsal; the ventro-dorsal; and ventral, which is discussed
more fully in Chapter 2) each making a separate contribution. This
theoretical approach is supported by studies on brain-damaged
patients and by neuroimaging research.
•
Perception of human motion. Human motion is perceived even
when only impoverished visual information is available. Perception
of human and biological motion involves bottom-up and top-down
processes with the latter most likely to be used with degraded
visual input. The perception of human motion is special because
we can produce as well as perceive human actions and because
we devote considerable time to making sense of it.
It has often been assumed that our ability to imitate and
understand human motion depends on a mirror neuron system
(an extensive brain network). This system’s causal involvement in
action perception and understanding has been shown in research
on brain-damaged patients and studies using techniques to alter
its neural activity. The mirror neuron system is especially important
in the understanding of relatively simple actions. However,
additional high-level cognitive processes are often required if
action understanding is complex or involves generalising from past
experience.
•
Change blindness. There is convincing evidence for change
blindness and inattentional blindness. Change blindness depends
on attentional processes: it occurs more often when the changed
object does not receive attention. However, change blindness
can occur for objects that are fixated and it also depends on the
limitations of peripheral vision. The visual system’s emphasis on
continuous, stable perception probably plays a part in making us
susceptible to change blindness. Inattentional blindness depends
very strongly on top-down processes (e.g., selective attention) and
can be found even when only the novel stimulus is present in the
visual field.
FURTHER READING
Binder, E., Dovern, A., Hesse, M.D., Ebke, M., Karbe, H., Salinger, J. et al.
(2017). Lesion evidence for a human mirror neuron system. Cortex, 90, 125–137.
Ellen Binder and colleagues discuss the nature of the mirror neuron system based
on evidence from brain-damaged patients.
Keysers, C., Paracampo, R. & Gazzola, V. (2018). What neuromodulation and
lesion studies tell us about the function of the mirror neuron system and embodied cognition. Current Opinion in Psychology, 24, 35–40. This article provides a
succinct account of our current understanding of the mirror neuron system.
Lappi, O. & Mole, C. (2018). Visuo-motor control, eye movements, and steering: A
unified approach for incorporating feedback, feedforward, and internal models.
Psychological Bulletin, 144, 981–1001. Otto Lappi and Callum Mole provide
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 176
28/02/20 6:44 PM
Motion perception and action
177
a comprehensive theoretical account of driving behaviour that emphasises the
importance of top-down control mechanisms in influencing drivers’ eye fixations.
Osiurak, F., Rossetti, Y. & Badets, A. (2017). What is an affordance? 40 years
later. Neuroscience and Biobehavioral Reviews, 77, 403–417. François Osiurak
and colleagues discuss Gibson’s notion of affordances in the contest of contemporary research and theory.
Rosenholtz, R. (2017a). What modern vision science reveals about the awareness
puzzle: Summary-statistic encoding plus decision limits underlie the richness of
visual perception and its quirky failures. Vision Sciences Society Symposium on
Summary Statistics and Awareness, preprint arXiv:1706.02764. Ruth Rosenholtz
provides an excellent account of the role played by peripheral vision in change
blindness and other phenomena.
Sakreida, K., Effnert, I., Thill, S., Menz, M.M., Jirak, D., Eickhoff, C.R. et al.
(2016). Affordance processing in segregated parieto-frontal dorsal stream
sub-pathways. Neuroscience and Biobehavioral Reviews, 69, 80–112. The pathways within the brain involved in goal-directed interactions with objects are discussed in the context of a meta-analytic review.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 177
28/02/20 6:44 PM
Chapter
5
Attention and
performance
INTRODUCTION
Attention is invaluable in everyday life. We use attention to avoid being
hit by cars when crossing the road, to search for missing objects and to
perform two tasks together. The word “attention” has various meanings
but typically refers to selectivity of processing as emphasised by William
James (1890, pp. 403–404):
Attention is . . . the taking into possession of the mind, in clear and
vivid form, of one out of what seem several simultaneously possible
objects or trains of thought. Focalisation, concentration, of consciousness are of its essence.
KEY TERMS
Focused attention
A situation in which
individuals try to attend
to only one source of
information while ignoring
other stimuli; also known
as selective attention.
Divided attention
A situation in which two
tasks are performed at the
same time; also known as
multi-tasking.
William James distinguished between “active” and “passive” modes of
attention. Attention is active when controlled in a top-down way by the
individual’s goals or expectations. In contrast, attention is passive when
controlled in a bottom-up way by external stimuli (e.g., a loud noise). This
distinction remains theoretically important (e.g., Corbetta & Shulman,
2002; see discussion, pp. 192–196).
Another important distinction is between focused and divided attention. Focused attention (or selective attention) is studied by presenting individuals with two or more stimulus inputs at the same time and
instructing them to respond to only one. Research on focused or selective
attention tells us how effectively we can select certain inputs and avoid
being distracted by non-task inputs. It also allows us to study the selection
process and the fate of unattended stimuli.
Divided attention is also studied by presenting at least two stimulus
inputs at the same time. However, individuals are instructed they must
attend (and respond) to all stimulus inputs. Divided attention is also
known as multi-tasking (see Glossary). Studies of divided attention provide
useful information about our processing limitations and the capacity of
our attentional mechanisms.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 178
28/02/20 6:44 PM
179
Attention and performance
There is a final important distinction (the last one, I promise you!)
between external and internal attention. External attention is “the selection and modulation of sensory information” (Chun et al., 2011). In contrast, internal attention is “the selection, modulation, and maintenance of
internally generated information, such as task rules, responses, long-term
memory, or working memory” (Chun et al., 2011, p. 73). The connection to Baddeley’s working memory model is especially important (e.g.,
Baddeley (2012); see Chapter 6). The central executive component of
working memory is involved in attentional control and is crucially involved
in internal and external attention.
Much attentional research has two limitations. First, the emphasis is
on attention to externally presented task stimuli rather than internally generated stimuli (e.g., worries; self-reflection). One reason is that it is easier to
assess and to control external attention. Second, what participants attend
to is determined by the experimenter’s instructions. In contrast, what we
attend to in the real world is mostly determined by our current goals and
emotional states.
Two important topics related to attention are discussed elsewhere.
Change blindness (see Glossary), which shows the close links between attention and perception, is considered in Chapter 4. Consciousness (including
its relationship to attention) is discussed in Chapter 16.
KEY TERMS
Cocktail party problem
The difficulties involved
in attending to one voice
when two or more people
are speaking at the same
time.
Dichotic listening task
A different auditory
message is presented to
each ear and attention
has to be directed to one
message.
Shadowing
Repeating one auditory
message word for word
as it is presented while a
second auditory message
is also presented; it is
used on the dichotic
listening task.
FOCUSED AUDITORY ATTENTION
Many years ago, British scientist Colin Cherry (1953) became fascinated
by the cocktail party problem – how can we follow just one conversation
when several people are talking at once? As we will see, there is no simple
answer.
McDermott (2009) identified two problems listeners face when attending to one voice among many. First, there is sound segregation: the listener
must decide which sounds belong together. This is complex: machine-based
speech recognition programs often perform poorly when attempting to
achieve sound segregation with several sound sources present together
(Shen et al., 2008). Second, after segregation has been achieved, the listener
must direct attention to the sound source of interest and ignore the others.
McDermott (2009) pointed out that auditory segmentation is often
harder than visual segmentation (deciding which visual features belong
to which objects; see Chapter 3). There is considerable overlap of signals
from different sound sources in the cochlea whereas visual objects typically
occupy different retinal regions.
There is another important issue – when listeners attend to one auditory input, how much processing is there of the unattended input(s)? As we
will see, various answers have been proposed.
Cherry (1953) addressed the issues discussed so far (see Eysenck, 2015,
for an evaluation of his research). He studied the cocktail party problem
using a dichotic listening task in which a different auditory message was
presented to each ear and the listener attended to only one. Listeners
engaged in shadowing (repeating the attended message aloud as it was presented) to ensure their attention was directed to that message. However,
the shadowing task has two potential disadvantages: (1) listeners do not
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 179
28/02/20 6:44 PM
180
Visual perception and attention
normally engage in shadowing and so the task is artificial; and (2) it
increases listeners’ processing demands.
Listeners solved the cocktail party problem by using differences between
the auditory inputs in physical features (e.g., sex of speaker; voice intensity;
speaker location). When these physical differences were eliminated by presenting two messages in the same voice to both ears at once, listeners found
it very hard to separate out the messages based on differences in meaning.
Cherry (1953) found very little information seemed to be extracted
from the unattended message. Listeners seldom noticed when it was spoken
backwards or in a foreign language. However, physical changes (e.g., a
pure tone) were nearly always detected. The conclusion that unattended
information receives minimal processing was supported by Moray (1959),
who found listeners remembered very few words presented 35 times each.
Where is the bottleneck? Early vs late selection
Interactive exercise:
Treisman
Many psychologists have argued we have a processing bottleneck (discussed
below). A bottleneck in the road (e.g., where it is especially narrow) can
cause traffic congestion, and a bottleneck in the processing system seriously
limits our ability to process two (or more) simultaneous inputs. However,
it would sometimes solve the cocktail problem by permitting listeners to
process only the desired voice.
Where is the bottleneck? Broadbent (1958) argued a filter (bottleneck)
early in processing allows information from one input or message through
it based on the message’s physical characteristics. The other input remains
briefly in a sensory buffer and is rejected unless attended to rapidly (see
Figure 5.1). Thus, Broadbent argued there is early selection.
Treisman (1964) argued the bottleneck’s location is more flexible than
Broadbent suggested (see Figure 5.1). She claimed listeners start with processing based on physical cues, syllable pattern and specific words and then
process grammatical structure and meaning. Later processes are omitted or
attenuated if there is insufficient processing capacity to permit full stimulus
analysis.
Treisman (1964) also argued top-down processes (e.g., expectations)
are important. Listeners performing the shadowing task sometimes say a
word from the unattended input. Such breakthroughs mostly occur when
the word on the unattended channel is highly probable in the context of
the attended message.
Deutsch and Deutsch (1963) argued all stimuli are fully analysed, with
the most important or relevant stimulus determining the response. Thus,
they placed the bottleneck much later in processing than did Broadbent
(see Figure 5.1).
Findings: unattended input
Broadbent’s approach predicts little or no processing of unattended auditory messages. In contrast, Treisman’s approach suggests flexibility in
the processing of unattended messages, whereas Deutsch and Deutsch’s
approach implies reasonably thorough processing of such messages.
Relevant findings are discussed below.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 180
28/02/20 6:44 PM
Attention and performance
181
Treisman and Riley (1969) asked listeners to shadow one of two auditory messages. They stopped shadowing and tapped when they detected
a target in either message. Many more target words were detected on the
shadowed message.
Aydelott et al. (2015) asked listeners to perform a task on attended
target words. When unattended words related in meaning were presented
shortly before the target words themselves, performance on the target
words was enhanced when unattended words were presented as loudly as
attended ones. Thus, the meaning of unattended words was processed.
There is often more processing of unattended words that have a special
significance for the listener. For example, Li et al. (2011) obtained evidence
that unattended weight-related words (e.g., fat; chunky) were processed
more thoroughly by women dissatisfied with their weight. Conway et al.
(2001) found listeners often detected their own name on the unattended
message. This was especially the case if they had low working memory
capacity (see Glossary) indicative of poor attentional control.
Coch et al. (2005) asked listeners to attend to one of two auditory
inputs and to detect targets presented on either input. Event-related potentials (ERPs; see Glossary) provided a measure of processing activity. ERPs
100 ms after target presentation were greater when the target was presented
on the attended rather than the unattended message. This suggests there
was more processing of the attended than unattended targets.
Greater brain activation for attended than unattended auditory stimuli
may reflect enhanced processing for attended stimuli and/or suppressed
processing for unattended stimuli. Horton et al. (2013) addressed this
Figure 5.1
A comparison of
Broadbent’s theory (top),
Treisman’s theory (middle),
and Deutsch and Deutsch’s
theory (bottom).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 181
28/02/20 6:44 PM
182
Visual perception and attention
issue. Listeners heard separate speech messages presented to each ear with
instructions to attend to the left or right ear. There was greater brain activation associated with the attended message (especially around 90 ms after
stimulus presentation). This difference depended on enhancement of the
attended message combined with suppression of the unattended message.
Classic theories of selective auditory attention (those of Broadbent,
Treisman, and Deutsch and Deutsch) de-emphasised the importance of
suppression or inhibition of the unattended message shown by Horton
et al. (2013). For example, Schwartz and David (2018) reported suppression of neuronal responses in the primary auditory cortex to distractor
sounds. More generally, all the classic theories de-emphasise the flexibility
of selective auditory attention and the role of top-down processes in selection (see below).
Findings: cocktail party problem
Humans are generally very good at separating out and understanding one
voice from several speaking at the same time (i.e., solving the cocktail party
problem). The extent of this achievement is indicated by the finding that
automatic speech recognition systems are considerably inferior to human
speech recognition (Spille & Meyer, 2014).
Mesgarani and Chang (2012) studied listeners with implanted multi-­
electrode arrays permitting the direct recording of activity within the auditory cortex. They heard two different messages (one in a male voice; one
in a female voice) presented to the same ear with instructions to attend to
only one. The responses within the auditory cortex revealed “The salient
spectral [based on sound frequencies] and temporal features of the attended
speaker, as if subjects were listening to that speaker alone” (Mesgarani &
Chang, 2012, p. 233).
Listeners found it easy to distinguish between the two messages in the
study by Mesgarani and Chang (2012) because they differed in physical
characteristics (i.e., male vs female voice). Olguin et al. (2018) presented
native English speakers with two messages in different female voices. The
attended message was always in English whereas the unattended message
was in English or an unknown language. Comprehension of the attended
message was comparable in both conditions. However, there was stronger
neural encoding of both messages in the former condition. As Olguin et al.
concluded, “The results offer strong support to flexible accounts of selective [auditory] attention” (p. 1618).
In everyday life, we are often confronted by several different speech
streams. Accordingly, Puvvada and Simon (2017) presented three speech
streams and assessed brain activity as listeners attended to only one. Early
in processing, “the auditory cortex maintains an acoustic representation of
the auditory scene with no significant preference to attended over ignored
sources” (p. 9195). Later in processing, “Higher-order auditory cortical
areas represent an attended speech stream separately from, and with significantly higher fidelity [accuracy] than, unattended speech streams” (p. 9189).
This latter finding results from top-down processes (e.g., attention).
How do we solve the cocktail party problem? The importance of
top-down processes is suggested by the existence of extensive descending
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 182
28/02/20 6:44 PM
Attention and performance
183
pathways from the auditory cortex to brain areas involved in early auditory
processing (Robinson & McAlpine, 2009). Various top-down factors based
on listeners’ knowledge and/or expectations are involved. For example, listeners are more accurate at identifying what one speaker is saying in the
context of several other voices if they have previously heard that speaker’s
voice in isolation (McDermott, 2009).
Woods and McDermott (2018) investigated top-down processes in
selective auditory attention in more detail. They argued, “Sounds produced by a given source often exhibit consistencies in structure that might
be useful in separating sources” (p. E3313). They used the term “schemas”
to refer to such structural consistencies.
Listeners showed clear evidence of schema learning leading to rapid
improvements in their listening performance. An important aspect of such
learning is temporal coherence – a given source’s sound features are typically all present when it is active and absent when it is silent. Shamma
et al. (2011) discussed research showing that if listeners can identify one
distinctive feature of the target voice, they can then distinguish its other
sound features via temporal coherence.
Evans et al. (2016) compared patterns of brain activity when attended
speech was presented on its own or together with competing unattended
speech. Brain areas associated with attentional and control processes (e.g.,
frontal and parietal regions) were more activated in the latter condition.
Thus, top-down processes relating to attention and control are important
in selective auditory processing.
Finally, Golumbic et al. (2013) suggested individuals at actual cocktail
parties can potentially use visual information to assist them in understanding what a given speaker is saying. Listeners heard two simultaneous messages (one in a male voice and the other in a female voice). Processing of
the attended message was enhanced when they saw a video of the speaker
talking.
In sum, listeners generally achieve the complex task of selecting one
speech message from among several such messages. There has been progress in identifying the top-down processes involved. For example, if listeners can identify at least one consistently distinctive feature of the target
voice, this makes it easier for them to attend only to that voice. Top-down
processes often produce a “winner-takes-all” situation where the processing
of one auditory input (the winner) suppresses the brain activity ­associated
with all other inputs (Kurt et al., 2008).
FOCUSED VISUAL ATTENTION
There has been much more research on visual attention than auditory attention. The main reason is that vision is our most important sense modality
with more of the cortex devoted to it than any other sense. Here we consider four key issues. First, what is focused visual attention like? Second,
what is selected in focused visual attention? Third, what happens to unattended visual stimuli? Fourth, what are the major systems involved in visual
attention? In the next section (see pp. 196–200), we discuss what the study
of visual disorders has taught us about visual attention.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 183
28/02/20 6:44 PM
184
Visual perception and attention
KEY TERM
Spotlight, zoom lens or multiple spotlights?
Split attention
Allocation of attention
to two (or more) nonadjacent regions of visual
space.
Look around you and attend to any interesting objects. Was your visual
attention like a spotlight? A spotlight illuminates a fairly small area, little
can be seen outside its beam and it can be redirected to focus on any given
object. Posner (1980) argued the same is true of visual attention.
Other psychologists (e.g., Eriksen & St. James, 1986) claim visual
attention is more flexible than suggested by the spotlight analogy and
argue visual attention resembles a zoom lens. We can increase or decrease
the area of focal attention just as a zoom lens can be adjusted to alter the
visual area it covers. This makes sense. For example, car drivers often need
to narrow their attention after spotting a potential hazard.
A third theoretical approach is even more flexible. According to the
multiple spotlights theory (Awh & Pashler, 2000), we sometimes exhibit
split attention (attention directed to two or more non-adjacent regions
in space). The notion of split attention is controversial. Jans et al. (2010)
argued attention is often strongly linked to motor action and so attending
to two separate objects might disrupt effective action. However, there is no
strong evidence for such disruption.
Findings
Support for the zoom-lens model was reported by Müller et al. (2003). On
each trial, observers saw four squares in a semi-circle and were cued to
attend to one, two or all four. Four objects were then presented (one in
each square) and observers decided whether a target (e.g., a white circle)
was among them. Brain activation in early visual areas was most widespread when the attended region was large (i.e., attend to all four squares)
and was most limited when it was small (i.e., attend to one square). As predicted by the zoom-lens theory, performance (reaction times and errors)
was best with the smallest attended region and worst with the largest one.
Chen and Cave (2016, p. 1822) argued the optimal attentional zoom
setting “includes all possible target locations and excludes possible distractor locations”. Most findings indicated people’s attentional zoom setting
is close to optimal. However, Collegio et al. (2019) obtained contrary
findings. Drawings of large objects (e.g., jukebox) and small objects (e.g.,
watch) were presented so their retinal size was the same. The observer’s
area of focal attention was greater with large objects because they made
top-down inferences concerning their real-world sizes. As a result, the area
of focal attention was larger than optimal for large objects.
Goodhew et al. (2016) pointed out that nearly all research has focused
only on spatial perception (e.g., identification of a specific object). They
focused on temporal perception (was a disc presented continuously or
were there two presentations separated by a brief interval?). Spotlight size
had no effect on temporal acuity, which is inconsistent with the theory.
How can we explain these findings? Spatial resolution is poor in peripheral
vision but temporal resolution is good. As a consequence, a small attentional spotlight is more beneficial for spatial than temporal acuity.
We turn now to split attention. Suppose you had to identify two digits
that would probably be presented to two cued locations a little way apart
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 184
28/02/20 6:44 PM
Attention and performance
185
Figure 5.2
(a) Shaded areas indicate
the cued locations; the
near and far locations are
not cued. (b) Probability of
target detection at valid
(left or right) and invalid
(near or far) locations.
Based on information in Awh
and Pashler (2000).
(see Figure 5.2a). Suppose also that on some trials a digit was presented
between the two cued locations. According to zoom-lens theory, the area
of maximal attention should include the two cued locations and the space
in between. As a result, the detection of digits presented in the middle
should have been very good. In fact, Awh and Pashler (2000) found it was
poor (see Figure 5.2b). Thus, attention can resemble multiple spotlights, as
predicted by the split-attention approach.
Morawetz et al. (2007) presented letters and digits at five locations
simultaneously (one in each quadrant of the visual field and one in the
centre). In one condition, observers attended to the visual stimuli at the
upper left and bottom right locations and ignored the other stimuli. There
were two peaks of brain activation corresponding to the attended areas but
less activation corresponding to the region in between. Overall, the pattern
of activation strongly suggested split attention.
Niebergall et al. (2011) recorded the neuronal responses of monkeys
attending to two moving stimuli while ignoring a distractor. In the key
condition, there was a distractor between (and close to) the two attended
stimuli. In this condition, neuronal responses to the distractor decreased
compared to other conditions. Thus, split attention involves a mechanism reducing attention to (and processing of) distractors located between
attended stimuli.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 185
28/02/20 6:44 PM
186
Visual perception and attention
KEY TERM
In most research demonstrating split attention, the two non-adjacent
stimuli being attended simultaneously were each presented to a different
hemifield (one half of the visual field). Note that the right hemisphere
receives visual signals from the left hemifield and the left hemisphere
receives signals from the right hemifield. Walter et al. (2016) found performance was better when non-adjacent stimuli were presented to different
hemifields rather than the same hemifield. Of most importance, the assessment of brain activity indicated effective filtering or inhibition of stimuli
presented between the two attended stimuli only when presented to different hemifields.
In sum, we can use visual attention very flexibly. Visual selective attention can resemble a spotlight, a zoom lens or multiple spotlights, depending
on the current situation and the observer’s goals. However, split attention
may require that two stimuli are presented to different hemifields rather
than the same one. A limitation with all these theories is that metaphors
(e.g., attention is a zoom lens) are used to describe experimental findings but
these metaphors fail to specify the underlying mechanisms (Di Lollo, 2018).
Hemifield
One half of the visual
field. Information from
the left hemifield of each
eye proceeds to the
right hemisphere and
information from the right
hemifield proceeds to the
left hemisphere.
What is selected?
Why might selective attention resemble a spotlight or zoom lens? Perhaps
we selectively attend to an area or region of space: space-based attention.
Alternatively, we may attend to a given object or objects: object-based
attention. Object-based attention is prevalent in everyday life because visual
attention is mainly concerned with objects of interest to us (see Chapters 2
and 3). As expected, observers’ eye movements as they view natural scenes
are directed almost exclusively to objects (Henderson & Hollingworth,
1999). However, even though we typically focus on objects of potential
importance, our attentional system is so flexible we can attend to an area of
space or a given object.
There is also feature-based attention. For example, suppose you are
looking for a friend in a crowd. Since she nearly always wears red clothes,
you might attend to the feature of colour rather than specific objects or
locations. Leonard et al. (2015) asked observers to identify a red letter
within a series of rapidly presented letters. Performance was impaired when
a # symbol also coloured red was presented very shortly before the target.
Thus, there was evidence for feature-based attention (e.g., colour; motion).
Findings
Visual attention is often object-based. For example, O’Craven et al. (1999)
presented observers with two stimuli (a face and a house), transparently
overlapping at the same location, with instructions to attend to one of
them. Brain areas associated with face processing were more activated
when the face was attended to than when the house was. Similarly, brain
areas associated with house processing were activated when the house was
the focus of attention.
Egly et al. (1994) devised a much-used method for comparing objectbased and space-based attention (see Figure 5.3). The task was to select
a target stimulus as rapidly as possible. A cue presented before the target
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 186
28/02/20 6:44 PM
Attention and performance
187
was valid (same location as the target) or
invalid (different location from the target).
Of key importance, invalid cues were in the
same object as the target (within-object cues)
or in a different object (between-object cues).
The key finding was that target detection was
faster on invalid trials when the cue was in the
same object rather than a different one. Thus,
attention was at least partly object-based.
Does object-based attention in the Egly
et al. (1994) task occur fairly “automatically”
or does it involve strategic processes? Object- Figure 5.3
based attention should always be found if it is Stimuli adapted from Egly et al. (1994). Participants saw two
rectangles and a cue indicated the most likely location of a
automatic. Drummond and Shomstein (2010)
subsequent target. The target appeared at the cued location
found no evidence for object-based attention (V), at the uncued end of the cued rectangle (IS) or at the
when the cue indicated with 100% certainty uncued, equidistant end of the uncued rectangle (ID).
where the target would appear. Thus, any From Chen (2012). © Psychonomic Society, Inc. Reprinted with
preference for object-based attention can be permission from Springer.
overridden when appropriate.
Hollingworth et al. (2012) found evidence object-based and space-based
attention can occur at the same time using a task resembling that of Egly
et al. (1994). There were three types of within-object cues varying in the
distance between the cue and subsequent target (see Figure 5.4). There was
evidence for object-based attention: when the target was far from the cue,
performance was worse when the cue was in a different object rather than
the same one. There was also evidence for space-based attention: when the
target was in the same object as the cue, performance declined the greater
the distance between target and cue. Thus, object-based and space-based
attention are not mutually exclusive.
Similar findings were reported by Kimchi et al. (2016). Observers
responded faster to a target presented within rather than outside an object.
This indicates object-based attention. There was also evidence for spacebased attention: when targets were presented outside the object, observers responded faster when they were close to it. Kimchi et al. concluded
that “object-related and space-related attentional processing can operate
­simultaneously” (p. 48).
Pilz et al. (2012) compared object-based and space-based attention
using various tasks. Overall, there was much more evidence of space-based
than object-based attention, with only a small fraction of participants
showing clear-cut evidence of object-based attention.
Donovan et al. (2017) noted that most studies indicating visual attention is object-based have used spatial cues, which may bias the allocation
of attention. Donovan et al. avoided the use of spatial cues and found
“Object-based representations do not guide attentional selection in the
absence of spatial cues” (p. 762). This finding suggests previous research
KEY TERM
has exaggerated the extent of object-based visual attention.
Inhibition of return
When we search the visual environment, it would be inefficient if we
A reduced probability of
repeatedly attended to any given location. In fact, we exhibit inhibition of
visual attention returning
to a recently attended
return (a reduced probability of returning to a region recently the focus of
location or object.
attention). Of theoretical importance is whether inhibition of return applies
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 187
28/02/20 6:44 PM
188
Visual perception and attention
Figure 5.4
(a) Possible target locations
(same object far, same
object near, valid, different
object far) for a given cue.
(b) Performance accuracy at
the various target locations.
From Hollingworth et al. (2012).
© 2011 American Psychological
Association.
more to locations or objects. The evidence is mixed (see Chen, 2012). List
and Robertson (2007) used Egly et al.’s (1994) task shown in Figure 5.4
and found location- or space-based inhibition of return was much stronger
than object-based inhibition of return.
Theeuwes et al. (2014) found location- and object-based inhibition of
return were both present at the same time. According to Theeuwes et al.
(p. 2254), “If you direct your attention to a location in space, you will
automatically direct attention to any object . . . present at that location,
and vice versa.”
There is considerable evidence of feature-based attention (see Bartsch
et al., 2018, for a review). In their own research, Bartsch et al. addressed
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 188
28/02/20 6:44 PM
Attention and performance
189
the issue of whether feature-based attention to colour-defined targets is
confined to the spatially attended region or whether it occurs across the
entire visual field. They discovered the latter was the case.
Finally, Chen and Zelinsky (2019) argued it is important to study the
allocation of attention under more naturalistic conditions than those typically used in research. In their study, observers engaged in free (unconstrained) viewing of natural scenes. Eye-fixation data suggested that
attention initially selects regions of space. These regions may provide “the
perceptual fragments from which objects are built” (p. 148).
Evaluation
Research on whether visual attention is object- or location-based have
produced variable findings and so few definitive conclusions are possible. However, the relative importance of object-based and space- or
­location-based attention is flexible. For example, individual differences are
important (Pilz et al., 2012). Note that visual attention can be both objectbased and space-based at the same time.
What are the limitations of research in this area? First, most research
apparently demonstrating that object-based attention is more important
than space- or location-based attention has involved the use of spatial
cues. Recent evidence (Donovan et al., 2017) suggests such cues may bias
visual attention and that visual attention is not initially object-based in
their absence.
Second, space-, object- and feature-based forms of attention often interact with each other to enhance object processing (Kravitz & Behrmann,
2011). However, we have as yet limited theoretical understanding of the
mechanisms involved in such interactions.
Third, there is a need for more research assessing patterns of attention
under naturalistic conditions. In recent research where observers view artificial stimuli while performing a specific task, it is unclear whether attentional
processes resemble those when they engage in free viewing of natural scenes.
What happens to unattended or distracting stimuli?
Unsurprisingly, unattended visual stimuli receive less processing than
attended ones. Martinez et al. (1999) compared event-related potentials
(ERPs) to attended and unattended visual stimuli. The ERPs to unattended
visual stimuli were comparable to those to attended ones 50–55 ms after
stimulus onset. After that, however, the ERPs to attended stimuli were
greater than those to unattended stimuli. Thus, selective attention influences all but the very early stages of processing.
As we have all discovered to our cost, it is often hard (or impossible)
to ignore task-irrelevant stimuli. Below we consider factors determining
whether task performance is adversely affected by distracting stimuli.
Load theory
Lavie’s (2005, 2010) load theory has been an influential approach to
understanding distraction effects. It distinguishes between perceptual
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 189
28/02/20 6:44 PM
190
Visual perception and attention
and cognitive load. Perceptual load refers to the perceptual demands of a
current task. Cognitive load refers to the burden placed on the cognitive
system by a current task (e.g., demands on working memory).
Tasks involving high perceptual load require nearly all our perceptual
capacity whereas low-load tasks do not. With low-load tasks there are
spare attentional resources, and so task-irrelevant stimuli are more likely
to be processed. In contrast, tasks involving high cognitive load reduce our
ability to use cognitive control to discriminate between target and distractor stimuli. Thus, high perceptual load is associated with low distractibility,
whereas high cognitive load is associated with high distractibility.
Findings
There is much support for the hypothesis that high perceptual load reduces
distraction effects. Forster and Lavie (2008) presented six letters in a circle
and participants decided which target letter (X or N) was present. The five
non-target letters resembled the target letter more closely in the high-load
condition. On some trials a picture of a cartoon character (e.g., Spongebob
Squarepants) was presented as a distractor outside the circle. Distractors
interfered with task performance only under low-load conditions.
According to the theory, brain activation associated with distractors
should be less when individuals are performing a task involving high perceptual load. This finding has been obtained with visual tasks and distractors (e.g., Schwartz et al., 2005) and also with auditory tasks and distractors
(e.g., Sabri et al., 2013).
Why is low perceptual load associated with high distractibility? Biggs
and Gibson (2018) argued this happens because observers generally adopt
a broad attentional focus when perceptual load is low. They tested this
hypothesis using three low-load conditions in which participants decided
whether a target X or N was presented and a distractor letter was sometimes presented (see Figure 5.5). They argued that observers would adopt
the smallest attentional focus in the circle condition and the largest attentional focus in the solo condition. As predicted, distractor interference was
greatest in the solo condition and least in the circle condition. Thus, distraction effects depend strongly on size of attentional focus as well as perceptual load.
The hypothesis that distraction effects should be greater when cognitive
or working memory load is high rather than low was tested by Burnham
et al. (2014). As predicted, distraction effects on a visual search task were
Figure 5.5
Sample displays for three
low perceptual load
conditions in which the
task required deciding
whether a target X or N
was presented. See text for
further details.
Standard condition
Solo condition
X
X
*
*
*
*
*
T
Circle condition
*
T
*
X
*
*
*
T
From Biggs and Gibson (2018).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 190
28/02/20 6:44 PM
Attention and performance
191
greater when participants performed another task placing high demands
on the cognitive system.
Sörqvist et al. (2016) argued high cognitive load can reduce rather than
increase distraction. They pointed out that cognitive load is typically associated with high levels of concentration and our everyday experience indicates high concentration generally reduces distractibility. As predicted, they
found neural activation associated with auditory distractors was reduced
when cognitive load on a visual task was high rather than low.
The effects of cognitive load on distraction are very variable. How can
we explain this variability? Sörqvist et al. (2016) argued that an important factor is how easily distracting stimuli can be distinguished from task
stimuli. When it is easy (e.g., task and distracting stimuli are in different modalities as in the Sörqvist et al., 2016, study), high cognitive load
reduces distraction. In contrast, when it is hard to distinguish between task
and distracting stimuli (e.g., they are similar and/or in the same modality),
then high cognitive load increases distraction.
Load theory assumes the effects of perceptual and cognitive load are
independent. However, Linnell and Caparos (2011) found perceptual and
cognitive processes interacted: perceptual load only influenced attention as
predicted when cognitive load was low. Thus, the effects of perceptual load
are not “automatic” as assumed theoretically but instead depend on cognitive resources being available.
Evaluation
The distinction between perceptual and cognitive load has proved useful
in predicting when distraction effects will be small or large. More specifically, the prediction that high perceptual load is associated with reduced
distraction effects has received much empirical support. In applied research,
load theory successfully predicts several aspects of drivers’ attention and
behaviour (Murphy & Greene, 2017). For example, drivers exposed to high
perceptual load responded more slowly to hazards and drove less safely.
What are the theory’s limitations? First, the terms “perceptual load”
and “cognitive load” are vague, making it hard to test the theory (Murphy
et al., 2016). Second, the assumption that perceptual and cognitive load
have separate effects on attention is incorrect (Linnell & Caparos, 2011).
Third, perceptual load and attentional breadth are often confounded.
Fourth, the prediction that high cognitive load is associated with high
distractibility has been disproved when task and distracting stimuli are
easily distinguishable. Fifth, the theory de-emphasises several relevant
factors including the salience or conspicuousness of distracting stimuli and
the spatial distance between distracting and task stimuli (Murphy et al.,
2016).
Major attention networks
As we saw in Chapter 1, many cognitive processes are associated with networks spread across relatively large areas of cortex rather than small, specific regions. With respect to attention, several theorists (e.g., Posner, 1980;
Corbetta & Shulman, 2002) have argued there are two major networks.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 191
28/02/20 6:44 PM
192
Visual perception and attention
KEY TERM
One attention network is goal-directed or endogenous whereas the other is
stimulus-driven or exogenous.
Covert attention
Attention to an object
in the absence of an eye
movement towards it.
Posner’s (1980) approach
Posner (1980) studied covert attention in which attention shifts to a given
spatial location without an accompanying eye movement. In his research,
people responded rapidly to a light. The light was preceded by a central
cue (arrow pointing to the left or right) or a peripheral cue (brief illumination of a box outline). Most cues were valid (i.e., indicating where the target
light would appear) but some were invalid (i.e., providing inaccurate information about the light’s location).
Responses to the light were fastest to valid cues, intermediate to neutral
cues (a central cross) and slowest to invalid cues. The findings were comparable for central and peripheral cues. When the cues were valid on only
a small fraction of trials, they were ignored when they were central cues.
However, they influenced performance when they were peripheral cues.
The above findings led Posner (1980) to distinguish between two attention systems:
(1)
(2)
An endogenous system: it is controlled by the individual’s intentions
and is used when central cues are presented.
An exogenous system: it automatically shifts attention and is involved
when uninformative peripheral cues are presented. Stimuli that are
salient or different from other stimuli (e.g., in colour) are most likely
to be attended to using this system.
Corbetta and Shulman’s (2002) approach
Corbetta and Shulman (2002) identified two attention systems that are
involved in basic aspects of visual processing. First, there is a goal-directed
or top-down system resembling Posner’s endogenous system. This dorsal
attention network consists of a fronto-parietal network including the intraparietal sulcus. It is influenced by expectations, knowledge and current
goals. It is used when a cue predicts the location or other feature of a forthcoming visual stimulus.
Second, Corbetta and Shulman (2002) identified a stimulus-driven or
bottom-up attention system resembling Posner’s exogenous system. This is
the ventral attention network and consists primarily of a right-­hemisphere
ventral fronto-parietal network. This system is used when an unexpected
and potentially important stimulus (e.g., flames appearing under the door)
occurs. Thus, it has a “circuit-breaking” function, meaning visual attention is redirected from its current focus. What stimuli trigger this circuit-­
breaking? According to Corbetta et al. (2008), non-task stimuli (i.e.,
distractors) closely resembling task stimuli are especially likely to activate
the ventral attention network although salient or conspicuous stimuli also
activate the same network.
Corbetta and Shulman (2011; see Figure 5.6) identified the brain
areas associated with each network. Key areas within the dorsal attention
network are as follows: superior parietal lobule (SPL), intraparietal sulcus
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 192
28/02/20 6:44 PM
Attention and performance
193
Figure 5.6
The brain areas associated
with the dorsal or goaldirected attention network
and the ventral or stimulusdriven network. The full
names of the areas involved
are indicated in the text.
From Corbetta and Shulman
(2011). © Annual Reviews. With
permission of Annual Reviews.
(IPS), inferior frontal junction (IFJ), frontal eye field (FEF), middle temporal area (MT) and V3A (a visual area). Key areas within the ventral
attention network are as follows: inferior frontal junction (IFJ), inferior
frontal gyrus (IFG), supramarginal gyrus (SMG), superior temporal gyrus
(STG) and insula (Ins). The temporo-parietal junction also forms part of
the ventral attention network.
The existence of two attention networks makes much sense. The
goal-directed system (dorsal attention network) allows us to attend to
stimuli directly relevant to our current goals. If we only had this system,
however, our attentional processes would be dangerously inflexible. It is
also important to have a stimulus-driven attentional system (ventral attention network) leading us to switch attention away from goal-relevant stimuli
to unexpected threatening stimuli (e.g., a ferocious animal). More generally,
the two attention networks typically interact effectively with each other.
Findings
Corbetta and Shulman (2002) supported their two-network model by carrying out meta-analyses of brain-imaging studies. In essence, they argued,
brain areas most often activated when participants expect a stimulus that
has not yet been presented form the dorsal attention network. In contrast,
brain areas most often activated when individuals detect low-frequency
targets form the ventral attention network.
Hahn et al. (2006) tested Corbetta and Shulman’s (2002) theory by
comparing patterns of brain activation when top-down and bottom-up
processes were required. As predicted, there was little overlap between the
brain areas associated with top-down and bottom-up processing. In addition, the brain areas involved in each type of processing corresponded reasonably well to those identified by Corbetta and Shulman.
Chica et al. (2013) reviewed research on the two attention systems and
identified 15 differences between them. For example, stimulus-driven attention is faster than top-down attention and is more object-based. In addition, it is more resistant to interference from other peripheral cues once
activated. The existence of so many differences strengthens the argument
the two attentional systems are separate.
Considerable research evidence (mostly involving neuroimaging) indicates the dorsal and ventral attention systems are associated with distinct
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 193
28/02/20 6:44 PM
194
Visual perception and attention
neural circuits even during the resting state (Vossel et al., 2014). However,
neuroimaging studies cannot establish that any given brain area is necessarily involved in stimulus-driven or goal-directed attention processes. Chica
et al. (2011) provided relevant evidence by using transcranial magnetic
stimulation (TMS; see Glossary) to interfere with processing in a given
brain area. TMS applied to the right temporo-parietal junction impaired
the functioning of the stimulus-driven system but not the top-down one.
In the same study, Chica et al. (2011) found TMS applied to the right
intraparietal sulcus impaired the functioning of both attention systems.
This provides evidence of the two attention systems working together.
Evidence from brain-damaged patients (discussed below, see pp. 196–200)
is also relevant to establishing the brain areas necessarily involved in goal-­
directed or stimulus-driven attentional processes. Shomstein et al. (2010)
had brain-damaged patients complete two tasks, one requiring stimulus-­
driven attentional processes whereas the other required top-down processes.
Patients having greater problems with top-down attentional processing typically had brain damage to the superior parietal lobule (part of the dorsal
attention network). In contrast, patients having greater problems with
stimulus-driven attentional processing typically had brain damage to the
temporo-­parietal junction (part of the ventral attention network).
Wen et al. (2012) investigated interactions between the two visual attention systems. They assessed brain activation while participants responded to
target stimuli in one visual field while ignoring all stimuli in the unattended
visual field. There were two main findings. First, stronger causal influences
of the top-down system on the stimulus-driven system led to superior performance on the task. This finding suggests the appearance of an object
at the attended location caused the top-down attention system to suppress
activity within the stimulus-driven system.
Second, stronger causal influences of the stimulus-driven system
on the top-down system were associated with impaired task performance.
This finding suggests activation within the stimulus-driven system produced by stimuli not in attentional focus disrupted the attentional set
maintained by the top-down system.
Recent developments
Corbetta and Shulman’s (2002) theoretical approach has been developed
in recent years. Here we briefly consider three such developments. First, we
now have a greater understanding of interactions between their two attention networks. Meyer et al. (2018) found stimulus-driven and goal-directed
attention both activated frontal and parietal regions within the dorsal
attention network, suggesting it has a pivotal role in integrating bottom-up
and top-down processing.
Second, previous research reviewed by Corbetta and Shulman (2002)
indicated the dorsal attention network is active immediately prior to the
presentation of an anticipated visual stimulus. However, this research did
not indicate how long this attention network remained active. Meehan
et al. (2017) addressed this issue and discovered that top-down influences
associated within the dorsal attention network persisted over a relatively
long time period.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 194
28/02/20 6:44 PM
Attention and performance
195
Third, brain networks relevant to attention additional to those within
KEY TERM
Corbetta and Shulman’s (2002) theory have been identified (Sylvester
Default mode network
et al., 2012). One such network is the cingulo-opercular network including
A network of brain
the anterior insula/operculum and dorsal anterior cingulate cortex (dACC;
regions that is active
“by default” when an
see Figure 5.7). This network is associated with non-selective attention or
individual is not involved
alertness (Coste & Kleinschmidt, 2016).
in a current task; it is
Another additional network is the default mode network including
associated with internal
the posterior cingulate cortex (PCC), the lateral parietal cortex (LP), the
processes including mindinferior temporal cortex (IT), the medial prefrontal cortex (MPF) and the
wandering, remembering
the past and imagining
subgenual anterior cingulate cortex (sgACC). The default mode network
is activated during internally focused cognitive processes (e.g., mind-­ the future.
wandering; imagining the future). What is the relevance of this network
to attention? In essence, performance on tasks requiring externally focused
attention is often enhanced if the default mode network is deactivated
(Amer et al., 2016a).
Finally, there is the fronto-parietal network (Dosenbach et al., 2008),
which includes the anterior dorsolateral prefrontal cortex (aDLPFC), the
(a)
IPS
aDLPFC
TPJ
LP
VLPFC
anterior insula
IT
(b)
MCC
dACC
PCC
MPF
sgACC
Networks
Key:
Fronto-parietal
Default mode
Cingulo-opercular
Ventral attention
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 195
Figure 5.7
This is part of a theoretical
approach based on several
functional networks of
relevance to attention:
the four networks shown
(fronto-parietal; default
mode; cingulo-opercular;
and ventral attention) are all
discussed fully in the text.
Sylvester et al., 2012, p. 528.
Reprinted with permission of
Elsevier.
28/02/20 6:44 PM
196
Visual perception and attention
KEY TERMS
middle cingulate cortex (MCC) and the intraparietal sulcus (IPS). It is
associated with top-down attentional and cognitive control.
Neglect
A disorder involving
right-hemisphere damage
(typically) in which the
left side of objects and/
or objects presented to
the left visual field are
undetected; the condition
resembles extinction but
is more severe.
Pseudo-neglect
A slight tendency in
healthy individuals to
favour the left side of
visual space.
Evaluation
The theoretical approach proposed by Corbetta and Shulman (2002) has
several successes to its credit. First, there is convincing evidence for somewhat separate stimulus-driven and top-down attention systems, each with
its own brain network. Second, research using transcranial magnetic stimulation suggests major brain areas within each attention system play a causal
role in attentional processes. Third, some interactions between the two networks have been identified. Fourth, research on brain-damaged patients
supports the theoretical approach (see next section, pp. 196–200).
What are the limitations of this theoretical approach? First, the precise
brain areas associated with each attentional system have not been clearly
identified. Second, there is more commonality (especially within the parietal lobe) in the brain areas associated with the two attention networks
than assumed theoretically by Corbetta and Shulman (2002). Third, there
are additional attention-related brain networks not included within the
original theory. Fourth, much remains to be discovered about how different attention systems interact.
DISORDERS OF VISUAL ATTENTION
Here we consider two important attentional disorders in brain-damaged
individuals: neglect and extinction. Neglect (or spatial neglect) involves a
lack of awareness of stimuli presented to the side of space on the opposite side to the brain damage (the contralesional side). This occurs because
information from the left side of the visual field proceeds to the right
hemisphere.
Most neglect patients have damage in the right hemisphere and so
lack awareness of stimuli on the left side of the visual field: space-based or
egocentric neglect. For example, patients crossing out targets presented to
their left or right side (cancellation task) cross out more of those presented
to the right. When instructed to mark the centre of a horizontal line (line
bisection task), patients put it to the right of the centre. Note that the right
hemisphere is dominant in spatial attention in healthy individuals – they
exhibit pseudo-neglect, in which the left side of visual space is favoured
(Friedrich et al., 2018).
There is also object-centred or allocentric neglect involving a lack
of awareness of the left side of objects (see Figure 5.8). Patients with
right-hemisphere damage typically draw the right side of all figures in a
multi-object scene but neglect their left side in the left and right visual
fields (Gainotti & Ciaraffa, 2013).
Do allocentric and egocentric neglect reflect a single disorder or separate disorders? Rorden et al. (2012) obtained two findings supporting the
single disorder explanation. First, the correlation between the extent of
each form of neglect across 33 patients was +.80. Second, similar brain
regions were associated with each type of neglect. However, Pedrazzini
et al. (2017) found damage to the intraparietal sulcus was more associated
197
Attention and performance
Figure 5.8
On the left is a copying
task in which a patient with
unilateral neglect distorted
or ignored the left side of
the figures to be copied
(shown on the left). On the
right is a clock drawing
task in which the patient
was given a clock face and
told to insert the numbers
into it.
Reprinted from Danckert and
Ferber (2006). Reprinted with
permission from Elsevier.
with allocentric than egocentric neglect, whereas the opposite was the case
with damage to the temporo-parietal junction.
Extinction is often found in neglect patients. Extinction involves a
failure to detect a stimulus presented to the side opposite the brain damage
when a second stimulus is presented to the same side as the brain damage.
Extinction and neglect are closely related but separate deficits (de Haan
et al., 2012). We will focus mostly on neglect because it has attracted much
more research.
Which brain areas are damaged in neglect patients? Neglect is a heterogeneous condition and the brain areas damaged vary considerably
across patients. In a meta-analysis, Molenberghs et al. (2012) found the
main areas damaged in neglect patients are in the right hemisphere and
include the superior temporal gyrus, the inferior frontal gyrus, the insula,
the supramarginal gyrus and the angular gyrus (gyrus means ridge). Nearly
all these areas are within the stimulus-driven or ventral attention network
(see Figure 5.6) suggesting brain networks are damaged rather than simply
specific brain areas (Corbetta & Shulman, 2011).
We also need to consider functional connectivity (correlated brain
activity between brain regions). Baldassarre et al. (2014, 2016) discovered
widespread disruption of functional connectivity between the hemispheres
in neglect patients. This disruption did not involve the bottom-up and topdown attention networks. Of importance, recovery from attention deficits
in neglect patients was associated with improvements in functional connectivity in bottom-up and top-down attention networks (Ramsey et al.,
2016).
The right-hemisphere temporo-parietal junction and intraparietal sulcus
are typically damaged in extinction patients (de Haan et al., 2012). When
transcranial magnetic stimulation is applied to these areas to interfere with
processing, extinction-like behaviour results (de Haan et al., 2012). Dugué
et al. (2018) confirmed the importance of the temporo-parietal junction
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 197
Interactive feature:
Primal Pictures’
3D atlas of the brain
KEY TERM
Extinction
A disorder of visual
attention in which a
stimulus presented to the
side opposite the brain
damage is not detected
when another stimulus is
presented at the same
time to the side of the
brain damage.
28/02/20 6:44 PM
198
Visual perception and attention
(part of the ventral attention network) in control of spatial attention in a
neuroimaging study on healthy individuals. However, its subregions varied
in terms of their involvement in voluntary and involuntary attention shifts.
Conscious awareness and processing
Neglect patients generally report no conscious awareness of stimuli presented to the left visual field. However, that does not necessarily mean those
stimuli are not processed. Vuilleumier et al. (2002b) presented extinction
patients with two pictures at the same time, one to each visual field. The
patients showed very little memory for left-field stimuli. Then the patients
identified degraded pictures. There was a facilitation effect for left-field pictures indicating they had been processed.
Vuilleumier et al. (2002a) presented GK, a male patient with neglect
and extinction, with fearful faces. He showed increased activation in the
amygdala (associated with emotional responses) whether or not these faces
were consciously perceived. This is explicable given there is a processing
route from the retina to the amygdala bypassing the cortex (Diano et al.,
2017).
Sarri et al. (2010) found extinction patients had no awareness of leftfield stimuli. However, these stimuli were associated with activation in
early visual processing areas, indicating they received some processing.
Processing in neglect and extinction has been investigated using
event-related potentials. Di Russo et al. (2008) focused on the processing
of left-field stimuli not consciously perceived by neglect patients. Early
processing of these stimuli was comparable to that of healthy controls with
only later processing being disrupted. Lasaponara et al. (2018) obtained
similar findings in neglect patients. In healthy individuals, the presentation
of left-field targets inhibits processing of right-field space. This was less
the case in neglect patients, which helps to explain their lack of conscious
perception of left-field stimuli.
Theoretical considerations
Corbetta and Shulman (2011) discussed neglect in the context of their
two-system theory (discussed earlier, see pp. 192–196). In essence, the
bottom-up ventral attention network is damaged. Strong support for this
assumption was reported by Toba et al. (2018a) who found in 25 neglect
patients that impaired performance on tests of neglect was associated with
damage to parts of the ventral attention network (e.g., angular gyrus;
supramarginal gyrus). Since the right hemisphere is dominant in the ventral
attention network, neglect patients typically have damage in that hemisphere. Of importance, Corbetta and Shulman (2011) also assumed that
damage to the ventral network impairs the functioning of the goal-directed
dorsal attention network (even though not itself damaged).
How does the damaged ventral attention network impair the dorsal
attention network’s functioning? The two attention networks interact and
so damage to the ventral network inevitably affects the functioning of the
dorsal network. More specifically, damage to the ventral attention network
“impairs non-spatial [across the entire visual field] functions, hypoactivates
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 198
28/02/20 6:44 PM
Attention and performance
199
[reduces activation in] the right hemisphere, and unbalances the activity of
the dorsal attention network” (Corbetta & Shulman, 2011, p. 592).
de Haan et al. (2012) proposed a theory of extinction based on two
major assumptions:
(1)
(2)
“Extinction is a consequence of biased competition for attention
between the ipsilesional [right-field] and contralesional [left-field]
target stimuli” (p. 1048);
Extinction patients have much reduced attentional capacity so often
only one target [the right-field one] can be detected.
Findings
According to Corbetta and Shulman (2011), the dorsal attention network in
neglect patients functions poorly because of reduced activation in the right
hemisphere and associated reduced alertness and attentional resources.
Thus, increasing patients’ general alertness should enhance their detection
of left-field visual targets. Robertson et al. (1998) found the slower detection of left visual field stimuli compared to those in the right visual field was
no longer present when warning sounds were used to increase alertness.
Bonato and Cutini (2016) compared neglect patients’ ability to detect
visual targets with (or without) a second, attentionally demanding task.
Detection rates were high for targets presented to the right visual field in
both conditions. In contrast, patients detected only approximately 50% as
many targets in the left visual field as the right when performing another
task. Thus, neglect patients have limited attentional resources.
Corbetta and Shulman (2011) assumed neglect patients have an essentially intact dorsal attention network. Accordingly, neglect patients might
use that network effectively if steps were taken to facilitate its use. Duncan
et al. (1999) presented arrays of letters and neglect patients recalled only
those in a pre-specified colour (the dorsal attention network could be used
to select the appropriate letters). Neglect patients resembled healthy controls in showing equal recall of letters presented to each side of visual space.
The two attention networks typically work closely together. Bays et al.
(2010) studied neglect patients. They used eye movements during a visual
search to assess patients’ problems with top-down and stimulus-driven
attentional processes. Both types of attentional processes were equally
impaired (as predicted by Corbetta and Shulman, 2011). Of most importance, there was a remarkably high correlation of +.98 between these two
types of attentional deficit.
Toba et al. (2018b) identified two reasons for the failure of neglect
patients to detect left-field stimuli:
(1)
(2)
a “magnetic” attraction of attention (i.e., right-field stimuli immediately capture attention).
impaired spatial working memory making it hard for patients to keep
track of the locations of stimuli.
Both reasons were equally applicable to most patients. However, the
first reason was dominant in 12% of patients and the second reason in
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 199
28/02/20 6:44 PM
200
Visual perception and attention
KEY TERM
24% of patients. Accordingly, Toba et al. argued we should develop
­multi-component models of visual neglect to account for such individual
differences.
We turn now to extinction patients. According to de Haan et al.
(2012), extinction occurs because of biased competition between stimuli.
If two stimuli could be integrated, that might minimise competition and so
reduce extinction. Riddoch et al. (2006) tested this prediction by presenting objects used together often (e.g., wine bottle and wine glass) or never
used together (e.g., wine bottle and ball). Extinction patients identified both
objects more often in the former condition than the latter (65% vs 40%,
respectively).
The biased competition hypothesis has been tested in other ways. We
can impair attentional processes in the intact left hemisphere by applying
transcranial magnetic stimulation to it. This should reduce competition
from the left hemisphere in extinction patients and thus reduce extinction.
Some findings are consistent with this prediction (Oliveri & Caltagirone,
2006).
de Haan et al. (2012) also identified reduced attentional capacity as a
factor causing extinction. Bonato et al. (2010) studied extinction with or
without the addition of a second, attentionally demanding task. As predicted, extinction patients showed a substantial increase in the extinction
rate (from 18% to over 80%) with this additional task.
Visual search
A task involving the rapid
detection of a specified
target stimulus within a
visual display.
Overall evaluation
Research has produced several important findings. First, neglect and
extinction patients can process unattended visual stimuli in the absence of
conscious awareness of those stimuli. Second, most neglect patients have
damage to the ventral attention network leading to impaired functioning of
the undamaged dorsal attention network. Third, extinction occurs because
of biased competition for attention and reduced attentional capacity.
What are the limitations of research in this area? First, it is hard
to produce theoretical accounts applicable to all neglect or extinction
patients because the precise symptoms and regions of brain damage
vary considerably across patients. Second, neglect patients vary in their
precise processing deficits (e.g., Toba et al., 2018b), but this has been
­de-emphasised in most theories. Third, the precise relationship between
neglect and e­xtinction remains unclear. Fourth, the dorsal and ventral
networks generally interact but the extent of their interactions remains to
be determined.
VISUAL SEARCH
We spend much time searching for various objects (e.g., a friend in a
crowd). The processes involved have been studied in research on visual
search where a specified target is detected as rapidly as possible. Initially,
we consider an important real-world situation where visual search can be
literally a matter of life-or-death: airport security checks. After that, we
consider an early very influential theory of visual search before discussing
more recent theoretical and empirical developments.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 200
28/02/20 6:44 PM
Attention and performance
201
IN THE REAL WORLD: AIRPORT SECURITY CHECKS
Airport security checks have become more thorough since 9/11. When your luggage is x-rayed, an
airport security screener searches for illegal and dangerous items (see Figure 5.9). Screeners are
well trained but mistakes sometimes occur.
Figure 5.9
Each bag contains one illegal item. From left to right: a large bottle; a dynamite stick; and a gun part.
From Mitroff & Biggs (2014).
There are two major reasons it is often hard for airport security screeners to detect dangerous
items. First, illegal and dangerous items are (thankfully!) present in only a minute fraction of passengers’ luggage. This rarity of targets makes it hard for airport security screeners to detect them.
Mitroff and Biggs (2014) asked observers to detect illegal items in bags (see Figure 5.9). The
detection rate was only 27% when targets appeared under 0.15% of the time: they termed this
the “ultra rare item effect”. In contrast, the detection rate was 92% when targets appeared more
than 1% of the time.
Peltier and Becker (2016) tested two explanations for the reduced detection rate with rare targets:
(1) a reduced probability that the target is fixated (selection error); and (2) increased caution about
reporting targets because they are so unexpected (identification error). There was evidence for
both explanations. However, most detection failures were selection errors (see Figure 5.10).
Accuracy
Accuracy
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Target present
10
50
Target absent
90
Prevalence
Figure 5.10
Frequency of selection and identification errors when targets were present on 10%, 50% or 90% of trials.
From Peltier and Becker (2016).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 201
28/02/20 6:44 PM
202
Visual perception and attention
Second, security screeners search for numerous different objects. This increases search difficulty.
Menneer et al. (2009) found target detection was worse when screeners searched for two categories of objects (metal threats and improvised explosive devices) rather than one.
How can we increase the efficiency of security screening? First, we can exploit individual differences in the ability to detect targets. Rusconi et al. (2015) found individuals scoring high on a
questionnaire measure of attention to detail had superior target-detection performance than low
scorers.
Second, airport security screeners can find it hard to distinguish between targets (i.e., dangerous items) and similar-looking non-targets. Geng et al. (2017) found that observers whose training
included non-targets resembling targets learned to develop increasingly precise internal target
representations. Such representations can improve the speed and accuracy of security screening.
Third, the low detection rate when targets are very rare can be addressed. Threat image projection (TIP) can be used to project fictional threat items into x-ray images of luggage to increase
the apparent frequency of targets. When screeners are presented with TIPs plus feedback when
they miss them, screening performance improves considerably (Hofer & Schwaninger, 2005). In
similar fashion, Schwark et al. (2012) found providing false feedback to screeners to indicate they
had missed rare targets reduced their cautiousness about reporting targets and improved their
performance.
Feature integration theory
Feature integration theory was proposed by Treisman and Gelade (1980)
and subsequently updated and modified (e.g., Treisman, 1998). According
to the theory, we need to distinguish between object features (e.g., colour;
size; line orientation) and the objects themselves. There are two processing
stages:
(1)
(2)
KEY TERM
Illusory conjunction
Mistakenly combining
features from two
different stimuli to
perceive an object that is
not present.
Basic visual features are processed rapidly and pre-attentively in parallel across the visual scene.
Stage (1) is followed by a slower serial process with focused attention providing the “glue” to form objects from the available features
(e.g., an object that is round and has an orange colour is perceived
as an orange). In the absence of focused attention, features from
different objects may be combined randomly producing an illusory
conjunction.
It follows from the above assumptions that targets defined by a single feature
(e.g., a blue letter or an S) should be detected rapidly and in parallel. In
contrast, targets defined by a conjunction or combination of features (e.g., a
green letter T) should require focused attention and so should be slower to
detect. Treisman and Gelade (1980) tested these predictions using both types
of targets; the display size was 1–30 items and a target was present or absent.
As predicted, response was rapid and there was very little effect of
display size when the target was defined by a single feature: these findings
suggest parallel processing (see Figure 5.11). Response was slower and was
strongly influenced by display size when the target was defined by a conjunction of features: these findings suggest there was serial processing.
According to the theory, lack of focused attention can produce illusory
conjunctions based on random combinations of features. Friedman-Hill
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 202
28/02/20 6:44 PM
Attention and performance
203
Figure 5.11
Performance speed on
a detection task as a
function of target definition
(conjunctive vs single
feature) and display size.
Adapted from Treisman and
Gelade (1980).
et al. (1995) studied a brain-damaged patient (RM) having problems with
the accurate location of visual stimuli. This patient produced many illusory conjunctions combining the shape of one stimulus with the colour of
another.
Limitations
What are the theory’s limitations? First, Duncan and Humphreys (1989,
1992) identified two factors not included within feature integration theory:
(1)
(2)
When distractors are very similar to each other, visual search is faster
because it is easier to identify them as distractors.
The number of distractors has a strong effect on search time to
detect even targets defined by a single feature when targets resemble
distractors.
Second, Treisman and Gelade (1980) estimated the search time with conjunctive targets was approximately 60 ms per item and argued this represented the time taken for focal attention to process each item. However,
research with other paradigms indicates it takes approximately 250 ms for
attention indexed by eye movements to move from one location to another.
Thus, it is improbable focal attention plays the key role assumed within the
theory.
Third, the theory assumes visual search is often item-by-item. However,
the information contained within most visual scenes cannot be divided up
into “items” and so the theory is of limited applicability. Such considerations led Hulleman and Olivers (2017) to produce an article entitled “The
impending demise of the item in visual search”.
Fourth, visual search involves parallel processing much more than
implied by the theory. For example, Thornton and Gilden (2007) used
29 different visual tasks and found 72% apparently involved parallel
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 203
28/02/20 6:44 PM
204
Visual perception and attention
processing. We can explain such findings by assuming that each eye fixation permits considerable parallel processing using information available in
peripheral vision (discussed below, see pp. 206–208).
Fifth, the theory assumes that the early stages of visual search are
entirely feature-based. However, recent research using event-related potentials indicates that object-based processing can occur much faster than predicted by feature integration theory (e.g., Berggren & Eimer, 2018).
Sixth, the theory assumes visual search is essentially random. This
assumption is wrong with respect to the real world – we typically use our
knowledge of where a target object is likely to be located when searching
for it (see below).
Dual-path model
In most of the research discussed so far, the target appeared at a random
location within the visual display. This is radically different from the real
world. Suppose you are outside looking for your missing cat. Your visual
search would be very selective – you would ignore the sky and focus mostly
on the ground (and perhaps the trees). Thus, your search would involve
top-down processes based on your knowledge of where cats are most likely
to be found.
Ehinger et al. (2009) studied top-down processes in visual search by
recording eye fixations of observers searching for a person in 900 realworld outdoor scenes. Observers typically fixated plausible locations (e.g.,
pavements) and ignored implausible ones (e.g., sky; trees; see Figure 5.12).
Observers also fixated locations differing considerably from neighbouring
locations and areas containing visual features resembling those of a human
figure.
How can we reconcile Ehinger et al.’s (2009) findings with those
discussed earlier? Wolfe et al. (2011) proposed a dual-path model (see
Figure 5.13). There is a selective pathway of limited capacity (indicated
by the bottleneck) with objects being selected individually for recognition.
Figure 5.12
The first three eye fixations made by observers searching for pedestrians. As can be
seen, the great majority of their fixations were on regions in which pedestrians would
most likely be found. Observers’ fixations were much more like each other in the lefthand photo than in the right-hand one, because there were fewer likely regions in the
left-hand one.
From Ehinger et al. (2009). Reprinted with permission from Taylor & Francis.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 204
28/02/20 6:44 PM
Attention and performance
ve
p
e
ns
No
Early vision
ti
lec
y
wa
h
t
a
Features
thway
ive pa
Select
Color
Orientation
Size
Depth
Motion
Etc.
205
Figure 5.13
A two-pathway model of
visual search. The selective
pathway is capacity limited
and can bind stimulus
features and recognise
objects. The non-selective
pathway processes the gist
of scenes. Selective and
non-selective processing
occur in parallel to produce
effective visual search.
From Wolfe et al. (2011).
Reprinted with permission from
Elsevier.
Binding and
recognition
This pathway has been the focus of most research until recently. There is
also a non-selective pathway in which the “gist” of a scene is processed.
Such processing can then guide processing within the selective pathway
(represented by the arrow labelled “guidance”). This pathway allows us to
utilise our stored environmental knowledge and so is of great value in the
real world.
Findings
Wolfe et al. (2011) compared visual searches for objects presented within a
scene setting or at random locations. As predicted, search rate per item was
much faster in the scene setting (10 ms vs 40 ms, respectively). Võ and Wolfe
(2012) explained that finding in terms of “functional set size” – searching in
scenes is efficient because most regions can be ignored. As predicted, Võ
and Wolfe found 80% of each scene was rarely fixated.
Kaiser and Cichy (2018) presented observers with objects typically
located in the upper (e.g., aeroplane; hat) or lower (e.g., carpet; shoe) visual
field. These objects were presented in their typical or atypical location
(e.g., hat in the lower visual field). Observers had to indicate whether an
object presented very briefly was located in the upper or lower visual field.
Observers’ performance was better when objects appeared in their typical
location because of their extensive knowledge of where objects are generally located.
Chukoskie et al. (2013) found observers can easily learn where
targets are located. An invisible target was presented at random locations on a blank screen and observers were provided with feedback. There
was a strong learning effect – fixations rapidly shifted from being fairly
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 205
28/02/20 6:44 PM
206
Visual perception and attention
KEY TERM
random to being focused on the area within which the target might be
present.
Ehinger et al.’s (2009) findings (discussed earlier, see p. 204) suggested
that scene gist or context can be used to enhance the efficiency of visual
search. Katti et al. (2017) presented scenes very briefly (83 ms) followed by
a mask. Observers were given the task of detecting a person or a car and
performed very accurately (over 90%) and rapidly. Katti et al. confirmed
that scene gist or context influenced performance. However, performance
was influenced more strongly by features of the target object – the more
key features of an object were visible, the faster it was detected.
What is the take-home message from the above study? The efficiency
of visual search with real-world scenes is more complex than implied by
Ehinger et al. (2009). More specifically, observers may rapidly fixate on the
area close to a target person because they are using scene gist or because
they rapidly process features of the person (e.g., wearing clothes).
Fovea
A small area within the
retina in the centre in the
field of vision where visual
acuity is greatest.
Evaluation
Our knowledge of likely (and unlikely) locations for any given object in a
scene influences visual search in the real world. This is fully acknowledged
in the dual-path model. There is also support for the notion that scene
knowledge facilitates visual search by reducing functional set size.
What are the model’s limitations? First, how we use gist knowledge
of a scene very rapidly to reduce the search area remains unclear. Second,
there is insufficient focus on the learning processes that can greatly facilitate visual search – the effects of such processes can be seen in the very
rapid and accurate detection of target information by experts in several
domains (see Chapter 11).
Third, it is important not to exaggerate the importance of scene gist or
context in influencing the efficiency of visual search. Features of the target
object can influence visual search more than scene gist (Katti et al., 2017).
Fourth, the assumption that items are processed individually within
the selective pathway is typically mistaken. As we will see shortly, visual
search often depends on parallel processes within peripheral vision and
such processes are not considered within the model.
Attention vs perception: texture tiling model
Several theories (e.g., Treisman & Gelade, 1980) have assumed that individual items are the crucial units in visual search. Such theories have also
often assumed that slow visual search depends mostly on the limitations of
focused attention. A plausible implication of these assumptions is that slow
visual search depends mostly on foveal vision (the fovea is a small area of
maximal visual acuity in the retina).
Both the above assumptions have been challenged recently. At the
risk of oversimplification, full understanding of visual search requires less
emphasis on attention and more on perception. According to Rosenholtz
(2016), peripheral (non-foveal) vision is of crucial importance. Acuity
decreases as we move away from the fovea to the periphery of vision, but
much less than often assumed. You can demonstrate this by holding out
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 206
28/02/20 6:44 PM
207
Attention and performance
your thumb and fixating the nail. Foveal vision only covers the nail so the
great majority of what you can see is in peripheral vision.
We can also compare the value of foveal and peripheral vision by considering individuals with impaired eyesight. Those with severely impaired
peripheral vision (e.g., due to glaucoma) had greater problems with mobility (e.g., number of falls; ability to drive) than those who lack foveal vision
(due to macular degeneration) (Rosenholtz, 2016). Individuals with severely
impaired central or foveal vision performed almost as well as healthy controls at detecting target objects in coloured scenes (75% vs 79%, respectively) (Thibaut et al., 2018).
If visual search depends heavily on peripheral vision, what predictions
can we make? First, if each fixation provides observers with a considerable
amount of information about several objects, visual search will typically
involve parallel rather than serial processing. Second, we need to consider
limitations of peripheral vision (e.g., visual acuity is less in peripheral
than foveal vision). However, a more important limitation concerns visual
crowding – a reduced ability to recognise objects or other stimuli because
of irrelevant neighbouring objects or stimuli (clutter). Visual crowding
impairs peripheral vision to a much greater extent than foveal vision.
Rosenholtz et al. (2012) proposed the texture tiling model based on
the assumption peripheral vision is of crucial importance in visual search.
More specifically, processing in peripheral vision can cause adjacent stimuli
to tile (join together) to form an apparent target, thus increasing the difficulty of visual search. Below we consider findings relevant to this model.
KEY TERM
Visual crowding
The inability to recognise
objects in peripheral
vision due to the presence
of neighbouring objects.
Findings
As mentioned earlier (p. 203), Thornton and Gilden (2007) found almost
three-quarters of the visual tasks they studied involved parallel processing.
This is entirely consistent with the emphasis on parallel processing in the
model.
Direct evidence for the importance of peripheral vision to visual search
was reported by Young and Hulleman (2013). They manipulated the visible
area around the fixation point making it small, medium or large. As predicted by the model, visual search performance was worst when the visible
area was small (so only one item could be processed per fixation). Overall,
visual search was almost parallel when the visible area was large but serial
when it was small.
Chang and Rosenholtz (2016) used various search tasks. According to
feature integration theory, both tasks shown in Figure 5.14 should be comparably hard because the target and distractors share features. In contrast,
the texture tiling model predicts the task on the right should be harder
because adjacent distractors seen in peripheral vision can more easily tile
(join together) to form an apparent T. The findings from these tasks (and
several others) supported the texture tiling model but were inconsistent
with feature integration theory.
Finally, Hulleman and Olivers (2017) produced a model of visual
search consistent with the texture tiling model. According to this model,
each eye fixation lasts 250 ms, during which information from foveal and
peripheral vision is extracted in parallel. They also assumed that the area
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 207
28/02/20 6:44 PM
208
Visual perception and attention
Figure 5.14
The target (T) is easier to
find in the display on the
left than the one on the
right.
From Chang and Rosenholtz
(2016).
(a)
Easier search
(b)
Harder search
Find the T
around the fixation point within which a target can generally be detected
is smaller when the visual search task is difficult (e.g., because target discriminability is low).
A key prediction from Hulleman and Olivers’ (2017) model is that the
main reason why search times are longer with more difficult search tasks is
because more eye fixations are required than with easier tasks. A computer
simulation based on these assumptions produced search times very similar
to those obtained in experimental studies.
Evaluation
What are the strengths of the texture tiling model? First, the information
available in peripheral vision is much more important in visual search than
assumed previously. The model explains how observers make use of the
information available in peripheral vision.
Second, the model explains why parallel processing is so prevalent
in visual search – it reflects directly parallel processing within peripheral
vision. Third, there is accumulating evidence that search times are generally directly related to the number of eye fixations.
Fourth, an approach based on eye fixations and peripheral vision can
potentially explain findings from all visual search paradigms, including
complex visual scenes and item displays. Such an approach thus has more
general applicability than feature integration theory.
What are the model’s limitations? First, as Chang and Rosenholtz
(2016) admitted, it needs further development to account fully for visual
search performance. For example, it does not predict search times with
precision. In addition, it does not specify the criteria used by observers to
decide no target is present.
Second, visual search is typically much faster for experts than non-­
experts in their domain of expertise (e.g., medical experts examining mammograms) (see Chapter 11). The texture tiling model does not identify
clearly the processes allowing experts to make very efficient use of peripheral information.
CROSS-MODAL EFFECTS
Nearly all the research discussed so far is limited in that the visual (or auditory) modality was studied on its own. We might try to justify this approach
by assuming attentional processes in each sensory modality operate
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 208
28/02/20 6:44 PM
209
Attention and performance
independently from those in other modalities. However, that assumption is
incorrect. In the real world, we often coordinate information from two or
more sense modalities at the same time (cross-modal attention). An example
is lip reading, where we use visual information about a speaker’s lip movements to facilitate our understanding of what they are saying (see Chapter 9).
Suppose we present participants with two streams of lights (as was
done by Eimer and Schröger, 1998), with one stream being presented to the
left and the other to the right. At the same time, we present participants
with two streams of sounds (one to each side). In one condition, participants detect deviant visual events (e.g., longer than usual stimuli) presented
to one side only. In the other condition, participants detect deviant auditory events in only one stream.
Event-related potentials were recorded to assess the allocation of attention. Unsurprisingly, Eimer and Schröger (1998) found ERPs to deviant
stimuli in the relevant modality were greater to stimuli presented on the
to-be-attended side than the to-be-ignored side. Thus, participants allocated attention as instructed.
Of more interest is what happened to the allocation of attention in
the irrelevant modality. Suppose participants detected visual targets on the
left side. In that case, ERPs to deviant auditory stimuli were greater on
the left side than the right. This is a cross-modal effect: the voluntary or
endogenous allocation of visual attention also affected the allocation of
auditory attention. Similarly, when participants detected auditory targets
on one side, ERPs to deviant visual stimuli on the same side were greater
than ERPs to those on the opposite side. Thus, the allocation of auditory
attention also influenced the allocation of visual attention.
KEY TERMS
Cross-modal attention
The coordination of
attention across two or
more modalities (e.g.,
vision and audition).
Ventriloquism effect
The mistaken perception
that sounds are
coming from their
apparent source (as in
ventriloquism).
Ventriloquism effect
What happens when there is a conflict between simultaneous visual and
auditory stimuli? We will focus on the ventriloquism effect in which
sounds are misperceived as coming from their apparent visual source.
Ventriloquists (at least good ones!) speak without moving their lips while
manipulating a dummy’s mouth movements. It seems as if the dummy is
speaking. Something similar happens at the movies. The actors’ lips move
on the screen but their voices come from loudspeakers beside the screen.
Nevertheless, we hear those voices coming from their mouths.
Certain conditions must be satisfied for the ventriloquism effect to
occur (Recanzone & Sutter, 2008). First, the visual and auditory stimuli
must occur close together in time. Second, the sound must match expectations created by the visual stimulus (e.g., high-pitched sound coming from
a small object). Third, the sources of the visual and auditory stimuli should
be close together spatially. More generally, the ventriloquism effect reflects
the unity assumption (the assumption that two or more sensory cues come
from the same object: Chen & Spence, 2017).
The ventriloquism effect exemplifies visual dominance (visual information dominating perception). Further evidence comes from the Colavita
effect (Colavita, 1974): participants instructed to respond to all stimuli
respond more often to visual than simultaneous auditory stimuli (Spence
et al., 2011).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 209
28/02/20 6:44 PM
210
Visual perception and attention
KEY TERM
When during processing is visual spatial information integrated with
auditory information? Shrem et al. (2017) found that misleading visual
information about the location of an auditory stimulus influenced the processing of the auditory stimulus approximately 200 ms after stimulus onset.
The finding that this effect is still present even when participants are aware
of the spatial discrepancy between the visual and auditory input suggests it
occurs relatively “automatically”.
However, the ventriloquism effect is smaller when participants had
previously heard syllables spoken in a fearful voice (Maiworm et al., 2012).
This suggests the effect is not entirely “automatic” but is reduced when the
relevance of the auditory channel is increased.
Why does vision capture sound in the ventriloquism effect? The visual
modality typically provides more precise information about spatial location. However, when visual stimuli are severely blurred and poorly localised, sound captures vision (Alais & Burr, 2004). Thus, we combine visual
and auditory information effectively by attaching more weight to the more
informative sense modality.
Temporal ventriloquism
effect
Misperception of the
timing of a visual stimulus
when an auditory stimulus
is presented close to it in
time.
Temporal ventriloquism
The above explanation for the ventriloquist illusion is a development of
the modality appropriateness and precision hypothesis (Welch & Warren,
1980). According to this hypothesis, when conflicting information is presented in two or more modalities, the modality having the greatest acuity
generally dominates. This hypothesis predicts the existence of another illusion. The auditory modality is typically more precise than the visual modality at discriminating temporal relations. As a result, judgements about
the temporal onset of visual stimuli might be biased by auditory stimuli
presented very shortly beforehand or afterwards. This is the temporal
­ventriloquism effect.
Research on temporal ventriloquism
was reviewed by Chen and Spence (2017).
A simple example is when the apparent
onset of a flash is shifted towards an abrupt
sound presented slightly asynchronously (see
Figure 5.15). Other research has found that
the apparent duration of visual stimuli can be
distorted by asynchronous auditory stimuli.
We need to consider the temporal ventriloquism effect in the context of the unity
assumption. This is the assumption that “two
or more uni-sensory cues belong together
(i.e., that they come from the same object or
event)” (Chen & Spence, 2017, p. 1). Chen
and Spence discussed findings showing that
Figure 5.15
the unity assumption generally (but not
An example of temporal ventriloquism in which the apparent
always) enhances the temporal ventriloquism
time of onset of a flash is shifted towards that of a sound
effect.
presented at a slightly different timing from the flash.
Orchard-Mills et al. (2016) extended
From Chen and Vroomen (2013). Reprinted with permission from
Springer.
research by using two visual stimuli (one
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 210
28/02/20 6:44 PM
Attention and performance
211
IN THE REAL WORLD: WARNING SIGNALS PROMOTE SAFE DRIVING
Front-to-rear-end collisions cause 25% of road accidents with driver inattention the most common
cause (Spence, 2012). Thus, it is important to devise effective warning signals to enhance driver
attention and reduce collisions. Warning signals might be especially useful if they were informative
(i.e., indicating the nature of the danger). However, informative warning signals requiring time-consuming cognitive processing might be counterproductive.
Ho and Spence (2005) considered drivers’ reaction times when braking to avoid a car in front or
accelerating to avoid a speeding car behind. An auditory warning signal (car horn) came from the
same direction as the critical visual event on 80% or 50% of trials. Braking times were faster when
the sound and critical visual event were from the same direction. The greater beneficial effects of
auditory signals when predictive rather than non-predictive suggests the involvement of endogenous spatial attention (controlled by the individual’s intentions). Auditory stimuli also influenced
visual attention even when non-predictive: this probably involved exogenous spatial attention
(“automatic” allocation of attention).
Gray (2011) studied braking times to avoid a collision with the car in front when drivers heard
auditory warning signals increasing in intensity as the time to collision reduced. These signals are
known as looming sounds. The most effective condition was the one where the rate of increase
in the intensity of the auditory signal was the fastest because it implied the time to collision was
the least. Lahmer et al. (2018) found evidence that looming sounds are effective because they are
consistent with the visual experience of an approaching collision.
Vibrotactile signals produce the perception of vibration through touch. Gray et al. (2014) studied
the effects of such signals on speed of braking to avoid a collision. Signals were presented at three
sites on the abdomen arranged vertically. In the most effective condition, successive signals moved
towards the driver’s head at an increasing rate reflecting the speed they were approaching the
car in front. Braking time was 250 ms faster in this condition than a no-warning control condition,
probably because it was highly informative.
Ahtamad et al. (2016) compared the effectiveness of three vibrotactile warning signals delivered
to the back on braking times to avoid a collision with the car in front: (1) expanding (centre of back
followed by areas to left and right); (2) contracting (areas to left and right followed by the centre
of the back); (3) static (centre of the back + areas to left and right at the same time). The dynamic
vibrotactile conditions (1 and 2) produced comparable braking reaction times that were faster than
those in the static condition (3).
In a second experiment, Ahtamad et al. (2016) compared the expanding vibrotactile condition
against a linear motion condition (vibrotactile stimulation to the hands followed by the shoulders).
Emergency braking reaction times were faster in the linear motion condition (approximately 585 ms
vs 640 ms) because drivers found it easier to interpret the warning signals in that condition.
In sum, the various auditory and vibrotactile warning signals discussed above typically reduce
braking reaction times by approximately 40 ms. That sounds modest. However, it can easily be the
difference between colliding with the car in front or avoiding it and so could potentially save many
lives. At present, however, we lack a theoretical framework within which to understand precisely
why some warning signals are more effective than others.
above and the other below fixation) and two auditory stimuli (low- and
high-pitch). When the visual and auditory stimuli were congruent (e.g.,
visual stimulus above fixation and auditory stimulus high-pitch), the temporal ventriloquism effect was found. However, this effect was eliminated
when the visual and auditory stimuli were incongruent, which prevented
binding of information across the two senses.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 211
28/02/20 6:44 PM
212
Visual perception and attention
KEY TERMS
Overall evaluation
Endogenous spatial
attention
Attention to a stimulus
controlled by intentions
or goal-directed
mechanisms.
Exogenous spatial
attention
Attention to a given
spatial location
determined by
“automatic” processes.
Multi-tasking
Performing two or more
tasks at the same time by
switching rapidly between
them.
Case study:
Multi-tasking efficiency
What are the limitations of research on cross-modal effects? First, as just
mentioned, our theoretical understanding lags behind the accumulation of
empirical findings. Second, much research has involved complex artificial
tasks far removed from naturalistic conditions. Third, individual differences have generally been ignored. However, individual differences (e.g.,
preference for auditory or visual stimuli) influence cross-modal effects (van
Atteveldt et al., 2014).
DIVIDED ATTENTION: DUAL-TASK PERFORMANCE
In this section, we consider factors influencing how well we can perform
two tasks at the same time. In our hectic 24/7 lives, we increasingly try to
do two things at once (multi-tasking) (e.g., sending text messages while
walking down the street). More specifically, multi-tasking “refers to the
ability to co-ordinate the completion of several tasks to achieve an overall
goal” (MacPherson, 2018, p. 314). It can involve performing two tasks at
the same time or switching between two tasks. There is controversy as to
whether massive amounts of multi-tasking have beneficial or detrimental
effects on attention and cognitive control (see Box).
What determines how well we can perform two tasks at once?
Similarity (e.g., in terms of modality) is one important factor. Treisman
and Davies (1973) found two monitoring tasks interfered with each other
much more when the stimuli on both tasks were in the same modality
(visual or auditory).
Two tasks can also be similar in response modality. McLeod (1977)
had participants perform a continuous tracking task with manual responding together with a tone-identification task. Some participants responded
vocally to the tones whereas others responded with the hand not involved
in tracking. Tracking performance was worse with high response similarity
(manual responses on both tasks) than with low response similarity.
Practice is the most important factor determining how well two tasks
can be performed together. The saying “Practice makes perfect” was
apparently supported by Spelke et al. (1976). Two students (Diane and
John) received 5 hours of training a week for 4 months on various tasks.
Their first task involved reading short stories for comprehension while
writing down words from dictation, which they initially found very hard.
After 6 weeks of training, however, they could read as rapidly and with
as much comprehension when writing to dictation as when only reading.
With further training, Diane and John learned to write down the names
of the categories to which the dictated words belonged while maintaining
normal reading speed and comprehension.
Spelke et al.’s (1976) findings are hard to interpret for various reasons.
First, Spelke et al. focused on accuracy measures, which are typically less
sensitive to dual-task interference than speed measures. Second, Diane and
John’s attentional focus was relatively uncontrolled, and so they may have
alternated attention between tasks rather than attending to both at the
same time. More controlled research on the effects of practice on dual-task
performance is discussed later.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 212
28/02/20 6:44 PM
Attention and performance
213
IN THE REAL WORLD: MULTI-TASKING
What are the effects of frequent multi-tasking in our everyday lives? Two main answers have been
proposed. First, heavy multi-tasking may impair cognitive control because it leads individuals to
allocate their attentional resources too widely. This is the scattered attention hypothesis (van der
Schuur et al., 2015).
Second, heavy multi-tasking may enhance some control processes (e.g., task switching) because
of prolonged practice in processing multiple streams of information. This is the trained attention
hypothesis (van der Schuur et al., 2015). The relevant evidence is very inconsistent – “positive,
negative, and null effects have all been reported” (Uncapher & Wagner, 2018, p. 9894).
Ophir et al. (2009) used a questionnaire (the Media Multitasking Index) to identify levels of
multi-tasking. Heavy multi-taskers were more distractible. In a review, van der Schuur et al. (2015)
found findings supported the scattered attention hypothesis (e.g., heavy multi-taskers had impaired
sustained attention).
Moisala et al. (2016) found heavy multi-taskers were more adversely affected than light
­multi-taskers by distracting stimuli while performing speech–listening and reading tasks. During
distraction, the heavy multi-taskers had greater activity than the light multi-taskers in the right prefrontal cortex (associated with attentional control). This suggests heavy multi-taskers have greater
problems than previously believed – their performance is impaired even though they try harder to
exert top-down attentional control.
Uncapher and Wagner (2018) found in a review that most research indicated negative effects
of heavy multi-tasking on tasks involving working memory, long-term memory, sustained attention and relational reasoning. These negative effects are likely to be due to attentional lapses.
Of relevance, there are several studies where media multi-tasking was positively associated with
self-reported everyday attentional failures. In addition, heavy multi-taskers often report high impulsivity – such individuals often make rapid decisions based on very limited evidence.
Most studies have only found an association between media multi-tasking and measures of
attention and performance. This makes it hard to establish causality – it is possible individuals with
certain patterns of attention choose to engage in extensive multi-tasking. Evidence suggesting
that media multi-tasking can cause attention problems was reported by Baumgartner et al. (2018).
They found that high media multi-tasking at one point in time predicted attention problems several
months later.
Serial vs parallel processing
When individuals perform two tasks together, they might use serial or parallel processing. Serial processing involves switching attention backwards
and forwards between two tasks with only one task being processed at any
given moment. In contrast, parallel processing involves processing both
tasks at the same time.
There has been much theoretical controversy on the issue of serial vs
parallel processing in dual-task conditions (Koch et al., 2018). Of importance, processing can be mostly parallel or mostly serial. Lehle et al. (2009)
trained participants to use serial or parallel processing when performing two tasks together. Those using serial processing performed better.
However, they found the tasks more effortful because they had to inhibit
processing of one task while performing the other one.
Lehle and Hübner (2009) also instructed participants to perform two
tasks together in a serial or parallel fashion. Those using parallel processing
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 213
28/02/20 6:44 PM
214
Visual perception and attention
performed much worse. Fischer and Plessow (2015) reviewed dual-task
research and concluded: “While serial task processing appears to be the
most efficient [dual-task] processing strategy, participants are able to adopt
parallel processing. Moreover, parallel processing can even outperform
serial processing under certain conditions” (p. 8).
Brüning and Manzey (2018) confirmed serial processing is not always
more efficient than parallel processing. Participants performed many alternate trials on two different tasks but could see the stimulus for the next
trial ahead of time. Participants engaging in parallel processing (processing the stimulus for trial n+1 during trial n) performed better than those
using only serial processing (not processing the trial n+1 stimulus ahead of
time). Parallel processing reduced the costs incurred when task switching.
Individuals high in working memory capacity (see Glossary) were more
likely to use parallel processing, perhaps because of their superior attentional control.
IN THE REAL WORLD: CAN WE THINK AND DRIVE?
Car driving is the riskiest activity engaged in by tens of millions of adults. Over 50 countries have
laws restricting the use of mobile or cell phones by drivers to increase car safety. Are such restrictions necessary? The short answer is “Yes” – drivers using a mobile phone are several times more
likely to be involved in a car accident (Nurullah, 2015). This is so even though drivers try to reduce
the risks by driving slightly more slowly (reducing speed by 5–6 mph) than usual shortly after initiating a mobile-phone call (Farmer et al., 2015).
Caird et al. (2008) in a review of studies using simulated driving tasks reported that reaction
times to events (e.g., onset of brake lights on the car in front) increased by 250 ms with mobilephone use and were greater when drivers were talking rather than listening. This 250 ms increase
in reaction time translates into travelling an extra 18 feet (5.5 metres) before stopping for a motorist doing 50 mph (80 kph). This could be the difference between stopping just short of a child or
killing that child.
Strayer and Drews (2007) studied the above slowing effect using event-related potentials while
drivers responded rapidly to the onset of brake lights on the car in front. The magnitude of the
P300 (a positive wave associated with attention) was reduced by 50% in mobile-phone users.
Strayer et al. (2011) considered a real-life driving situation. Drivers were observed to see whether
they obeyed a law requiring them to stop at a road junction. Of drivers not using a mobile phone,
79% obeyed the law compared to only 25% of mobile-phone users.
Theoretical considerations
Why do so many drivers endanger people’s lives by using mobile phones? Most believe they
can drive safely while using a mobile phone whereas other drivers cannot (Sanbonmatsu et al.,
2016b). Their misplaced confidence depends on limited monitoring of their driving performance:
drivers using a mobile phone make more driving errors but do not remember making more errors
(Sanbonmatsu et al., 2016a).
Why does mobile-phone use impair driving performance? Strayer and Fisher (2016) in their
SPIDER model identified five cognitive processes that are adversely affected when drivers’ ­attention
is diverted from driving (e.g., by mobile-phone use):
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 214
28/02/20 6:44 PM
Attention and performance
215
(1) There is less effective visual scanning of the environment for potential threats. Distracted drivers
are more inclined to focus attention on the centre of the road and less inclined to scan objects
in the periphery and their side mirrors (Strayer & Fisher, 2016).
(2) The ability to predict where threats might occur is impaired. Distracted drivers are much less
likely to make anticipatory glances towards the location of a potential hazard (e.g., obstructed
view of a pedestrian crossing) (Taylor et al., 2015).
(3) There is reduced ability to identify visible threats, a phenomenon known as inattentional blindness (see Glossary; and Chapter 4). In a study by Strayer and Drews (2007), 30 objects (e.g.,
pedestrians; advertising hoardings) were clearly visible to drivers. However, those using a
mobile phone subsequently recognised far fewer objects they had fixated than those not using
a mobile phone (under 25% vs 50%, respectively).
(4) It is harder to decide what action is necessary in a threatening situation. Cooper et al. (2009)
found drivers were 11% more likely to make unsafe lane changes when using a mobile
phone.
(5) It becomes harder to execute the appropriate action. Reaction times are slowed (Caird et al.,
2008, discussed above, p. 214).
The SPIDER model is oversimplified in several ways. First, various different activities are associated
with mobile-phone use. Simmons et al. (2016) found in a meta-analytic review that the risk of safety-­
critical events was increased by activities requiring drivers to take their eyes off the road (e.g., locating a phone; dialling; texting). However, talking on a mobile phone did not increase risk.
Second, driving-irrelevant cognitive activities do not always impair all aspects of driving performance. Engstrom et al. (2017, p. 734) proposed their cognitive control hypothesis: “Cognitive
load selectively impairs driving sub-tasks that rely on cognitive control but leaves automatic performance unaffected.” For example, driving-irrelevant activities involving cognitive load (e.g., mobilephone use) typically have no adverse effect on well-practised driving skills, such as lane keeping
and braking when getting close to the vehicle in front (Engstrom et al., 2017).
Third, individuals using mobile phones while driving are unrepresentative of drivers in general
(e.g., they tend to be relatively young and to engage in more risk-taking activities: Precht et al.,
2017). Thus, we must consider individual differences in personality and risk taking when interpreting accidents associated with mobile-phone use.
Fourth, the SPIDER model implies that performance cannot be improved by adding a secondary
task. However, driving performance in monotonous conditions is sometimes better when drivers
listen to the radio at the same time (see Engstrom et al., 2017). Listening to the radio can reduce
the mind-wandering that occurs when someone drives in monotonous conditions. Drivers indicating their immediate thoughts during their daily commute reported mind-wandering 63% of the
time and active focus on driving only 15%–20% of the time (Burdett et al., 2018).
Multiple resource theory
Wickens (1984, 2008) argued in his multiple resource model that the processing system consists of several independent processing resources or
mechanisms. The model includes four major dimensions (see Figure 5.16):
(1)
(2)
Processing stages: there are successive stages of perception, cognition
(e.g., working memory) and responding.
Processing codes: perception, cognition and responding can use
spatial and/or verbal codes; action can involve speech (vocal verbal)
or manual/spatial responses.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 215
28/02/20 6:44 PM
216
Visual perception and attention
Figure 5.16
Wickens’s four-dimensional
multiple resource model.
The details are described in
the text.
From Wickens (2008). © 2008.
Reprinted by permission of
SAGE Publications.
(3)
(4)
Modalities: perception can involve visual and/or auditory resources.
Visual channels: visual processing can be focal (high acuity) or ambient
(peripheral).
Here is the model’s crucial prediction: “To the extent that two tasks use different levels along each of the three dimensions [excluding (4) above], time-­
sharing [dual-task performance] will be better” (Wickens, 2008, p. 450). Thus,
tasks requiring different resources can be performed together more successfully than those requiring the same resources. Wickens’s approach bears
some resemblance to Baddeley’s (e.g., 2012) working memory model (see
Chapter 6). According to that model, two tasks can be performed together
successfully provided they use different components or processing resources.
Findings
Research discussed earlier (Treisman & Davies, 1973; McLeod, 1977)
showing the negative effects of stimulus and response similarity on performance are entirely consistent with the theory. Lu et al. (2013) reviewed
research where an ongoing visual-motor task (e.g., car driving) was
­performed together with an interrupting task in the visual, auditory or
tactile (touch) modality. As predicted, non-visual interrupting tasks (especially those in the tactile modality) were processed more effectively than
visual ones and there were no adverse effects on the visual-motor task.
According to the model, there should be only limited dual-task interference between two visual tasks if one requires focal or foveal vision,
whereas the other requires ambient or peripheral vision. Tsang and Chan
(2018) obtained support for this prediction in a study in which participants
tracked a moving target in focal vision while responding to a spatial task
in ambient or peripheral vision.
Dual-task performance is often more impaired than predicted by
the theory. For example, consider a study by Robbins et al. (1996; see
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 216
28/02/20 6:44 PM
Attention and performance
217
Chapter 6). The main task was selecting chess moves and we will focus on
the condition where the task performed at the same time was generating
random letters. These two tasks involve different processing codes (spatial
vs verbal, respectively) and they also involve different response types
(manual vs vocal, respectively). Nevertheless, generating random letters
caused substantial interference on the chess task.
Evaluation
The main assumptions of the theory have largely been supported by the
experimental evidence. In other words, dual-task performance is generally
less impaired when two tasks differ with respect to modalities, processing
codes or visual channels than when they do not.
What are the model’s limitations?
(1)
(2)
(3)
Successful dual-task performance often requires higher-level processes
of coordinating and organising the demands of the two tasks (see
later section on cognitive neuroscience, pp. 220–222). However, these
processes are de-emphasised within the theory.
The theory’s assumption there is a sequence of processing stages (perception; cognition; responding) is too rigid given the flexible nature of
much dual-task processing (Koch et al., 2018). The numerous forms
of cognitive processing intervening between perception and responding are not discussed in detail.
It is implied within the theory that negative or interfering effects of
performing two tasks together would be constantly present. However,
Steinborn and Huestegge (2017) found dual-task conditions led only
to occasional performance breakdown due to attention failures.
Threaded cognition
Salvucci and Taatgen (2008, 2011) proposed a model of threaded cognition in which streams of thought are represented as threads of processing.
For example, processing two tasks might involve two separate threads. The
central theoretical assumptions are as follows:
Multiple threads or goals can be active at the same time, and as long as
there is no overlap in the cognitive resources needed by these threads,
there is no multi-tasking interference. When threads require the same
resource at the same time, one thread must wait and its performance
will be adversely affected.
(Salvucci & Taatgen, 2011, p. 228)
This is because all resources have limited capacity.
Taatgen (2011) discussed the threaded cognition model (see
Figure 5.17). Several cognitive resources can be the source of competition
between two tasks. These include visual perception, declarative memory,
task control and focal working memory or problem state. Nijboer et al.
(2016a) discussed similarities between this model and Baddeley’s working
memory model (see Chapter 6). Three components of the model relate to
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 217
28/02/20 6:44 PM
218
Visual perception and attention
Figure 5.17
Threaded cognition theory. We possess several cognitive
resources (e.g., declarative memory, task control, visual
perception). These resources can be used in parallel but each
resource can only work on one task at a time. Our ability to
perform two tasks at the same time (e.g., driving and dialling,
subtraction and typing) depends on the precise ways in which
cognitive resources need to be used. The theory also identifies
some of the brain areas associated with cognitive resources.
working memory: (1) problem state (attentional focus); (2) declarative memory (activated short-term memory); and (3) subvocal
rehearsal (resembling the phonological loop;
see Chapter 6).
Each thread or task controls resources
in a greedy, polite way – threads claim
resources greedily when required but release
them politely when no longer needed. These
aspects of the model lead to one of its most
original assumptions – several goals (each
associated with a given thread) can be active
simultaneously.
The model resembles Wickens’s multiple
resource model: both models assume there
are several independent processing resources.
However, only the threaded cognition model
led to a computational model making specific predictions. In addition, the threaded
cognition model identifies the brain areas
associated with each processing resource (see
Figure 5.17).
Findings
According to the model, any given cognitive resource (e.g., visual perception; focal
From Taatgen (2011). With permission of the author.
working memory) can be used by only one
process at any given time. Nijboer et al. (2013)
tested this assumption using multi-column subtraction as the primary task
with participants responding using a keypad. Easy and hard conditions differed in whether digits were carried over (“borrowed”) from one column to
the next:
(1: easy)
(2: hard)
336789495 3649772514
–224578381
–1852983463
The model predicts focal working memory is required only in the hard condition. Subtraction was combined with a secondary task: a tracking task
involving visual and manual resources or a tone-counting task involving
working memory.
Nijboer et al. (2013) predicted performance on the easy subtraction
task should be worse when combined with the tracking task because both
compete for visual and manual resources. In contrast, performance on
the hard subtraction task should be worse when combined with the tone-­
counting task because there are large disruptive effects when two tasks
compete for working memory resources. The findings were as predicted.
Borst et al. (2013) found there was far less impairment of hard subtraction performance by a secondary task requiring working memory when
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 218
28/02/20 6:44 PM
Attention and performance
219
participants saw a visual sign explicitly indicating that “borrowing” was
needed. This supports the model’s assumption that dual-task performance
can be enhanced by appropriate environmental support.
According to the threaded cognition model, we often cope with the
demands of combining two tasks by switching flexibly between them to
maximise performance. Support was reported by Farmer et al. (2018).
Participants performed a typing task and a tracking task at the same time.
The relative value of the two tasks was varied by manipulating the number
of points lost for poor tracking performance. Participants rapidly learned
to adjust their strategies over time to increase the overall number of points
they gained.
Katidioti and Taatgen (2014) found task switching is not always
optimal. Participants performed two tasks together: (1) an email task in
which information needed to be looked up; (2) chat messages containing
questions to be answered. When there was a delay on the email task, most
participants switched to the chat task. This happened even when this was
suboptimal because it caused participants to forget information in the
email task.
How can we explain the above findings? According to Katidioti and
Taatgen (2014, p. 734), “The results . . . agree with threaded cognition’s
‘greedy’ theory . . . which states that people will switch to a task that is
waiting as soon as the resources for it are available.” Huijser et al. (2018)
obtained further evidence of “greediness”. When there were brief blank
periods during the performance of a cognitively demanding task, participants often had task-irrelevant thoughts (e.g., mind-wandering) even
though these thoughts impaired task performance.
Katidioti and Taatgen (2014) also discovered substantial individual
differences in task switching – some participants never switched to the chat
task when delays occurred on the email task. Such individual differences
cannot be explained by the theory.
As mentioned earlier, a recent version of threaded cognition theory
discussed by Nijboer et al. (2016a) identifies three components of working
(i.e., problem state or focus of attention; declarative memory or activated
short-term memory; and subvocal rehearsal). Nijboer et al. had participants perform two working memory tasks at the same time; these tasks
varied in the extent to which they required the same working memory components. They obtained measures of performance and also used neuroimaging under dual-task and single-task conditions.
What did Nijboer et al. (2016a) find? First, dual-task interference
could be predicted from the extent to which the two tasks involved the
same working memory components. Second, dual-task interference could
also be predicted from the extent of overlap in brain activation of the two
tasks in single-task conditions. In sum, dual-task interference depended
on competition for specific resources (i.e., working memory components)
rather than general resources (e.g., central executive).
Evaluation
The model has proved successful in various ways. First, several important
cognitive resources have been identified. Second, the model identifies brain
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 219
28/02/20 6:44 PM
220
Visual perception and attention
areas associated with various cognitive resources. This has led to computational modelling testing the model’s predictions using neuroimaging and
behavioural findings. Thus, the model accounts for dual-task performance
without assuming the existence of a central executive or other executive
process (often vaguely defined in other theories). Fourth, the theory predicts
factors determining switching between two tasks being performed together.
Fifth, individuals often have fewer problems performing two simultaneous
tasks than generally assumed.
What are the model’s limitations? First, it predicts that “Practising two
tasks concurrently [together] results in the same performance as performing the two tasks independently” (Salvucci & Taatgen, 2008, p. 127). This
de-emphasises the importance of processes coordinating and managing two
tasks performed together (see next section). Second, excluding processes
resembling Baddeley’s central executive is controversial and may well
prove inadvisable. Third, most tests of the model have involved the simultaneous performance of two relatively simple tasks and its applicability to
more complex tasks remains unclear. Fourth, the theory does not provide
a full explanation for individual differences in the extent of task switching
(e.g., Katidioti & Taatgen, 2014).
Cognitive neuroscience
The cognitive neuroscience approach is increasingly used to test theoretical
models and enhance our understanding of processes underlying dual-task
performance. Its value is that neuroimaging provides “an additional data
source for contrasting between alternative models” (Palmeri et al., 2017,
p. 61). More generally, behavioural findings indicate the extent to which
dual-task conditions impair task performance but are often relatively uninformative about the precise reasons for such impairment.
Suppose we compare patterns of brain activation while participants
perform tasks x and y singly or together. Three basic patterns are shown
in Figure 5.18:
(1)
Figure 5.18
(a) Underadditive activation;
(b) additive activation;
(c) overadditive activation.
White indicates task 1
activation; grey indicates
task 2 activation; and
black indicates activation
only present in dual-task
conditions.
Underadditive activation: reduced activation in one or more brain
areas in the dual-task condition occurs because of resource competition between the tasks.
(a) Underadditive activation
(b) Additive activation
(c) Overadditive activation
Component task 1
Component task 1
Component task 1
Component task 2
Dual-task
From Nijboer et al., 2014.
Reprinted with permission of
Elsevier.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 220
Time
Time
Time
Component task 2
Dual-task
Time
Time
Time
Component task 2
Dual-task
Time
Time
Time
28/02/20 6:44 PM
221
Attention and performance
(2)
(3)
Additive activation: brain activation in the dual-task condition is
simply the sum of the two single-task activations because access to
resources is integrated efficiently between the two tasks.
Overadditive activation: brain activation in one or more brain areas is
present in the dual-task condition but not the single-task conditions.
This occurs when dual-task conditions require executive processes that
are absent (or less important) with single tasks. These executive processes include the coordination of task demands, attentional control
and dual-task management generally. We would expect such executive
processes to be associated mostly with activation in prefrontal cortex.
KEY TERM
Underadditivity
The finding that brain
activation when tasks
A and B are performed
at the same time is less
than the sum of the brain
activation when tasks
A and B are performed
separately.
Findings
We start with an example of underadditive activation. Just et al. (2001)
used two very different tasks (auditory sentence comprehension and mental
rotation of 3-D figures) performed together or singly. Performance on both
tasks was impaired under dual-task conditions compared to single-task
conditions. Under dual-task conditions, brain activation in language processing areas decreased by 53% and reduced by 29% in areas associated
with mental rotation. These findings suggest fewer task-relevant processing
resources were available when both tasks were performed together.
Schweizer et al. (2013) also reported underadditivity. Participants
performed a driving task on its own or with a distracting secondary task
(answering spoken questions). Driving performance was unaffected by
the secondary task. However, driving with distraction reduced activation in posterior brain areas associated with spatial and visual processing
(underadditivity). It also produced increased activation in the prefrontal
cortex (overadditivity; see Figure 5.19) probably because driving with distraction requires increased attentional or cognitive control within the prefrontal cortex.
Dual-task performance is often associated with overadditivity due to
increased activity within the prefrontal cortex (especially the lateral prefrontal cortex) during dual-task performance (see Strobach et al., 2018, for
a review). However, most such findings do not show that this increased
prefrontal activation is actually required for dual-task performance.
More direct evidence that prefrontal areas associated with attentional
or cognitive control are causally involved in enhancing dual-task performance was reported by Filmer et al. (2017) and Strobach et al. (2018).
Filmer et al. (2017) studied the effects of transcranial direct current stimulation (tDCS; see Glossary) applied to areas of the prefrontal cortex
associated with cognitive control. Anodal tDCS during training enhanced
cognitive control and subsequent dual-task performance.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 221
Figure 5.19
Effects of an audio
distraction task on brain
activity associated with a
straight driving task. There
were significant increases
in activation within the
ventrolateral prefrontal
cortex and the auditory
cortex (in orange). There
was decreased activation
in occipital-visual areas (in
blue).
From Schweizer et al. (2013).
28/02/20 6:44 PM
222
Visual perception and attention
Strobach et al. (2018) reported similar findings. Anodal tDCS applied
to the lateral prefrontal cortex led to enhanced dual-task performance.
In another condition, cathodal tDCS to the same area of the prefrontal
cortex impaired dual-task performance. These findings were as predicted
given that anodal and cathodal tDCS often have opposite effects on performance. These findings indicate that the lateral prefrontal cortex causally
influences dual-task performance.
Additional evidence of the importance of the lateral prefrontal cortex
was reported by Wen et al. (2018). Individuals with high connectivity (connectedness) within that brain area showed superior dual-task performance
to those with low connectivity.
Finally, patterns of brain activation can help to explain practice effects
on dual-task performance. Garner and Dux (2015) found much fronto-­
parietal activation (associated with cognitive control) when two tasks were
performed singly or together. Extensive training greatly reduced dual-task
interference and also produced increasing differentiation in the pattern of
fronto-parietal activation associated with the two tasks. Participants showing
the greatest reduction in dual-task interference tended to have the greatest
increase in differentiation. Thus, using practice to increase differences in processing and associated brain processing between tasks can be very effective.
Evaluation
Brain activity in dual-task conditions often differs from the sum of brain
activity of the same two tasks performed singly. Dual-task activity can
exhibit underadditivity or overadditivity. The findings are theoretically
important because they indicate performance of dual tasks can involve
much more cognitive control and other processes than single tasks. Garner
and Dux’s (2015) findings demonstrate that enhanced dual-task performance with practice can depend on increased differentiation between the
two tasks with respect to processing and brain activation.
What are the limitations of the cognitive neuroscience approach?
First, increased (or decreased) activity in a given brain area in dual-task
conditions is not necessarily very informative. For example, Dux et al.
(2009) found dual-task performance improved over time because practice increased the speed of information processing in the prefrontal cortex
rather than because it changed activation within that region. Second, it is
often unclear whether patterns of brain activation are directly relevant to
task processing rather than reflecting non-task processing.
Third, findings in this area are rather inconsistent (Strobach et al., 2018)
and we lack a comprehensive theory to account for these inconsistencies.
Plausible reasons for these apparent inconsistencies are the great variety of
task combinations used in dual-task studies and individual differences in
task proficiency among participants (Watanabe & Funahashi, 2018).
Psychological refractory period: cognitive bottleneck?
Much of the research discussed so far was limited because the task combinations used made it hard to assess in detail the processes used by participants. For example, the data collected were often insufficient to indicate
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 222
28/02/20 6:44 PM
223
Attention and performance
the frequency with which participants switched their attentional focus from
one task to the other. This led researchers to use much simpler tasks so that
they had “better experimental control over the timing of the component
task processes” (Koch et al., 2018, p. 575).
The dominant paradigm in recent research is as follows. There are two
stimuli (e.g., two lights) and two responses (e.g., button presses), one associated with each stimulus. Participants respond to each stimulus as rapidly
as possible. When the two stimuli are presented at the same time (dual-task
condition), performance is typically worse on both tasks than when each
task is presented separately (single-task conditions).
When the second stimulus is presented shortly after the first, there
is typically a marked slowing of the response to the second stimulus:
the ­psychological refractory period (PRP) effect. This effect is robust –
Ruthruff et al. (2009) obtained a large PRP effect even when participants
were given strong incentives to eliminate it.
The PRP effect has direct real-world relevance. Hibberd et al. (2013)
studied the effects of a simple in-vehicle task on braking performance when
the vehicle in front braked and slowed down. There was a classic PRP
effect – braking time was slowest when the in-vehicle task was presented
immediately before the vehicle in front braked.
How can we explain the PRP effect? It is often argued task performance involves three successive stages: (1) perceptual; (2) central response
selection; and (3) response execution. According to the bottleneck model
(e.g., Pashler, 1994),
KEY TERMS
Psychological refractory
period (PRP) effect
The slowing of the
response to the second
of two stimuli when
presented close together
in time.
Stimulus onset
­asynchrony (SOA)
Time interval between the
start of two stimuli.
The response selection stage of the second task cannot begin until the
response selection stage of the first task has finished, although the
other stages . . . can proceed in parallel . . . according to this model,
the PRP effect is a consequence of the waiting time of the second task
because of a bottleneck at the response selection stage.
(Mittelstädt & Miller, 2017, p. 89)
The bottleneck model explains several findings. For example, consider the
effects of varying the time between the start of the first and second stimuli
(stimulus onset asynchrony (SOA)). According to the model, processing on
the first task should slow down second-task processing much more when
the SOA is small than when it is larger. The predicted finding is generally
obtained (Mittelstädt & Miller, 2017).
The bottleneck model remains the most influential explanation of the
PRP effect (and other dual-task costs). However, resource models (e.g.,
Navon & Miller, 2002) are also influential. According to these models,
limited processing capacity can be shared between two tasks so both are
processed simultaneously. Of crucial importance, sharing is possible even
during the response selection process. A consequence of sharing processing
capacity across task is that each task is processed more slowly than if performed on its own.
Many findings can be explained by both models. However, resource
models are more flexible than bottleneck models. Why is this? Resource
models assume the division of processing resources between two tasks
varies freely to promote efficient performance.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 223
28/02/20 6:44 PM
224
Visual perception and attention
KEY TERM
Another factor influencing the PRP effect is crosstalk (the two tasks
interfere directly with each other). This mostly occurs when the stimuli
and/or responses on the two tasks are similar. A classic example of crosstalk is when you try to rub your stomach in circles with one hand while
patting your head with the other hand (try it!).
Finally, note that participants in most studies receive only modest
amounts of practice in performing two tasks at the same time. As a consequence, the PRP effect may occur at least in part because participants
receive insufficient practice to eliminate it.
Crosstalk
In dual-task conditions,
the direct interference
between the tasks that is
sometimes found.
Findings
According to the bottleneck model, we would expect to find a PRP effect
even when easy tasks are used and/or participants receive prolonged practice. Contrary evidence was reported by Schumacher et al. (2001). They
used two tasks: (1) say “one”, “two” or “three” to low-, medium- and
high-pitched tones, respectively; (2) press response keys corresponding to
the position of a disc on a computer screen. These tasks were performed
together for over 2,000 trials, by which time some participants performed
them as well together as singly.
Strobach et al. (2013) conducted a study very similar to that of
Schumacher et al. (2001). Participants took part in over 5,000 trials involving single-task or dual-task conditions. However, dual-task costs were not
eliminated after extensive practice: dual-task costs on the auditory task
reduced from 185 to 60 ms and those on the visual task from 83 to 20 ms
(see Figure 5.20). How did dual-task practice benefit performance? Practice
speeded up the central response selection stage in both tasks.
Why did the findings differ in the two studies discussed above? In both
studies, participants were rewarded for fast responding on single-task and
dual-task trials. However, the way the reward system was set up in the
Schumacher et al. study may have led participants to exert more effort in
dual-task than single-task trials. This potential bias was absent from the
Figure 5.20
Reaction times for correct
responses only over eight
experimental sessions
under dual-task (auditory
and visual tasks) and singletask (auditory or visual task)
conditions.
From Strobach et al. (2013).
Reprinted with permission of
Springer.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 224
28/02/20 6:44 PM
225
Attention and performance
Strobach et al. study. This difference in reward structure could explain the
greater dual-task costs in the Strobach et al. study.
Hesselmann et al. (2011) studied the PRP effect using event-related
potentials. The slowing of responses on the second task was closely
matched by slowing in the onset of the P300 (an ERP component reflecting
response selection). However, there was no slowing of earlier ERP components reflecting perceptual processing. Thus, as predicted by the bottleneck
model, the PRP effect depended on response selection rather than perceptual processes.
According to the resource model approach, individuals choose whether
to use serial or parallel processing on PRP tasks. Miller et al. (2009) argued
that serial processing generally leads to superior performance compared
with parallel processing. However, parallel processing should theoretically
be superior when the stimuli associated with the two tasks are mostly presented close in time. As predicted, there was a shift from predominantly
serial processing towards parallel processing when that was the case.
Miller et al. (2009) used very simple tasks and it is likely parallel processing is most likely to be used with such tasks. Han and Marois (2013)
used two tasks, one of which was relatively difficult. Participants used
serial processing even when parallel processing was encouraged by financial rewards.
Finally, we consider the theoretically important backward crosstalk
effect: “characteristics of Task 2 of 2 subsequently performed tasks influence Task 1 performance” (Janczyk et al., 2018, p. 261). Hommel (1998)
obtained this effect. Participants responded to Task 1 by making a left or
right key-press and to Task 2 by saying “left” or “right”. Task 1 responses
were faster when the two responses were compatible (e.g., press right key
+ say “right”) than when they were incompatible (e.g., press right key +
say “left”). Evidence for the backward crosstalk effect was also reported by
Janczyk et al. (2018).
Why is the backward crosstalk effect theoretically important? It indicates that aspects of response selection processing on Task 2 occur before
response selection processing on Task 1 has finished. This effect is incompatible with the bottleneck model, which assumes response selection on
Task 1 is completed prior to any response selection on Task 2. In other
words, this model assumes there is serial processing at the response selection stage. In contrast, the backward crosstalk effect is compatible with the
resource model approach.
KEY TERM
Backward crosstalk
effect
Aspects of Task 2
influence response
selection and
performance speed on
Task 1 in studies on the
psychological refractory
period (PRP) effect.
Summary and conclusions
The findings from most research on the psychological refractory period
effect are consistent with the bottleneck model. As predicted, this effect is
typically larger when the second task follows very soon after the first task.
In addition, even prolonged practice rarely eliminates the psychological
refractory period effect suggesting that central response selection processes
typically occur serially.
The bottleneck model assumes processing is less flexible than is often
the case. For example, the existence of the backward crosstalk effect is
inconsistent with the bottleneck model but consistent with the resource
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 225
28/02/20 6:44 PM
226
Visual perception and attention
model approach. Fischer et al. (2018) also found evidence for much flexibility. There was less interference between the two tasks when financial rewards were offered because participants devoted more processing
resources to protecting the first task from interference. However, the
resource model approach has the disadvantage compared to the bottleneck
model that its predictions are less precise, making it harder to submit to
detailed empirical testing.
Finally, as Koch et al. (2017, p. 575) pointed out, the bottleneck
model “can be applied (with huge success) mainly for conditions in which
two tasks are performed strictly sequentially”. This is often the case with
research on the psychological refractory period effect but is much less
applicable to more complex dual-task situations.
“AUTOMATIC” PROCESSING
We have seen in studies of divided attention that practice often causes a dramatic improvement in performance. This improvement has been explained
by assuming some processes become automatic through prolonged practice. For example, the huge amount of practice we have had with reading
words has led to the assumption that familiar words are read “automatically”. Below we consider various definitions of “automaticity”. We also
consider different approaches to explaining the development of automatic
processing.
Traditional approach: Shiffrin and Schneider (1977)
Shiffrin and Schneider (1977) and Schneider and Shiffrin (1977) distinguished between controlled and automatic processes:
●●
●●
Controlled processes are of limited capacity, require attention and can
be used flexibly in changing circumstances.
Automatic processes suffer no capacity limitations, do not require
attention and are very hard to modify once learned.
In Schneider and Shiffrin’s (1977) research, participants memorised letters
(the memory set) followed by a visual display containing letters. They
then decided rapidly whether any item in the visual display was the same
as any item in the memory set. The crucial manipulation was the type of
mapping. With consistent mapping, only consonants were used as members
of the memory set and only numbers were used as distractors in the visual
display (or vice versa). Thus, a participant given only consonants to memorise would know any consonant detected in the visual display was in the
memory set. With varied mapping, numbers and consonants were both used
to form the memory set and to provide distractors in the visual display.
The mapping manipulation had dramatic effects (see Figure 5.21). The
numbers of items in the memory set and visual display greatly affected
decision speed only with varied mapping. According to Schneider and
Shiffrin (1977), varied mapping involved serial comparisons between each
item in the memory set and each item in the visual display until a match
was achieved or every comparison had been made. In contrast, consistent
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 226
28/02/20 6:44 PM
227
Attention and performance
Figure 5.21
Response times on a
decision task as a function
of memory-set size, displayset size and consistent vs
varied mapping.
Response times on a
decision task as a function
of memory-set size, displayset size and consistent
vs varied mapping. Data
from Shiffrin and Schneider
(1977).
American Psychological
Association.
mapping involved automatic processes operating independently and in parallel. These automatic processes have evolved through prolonged practice
in distinguishing between letters and numbers.
In a second experiment, Shiffrin and Schneider (1977) used consistent
mapping with the consonants B to L forming one set and Q to Z the other
set. As before, items from only one set always formed the memory set
with all the distractors in the visual display being selected from the other
set. Performance improved greatly over 2,100 trials, reflecting increased
automaticity. After that, there were 2,100 trials with the reverse consistent mapping (swapping over the memory and visual display sets). With
this reversal, it took nearly 1,000 trials before performance recovered to its
level at the start of the experiment!
Evidence that there may be limited (or no conscious awareness in
the consistent mapping condition was reported by Jansma et al. (2001)).
Increasing automaticity (indexed by increased performance speed) was
accompanied by reduced activation in areas associated with conscious
awareness (e.g., dorsolateral prefrontal cortex).
In sum, automatic processes function rapidly and in parallel but are
inflexible (second part of the second experiment). Controlled processes are
flexible and versatile but operate relatively slowly and in a serial fashion.
Limitations
What are the limitations with this approach? First, the distinction between
automatic and controlled processes is oversimplified (discussed below).
Second, Shiffrin and Schneider (1977) argued automatic processes operate
in parallel and place no demands on attentional capacity and so decision
speed should be unrelated to the number of items. However, decision
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 227
28/02/20 6:44 PM
228
Visual perception and attention
speed was slower when the memory set and visual display both contained
several items (see Figure 5.21). Third, the theory is descriptive rather than
­explanatory – it does not explain how serial controlled processing turns into
parallel automatic processing.
Definitions of automaticity
Shiffrin and Schneider (1977) assumed there is a clear-cut distinction
between automatic and controlled processes. More specifically, automatic
processes possess several features (e.g., inflexibility; very efficient because
they have no capacity limitations; occurring in the absence of attention). In
essence, it is assumed there is perfect coherence or consistency among the
features (i.e., they are all found together).
Moors and De Houwer (2006) and Moors (2016) identified four key
features associated with automaticity:
(1)
(2)
(3)
(4)
unconscious: lack of conscious awareness of at least one of the following: “the input, the output, and the transition from one to the
other” (Moors, 2016, p. 265);
efficient: using very little attentional capacity;
fast;
goal-unrelated or goal-uncontrolled: at least one of the following is
missing: “the goal is absent, the desired state does not occur, or the
causal relation [between the goal and the occurrence of the desired
state] is absent” (Moors, 2016, p. 265).
Why might these four features (or the similar ones identified by Shiffrin
and Schneider (1977)) often be found together? Instance theory (Logan,
1988; Logan et al., 1999) provides an influential answer. It is assumed
task practice leads to storage of information in long-term memory which
facilitates subsequent performance on that task. In essence, “Automaticity
is memory retrieval: performance is automatic when it is based on a
­single-step direct-access retrieval of past solutions from memory” (Logan,
1988, p. 493). For example, if you were given the problem “24 × 7 = ???”
numerous times, you would retrieve the answer (168) “automatically”
without performing any mathematical calculations.
Instance theory makes coherent sense of several characteristics of
automaticity. Automatic processes are fast because they require only the
retrieval of past solutions from long-term memory. They make few demands
on attentional resources because the retrieval of heavily o
­ver-learned
­information is relatively effortless. Finally, there is no conscious awareness
of automatic processes because no significant processes intervene between
stimulus presentation and retrieval of the correct response.
In spite of its strengths, instance theory is limited (see Moors, 2016).
First, the theory implies the key features of automaticity will typically all
be found together. However, this is not the case (see below). Second, it
is assumed practice leads to automatic retrieval of solutions with learners
having no control over such retrieval. However, Wilkins and Rawson (2011)
found evidence learners can exercise top-down control over retrieval: when
the instructions emphasised accuracy, there was less evidence of retrieval
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 228
28/02/20 6:44 PM
229
Attention and performance
than when they emphasised speed. Thus, the use of retrieval after practice
is not fully automatic.
Melnikoff and Bargh (2018) argued that the central problem with the
traditional approach is that no research has shown the four features associated with “automaticity” occurring together. As they pointed out, “No
attempt has been made to estimate the probability of a process being intentional given that is conscious versus unconscious, or the probability of a
process being controllable given that it is efficient versus inefficient, and so
forth” (p. 282).
Decompositional approach: Moors (2016)
Moors and De Houwer (2006) and Moors (2016) argued that previous
theoretical approaches are greatly oversimplified. Instead, they favoured
a decompositional approach. According to this approach, the features
of automaticity are clearly separable and are by no means always found
together: “It is dangerous to draw inferences about the presence or absence
of one feature on the basis of the presence or absence of another” (Moors &
de Houwer, 2006, p. 320).
Moors and De Houwer (2006) also argued there is no firm dividing line
between automaticity and non-automaticity. The features are continuous
rather than all-or-none (e.g., a process can be fairly fast or slow; it can be
partially conscious). As a result, most processes involve a blend of automaticity and non-automaticity. This approach is rather imprecise because few
processes are 100% automatic or non-automatic. However, we can make
relative statements (e.g., process X is more/less automatic than process Y).
Moors (2016) claimed the relationships between factors such as goals,
attention and consciousness are much more complex than claimed within
traditional approaches to “automaticity”. This led her to develop a new theoretical account (see Figure 5.22). A key assumption is that all information
Prior stimulus factors
• Frequency
• Recency
• Stimulus quality: duration, intensity
Prior stimulus representation factors
• Existence of stimulus representation in LTM
• Strength of trace to stimulus representation in LTM
~ Availability of stimulus representation in LTM
• Quality of stimulus representation in WM
Prior stimulus × person factors
• Selection history
• Reward history
Conscious
processing
Attention
2nd threshold
Current stimulus factors
• Stimulus quality: duration, intensity
• Un/expectedness
• Goal in/congruence
• Novelty/familiarity
Attention
Current stimulus representation factors
• Quality of stimulus representation: duration,
intensity, distinctiveness
~ Accessibility of stimulus representation for
processing
Unconscious
processing
1st threshold
Figure 5.22
Factors that are hypothesised to influence representational quality within Moors’ (2016) theoretical approach.
From Moors (2016).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 229
28/02/20 6:44 PM
230
Visual perception and attention
processes require an input of sufficient representational quality (defined by
the “intensity, duration, and distinctiveness of a representation”, Moors,
2016, p. 273).
What factors determine representational quality?
(1)
(2)
(3)
(4)
current stimulus factors, including the extent to which a stimulus is
expected or unexpected, familiar or novel, and goal congruent or
incongruent;
prior stimulus factors (e.g., the frequency and recency with which the
current stimulus has been encountered);
prior stimulus representation factors based on relevant information
stored within long-term memory;
attention, which enhances or amplifies the impact of current stimulus
factors and prior stimulus representation factors on the current stimulus representation.
According to this theoretical account, the above factors influence repre­
sentational quality additively so that a high level of one factor can compensate for a low level of another factor. For example, selective attention or
relevant information in long-term memory can compensate for brief stimulus presentations. The main impact of consciousness occurs later than
for other factors (e.g., attention and goal congruence). More specifically,
representational quality must reach the first threshold to permit unconscious processing but a more stringent second threshold to permit conscious
processing.
Findings
According to Moors’ (2016) theoretical framework, there is a flexible relationship between controlled and conscious processing. This contrasts with
Schneider and Shiffrin’s (1977) assumption that executive control is always
associated with conscious processing.
Diao et al. (2016) reported findings consistent with Moors’ prediction.
They used a Go/No-Go task where participants made a simple response
(Go trials) or withheld it (No-Go trials). High-value or low-value financial
rewards were available for successful performance. Task stimuli were presented above or below the level of conscious awareness.
What did Diao et al. (2016) find? Performance was better on high-­
reward than low-reward trials even when task processing was unconscious. In addition, participants showed superior unconscious inhibitory
control (assessed by event-related potentials) on high-reward trials. Thus,
one feature of automaticity (unconscious processing) was present whereas
another feature (goal-uncontrolled) was not.
Huber-Huber and Ansorge (2018) also reported problems for the traditional approach. Participants received target words indicating an upward
or downward direction (e.g., above; below). Prior to the target word, a
prime word also indicating an upward or downward direction was presented below the level of conscious awareness. Response times to the target
words were slower when there was a conflict between the meanings of the
prime and target words than when they were congruent in meaning. As in
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 230
28/02/20 6:44 PM
231
Attention and performance
the study by Diao et al. (2016), unconscious processing was combined with
control, a combination that is inconsistent with the traditional approach.
Evaluation
The theoretical approach to automaticity proposed by Moors (2016) has
several strengths. First, the assumption that various features associated with
automaticity often correlate poorly with each other is clearly superior to
the earlier notion that these features exhibit perfect coherence. Second, her
assumption that processes vary in the extent to which they are “automatic”
is much more realistic than the simplistic division of processes into automatic and non-automatic. Third, the approach is more comprehensive than
previous ones because it considers more factors relevant to “automaticity”.
What are the limitations with Moors’ (2016) approach? First, numerous factors are assumed to influence representational quality (and thus the
extent to which processes are automatic) (see Figure 5.22). It would thus
require large-scale experimental research to assess the ways all these factors
interact. Second, the approach provides only a partial explanation of the
underlying mechanisms causing the various factors to influence representational quality.
Interactive exercise:
Definitions of attention
CHAPTER SUMMARY
•
Focused auditory attention. When two auditory messages
are presented at the same time, there is less processing of the
unattended than the attended message. Nevertheless, unattended
messages often receive some semantic processing. The restricted
processing of unattended messages may reflect a bottleneck at
various stages of processing. However, theories assuming the
existence of a bottleneck de-emphasise the flexibility of selective
auditory attention. Attending to one voice among several (the
cocktail party problem) is a challenging task. Human listeners use
top-down and bottom-up processes to select one voice. Topdown processes include the use of various control processes
(e.g., focused attention; inhibitory processes) and learning about
structural consistencies present in the to-be-attended voice.
•
Focused visual attention. Visual attention can resemble a
spotlight or zoom lens. In addition, the phenomenon of split
attention suggests visual attention can also resemble multiple
spotlights. However, accounts based on spotlights or a zoom
lens typically fail to specify the underlying mechanisms. Visual
attention can be object-based, space-based or feature-based, and
it is often object-based and space-based at the same time. Visual
attention is flexible and is influenced by factors such as individual
differences.
According to Lavie’s load theory, we are more susceptible
to distraction when our current task involves low perceptual load
and/or high cognitive load. There is much support for this theory.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 231
28/02/20 6:44 PM
232
Visual perception and attention
However, the effects of perceptual and cognitive load are often
not independent as predicted. In addition, it is hard to test the
theory because the terms “perceptual load” and “cognitive load”
are vague. There are stimulus-driven ventral attention and goaldirected dorsal attention networks involving different (but partially
overlapping) brain networks. More research is required to establish
how these two attentional systems interact. Additional brain
networks (e.g., cingulo-opercular network; default mode network)
relevant to attention have also been identified.
•
Disorders of visual attention. Neglect occurs when damage
to the ventral attention network in the right hemisphere impairs
the functioning of the undamaged dorsal attention network. This
impaired functioning of the dorsal attention network involves
reduced activation and alertness within the left hemisphere.
Extinction is due to biased competition for attention between the
two hemispheres combined with reduced attentional capacity.
More research is required to clarify differences among neglect
patients in their specific processing deficits (e.g., the extent to
which failures to detect left-field stimuli are due to impaired spatial
working memory).
•
Visual search. One problem with airport security checks is that
there are numerous possible target objects. Another problem is
the rarity of targets, which leads to excessive caution in reporting
targets. According to feature integration theory, object features
are processed in parallel and then combined by focused attention
in visual search. This theory ignores our use of general scene
knowledge in everyday life to focus visual search on areas of the
scene most likely to contain the target object. It also exaggerates
the prevalence of serial processing. Contemporary approaches
emphasise the role of perception in visual search. Parallel
processing is very common because much information is typically
extracted from the peripheral visual field as well as from central or
foveal vision. Problems in visual search occur when there is visual
crowding in peripheral vision.
•
Cross-modal effects. In the real world, we often coordinate
information across sense modalities. In the ventriloquist effect,
vision dominates sound because an object’s location is typically
indicated more precisely by vision. In the temporal ventriloquism
effect, sound dominates vision because the auditory modality is
typically more precise at discriminating temporal relations. Both
effects depend on the assumption that visual and auditory stimuli
come from the same object. Auditory or vibrotactile warning
signals that are informative about the direction of danger and/or
imminence of collision speed up drivers’ braking times. We lack
a theoretical framework within which to understand why some
warning signals are more effective than others.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 232
28/02/20 6:44 PM
Attention and performance
•
Divided attention: dual-task performance. Individuals engaging
in heavy multi-tasking show evidence of increased distractibility
and impaired attentional control. A demanding secondary task
(e.g., mobile-phone use) impairs aspects of driving performance
requiring cognitive control but not well-practised driving skills (e.g.,
lane keeping). Multiple resource theory and threaded cognition
theory both assume dual-task performance depends on several
limited-capacity processing resources. This permits two tasks to
be performed together successfully provided they use different
processing resources. This general approach de-emphasises highlevel executive processes (e.g., monitoring and coordinating two
tasks).
Some neuroimaging studies have found underadditivity
in dual-task conditions (less activation than for the two tasks
performed separately). This may indicate people have limited
general processing resources. Other neuroimaging studies
have found dual-task conditions can introduce new processing
demands of task coordination associated with activation within the
dorsolateral prefrontal cortex and cerebellum. It is often unclear
whether patterns of brain activation are directly relevant to task
processing.
The psychological refractory period (PRP) effect can be
explained by a processing bottleneck during response selection.
This remains the most influential explanation. However,
some evidence supports resource models claiming parallel
processing of two tasks is often possible. Such models are more
flexible than bottleneck models and they provide an explanation
for interference effects from the second of two tasks on the first
one.
•
“Automatic” processing. Shiffrin and Schneider distinguished
between slow, flexible controlled processes and fast, automatic
ones. This distinction is greatly oversimplified. Other theorists have
claimed automatic processes are unconscious, efficient, fast and
goal-unrelated. However, these four processing features are not
all-or-none and they often correlate poorly with each other. Thus,
there is no sharp distinction between automatic and non-automatic
processes. Moors’ (2016) decompositional approach plausibly
assumes that there is considerable flexibility in terms of the extent
to which any given process is “automatic”.
233
FURTHER READING
Chen, Y.-C. & Spence, C. (2017). Assessing the role of the “unity assumption”
on multi-sensory integration: A review. Frontiers in Psychology, 8 (Article 445).
Factors determining the extent to which stimuli from different sensory modalities
are integrated are discussed.
Engstrom, J., Markkula, G., Victor, T. & Merat, N. (2017). Effects of cognitive
load on driving performance: The cognitive control hypothesis. Human Factors,
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 233
28/02/20 6:44 PM
234
Visual perception and attention
59, 734–764. Johan Engstrom and his colleagues review research on factors influencing driving performance and provide a new theoretical approach.
Hulleman, J. & Olivers, C.N.L. (2017). The impending demise of the item in visual
search. Behavioral and Brain Sciences, 40, 1–20. This review article indicates very
clearly why theoretical accounts of visual search increasingly emphasise the role
of fixations and visual perception. Several problems with previous a­ ttention-based
theories of visual search are also discussed.
Karnath, H.-O. (2015). Spatial attention systems in spatial neglect. Neuropsychologia,
75, 61–73. Hans-Otto Karnath discusses theoretical accounts of neglect emphasising the role of attentional systems.
Koch, I., Poljac, E., Müller, H. & Kiesel, A. (2018). Cognitive structure, flexibility, and plasticity in human multitasking – An integrative review of dual-task
and task-switching research. Psychological Bulletin, 144, 557–583. Iring Koch
and colleagues review dual-task and task-switching research with an emphasis on
major theoretical perspectives.
McDermott, J.H. (2018). Audition. In J.T. Serences (ed.), Stevens’ Handbook
of Experimental Psychology and Cognitive Neuroscience, Vol. 2: Sensation,
Perception, and Attention (4th edn; pp. 63–120). New York: Wiley. Josh
McDermott discusses theory and research focused on selective auditory attention
in this comprehensive chapter.
Melnikoff, D.E. & Bargh, J.A. (2018). The mythical number two. Trends in
Cognitive Sciences, 22, 280–293. Research revealing limitations with traditional
theoretical approaches to “automaticity” is discussed.
Moors, A. (2016). Automaticity: Componential, causal, and mechanistic explanations. Annual Review of Psychology, 67, 263–287. Agnes Moors provides an
excellent critique of traditional views on “automaticity” and develops her own
comprehensive theoretical account.
Nobre, A.C. (2018). Attention. In J.T. Serences (ed.), Stevens’ Handbook of
Experimental Psychology and Cognitive Neuroscience, Vol. 2: Sensation,
Perception, and Attention (4th edn; pp. 241–316). New York: Wiley. Anna (Kia)
Nobre discusses the key role played by attention in numerous aspects of cognitive processing.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_1.indd 234
28/02/20 6:44 PM
Taylor& Francis
Taylor & Francis Group
http://taylorandfrancis.com
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 236
28/02/20 4:19 PM
How important is memory? Imagine if we were without it. We would not
recognise anyone or anything as familiar. We would be unable to talk, read
or write because we would remember nothing about language. We would
have extremely limited personalities because we would have no recollection
of the events of our own lives and therefore no sense of self. In sum, we
would have the same lack of knowledge as a newborn baby.
Nairne et al. (2007) argued there were close links between memory and survival in our evolutionary history. Our ancestors prioritised information relevant to their survival (e.g., remembering the location of food or water; ways
of securing a mate). Nairne et al. found memory for word lists was especially high when participants rated the words for their relevance to survival
in a dangerous environment: the survival-processing effect. This effect has
been replicated several times (Kazanas & Altarriba, 2015) and is stronger
when participants imagine themselves alone in a dangerous environment
rather than with a group of friends (Leding & Toglia, 2018). In sum, human
memory may have evolved in part to promote survival.
We use memory for numerous purposes throughout every day of our lives. It
allows us to keep track of conversations, to remember how to use a mobile
phone, to write essays in examinations, to recognise other people’s faces, to
take part in conversations, to ride a bicycle, to carry out intentions and, perhaps,
to play various sports. More generally, our interactions with others and with
the environment depend crucially on having an effective memory system.
PART II
Memory
The wonders of human memory are discussed at length in Chapters 6–8.
Chapter 6 deals mainly with key issues regarded as important from the early
days of memory research. For example, we consider the distinction between
short-term and long-term memory. The notion of short-term memory has
been largely superseded by that of a working-memory system combining
the functions of processing and short-term information storage. There is
extensive coverage of working memory in Chapter 6.
Another topic discussed at length in Chapter 6 is learning. Long-term
memory is generally enhanced when meaning is processed at the time of
learning. Long-term memory is also better if much of the learning period
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 237
28/02/20 4:19 PM
238
Memory
is spent practising retrieval. Evidence suggesting some learning is implicit
(i.e., does not depend on conscious processes) is also discussed. Finally, we
discuss forgetting. Why do we tend to forget information over time?
Chapter 7 is devoted to long-term memory. Our long-term memories
include personal information, knowledge about language, much knowledge
about psychology (hopefully!), knowledge about thousands of objects in the
world around us, and information about how to perform various skills (e.g.,
riding a bicycle; playing the piano). The central issue addressed in Chapter 7
is how to account for this incredible richness. Several theorists have claimed
there are several long-term memory systems. Others argue that there are
numerous processes that are combined and recombined depending on the
specific demands of any given memory task.
Memory is important in everyday life in ways de-emphasised historically.
For example, autobiographical memory (discussed in Chapter 8) is of great
significance to us. It gives us a coherent sense of ourselves and our personalities. The other topics considered in Chapter 8 are eyewitness testimony
and prospective memory (memory for future intentions). Research into eyewitness testimony has revealed that eyewitness testimony is often much less
accurate than generally assumed. This has implications for the legal system
because hundreds of innocent individuals have been imprisoned solely on
the basis of eyewitness testimony.
When we think about memory, we naturally focus on memory of the past.
However, we also need to remember numerous future commitments (e.g.,
meeting a friend as arranged), and such remembering involves prospective
memory. We will consider how we try to ensure we carry out our future
intentions.
The study of human memory is fascinating, and substantial progress has
been made. However, it is complex and depends on several factors. Four
kinds of factors are especially important: events, participants, encoding and
retrieval (Roediger, 2008). Events range from words and pictures to texts
and life events. Participants vary in age, expertise, memory-specific disorders and so on. What happens at encoding varies as a function of task
instructions, the immediate context and participants’ strategies. Finally,
memory performance at retrieval often varies considerably depending on
the nature of the memory task (e.g., free recall; cued recall; recognition).
The take-home message is that memory findings are context-sensitive –
they depend on interactions between the four factors. Thus, the effects of
manipulating, say, what happens at encoding depend on the participants
used, the events to be remembered and the conditions of retrieval. That
explains why Roediger (2008) entitled his article, “Why the laws of memory
vanished”. How, then, do we make progress? As Baddeley (1978, p. 150)
argued, “The most fruitful way to extend our understanding of human
memory is not to search for broader generalisations and ‘principles’, but
is rather to develop ways of separating out and analysing more deeply the
complex underlying processes.”
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 238
28/02/20 4:19 PM
Chapter
Learning, memory and
forgetting
6
INTRODUCTION
This chapter (and the next two) focus on human memory. All three chapters deal with intact human memory, but Chapter 7 also considers amnesic
patients in detail. Traditional laboratory-based research is the focus of this
chapter and Chapter 7, with more naturalistic research being discussed
in Chapter 8. There are important links among these different types of
research. For example, many theoretical issues relevant to brain-damaged
and healthy individuals can be tested in the laboratory or in the field.
Learning and memory involve several stages of processing. Encoding
occurs during learning: it involves transforming presented information
into a representation that can subsequently be stored. This is the first
stage. As a result of encoding, information is stored within the memory
system. Thus, storage is the second stage. The third stage is retrieval, which
involves recovering information from the memory system. Forgetting (discussed later, see pp. 278–293) occurs when our attempts at retrieval are
unsuccessful.
Several topics are discussed in this chapter. The basic structure of the
chapter consists of three sections:
(1)
(2)
The first section focuses mostly on short-term memory (a form of
memory in which information is held for a brief period of time). This
section has three topics (short-term vs long-term memory; working
memory; and working memory: executive functions and individual
differences). The emphasis here is on the early stages of processing
(especially encoding).
The second section focuses on learning and the processes occurring
during the acquisition of information (i.e., encoding processes) leading to
long-term memory. Learning can be explicit (occurring with conscious
awareness of what has been learned) or implicit (occurring without
conscious awareness of what has been learned). The first two topics in
this section (levels of processing; learning through retrieval) focus on
explicit learning whereas the third topic focuses on implicit learning.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 239
KEY TERM
Encoding
The process by which
information contained
in external stimuli is
transformed into a
representation that can be
stored within the memory
system.
28/02/20 4:19 PM
240
Memory
KEY TERM
(3)
Iconic memory
A sensory store that
holds visual information
for between 250–1,000
milliseconds following the
offset of a visual stimulus.
The third section consists of a single topic: forgetting from long-term
memory. The emphasis differs from the other two sections in that the
emphasis is on retrieval processes rather than encoding processes.
More specifically, the focus is on the reasons responsible for the failures of retrieval.
SHORT-TERM VS LONG-TERM MEMORY
Many theorists distinguish between short-term and long-term memory. For
example, there are enormous differences in capacity: only a few items can
be held in short-term memory compared with essentially unlimited capacity in long-term memory. There are also massive differences in duration: a
few seconds for short-term memory compared with up to several decades
for long-term memory. The distinction between short-term and long-term
memory stores was central to multi-store models. More recently, however,
some theorists have proposed unitary-store models in which this distinction
is much less clear-cut. Both types of models are discussed below.
Multi-store model
Atkinson and Shiffrin (1968) proposed an extremely influential multi-store
model (see Figure 6.1):
●●
●●
●●
sensory stores, each modality-specific (i.e., limited to one sensory
modality) and holding information very briefly;
short-term store of very limited capacity;
long-term store of essentially unlimited capacity holding information
over very long periods of time.
According to the multi-store model, environmental stimulation is ­initially
processed by the sensory stores. These stores are modality-specific (e.g.,
vision; hearing). Information is held very briefly in the sensory stores, with
some being attended to and processed further within the short-term store.
Sensory stores
The visual store (iconic memory) holds visual information briefly.
According to a recent estimate (Clarke & Mack, 2015), iconic memory for
a natural scene lasts for at least 1,000 ms after stimulus offset. If you twirl
Figure 6.1
The multi-store model of
memory as proposed by
Atkinson and Shiffrin (1968).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 240
28/02/20 4:19 PM
241
Learning, memory and forgetting
a lighted object in a circle in the dark, you will see a circle of light because
of the persistence of visual information in iconic memory. More generally,
iconic memory increases the time for which visual information is accessible
(e.g., when reading).
Atkinson and Shiffrin (1968) and many other theorists have assumed
iconic memory is pre-attentive (not dependent on attention). However,
Mack et al. (2016) obtained findings strongly suggesting that iconic
memory does depend on attention. Participants had to report the letters
in the centre of a visual array (iconic memory) or whether four circles presented close to the fixation point were the same colour. Performance on
the iconic memory task was much worse when the probability of having
to perform the iconic memory task was only 10% rather than 90%. This
happened because there was much less attention to the letters in the former
condition.
Echoic memory, the auditory equivalent of iconic memory, holds auditory information for a few seconds. Suppose someone asked you a question while your mind was elsewhere. Perhaps you replied “What did you
say?”, just before realising you did know what had been said. This “playback” facility depends on echoic memory. Ioannides et al. (2003) found
the duration of echoic memory was longer in the left hemisphere than the
right, probably because of the dominance of the left hemisphere in language processing.
There are sensory stores associated with all other senses (e.g., touch;
taste). However, they are less important than iconic and echoic memory
and have attracted much less research.
KEY TERMS
Echoic memory
A sensory store that holds
auditory information
for approximately 2–3
seconds.
Chunks
Stored units formed from
integrating smaller pieces
of information.
Short-term memory
Short-term memory has very limited capacity. Consider digit span: participants listen to a random digit series and then repeat back the digits
immediately in the correct order. There are also letter and word spans. The
maximum number of items recalled without error is typically about seven.
There are two reasons for rejecting seven items as the capacity of
short-term memory. First, we must distinguish between items and chunks –
“groups of items . . . collected together and treated as a single unit” (Mathy
& Feldman, 2012, p. 346). For example, most individuals presented with
the letter string IBMCIAFBI would treat it as three chunks rather than
nine letters. Here is another example: you might find it hard to recall the
following five words: is thing many-splendoured a love but easier to recall
the same words presented as follows: love is a many-splendoured thing.
Simon (1974) showed the importance of chunking. Immediate serial
recall was 22 words with 8-word sentences but only 7 with unrelated
words. In contrast, the number of chunks recalled varied less: it was 3 with
the sentences compared to 7 with the unrelated words. Second, estimates of
short-term memory capacity are often inflated because participants’ performance is influenced by rehearsal and long-term memory.
What influences chunking? As we have seen, it is strongly determined by
information stored in long-term memory (e.g., IBM stands for International
Business Machines). However, chunking also depends on people’s abilities
to identify patterns or regularities in the material presented for learning.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 241
Interactive exercise:
Capacity of short-term
memory
Interactive exercise:
Duration of short-term
memory
28/02/20 4:19 PM
242
Memory
KEY TERM
For example, compare the digit sequences 2 3 4 5 6 and 2 4 6 3 5. It is
much easier to chunk the former sequence as “all digits between 2 and 6”.
Chekaf et al. (2016) found participants’ short-term memory was greatly
enhanced by spontaneous detection of such patterns. When there were no
patterns in the learning material, short-term memory was only three items.
A similar capacity limit was reported by Chen and Cowan (2009).
When rehearsal was prevented by articulatory suppression (saying “the”
repeatedly), only three chunks were recalled.
Within the multi-store model, it is assumed all items within short-term
memory have equal importance. However, this is an oversimplification.
Vergauwe and Langerock (2017) assessed speed of performance when participants were presented with four letters followed by a probe letter and
decided whether the probe was the same as any of the original letters.
Response to the probe was fastest when it corresponded to the letter currently being attended to (cues were used to manipulate which letter was the
focus of attention at any given moment).
How is information lost from short-term memory? Several answers
have been provided (Endress & Szabó, 2017). Atkinson and Shiffrin (1968)
emphasised the importance of displacement – the capacity of short-term
memory is very limited, and so new items often displace items currently in
short-term memory. Another possibility is that information in short-term
memory decays over time in the absence of rehearsal. A further possibility
is interference which could come from items on previous trials and/or from
information presented during the retention interval.
The experimental findings are variable. Berman et al. (2009) claimed
interference is more important than decay. Short-term memory performance on any given trial was disrupted by words presented on the previous trial. Suppose this disruption effect occurred because words from the
previous trial had not decayed sufficiently. If so, disruption would have
been greatly reduced by increasing the inter-trial interval. In fact, increasing that interval had no effect. However, the disruption effect was largely
eliminated when interference from previous trials was reduced.
Campoy (2012) pointed out Berman et al.’s (2009) research was limited
because their experimental design did not allow them to observe any decay
occurring within 3.3 seconds of item presentation. Campoy obtained strong
decay effects at time intervals shorter than 3.3 seconds. Overall, the findings suggest decay occurs mostly at short retention intervals and interference at longer ones.
Strong evidence interference is important was reported by Endress and
Potter (2014). They rapidly presented 5, 11 or 21 pictures of familiar objects.
In their unique condition, no pictures were repeated over trials, whereas in
their repeated condition, the same pictures were seen frequently over trials.
Short-term memory was greater in the unique condition in which there was
much less interference than in the repeated condition (see Figure 6.2).
In sum, most of the evidence indicates that interference is the most
important factor causing forgetting from short-term memory, although
decay may also play a part. There is little direct evidence that displacement (emphasised by Atkinson & Shiffrin, 1968) is the main factor causing
forgetting. However, it is possible that interference causes items to be displaced from short-term memory (Endress & Szabó, 2017).
Articulatory suppression
Rapid repetition of a
simple sound (e.g.,
“the the the”), which
uses the articulatory
control process of the
phonological loop.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 242
28/02/20 4:19 PM
Learning, memory and forgetting
10
Unique condition
Repeated condition
9.1
Capacity estimate
8
6
4.9
Figure 6.2
Short-term memory
performance in conditions
designed to create
interference (repeated
condition) or minimise
interference (unique
condition) for set sizes 5, 11
and 21 pictures.
From Endress and Potter, 2014.
4 3.2
2
243
3.7
4.8
2.3
0
5
11
21
Set size
Short-term vs long-term memory
Is short-term memory distinct from long-term memory, as assumed
by Atkinson and Shiffrin (1968)? If they are separate, we would expect
some patients to have impaired long-term memory but intact short-term
memory with others showing the opposite pattern. This would produce
a double dissociation (see Glossary). The findings are generally supportive. Patients with amnesia (discussed in Chapter 7) have severe long-term
memory ­impairments but nearly all have intact short-term memory (Spiers
et al., 2001).
A few brain-damaged patients have severely impaired short-term
memory but intact long-term memory. For example, KF had no problems with long-term learning and recall but had a very small digit span
(Shallice & Warrington, 1970). Subsequent research indicated his shortterm memory problems focused mainly on recall of verbal material (letters;
words; digits) rather than meaningful sounds or visual stimuli (Shallice &
Warrington, 1974).
Evaluation
The multi-store model has been enormously influential. It is widely accepted
(but see below) that there are three separate kinds of memory stores. Several
sources of experimental evidence support the crucial distinction between
short-term and long-term memory. However, the strongest evidence probably comes from brain-damaged patients having impairments only to shortterm or long-term memory.
What are the model’s limitations? First, it is very oversimplified (e.g.,
the assumptions that the short-term and long-term stores are both unitary:
operating in a single, uniform way). Below we discuss an approach where
the single short-term store is replaced by a working memory system having
four components. In similar fashion, there are several long-term memory
systems (see Chapter 7).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 243
28/02/20 4:19 PM
244
Memory
Second, the assumption that the short-term store is a gateway between
the sensory stores and long-term memory (see Figure 6.1) is incorrect.
The information processed in short-term memory has typically already
made contact with information in long-term memory (Logie, 1999). For
example, you can only process IBM as a single chunk in short-term
memory after you have accessed long-term memory to obtain the meaning
of IBM.
Third, Atkinson and Shiffrin (1968) assumed information in shortterm memory represents the “contents of consciousness”. This implies
only information processed consciously is stored in long-term memory.
However, there is much evidence for implicit learning (learning without
conscious awareness of what has been learned) (discussed later, see
pp. 269–278).
Fourth, the assumption all items within short-term memory have
equal status is incorrect. The item currently being attended to is accessed
more rapidly than other items within short-term memory (Vergauwe &
Langerock, 2017).
Fifth, the notion that most information is transferred to long-term
memory via rehearsal greatly exaggerates its role in learning. In fact,
only a small fraction of the information stored in long-term memory was
rehearsed during learning.
Sixth, the notion that forgetting from short-term memory is caused by
displacement minimises the role of interference.
Unitary-store model
Several theorists have argued the multi-store approach should be replaced
by a unitary-store model. According to such a model, “STM [short-term
memory] consists of temporary activations of LTM [long-term memory]
representations or of representations of items that were recently perceived” (Jonides et al., 2008, p. 198). In essence, Atkinson and Shiffrin
(1968) emphasised the differences between short-term and long-term
memory whereas advocates of the unitary-store approach focus on the
similarities.
How can unitary-store models explain amnesic patients having essentially intact short-term memory but severely impaired long-term memory?
Jonides et al. (2008) argued they have special problems in forming novel
relations (e.g., between items and their context) in both short-term and
long-term memory. Amnesic patients perform well on short-term memory
tasks because such tasks typically do not require storing relational information. Thus, amnesic patients should have impaired short-term memory
performance on tasks requiring relational memory.
According to Jonides et al. (2008), the hippocampus and surrounding medial temporal lobes (damaged in amnesic patients) are crucial for
forming novel relations. Multi-store theorists assume these structures are
much more involved in long-term than short-term memory. However,
unitary-store models predict the hippocampus and medial temporal lobes
would be involved if a short-term memory task required forming novel
relations.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 244
28/02/20 4:19 PM
Learning, memory and forgetting
245
Findings
Several studies have assessed the performance of amnesic patients on
short-term memory tasks. In some studies (e.g., Hannula et al., 2006) the
performance of amnesic patients was impaired. However, Jeneson and
Squire (2012) in a review found these allegedly short-term memory studies
also involved long-term memory. More specifically, the information to be
learned exceeded the capacity of short-term memory and so necessarily
involved long-term memory as well as short-term memory (Norris, 2017).
As a result, such studies do not demonstrate deficient short-term memory
in amnesic patients.
Several neuroimaging studies have reported hippocampal involvement
(thought to be crucial for long-term memory) during short-term memory
tasks. However, it has generally been unclear whether hippocampal activation was due in part to encoding for long-term memory. An exception was
a study by Bergmann et al. (2012). They assessed short-term memory for
face–house pairs followed by an unexpected test of long-term memory for
the pairs.
What did Bergmann et al. (2012) find? Encoding of pairs remembered
in both short- and long-term memory involved the hippocampus. However,
there was no hippocampal activation at encoding when short-term memory
for word pairs was successful but subsequent long-term memory was not.
Thus, the hippocampus was only involved on a short-term memory task
when long-term memories were being formed.
Evaluation
As predicted by the unitary-store approach, activation of part of long-term
memory often plays an important role in short-term memory. More specifically, relevant information from long-term memory frequently influences
the contents of short-term memory.
What are the limitations of the unitary-store approach? First, the
claim that short-term memory consists only of activated long-term memory
is oversimplified. As Norris (2017, p. 992) pointed out, “The central
problem . . . is that STM has to be able to store arbitrary configurations
of novel information. For example, we can remember novel sequences of
words or dots in random positions on a screen. These cannot possibly have
pre-existing representations in LTM that could be activated.” Short-term
memory is also more flexible than expected on the unitary-store approach
(e.g., backward digit recall: recalling digits in the opposite order to the one
presented).
Second, we must distinguish between the assumption that short-term
memory is only activated long-term memory and the assumption that
short-term and long-term memory are separate but often interact. Most
evidence supports the latter assumption rather than the former.
Third, the theory fails to provide a precise definition of the crucial
explanatory concept of “activation”. It is thus unclear how activation
might maintain representations in short-term memory (Norris, 2017).
Fourth, the medial temporal lobes (including the hippocampus) are
of crucial importance for many forms of long-term memory (especially
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 245
28/02/20 4:19 PM
246
Memory
declarative memory – see Glossary). Amnesic patients
with damage to these brain areas have severely
impaired declarative memory. In contrast, amnesic
patients typically have intact short-term memory
(Spiers et al., 2001).
WORKING MEMORY: BADDELEY
AND HITCH
Photos courtesy of Alan
Baddeley and Graham Hitch.
Research activity:
Phonemic similarity
Is short-term memory useful in everyday life?
Textbook writers used to argue it allows us to remember a telephone
number for the few seconds required to dial it. Of course, that is now
irrelevant – our mobile phones store all the phone numbers we need
­
regularly.
Baddeley and Hitch (1974) provided a convincing answer to the above
question. They argued we typically use short-term memory when performing
complex tasks. Such tasks involve storing information about the outcome
of early processes in short-term memory while moving on to later processes.
Baddeley and Hitch’s key insight was that short-term memory is essential to the performance of numerous tasks that are not explicitly memory
tasks.
The above line of thinking led Baddeley and Hitch (1974) to replace
the concept of short-term memory with that of working memory. Working
memory “refers to a system, or a set of processes, holding mental representations temporarily available for use in thought and action” (Oberauer
et al., 2018, p. 886). Since 1974, there have been several developments of
the working memory system (Baddeley, 2012, 2017; see Figure 6.3):
Central
executive
Shape
Object
Visual
Kinaesthetic Tactile
Spatial
Smell
Taste
Speech
Haptic
Lip-reading
Music
and
sound
Episodic buffer
Visuo-spatial
sketch pad
Visual
semantics
Artic
Alan Baddeley and Graham
Hitch.
Episodic long-term
memory
Phonological loop
Language
Figure 6.3
The working memory model showing the connections between its four components and
their relationship to long-term memory. Artic = articulatory rehearsal.
From Darling et al., 2017.
247
Learning, memory and forgetting
●●
●●
●●
●●
a modality-free central executive, which “is an attentional system”
(Baddeley, 2012, p. 22);
a phonological loop processing and storing information briefly in a
phonological (speech-based) form;
a visuo-spatial sketchpad specialised for spatial and visual processing
and temporary storage;
an episodic buffer providing temporary storage for integrated information coming from the visuo-spatial sketchpad and phonological
loop; this component (added by Baddeley, 2000) is discussed later (see
pp. 252–253).
The most important component is the central executive. The phonological
loop and the visuo-spatial sketchpad are slave systems used by the central
executive for specific purposes. The phonological loop preserves word
order, whereas the visuo-spatial sketchpad stores and manipulates spatial
and visual information.
All three components discussed above have limited capacity and can
function fairly independently of the others. Two key assumptions follow:
(1)
(2)
If two tasks use the same component, they cannot be performed successfully together.
If two tasks use different components, they can be performed as well
together as separately.
Robbins et al. (1996) investigated these assumptions in a study on the
selection of chess moves. Chess players selected continuation moves
from various chess positions while also performing one of the following
tasks:
●●
●●
●●
●●
KEY TERMS
Working memory
A limited-capacity system
used in the processing
and brief holding of
information.
Central executive
A modality-free, limitedcapacity, component of
working memory.
Phonological loop
A component of working
memory in which speechbased information is
processed and stored
briefly and subvocal
articulation occurs.
Visuo-spatial sketchpad
A component of working
memory used to process
visual and spatial
information and to store
this information briefly.
Episodic buffer
A component of working
memory; it is essentially
passive and stores
integrated information
briefly.
repetitive tapping: the control condition;
random letter generation: this involves the central executive;
pressing keys on a keypad in a clockwise fashion: this uses the
visuo-spatial sketchpad;
rapid repetition of the word “see-saw”: this is articulatory suppression
and uses the phonological loop.
The quality of chess moves was impaired when the additional task involved
the central executive or visuo-spatial sketchpad but not when it involved
the articulatory loop. Thus, calculating successful chess moves requires use
of the central executive and the visuo-spatial sketchpad but not the articulatory loop.
Phonological loop
According to the working memory model, the phonological loop has two
components (see Figure 6.4):
●●
●●
a passive phonological store directly concerned with speech perception;
an articulatory process linked to speech production (i.e., rehearsal)
giving access to the phonological store.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 247
28/02/20 4:19 PM
248
Memory
Figure 6.4
Phonological loop system
as envisaged by Baddeley
(1990).
Interactive exercise:
Encoding in STM
KEY TERMS
Phonological similarity
effect
The finding that
immediate serial recall of
verbal material is reduced
when the items sound
similar.
Word-length effect
The finding that verbal
memory span decreases
when longer words are
presented.
Orthographic
neighbours
With reference to a target
word, the number of
words that can be formed
by changing one of its
letters.
Suppose we test individuals’ memory span by presenting a word list visually and requiring immediate recall in the correct order. Would they use
the phonological loop to engage in verbal rehearsal (i.e., saying the words
repeatedly to themselves)? Two kinds of evidence (discussed below) indicate
the answer is “Yes”.
First, there is the phonological similarity effect – reduced immediate
serial recall when words are phonologically similar (i.e., have similar sounds).
For example, Baddeley et al. (2018) found that short-term memory was
much worse with phonologically similar words (e.g., pan, cat, bat, ban, pad, man)
than phonologically dissimilar words (e.g., man, pen, rim, cod, bud, peel).
The working memory model does not make it clear whether the phonological similarity effect depends more on acoustic similarity (similar sounds)
or articulatory similarity (similar articulatory movements). Schweppe et al.
(2011) found the effect depends more on acoustic than articulatory similarity. However, there was an influence of articulatory similarity when recall
was spoken.
Second, there is the word-length effect: word span (words recalled
immediately in the correct order) is greater for short than long words.
Baddeley et al. (1975) obtained this effect with visually presented words. As
predicted, the effect disappeared when participants engaged in articulatory
suppression (repeating the digits 1 to 8) to prevent rehearsal within the
phonological loop during list presentation. In similar fashion, Jacquemot
et al. (2011) found a brain-damaged patient with greatly impaired ability to
engage in verbal rehearsal had no word-length effect.
Jalbert et al. (2011) pointed out a short word generally has more
orthographic neighbours (words of the same length differing from it in
only one letter) than a long word. When short (one-syllable) and long
(three-syllable) words were equated for neighbourhood size, the wordlength effect disappeared. Thus, the word-length effect may be misnamed.
Which brain areas are associated with the phonological loop? Areas in
the parietal lobe, especially the supramarginal gyrus (BA40) and angular
gyrus (BA39), are associated with the phonological store, whereas Broca’s
area (approximately BA44 and BA45) within the frontal lobe is associated
with the articulatory control process.
Evidence indicating these areas differ in their functioning was reported
by Papagno et al. (2017). Patients undergoing brain surgery received direct
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 248
28/02/20 4:19 PM
249
Learning, memory and forgetting
electrical stimulation while performing a digit-span
task. Stimulation within the parietal lobe increased
item errors in the task because it disrupted the
storage of information. In contrast, stimulation
within Broca’s area increased order errors because
it disrupted rehearsal of items in the correct order
(see Figure 6.5).
How is the phonological loop useful in everyday life? The answer is not immediately obvious.
Baddeley et al. (1988) found a female patient,
PV, with a very small digit span (only two items)
coped very well (e.g., running a shop and raising a
family). In subsequent research, however, Baddeley
et al. (1998) argued the phonological loop is useful
when learning a language. PV (a native Italian
speaker) had generally good learning ability but
was totally unable to associate Russian words with Figure 6.5
their Italian translations. Indeed, she showed no Sites where direct electrical stimulation disrupted digitspan performance. Item-error sites are in blue, orderlearning at all over ten trials!
error sites are in yellow and sites where both types of
The phonological loop (“inner voice”) is also errors occurred are in green.
used to resist temptation. Tullett and Inzlicht
(2010) found articulatory suppression (saying computer repeatedly) reduced
participants’ ability to control their actions (they were more likely to
respond on trials where they should have inhibited a response).
Visuo-spatial sketchpad
The visuo-spatial sketchpad is used for the temporary storage and manipulation of visual patterns and spatial movement. In essence, visual processing
involves remembering what and spatial processing involves remembering
where. In everyday life, we use the sketchpad to find the route when moving
from one place to another or when watching television. The distinction
between visual and spatial processing is very clear with respect to blind
individuals. Schmidt et al. (2013) found blind individuals could construct
spatial representations of the environment almost as accurately as those of
sighted individuals despite their lack of visual processing.
Is there a single system containing combining visual and spatial processing or are there partially separate systems? Logie (1995) identified two
separate components:
(1)
(2)
visual cache: this stores information about visual form and colour;
inner scribe: this processes spatial and movement information; it is
involved in the rehearsal of information in the visual cache and transfers information from the visual cache to the central executive.
Smith and Jonides (1997) obtained findings supporting the notion of separate visual and spatial systems. Two visual stimuli presented together were
followed by a probe stimulus. Participants decided whether the probe was
in the same location as one of the initial stimuli (spatial task) or had the
same form (visual task). Even though the stimuli presented were identical in
KEY TERMS
Visual cache
According to Logie, the
part of the visuo-spatial
sketchpad that stores
information about visual
form and colour.
Inner scribe
According to Logie, the
part of the visuo-spatial
sketchpad dealing with
spatial and movement
information.
250
Figure 6.6
Amount of interference on
a spatial task (dots) and a
visual task (ideographs) as
a function of a secondary
task (spatial: movement
vs visual: colour
discrimination).
From Klauer and Zhao (2004).
© 2000 American Psychological
Association. Reproduced with
permission.
Memory
the two tasks, there was more activity in the
right hemisphere during the spatial task than
the visual task, but the opposite was the case
for activity in the left hemisphere.
Zimmer (2008) found in a research review
that areas within the occipital and temporal
lobes were activated during visual processing.
In contrast, areas within the parietal cortex
(especially the intraparietal sulcus) were activated during spatial processing.
Klauer and Zhao (2004) used two main
tasks: (1) a spatial task (memory for dot locations); (2) a visual task (memory for Chinese
characters). The main task was performed at
the same time as a visual (colour discrimination) or spatial (movement discrimination)
interference task. If the visuo-spatial sketchpad has separate spatial and visual components, the spatial interference task should disrupt performance more on the
spatial main task. Second, the visual interference task should disrupt performance more on the visual main task. Both predictions were supported
(see Figure 6.6).
Vergauwe et al. (2009) argued that visual and spatial tasks often
require the central executive’s attentional resources. They used more
demanding versions of Klauer and Zhao’s (2004) main tasks and obtained
different findings: each type of interference (visual and spatial) had comparable effects on the spatial and visual main tasks. Thus, there are general,
attentionally demanding interference effects when tasks are demanding but
also interference effects specific to the type of interference when tasks are
relatively undemanding.
Morey (2018) discussed the theoretical assumption that the visuo-­
spatial sketchpad is a specialised system separate from other cognitive
systems and components of working memory. She identified two predictions following from that assumption:
(1)
(2)
Some brain-damaged patients should have selective impairments of
visual and/or spatial short-term memory with other cognitive processes and systems essentially intact.
Short-term visual or spatial memory in healthy individuals should be
largely or wholly unaffected by the requirement to perform a secondary task at the same time (especially when that task does not require
visual or spatial processing).
Morey (2018) reviewed evidence inconsistent with both the above predictions. First, the great majority of brain-damaged patients with impaired
visual and/or spatial short-term memory also have various more general
cognitive impairments. Second, Morey carried out a meta-analytic review
and found that short-term visual and spatial memory was strongly impaired
by cognitively demanding secondary tasks. This was the case even when the
secondary task did not require visual or spatial processing.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 250
28/02/20 4:19 PM
251
Learning, memory and forgetting
In sum, there is some support for the notion that the visuo-spatial
sketchpad has somewhat separate visual and spatial components. However,
the visuo-spatial sketchpad seems to interact extensively with other cognitive and memory systems, which casts doubt on the theoretical assumption
that it often operates independently from other systems.
Central executive
The central executive (which resembles an attentional system) is the most
important and versatile component of the working memory system. It is
heavily involved in almost all complex cognitive activities (e.g., solving
a problem; carrying out two tasks at the same time) but does not store
information.
There is much controversy concerning the brain regions most associated with the central executive and its various functions (see below,
pp. 257–262). However, it is generally assumed the prefrontal cortex is
heavily involved. Mottaghy (2006) reviewed studies using repetitive transcranial magnetic stimulation (rTMS; see Glossary) to disrupt the dorsolateral prefrontal cortex (BA9/46). Performance on many complex cognitive
tasks was impaired by this manipulation. However, executive processes
do not depend solely on the prefrontal cortex. Many brain-damaged
patients (e.g., those with diffuse trauma) have poor executive functioning
despite having little or no frontal damage (Stuss, 2011).
Baddeley has always recognised that the central executive is associated
with several executive functions (see Glossary). For example, Baddeley
(1996) speculatively identified four such processes: (1) focusing attention
or concentration; (2) dividing attention between two stimulus streams;
(3) switching attention between tasks; and (4) interfacing with longterm memory. It has proved difficult to obtain consensus on the number
and nature of executive processes. However, two influential theoretical
approaches are discussed below.
Brain-damaged individuals whose central executive functioning is
impaired suffer from dysexecutive syndrome. Symptoms include impaired
response inhibition, rule deduction and generation, maintenance and shifting
of sets, and information generation (Godefroy et al., 2010). Unsurprisingly,
patients with this syndrome have great problems in holding a job and
­functioning adequately in everyday life (Chamberlain, 2003).
KEY TERMS
Executive processes
Processes that organise
and coordinate the
functioning of the
cognitive system to
achieve current goals.
Dysexecutive syndrome
A condition in which
damage to the frontal
lobes causes impairments
to the central executive
component of working
memory.
Evaluation
The notion of a unitary central executive is greatly oversimplified (see
below). As Logie (2016, p. 2093) argued, “Executive control [may] arise
from the interaction among multiple differing functions in cognition that
use different, but overlapping, brain networks . . . the central executive
might now be offered a dignified retirement.”
Similar criticisms can be directed against the notion of a dysexecutive
syndrome. Patients with widespread damage to the frontal lobes may have
a global dysexecutive syndrome. However, as discussed below, patients
with limited frontal damage display various patterns of impairment to
executive processes (Stuss & Alexander, 2007).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 251
28/02/20 4:19 PM
252
Memory
Episodic buffer
Case study:
The episodic buffer
Why was the episodic buffer added to the model? There are various reasons.
First, the original version of the model was limited because its components
were too separate in their functioning. For example, it was unclear how
verbal information from the phonological loop and visual and spatial information from the visuo-spatial sketchpad was integrated to form multidimensional representations.
Second, it was hard to explain within the original model the finding
that people can provide immediate recall of up to 16 words presented in
sentences (Baddeley et al., 1987). This high level of immediate sentence
recall is substantially beyond the capacity of the phonological loop.
The function of the episodic buffer is suggested by its name. It is
­episodic because it holds integrated information (or chunks) about episodes or event in a multidimensional code combining visual, auditory and
other information sources. It acts as a buffer between the other working
memory components and also links to perception and long-term memory.
Baddeley (2012) suggested the capacity of the episodic buffer is approximately four chunks (integrated units of information). This potentially
explains why people can recall up to 16 words in immediate recall from
sentences.
Baddeley (2000) argued the episodic buffer could be accessed only via
the central executive. However, it is now assumed the episodic buffer can
be accessed by the visuo-spatial sketchpad and the phonological loop as
well as by the central executive (see Figure 6.3).
In sum, the episodic buffer
differed from the existing subsystems representations [i.e., phonological loop and visuo-spatial sketchpad] in being able to hold a limited
number of multi-dimensional representations or episodes, and it differed from the central executive in having storage capacity . . . The
episodic buffer is a passive storage system, the screen on which bound
information from other sources could be made available to conscious
awareness and used for planning future action.
(Baddeley, 2017, pp. 305–306)
Findings
Why did Baddeley abandon his original assumption that the central executive controls access to and from the episodic buffer? Consider a study by
Allen et al. (2012). Participants were presented with visual stimuli and had
to remember briefly a single feature (colour; shape) or colour–shape combinations. It was assumed combining visual features would require the central
executive prior to storage in the episodic buffer. On that assumption, the
requirement to perform a task requiring the central executive (counting
backwards) at the same time should have reduced memory to a greater
extent for colour–shape combinations than single features.
Allen et al. (2012) found that counting backwards had comparable
effects on memory performance regardless of whether or not feature combinations needed to be remembered. These findings suggest combining
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 252
28/02/20 4:19 PM
253
Learning, memory and forgetting
6
1
2
3
4
5
6
7
8
9
Figure 6.7
Screen displays for the digit
6. Clockwise from top left:
(1) single item display; (2)
keypad display; and (3)
linear display.
From Darling and Havelka
(2010).
0
0
1
2
3
4
5
6
7
8
9
visual features does not require the central executive but instead occurs
“automatically” prior to information entering the episodic buffer.
Grot et al. (2018) clarified the relationship between the central executive and the episodic buffer. Participants learned to link or bind together
words and spatial locations within the episodic buffer for a memory
test. It was either relatively easy to bind words and spatial locations
together (passive binding) or relatively difficult (active binding). The
central executive was involved only in the more difficult active binding
condition.
Darling et al. (2017) discussed several studies showing how memory
can be enhanced by the episodic buffer. Much of this research focused on
visuo-spatial bootstrapping (verbal memory being bootstrapped (supported)
by visuo-spatial memory). Consider a study by Darling and Havelka (2010).
Immediate serial recall of random digits was best when they were presented
on a keypad display rather on a single item or linear display (see Figure 6.7).
Why was memory performance best with the keypad display? This was the
only condition which allowed visual information, spatial information and
knowledge about keyboard displays accessed from long-term memory to be
integrated within the episodic buffer using bootstrapping.
Evaluation
The episodic buffer provides a brief storage facility for information from
the phonological loop, the visuo-spatial sketchpad and long-term memory.
Bootstrapping data (e.g., Darling & Havelka, 2010) suggest that processing
in the episodic buffer “interacts with long-term knowledge to enable integration across multiple independent stimulus modalities” (Darling et al.,
2017, p. 7). The central executive is most involved when it is hard to bind
together different kinds of information within the episodic buffer.
What are the limitations of research on the episodic buffer? First, it
remains unclear precisely how information from the phonological loop and
the visuo-spatial sketchpad is combined to form unified representations
within the episodic buffer. Second, as shown in Figure 6.3, it is assumed
information from sensory modalities other than vision and hearing can be
stored in the episodic buffer. However, relevant research on smell and taste
is lacking.
254
Memory
KEY TERM
Overall evaluation
Working memory
capacity
An assessment of how
much information can be
processed and stored at
the same time; individuals
with high capacity have
higher intelligence and
more attentional control.
Interactive exercise:
Working memory
The working memory model remains highly influential over 45 years since
it was first proposed. There is convincing empirical evidence for all components of the model. As Logie (2015, p. 100) noted, it explains findings
“from a very wide range of research topics, for example, aspects of children’s language development, aspects of counting and mental arithmetic,
reasoning and problem solving, dividing and switching attention, navigating unfamiliar environments”.
What are the model’s limitations? First, it is oversimplified. Several
kinds of information are not considered within the model (e.g., those
relating to smell, touch and taste). In addition, we can subdivide spatial
working memory into somewhat separate eye-centred, hand-centred and
foot-centred spatial working memory (Postle, 2006). This could lead to an
unwieldy model with numerous components each responsible for a different kind of information.
Second, the notion of a central executive should be replaced with a
theoretical approach identifying the major executive processes (see below,
pp. 257–262).
Third, the notion that the visuo-spatial sketchpad is a specialised and
relatively independent processing system is doubtful. There is much evidence (Morey, 2018) that it typically interacts with other working memory
components (especially the central executive).
Fourth, we need more research on the interactions among the four
components of working memory (e.g., how the episodic buffer integrates
information from the other components and from long-term memory).
Fifth, the common assumption that conscious awareness is necessarily associated with processing in all working memory components requires
further consideration. For example, executive processes associated with
the functioning of the central executive can perhaps occur outside conscious awareness (Soto & Silvanto, 2014). As discussed in Chapter 16,
many complex processes can apparently occur in the absence of conscious
awareness.
WORKING MEMORY: INDIVIDUAL DIFFERENCES
AND EXECUTIVE FUNCTIONS
There have been numerous recent attempts to enhance our understanding of
working memory. Here we will focus on two major theoretical approaches.
First, some theorists (e.g., Engle & Kane, 2004) have focused on working
memory capacity. In essence, they claim performance across numerous
tasks (including memory ones) is strongly influenced by individual differences in working memory capacity. Second, many theorists have replaced
a unitary central executive with several more specific executive functions.
Working memory capacity
Several theorists (e.g., Engle & Kane, 2004) have considered working
memory from the perspective of individual differences in working memory
capacity, “the ability to hold and manipulate information in a temporary
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 254
28/02/20 4:20 PM
255
Learning, memory and forgetting
active state” (DeCaro et al., 2016, p. 39). Daneman and Carpenter (1980)
used reading span to assess this capacity. Individuals read sentences for
comprehension (processing task) and then recalled the final word of each
sentence (storage task). The reading span was defined as the largest number
of sentences from which individuals could recall the final words over 50%
of the time.
Operation span is another measure of working memory capacity. Items
(e.g., IS (4 × 2) – 3 = 5? TABLE) are presented. Individuals answer each
arithmetical question and try to remember all the last words. Operation
span is the maximum number of items for which individuals can remember all the last words over half the time. It correlates highly with reading
span.
Working memory capacity correlates positively with intelligence.
We can clarify this relationship by distinguishing between crystallised
­intelligence (which depends on knowledge, skills and experience) and fluid
intelligence (which involves a rapid understanding of novel relationships;
see Glossary). Working memory capacity correlates more strongly with
fluid intelligence (sometimes as high as +.7 or +.8; Kovacs & Conway,
2016). The correlation with crystallised intelligence is relatively low because
it involves acquired knowledge whereas working memory capacity depends
on cognitive processes and temporary information storage.
Engle and Kane (2004) argued individuals who are high and low in
working memory capacity differ in attentional control. In their influential
two-factor theory, they emphasised two key aspects of attentional control:
(1) the maintenance of task goals; (2) the resolution of response competition or conflict. Thus, high-capacity individuals are better at maintaining
task goals and resolving conflict.
How does working memory capacity relate to Baddeley’s working
memory model? The two approaches differ in emphasis. Researchers investigating working memory capacity focus on individual differences in processing and storage capacity whereas Baddeley focuses on the underlying
structure of working memory. However, there has been some convergence
between the two theoretical approaches. For example, Kovacs and Conway
(2016, p. 157) concluded that working memory capacity “reflects individual
differences in the executive component of working memory, particularly
executive attention and cognitive control”.
In view of the association between working memory capacity and
intelligence, we would expect high-capacity individuals to outperform
low-capacity ones on complex tasks. That is, indeed, the case (see Chapter
10). However, Engle and Kane’s (2004) theory also predicts high-capacity
individuals might perform better than low-capacity ones even on relatively
simple tasks if it were hard to maintain task goals.
KEY TERMS
Reading span
The largest number
of sentences read for
comprehension from
which an individual can
recall all the final words
over 50% of the time.
Operation span
The maximum number
of items (arithmetical
questions + words) for
which an individual can
recall all the words more
than 50% of the time.
Crystallised intelligence
A form of intelligence
that involves the ability to
use one’s knowledge and
experience effectively.
Findings
There are close links between working memory capacity and the executive
functions of the central executive. For example, McCabe et al. (2010) found
measures of working memory capacity correlated highly with measures of
executive functioning. Both types of measures reflect executive attention
(which maintains task goals).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 255
28/02/20 4:20 PM
256
Memory
The hypothesis that high-capacity individuals have greater attentional
control than low-capacity ones has received experimental support. Sörqvist
(2010) studied distraction effects caused by the sounds of planes flying
past. Recall of a prose passage was adversely affected by distraction only
in low-capacity individuals. Yurgil and Golob (2013), using event-­related
potentials (ERPs; see Glossary), found that high-capacity individuals
attended less than low-capacity ones to distracting auditory stimuli.
We have seen goal maintenance or attentional control in low-­capacity
individuals is disrupted by external distraction. It is also disrupted by internal task-unrelated thoughts (mind-wandering). McVay and Kane (2012)
used a sustained-attention task in which participants responded to frequent
target words but withheld responses to rare non-targets. Low-capacity
individuals performed worse than high-capacity ones on this task because
they engaged in more mind-wandering.
Robison and Unsworth (2018) identified two main reasons why this might
be the case. First, low-capacity individuals’ inferior attentional control may
lead to increased amounts of spontaneous or unplanned mind-­wandering.
Second, low-capacity individuals may be less motivated to perform cognitive
tasks well and so engage in increased deliberate mind-wandering. Robison
and Unsworth’s findings provided support only for the first reason.
Individuals having low working memory capacity may have worse
task performance than high-capacity ones because they consistently have
poorer attentional control and ability to maintain the current task goal.
Alternatively, their failures of attentional control may only occur relatively
infrequently. Unsworth et al. (2012) compared these two explanations.
They used the anti-saccade task: a flashing cue is presented to the left (or
right) of fixation followed by a target presented in the opposite location.
Reaction times to identify the target were recorded.
Unsworth et al. (2012) divided each participant’s reaction times into
quintiles (five bins representing the fastest 20%, the next fastest 20% and
so on). Low-capacity individuals were significantly slower than the high-­
capacity ones only in the slowest quintile (see Figure 6.8). Thus, they experienced failures of goal maintenance or attentional goal on only a small
fraction of trials.
Figure 6.8
Mean reaction times (RTs)
quintile-by-quintile on the
anti-saccade task by groups
high and low in working
memory capacity.
From Unsworth et al. (2012).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 256
28/02/20 4:20 PM
257
Learning, memory and forgetting
Evaluation
Theory and research on working memory capacity indicate the value of
focusing on individual differences. There is convincing evidence high- and
low-capacity individuals differ in attentional control. More specifically,
high-capacity individuals are better at controlling external and internal
distracting information. In addition, they are less likely than low-capacity
individuals to experience failures of goal maintenance. Of importance, individual differences in working memory capacity are relevant to performance
on numerous different tasks (see Chapter 10).
What are the limitations of research in this area? First, the finding that
working memory capacity correlates highly with fluid intelligence means
many findings ascribed to individual differences in working memory capacity may actually reflect fluid intelligence. However, it can be argued that
general executive functions relevant to working memory capacity partially
explain individual differences in fluid intelligence (Kovacs & Conway, 2016).
Second, research on working memory capacity is somewhat narrowly
based on behavioural research with healthy participants. In contrast, the
unity/diversity framework (discussed next) has been strongly influenced
by neuroimaging and genetic research and by research on brain-damaged
patients.
Third, there is a lack of conceptual clarity. For example, theorists
differ as to whether the most important factor differentiating individuals
with high- or low-capacity is “maintenance of task goals”, “resolution of
conflict”, “executive attention” or “cognitive control”. We do not know
how closely related these terms are.
Fourth, the inferior attentional or cognitive control of low-capacity
individuals might manifest itself consistently throughout task performance
or only sporadically. Relatively little research (e.g., Unsworth et al., 2012)
has investigated this issue.
Fifth, the emphasis in theory and research has been on the benefits for
task performance associated with having high working memory capacity.
However, some costs are associated with high capacity. These costs are
manifest when the current task requires a broad focus of attention but
high-capacity individuals adopt a narrow and inflexible focus (e.g., DeCaro
et al., 2016, 2017; see Chapter 12).
KEY TERMS
Executive functions
Processes that organise
and coordinate the
workings of the cognitive
system to achieve current
goals; key executive
functions include
inhibiting dominant
responses, shifting
attention and updating
information in working
memory.
Executive functions: unity/diversity framework
Executive functions are “high-level processes that, through their influ-
ence on lower-level processes, enable individuals to regulate their thoughts
and actions during goal-directed behaviour” (Friedman & Miyake, 2017,
p. 186). The crucial issue is to identify the number and nature of these
­executive functions or processes. Various approaches can address this issue:
(1)
Psychometric approach: several tasks requiring the use of executive functions are administered and the pattern of inter-correlations
among the tasks is assessed. Consider the following hypothetical
example. There are four executive tasks (A, B, C and D). There is a
moderate positive correlation between tasks A and B and between C
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 257
28/02/20 4:20 PM
258
Memory
KEY TERMS
Stroop task
A task in which
participants have to name
the ink colours in which
colour words are printed;
performance is slowed
when the to-be-named
colour (e.g., green)
conflicts with the colour
word (e.g., red).
(2)
(3)
(4)
and D but the remaining correlations are small. Such a pattern suggests tasks A and B involve the same executive function whereas tasks
C and D involve a different executive function.
Neuropsychological approach: the focus is on individuals with brain
damage causing impaired executive functioning. Patterns of impaired
functioning are related to the areas of brain damage to identify executive functions and their locations within the brain. Shallice and
Cipiolotti (2018) provide a thorough discussion of the applicability of
this approach to understanding executive functioning.
Neuroimaging approach: the focus is on assessing similarities and differences in the patterns of brain activation associated with various
executive tasks. For example, the existence of two executive functions
(A and B) would be supported if they were associated with different
patterns of brain activation.
Genetic approach: twin studies are conducted with an emphasis on
showing different sets of genes are associated with each executive
function (assessed by using appropriate cognitive tasks).
Several theories have been proposed on the basis of evidence using the
above approaches (see Friedman and Miyake, 2017, for a review). Here
we will focus on the very influential theory originally proposed by Miyake
et al. (2000) and developed subsequently (e.g., Friedman & Miyake, 2017).
Unity/diversity framework
Interactive exercise:
Stroop
In their initial study, Miyake et al. (2000) used the psychometric approach:
they administered several executive tasks and then focused on the pattern of
inter-correlations among the tasks. They identified three related (but separable) executive functions:
(1)
Case study:
Automatic processes,
attention and the
emotional Stroop effect
(2)
(3)
Inhibition function: used to deliberately override dominant responses
and to resist distraction. For example, it is used on the Stroop task (see
Figure 1.3 on p. 5), which involves naming the colours in which words
are printed. When the words are conflicting colour words (e.g., the
word BLUE printed in red), it is necessary to inhibit saying the word.
Shifting function: used to switch flexibly between tasks or mental sets.
Suppose you are presented with two numbers on each trial. Your task
is to switch between multiplying the two numbers and dividing one by
the other on alternate trials. Such task switching requires the shifting
function.
Updating function: used to monitor and engage in rapid addition or
deletion of working memory contents. For example, this function is
used if you must keep track of the most recent member of each of
several categories.
Subsequent research (e.g., Friedman et al., 2008; Miyake & Friedman,
2012) led to the development of the unity/diversity framework. The basic
idea is that each executive function consists of what is common to all three
executive functions (unity) plus what is unique to that function (diversity)
(see Figure 6.9). After accounting for what was common to all executive
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 258
28/02/20 4:20 PM
Learning, memory and forgetting
259
Figure 6.9
Schematic representation of the unity and diversity of three executive functions (EFs).
Each executive function is a combination of what is common to all three and what is
specific to that executive function. The inhibition-specific component is absent because
the inhibition function correlates very highly with the common executive function.
From Miyake and Friedman (2012). Reprinted with permission of SAGE Publications.
functions, Friedman et al. found there was no unique variance left for the
inhibition function. Of importance, separable shifting and updating factors
have consistently been identified in subsequent research (Friedman &
Miyake, 2017).
What is the nature of the common factor? According to Friedman
and Miyake (2017, p. 194), “It reflects individual differences in the ability
to maintain and manage goals, and use those goals to bias ongoing processing.” Goal maintenance (resembling concentration) may be especially
important on inhibition tasks where it is essential to focus on task requirements to avoid distraction or incorrect competing responses. This could
explain why such tasks load only on the common factor.
Support for the notion that the common factor reflects goal maintenance was reported by Gustavson et al. (2015). Everyday goal-­management
failures (assessed by questionnaire) correlated negatively with the common
factor.
Findings
So far we have focused on the psychometric approach. The unity/­diversity
framework is also supported by research using the genetic approach.
Friedman et al. (2008) had monozygotic (identical) and dizygotic (fraternal)
twins perform several executive function tasks. One key finding was that
individual differences in all three executive functions (common; updating;
shifting) were strongly influenced by genetic factors. Another key finding
was that different sets of genes were associated with each function.
We turn now to neuroimaging research. Such research partly supports
the unity/diversity framework. Collette et al. (2005) found all three of
Miyake et al.’s (2000) functions (i.e., inhibition; shifting; updating) were
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 259
28/02/20 4:20 PM
260
Memory
Figure 6.10
Activated brain regions across all executive functions in a meta-analysis of 193 studies
(shown in red).
From Niendam et al. (2012).
associated with activation in different prefrontal areas. However, all tasks
produced activation in other areas (e.g., the left lateral prefrontal cortex,
which is consistent with Miyake and Friedman’s (2012) unity notion).
Niendam et al. (2012) carried out a meta-analysis (see Glossary) of
findings from 193 studies where participants performed many tasks involving executive functions. Of most importance, several brain areas were
activated across all executive functions (see Figure 6.10). These areas
included the dorsolateral prefrontal cortex (BA9/46), fronto-polar cortex
(BA10), orbitofrontal cortex (BA11) and anterior cingulate (BA32). This
brain network corresponds closely to the common factor identified by
Miyake and Friedman (2012). In addition, Niendam et al. found some differences in activated brain areas between shifting and inhibition function
tasks.
Stuss and Alexander (2007) argued the notion of a dysexecutive syndrome (see Glossary; discussed earlier, p. 251) erroneously implies brain
damage to the frontal lobes damages all central executive functions. While
there may be a global dysexecutive syndrome in patients having widespread damage to the frontal lobes, this is not so in patients having limited
prefrontal damage. Among such patients, Stuss and Alexander identified
three executive processes, each associated with a different region within the
frontal cortex (approximate brain locations are in brackets):
(1)
Task setting (left lateral): this involves planning; it is “the ability to
set a stimulus-response relationship . . . necessary in the early stages
of learning to drive a car or planning a wedding” (p. 906).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 260
28/02/20 4:20 PM
Learning, memory and forgetting
(2)
(3)
261
Monitoring (right lateral): this involves checking the adequacy of one’s
task performance; deficient monitoring leads to increased variability
of performance and increased errors.
Energisation (superior medial): this involves sustained attention or
concentration; deficient energisation leads to slow performance on all
tasks requiring fast responding.
The above three executive processes are often used in combination when
someone performs a complex task. Note that these three processes differ
from those identified by Miyake et al. (2000). However, there is some
overlap: task setting and monitoring both involve aspects of cognitive
control as do the processes of inhibition and shifting.
Stuss (2011) confirmed the importance of the above three executive
functions. In addition, he identified a fourth executive process he called
metacognition/integration (located in BA10: fronto-polar prefrontal cortex).
According to Stuss (p. 761), “This function is integrative and coordinating-orchestrating . . . [it includes] recognising the differences between what
one knows from what one believes.” Evidence for this process has come
from research on patients with damage to BA10 (Burgess et al., 2007).
Evaluation
The unity/diversity framework provides a coherent account of the major
executive functions and is deservedly highly influential. One of its greatest strengths is that it is supported by research using several different
approaches (e.g., psychometric; genetic; neuroimaging; neuropsychological). The notion of a hierarchical system with one very general function
(common executive function) plus more specific functions (e.g., shifting;
updating) is consistent with most findings.
What are the limitations of the unity/diversity framework? First, as
Friedman and Miyake (2017, p. 199) admitted, “The results of lesion
studies are in partial agreement with the unity/diversity framework . . .
the processes [identified] in these studies are not clearly the same as those
[identified] in studies of normal individual differences.” For example, Stuss
(2011) obtained evidence for task setting, monitoring, energisation and
metacognition/integration functions in research on brain-damaged patients.
Second, many neuroimaging findings appear inconsistent with the
framework. For example, Nee et al. (2013) carried out a meta-analysis
of 36 neuroimaging studies on executive processes. There was little evidence that functions such as shifting, updating and inhibition differed in
their patterns of brain activation. Instead, one frontal region was mostly
involved in processing spatial content (where-based processing) and a
second frontal region was involved in processing non-spatial content
(what-based processing).
Third, Waris et al. (2017) also found evidence for content-based factors
differing from the executive factors emphasised within the unity/diversity
framework. They factor-analysed performance on ten working memory
tasks and identified two specific content-based factors: (1) a visuo-spatial
factor; and (2) a numerical-verbal factor. There is some overlap between
these factors and those identified by Nee et al. (2013).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 261
28/02/20 4:20 PM
262
Memory
Fourth, an important assumption within the unity/diversity framework is that all individuals have the same executive processes (Friedman
& Miyake, 2017). The complexities and inconsistencies of the research evidence suggest this assumption may be only partially correct.
LEVELS OF PROCESSING (AND BEYOND)
Interactive exercise:
Levels of processing
What determines long-term memory? According to Craik and Lockhart
(1972), how information is processed during learning is crucial. In their
levels-of-processing approach, they argued that attentional and perceptual
processes of learning determine what information is stored in long-term
memory. Levels of processing range from shallow or physical analysis of a
stimulus (e.g., detecting specific letters in words) to deep or semantic analysis. The greater the extent to which meaning is processed, the deeper the
level of processing.
Here are Craik and Lockhart’s (1972) main theoretical assumptions:
●●
●●
The level or depth of stimulus processing has a large effect on its memorability: the levels-of-processing effect.
Deeper levels of analysis produce more elaborate, longer-lasting and
stronger memory traces than shallow levels.
Craik (2002) subsequently moved away from the notion that there is a
series of processing levels going from perceptual to semantic. Instead, he
argued that the richness or elaboration of encoding is crucial for long-term
memory.
Hundreds of studies support the levels-of-processing approach. For
example, Craik and Tulving (1975) compared deep processing (decide
whether each word fits the blank in a sentence) and shallow processing
(decide whether each word is in uppercase or lowercase letters). Recognition
memory was more than three times higher with deep than with shallow
processing. Elaboration of processing (amount of processing of a given
kind) was also important. Cued recall following the deep task was twice
as high for words accompanying complex sentences (e.g., “The great bird
swooped down and carried off the struggling ____”) as those accompanying simple sentences (e.g., “She cooked the ____”).
Rose et al. (2015) reported a levels-of-processing effect even with an
apparently easy memory task: only a single word had to be recalled and
the retention interval was only 10 seconds. More specifically, words associated with deep processing were better recalled than those associated
with shallow processing when the retention interval was filled with a task
involving adding or subtracting).
Baddeley and Hitch (2017) pointed out the great majority of studies
had used verbal materials (e.g., words). Accordingly, they decided to see
whether a levels-of-processing effect would be obtained with different learning materials. In one study, they found the effect with recognition memory
was much smaller with doors and clocks than with food names (see
Figure 6.11). The most plausible explanation is that it is harder to produce
an elaborate semantic encoding with doors or clocks than with most
words.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 262
28/02/20 4:20 PM
263
Learning, memory and forgetting
p (correct)
Morris et al. (1977) disproved the
1
levels-of-processing theory. Participants
Shallow
0.9
answered semantic or shallow (rhyme)
Deep
0.8
questions for words. Memory was tested
0.7
by a standard recognition test (select list
0.6
words and reject non-list words) or a
0.5
0.4
rhyming recognition test (select words
0.3
rhyming with list words – the list words
0.2
themselves were not presented). There
0.1
was the usual superiority of deep pro0
cessing on the standard recognition test.
Doors
Clocks
Menus
However, the opposite was the case on
the rhyme test, a finding inconsistent with
Figure 6.11
the theory. According to Morris et al.’s Recognition memory performance as a function of processing
transfer-appropriate processing theory, depth (shallow vs deep) for three types of stimuli: doors, clocks
retrieval requires that the processing and menus.
during learning is relevant to the demands From Baddeley and Hitch (2017). Reprinted with permission of Elsevier.
of the memory test. With the rhyming
test, rhyme information is relevant but
sematic information is not.
Challis et al. (1996) compared the levels-of-processing effect on explicit
memory tests (e.g., recall; recognition) involving conscious recollection
and on implicit memory tests not involving conscious recollection (see
Chapter 7). The effect was generally greater in explicit than implicit memory.
Parks (2013) explained this difference in terms of transfer-appropriate processing. Shallow processing involves more perceptual but less conceptual
processing than deep processing. Accordingly, the levels-of-processing effect
should generally be smaller when the memory task requires demanding
­perceptual processing (as is the case with most implicit memory tasks).
Distinctiveness
Another important factor influencing long-term memory is distinctiveness.
Distinctiveness means a memory trace differs from other memory traces
because it was processed differently during learning. According to Hunt
and Smith (2014, p. 45), distinctive processing is “the processing of difference in the context of similarity”.
Eysenck and Eysenck (1980) studied distinctiveness using nouns having
irregular pronunciations (e.g., comb has a silent “b”). In one condition, participants said these nouns in a distinctive way (e.g., pronouncing the “b” in
comb). Thus, the processing was shallow (i.e., phonemic) but the memory
traces were distinctive. Recognition memory was much higher than in a
phonemic condition involving non-distinctive processing (i.e., pronouncing
nouns as normal). Indeed, memory was as good with distinctive phonemic
processing as with deep or semantic processing.
How can we explain the beneficial effects of distinctiveness on longterm memory? Chee and Goh (2018) identified two potential explanations.
First, distinctive items may attract additional attention and processing at the time of study. Second, distinctive items may be well remembered because of effects occurring at the time of retrieval, an explanation
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 263
KEY TERMS
Explicit memory
Memory that involves
conscious recollection of
information.
Implicit memory
Memory that does not
depend on conscious
recollection.
Distinctiveness
This characterises memory
traces that are distinct
or different from other
memory traces stored in
long-term memory.
28/02/20 4:20 PM
Figure 6.12
Percentage recall of the
critical item (e.g., kiwi) in
encoding, retrieval and
control conditions; also
shown is the percentage
recall of preceding and
following items in the three
conditions.
From Chee and Goh (2018).
Reprinted with permission of
Elsevier.
Memory
1
0.9
0.8
Proportion of recall
264
0.7
0.6
Instruction type
0.5
Control
0.4
Encoding
0.3
Retrieval
0.2
0.1
0
Preceding
Critical
item type
Following
originally proposed by Eysenck (1979). For example, suppose the distinctive item is printed in red whereas all the other items are printed in black.
The retrieval cue (recall the red item) uniquely specifies one item and so
facilitates retrieval.
Chee and Goh (2018) contrasted the two above explanations. They
presented a list of words referring to species of birds including the word
kiwi. Of importance, kiwi is a homograph (two words having the same
spelling but two different meanings): it can mean a species of bird or a type
of fruit. Participants were instructed before study (encoding condition) or
after study (retrieval condition) that one of the words would be a type of
fruit. The findings are shown in Figure 6.12. A distinctiveness effect was
found in the retrieval condition in the absence of distinctive processing at
study. These findings strongly support a retrieval-based explanation of the
distinctiveness effect.
Evaluation
There is compelling evidence that processes at learning have a major impact
on subsequent long-term memory (Roediger, 2008). Another strength of the
theory is the central assumption that learning and remembering are byproducts of perception, attention and comprehension. The levels-of-­processing
approach led to the identification of elaboration and distinctiveness of
processing as important factors in learning and memory. Finally, “The
levels-of-processing approach has been fruitful and generative, providing
a powerful set of experimental techniques for exploring the phenomena of
memory” (Roediger & Gallo, 2001, p. 44).
The levels-of-processing approach has several limitations. First, Craik
and Lockhart (1972) underestimated the importance of the retrieval environment in determining memory performance (e.g., Morris et al., 1977).
Second, the relative importance of processing depth, elaboration of processing and distinctiveness of processing to long-term memory remains unclear.
Third, the terms “depth”, “elaboration” and “distinctiveness” are vague
and hard to measure (Roediger & Gallo, 2001). Fourth, we do not know
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 264
28/02/20 4:20 PM
265
Learning, memory and forgetting
precisely why deep processing is so effective or why the l­evels-of-processing
effect is small in implicit memory. Fifth, the levels-of-processing effect is
typically smaller with non-verbal stimuli than with words (Baddeley &
Hitch, 2017).
LEARNING THROUGH RETRIEVAL
How can we maximise our learning (e.g., of some topic in cognitive psychology)? Many people (including you?) think what is required is to study
and re-study the to-be-learned material with testing serving only to establish what has been learned. In fact, this is not the case. As we will see, there
is typically a testing effect: “the finding that intermediate retrieval practice
between study and a final memory test can dramatically enhance final-test
performance when compared with restudy trials” (Kliegl & Bäuml, 2016).
The testing effect is generally surprisingly strong. Dunlosky et al.
(2013) discussed ten learning techniques including writing summaries,
forming images of texts and generating explanations for stated facts, and
found repeated testing was the most effective technique. Rowland (2014)
carried out a meta-analysis: 81% of the findings were positive. Most of
these studies were laboratory-based. Reassuringly, Schwieren et al. (2017)
found the magnitude of the testing effect was comparable in real-life
­contexts (teaching psychology) and laboratory conditions.
KEY TERM
Testing effect
The finding that longterm memory is enhanced
when some of the
learning period is devoted
to retrieving to-be-learned
information rather than
simply studying it.
Explanations of the testing effect
We start by identifying two important theoretical approaches to explaining
the testing effect. First, several theorists have emphasised the importance
of retrieval effort (Rowland, 2014). The core notion here is that the testing
effect will be greater when the difficulty or effort involved in retrieval during
the learning period is high rather than low.
Why does increased retrieval effort have this beneficial effect? Several
answers have been suggested. For example, there is the elaborative retrieval
hypothesis, which is applicable to paired-associate learning (e.g., learning to associate the cue Chalk with the target Crayon). According to this
hypothesis, “the act of retrieving a target from a cue activates cue-­relevant
information that becomes incorporated with the successfully retrieved
target, providing a more elaborate representation” (Carpenter & Yeung,
2017, p. 129). According to a more specific version of this hypothesis (the
mediator effectiveness hypothesis), retrieval practice promotes the use of
more effective mediators. In the above example, Board might be a mediator
­triggered by the cue Chalk.
Rickard and Pan (2018) proposed a related (but more general)
dual-memory theory. In essence, restudy causes the memory trace formed
at initial study to be strengthened. Testing with feedback (which involves
effort) also strengthens the memory trace formed at initial study. More
importantly, it leads to the formation of a second memory trace (see
Figure 6.13). The strength of this second memory trace probably depends
on the amount of retrieval effort during testing. Thus, testing generally promotes superior memory to restudy because it promotes the acquisition of
two memory traces for each item rather than one.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 265
28/02/20 4:20 PM
266
Memory
Figure 6.13
(a) Restudy causes
strengthening of the
memory trace formed
after initial study; (b)
testing with feedback
causes strengthening of
the original memory trace;
and (c) the formation of
a second memory trace.
t = the response threshold
that must be exceeded
for any given item to be
retrieved on the final test.
(a)
Study memory
After initial study
Restudy
After training
t
Strength
(b)
Study memory
After initial study
Testing
with
feedback
From Rickard & Pan (2018).
After training
t
Strength
(c)
Test memory
Testing
with
feedback
After training
t
Strength
Second, there is the bifurcation model (bifurcation means division
into two) proposed by Kornell et al. (2011). According to this model,
items successfully retrieved during testing practice are strengthened more
than restudied items. However, the crucial assumption is that items not
retrieved during testing practice are strengthened less than restudied
items; indeed, their memory strength does not change. Thus, there should
be ­circumstances in which the testing effect is reversed.
Findings
Several findings indicate that the size of the testing effect depends on
retrieval effort (probably because it leads to the formation of a strong
second memory trace). Endres and Renkl (2015) asked participants to rate
the mental effort they used during retrieval practice and restudying. They
obtained a testing effect that disappeared when mental effort was controlled
for statistically. As predicted, more effortful or difficult retrieval tests (e.g.,
free recall) typically led to a greater testing effect than easy retrieval tests
(e.g., recognition memory) (Rowland, 2014). All these findings provide
indirect support for the dual-memory theory.
It seems reasonable to assume retrieval practice is more effortful and
demanding when initial memory performance is low rather than high. As
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 266
28/02/20 4:20 PM
Learning, memory and forgetting
267
predicted, the testing effect is greater when initial memory performance
was low in studies providing feedback (re-presentation of the learning
materials) (Rowland, 2014).
Suppose you are trying to learn the word pair wingu–cloud. You might
try to link the words by using the mediator plane. When subsequently
given the cue (wingu) and told to recall the target word (cloud), you might
generate the sequence wingu–wing–cloud according to the mediator effectiveness hypothesis. Pyc and Rawson (2010) instructed participants to
learn Swahili-English pairs (e.g., wingu–cloud). In one condition, each
trial after the initial study trial involved only restudy. In the other condition (test-restudy), each trial after the initial study trial involved a cued
recall test followed by restudy. Participants generated and reported mediators on the study and restudy trials. There were three recall conditions
on the final memory test 1 week later: (1) cue only; (2) cue + the mediator generated during learning; (3) cue + prompt to try to generate the
mediator.
The findings were straightforward (see
Figure 6.14(a)):
(1)
Memory performance in the cue only
condition replicated the basic testing
effect.
(2)Performance in the cue + mediator condition shows test-restudy participants
generated more effective mediators than
restudy-only participants.
(3)
Test-restudy participants performed
much better than restudy-only ones
in the cue + prompt condition. Testrestudy participants remembered the
mediators much better. Retrieving
mediators was important for the test-­
restudy ­participants – their performance
was poor when they failed to recall
mediators.
Pyc and Rawson (2012) developed the mediator effectiveness hypothesis. Participants
were more likely to change their mediators
during test-restudy practice than restudy-only
practice. Of most importance, participants
engaged in test-restudy practice were more
likely to change their mediators following
retrieval failure than retrieval success. Thus,
retrieval practice allows people to evaluate the
effectiveness of their mediators and to replace
ineffective ones with effective ones.
We turn now to the bifurcation model,
the main theoretical approach predicting
reversals of the testing effect. Support was
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 267
Figure 6.14
(a) Final recall for restudy-only and test-restudy group
participants provided at test with cues (C), cues + the
mediators generated during learning (CM) or cues plus
prompts to recall their mediators (CMR). (b) Recall performance
in the CMR group as a function of whether the mediators were
or were not retrieved.
From Pyc and Rawson (2010). © American Association for Advancement
of Science. Reprinted with permission of AAAS.
28/02/20 4:20 PM
268
Memory
reported by Pastötter and Bäuml (2016). Participants
had retrieval/testing or restudy practice for paired
100
associates during Session 1. In Session 2 (48 hours
90
**
later), Test 1 was immediately followed by feedback
80
70
(re-presentation of the word pairs) and 10 minutes
60
later by Test 2.
***
50
There was a testing effect on Test 1 but a reversed
40
testing effect on Test 2 (see Figure 6.15). According
30
to the bifurcation model, non-recalled items on Test
Test 1
Test 2
1 should be weaker if previously subject to retrieval
practice rather than restudy. Thus, they should benefit
Retrieval practice
Restudy practice
less from feedback. That is precisely what happened
(see Figure 6.15).
Figure 6.15
Most research on the testing effect has involved
Mean recall percentage in Session 2 on Test 1
the use of identical materials during both initial and
(followed by feedback) and Test 2 10 minutes later
final retrieval tests. For many purposes, however, we
as function of retrieval practice (in blue) or restudy
want retrieval to produce more general and flexible
practice (in green) in Session 1.
learning that transfers to related (but non-tested)
From Pastötter & Bäuml (2016).
information. Pan and Rickard (2018) found in a
meta-analysis that retrieval practice on average has a
moderately beneficial effect on transfer of learning. This was especially the
case when retrieval practice involved elaborative feedback (e.g., extended
and detailed feedback) than when only basic feedback (i.e., the correct
answer) was provided.
% Recall
Session 2
Evaluation
The testing effect is strong and has been obtained with many different
types of learning materials. Testing during learning has the advantage it
can be used almost regardless of the nature of the to-be-learned material.
Of importance, retrieval practice often produces learning that generalises
or transfers to related (but non-tested) information. Testing has beneficial effects because it produces a more elaborate memory trace (elaborative retrieval hypothesis) or a second memory trace (dual-memory theory).
However, testing can be ineffective if the studied material is not retrieved
and there is no feedback (the bifurcation model).
What are the limitations of theory and research in this area?
(1)
(2)
(3)
There are several ways retrieval practice might produce more elaborate memory traces (e.g., additional processing of external context;
the production of more effective internal mediators). The precise form
of such elaborate memory traces is hard to predict.
The dual-memory theory provides a powerful explanation of the
testing effect. However, more research is required to demonstrate
the conditions in which testing leads to the formation of a second
memory trace differing from the memory trace formed during initial
study.
The bifurcation model has received empirical support. However, it
does not specify the underlying processes or mechanisms responsible
for the reversed testing effect.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 268
28/02/20 4:20 PM
269
Learning, memory and forgetting
(4)
The fact that the testing effect has been found with numerous types of
learning material and testing conditions suggests that many different
processes can produce that effect. Thus, currently prominent theories
are probably applicable to only some findings.
IMPLICIT LEARNING
KEY TERM
Implicit learning
Learning complex
information without
conscious awareness of
what has been learned.
Earlier in the chapter we discussed learning through retrieval and learning from the levels-of-processing perspective. In both cases, the emphasis was on explicit learning: it generally makes substantial demands on
attention and working memory and learners are aware of what they are
learning.
Can we learn something without an awareness of what we have learned?
It sounds improbable. Even if we learned something without realising, it
seems unlikely we would make much use of it. In fact, there is much evidence for implicit learning: “learning that occurs without full conscious
awareness of the regularities contained in the learning material itself and/
or that learning has occurred” (Sævland & Norman, 2016, p. 1). As we will
see, it is often assumed implicit learning differs from explicit learning in
being less reliant on attention and working memory.
We can also distinguish between implicit learning and implicit memory
(memory not involving conscious recollection; discussed in Chapter 7). There
can be implicit memory for information acquired through explicit learning if learners lose awareness of that information over time. There can
also be explicit memory for information acquired through implicit learning if learners are provided with informative contextual cues when trying
to remember that information. However, implicit learning is typically followed by implicit memory whereas explicit learning is followed by explicit
memory.
There is an important difference between research on implicit learning
and implicit memory. Research on implicit learning mostly involves focusing on performance changes occurring over a lengthy sequence of learning
trials. In contrast, research on implicit memory mostly involves one or a
few learning trials and the emphasis is on the effects of various factors
(e.g., retention interval; retrieval cues) on memory performance. In addition, research on implicit learning often uses fairly complex, novel tasks
whereas much research on implicit memory uses simple, familiar stimulus
materials.
Reber (1993) made five assumptions concerning major differences
between implicit and explicit learning (none established definitively):
(1)
(2)
(3)
(4)
(5)
Age independence: implicit learning is little influenced by age or developmental level.
IQ independence: performance on implicit tasks is relatively unaffected
by IQ.
Robustness: implicit systems are relatively unaffected by disorders
(e.g., amnesia) affecting explicit systems.
Low variability: there are smaller individual differences in implicit
learning than explicit learning.
Commonality of process: implicit systems are common to most species.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 269
28/02/20 4:20 PM
270
Memory
Here we will briefly consider the first two assumptions (the third
assumption is discussed later, p. 277). With respect to the first assumption, some studies have reported comparable implicit learning in older and
young adults. However, implicit learning is mostly significantly impaired in
older adults. How can we explain this deficit? Older adults generally have
reduced volume of frontal cortex and the striatum, an area strongly associated with implicit learning (King et al., 2013a).
With respect to the second assumption, Christou et al. (2016) found
on a visuo-motor task that the positive effects of high working memory
capacity on task performance were due to explicit but not implicit learning. When the visuo-motor task was changed to reduce the possibility of
explicit learning, high working memory capacity was unrelated to performance. Overall, intelligence is associated more strongly with explicit learning. However, the association between intelligence and implicit learning
appears greater than predicted by Reber (1993).
IN THE REAL WORLD: SKILLED TYPISTS AND IMPLICIT LEARNING
Millions of individuals have highly developed typing skills (e.g., the typical American student who
touch types produces 70 words a minute) (Logan & Crump, 2009). Nevertheless, many expert
typists find it hard to think exactly where the letters are on the keyboard. For example, the first
author of this book has typed 8 million words for publication but has limited conscious awareness
of the locations of most letters! This suggests expert typing relies heavily on implicit learning and
memory. However, typing initially involves mostly explicit learning as typists learn to associate
finger movements with specific letter keys.
Snyder et al. (2014) studied college students averaging 11.4 years of typing practice. In the first
experiment, typists saw a blank keyboard and were instructed to write the letters in their correct
locations (see Figure 6.16). They located only 14.9 (57.3%) of the letters accurately. If you are a
skilled typist, try this task before checking your answers (shown in Figure 6.22).
Accurate identification of letters’ keyboard locations could occur because typists engage in
simulated typing. In their second experiment, Snyder et al. (2014) found the ability to identify the
keyboard locations of letters was reduced when simulated typing was prevented. Thus, explicit
memory for letter locations is lower than 57%.
Figure 6.16
Schematic representation of a traditional keyboard.
From Snyder et al. (2014). © 2011 Psychonomic Society. Reprinted with permission from Springer.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 270
28/02/20 4:20 PM
Learning, memory and forgetting
271
In a final experiment, Snyder et al. (2014) gave typists two hours’ training on the Dvorak keyboard,
on which the letter locations differ from the traditional QWERTY keyboard. The ability to locate
letters on the Dvorak and QWERTY keyboards was comparable. Thus, typists have no more explicit
knowledge of letter locations on a keyboard after 11 years than after 2 hours!
What is the nature of experienced typists’ implicit learning? Logan (2018) addressed this
issue. Much of this learning involves forming associations between individual letters and finger
movements. In addition, however, typists learn to treat each word as a single chunk or unit. As a
result, they type words much faster than non-words containing the same number of letters. Thus,
implicit learning occurs at both the word and letter levels (Logan, 2018).
If experts rely on implicit learning and memory, we might predict performance impairments if
they focused consciously on their actions. There is much support for this prediction. For example,
Flegal and Anderson (2008) gave skilled golfers a putting task before and after they described
their actions in detail. Their putting performance was markedly worse after describing their actions
because conscious processes disrupted implicit ones.
Assessing implicit learning
You might think it is easy to decide whether implicit learning has occurred –
we simply ask participants after performing a task to indicate their conscious awareness of their learning. Implicit learning is shown if there is no
such conscious awareness.
Alas, individuals sometimes fail to report fully their conscious awareness of their learning (Shanks, 2010). For example, there is the “retrospective problem” (Shanks & St. John, 1994) – participants may be consciously
aware of what they are learning at the time but have forgotten it when
questioned subsequently.
Shanks and St. John (1994) proposed two criteria (incompletely implemented in most research) for implicit learning to be demonstrated:
(1)
(2)
Information criterion: The information participants are asked to
provide on the awareness test must be the information responsible for
the improved performance.
Sensitivity criterion: “We must . . . show our test of awareness is sensitive to all of the relevant knowledge” (Shanks & St. John, 1994,
p. 374). We may underestimate participants’ consciously accessible
knowledge if we use an insensitive awareness test.
When implicit learning studies fail to obtain significant evidence of explicit
learning, researchers often (mistakenly) conclude there was no explicit
learning. Consider research on contextual cueing: participants search for
targets in visual displays and targets are detected increasingly rapidly (especially with repeated rather than random displays). Subsequently, participants see the repeating patterns and new random ones and indicate whether
they have previously seen each one. Typically, participants fail to identify
the repeating patterns significantly more often than the random ones. Such
non-significant findings imply all task learning is implicit.
Vadillo et al. (2016) argued many of the above non-significant findings occurred because insufficiently large samples were used. In their review
of 73 studies, 78.5% of awareness tests produced non-significant findings.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 271
28/02/20 4:20 PM
272
Memory
KEY TERMS
Nevertheless, participants in 67% of the studies performed above chance
(a highly significant finding). Thus, some explicit learning is involved in
contextual cueing even though the opposite is often claimed.
Finally, we consider the process-dissociation procedure. Suppose
participants perform a task involving a repeating sequence of stimuli. They
either guess the next stimulus (inclusion condition) or try to avoid guessing the next stimulus accurately (exclusion condition). If learning is wholly
implicit, performance should be comparable in both conditions because
participants would have no conscious access to relevant information. If it
is partly or wholly explicit, performance should be better in the inclusion
condition.
The process-dissociation procedure is based on the assumption that
the influence of implicit and explicit processes is unaffected by instructions (inclusion vs exclusion). However, Barth et al. (2019) found explicit
knowledge was less likely to influence performance in the exclusion than
the inclusion condition. Such findings make it hard to interpret findings
obtained using the process-dissociation procedure.
Process-dissociation
procedure
On learning tasks,
participants try to
guess the next stimulus
(inclusion condition)
or avoid guessing the
next stimulus accurately
(exclusion condition); the
difference between the
two conditions indicates
the amount of explicit
learning.
Serial reaction time task
Participants on this
task respond as rapidly
as possible to stimuli
typically presented in a
repeating sequence; it is
used to assess implicit
learning.
Findings
The serial reaction time task has often been used to study implicit learning. On each trial, a stimulus appears at one of several locations on a computer screen and participants respond using the response key corresponding
to its location. There is typically a complex, repeating sequence over trials
but participants are not told this. Towards the end of the experiment, there
is often a block of trials conforming to a novel sequence but the participants are not informed.
Participants speed up over trials on the serial reaction time task but
respond much more slowly during the novel sequence (Shanks, 2010).
When questioned at the end of the experiment, participants usually claim
no conscious awareness of a repeating sequence or pattern. However, participants sometimes have partial awareness of what they have learned.
Wilkinson and Shanks (2004) gave participants 1,500 trials (15 blocks) or
4,500 trials (45 blocks) on the task and obtained strong sequence learning. This was followed by a test of explicit learning based on the process-­
dissociation procedure.
Participants’ predictions were significantly better in the inclusion than
exclusion condition (see Figure 6.17) indicating some conscious or explicit
knowledge was acquired. In a similar study, Gaillard et al. (2009) obtained
comparable findings and discovered conscious knowledge increased with
practice.
Haider et al. (2011) argued the best way to assess whether learning is
explicit or implicit is to use several measures of conscious awareness. They
used a version of the serial reaction time task in which a colour word (the
target) was written in ink of the same colour (congruent trials) or a different colour (incongruent trials). Participants responded to the colour word
rather than the ink. There were six different coloured squares below the
target word and participants pressed the coloured square corresponding to
the colour word. The correct coloured square followed a regular sequence
(1-6-4-2-3-5) but participants were not told this.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 272
28/02/20 4:20 PM
Learning, memory and forgetting
273
Haider et al. (2011) found 34% of participants showed a sudden drop in reaction
times at some point. They hypothesised these
RT-drop participants were consciously aware
of the regular sequence (explicit learning).
The remaining 66% failed to show a sudden
drop (the no-RT-drop participants) and were
hypothesised to have engaged only in implicit
learning (see Figure 6.18).
Haider et al. (2011) used the process-­
dissociation procedure to test the above
hypotheses. The RT-drop participants performed well: 80% correct on inclusion trials
vs 18% correct on exclusion trials, suggesting
considerable explicit learning. In contrast,
the no-RT-drop participants had comparably
low performance on inclusion and exclusion Figure 6.17
trials indicating an absence of explicit learn- Mean number of completions (guessed locations)
ing. Finally, all participants described the corresponding to the trained sequence (own) or the untrained
sequence (other) in inclusion and exclusion conditions as a
training sequence (explicit task). Almost all function of number of trials (15 vs 45 blocks).
(91%) of the RT-drop participants did this From Wilkinson and Shanks (2004). © 2004 American Psychological
perfectly compared to 0% of the no-RT-drop Association. Reproduced with permission.
participants. Thus, all the various findings
supported Haider et al.’s hypotheses.
Figure 6.18
Response times for participants showing a sudden drop in RTs (right-hand side) or not
showing such a drop (left-hand side). The former group showed much greater learning
than the latter group (especially on incongruent trials on which the colour word was in a
different coloured ink).
From Haider et al. (2011). Reprinted with permission from Elsevier.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 273
28/02/20 4:20 PM
274
Memory
If implicit learning does not require cognitively demanding processes
(e.g., attention), people should be able to perform two implicit learning
tasks simultaneously without interference. As predicted, Jiménez and
Vázquez (2011) reported no interference when participants performed the
serial reaction time task and a second implicit learning task.
Many tasks involve a combination of implicit and explicit learning.
Taylor et al. (2014) used a visuo-motor adaptation task on which participants learned to point at a target that rotated 45 degrees counterclockwise. Participants initially indicated their aiming direction and then made
a rapid reaching movement. The former provided a measure of explicit
learning whereas the latter provided a measure of implicit learning. Thus,
an advantage of this experimental approach is that it provides separate
measures of explicit and implicit learning.
Huberdeau et al. (2015) reviewed findings using the above visuo-motor
adaptation task and drew two main conclusions. First, improved performance over trials depended on both implicit and explicit learning. Second,
there was a progressive increase in implicit learning with practice, whereas
most explicit learning occurred early in practice.
Cognitive neuroscience
If implicit and explicit learning are genuinely different, they should be associated with different brain areas. Implicit learning has been linked to the
striatum, which is part of the basal ganglia (see Figure 6.19). For example,
Reiss et al. (2005) found on the serial reaction time task that participants
showing implicit learning had greater activation
in the striatum than those not exhibiting implicit
learning.
In contrast, explicit learning and memory
are typically associated with activation in the
medial temporal lobes including the hippocampus (see Chapter 7). Since conscious awareness
is most consistently associated with activation
of the dorsolateral prefrontal cortex and the
anterior cingulate (see Chapter 16), these areas
should be more active during explicit than
implicit learning.
Relevant evidence was reported by Wessel
et al. (2012) using the serial reaction time
task. Some participants showed clear evidence of explicit learning during training. A
brain area centred on the right prefrontal
cortex became much more active around the
onset of explicit learning. In similar fashion,
Lawson et al. (2017) compared participants
showing (or not showing) conscious awareness of a repeating pattern on the serial reaction time task. The fronto-parietal network was
Figure 6.19
more activated for those showing conscious
The striatum (which includes the caudate nucleus and the
putamen) is of central importance in implicit learning.
awareness.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 274
28/02/20 4:20 PM
275
Learning, memory and forgetting
It is often hard to establish the brain regions associated with implicit
and explicit learning because learners often use both kinds of learning.
Destrebecqz et al. (2005) used the process-dissociation procedure (see
Glossary) with the serial reaction time task to distinguish more clearly
between the explicit and implicit components of learning. Striatum activation was associated with the implicit component whereas the prefrontal
cortex and anterior cingulate were associated with the explicit component.
Penhune and Steele (2012) proposed a model of motor sequence
learning (see Figure 6.20). The striatum is involved in learning stimulus–
response associations and motor chunking or organisation. The cerebellum
is involved in producing an internal model to aid sequence performance
and error correction. Finally, the motor cortex is involved in storing the
learned motor sequence. Of importance, the involvement of each brain
area varies across stages of learning.
Evidence for the importance of the cerebellum in motor sequence
learning was reported by Shimizu et al. (2017) using transcranial direct
current stimulation (tDCS; see Glossary) applied to the cerebellum. This
KEY TERM
Striatum
It forms part of the basal
ganglia and is located
in the upper part of the
brainstem and the inferior
part of the cerebral
hemispheres.
Interactive feature:
Primal Pictures’
3D atlas of the brain
Figure 6.20
A model of motor sequence learning. The top panel shows the brain areas (PMC or M1 =
primary motor cortex) and associated mechanisms involved in motor sequence learning.
The bottom panel shows the changing involvement of different processing components
(chunking, synchronisation, sequence ordering, error correction) in overall performance.
Each component is colour-coded to its associated brain region.
From Penhune and Steele (2012). Reprinted with permission of Elsevier.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 275
28/02/20 4:20 PM
276
Memory
stimulation influenced implicit learning (enhancing or impairing performance) as predicted theoretically.
In spite of the above findings, there are many inconsistencies and complexities in the research literature (Reber, 2013). For example, Gheysen
et al. (2011) found the striatum contributed to explicit learning of motor
sequences as well as implicit learning and the hippocampus is sometimes
involved in implicit learning (Henke, 2010).
Why are the findings inconsistent? First, there are numerous forms of
implicit learning. As Reber (2013, p. 2029) argued, “We should expect to
find implicit learning . . . whenever perception and/or actions are repeated
so that processing comes to reflect the statistical structure of experience.”
As a consequence, it is probable that implicit learning can involve several
different brain networks.
Second, we can regard, “the cerebellum, basal ganglia, and cortex as
an integrated system” (Caligiore et al., 2017, p. 204). This system plays an
important role in implicit and explicit learning.
Third, as we have seen, there are large individual differences in learning strategies and the balance between implicit and explicit learning. These
individual differences introduce complexity into the overall findings.
Fourth, there are often changes in the involvement of implicit and explicit
processes during learning. For example, Beukema and Verstynen (2018)
focused on changes in the involvement of different brain regions during
the acquisition of sequential motor skills (e.g., the skills acquired by
typists). Explicit processes dependent on the medial temporal lobe (shown
in magenta) were especially important early in learning whereas implicit
processes dependent on the basal ganglia (shown in blue) became increasingly important later in learning (see Figure 6.21).
Figure 6.21
Sequential motor skill
learning initially depends
on the medial temporal
lobe (MTL) including the
hippocampus (shown in
magenta) but subsequently
depends more on the basal
ganglia (BG) including the
striatum (shown in blue).
From Beukema and Verstynen,
2018).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 276
28/02/20 4:20 PM
277
Learning, memory and forgetting
Brain-damaged patients
Amnesic patients with damage to the medial temporal lobes often have
intact performance on implicit-memory tests but are severely impaired on
explicit-memory tests (see Chapter 7). If separate learning systems underlie
implicit and explicit learning, we might expect amnesic patients to have intact
implicit learning but impaired explicit learning. That pattern of fi
­ ndings has
been reported several times. However, amnesic patients are often slower
than healthy controls on implicit-learning tasks (Oudman et al., 2015).
Earlier we discussed the hypothesis that the basal ganglia (especially
the striatum) are of major importance in implicit learning. Patients with
Parkinson’s disease (a progressive neurological disorder) have damage
to this region. As predicted, Clark et al. (2014) found in a meta-analytic
review that patients with Parkinson’s disease typically exhibit impaired
implicit learning on the serial reaction time task (see Chapter 7). However,
Wilkinson et al. (2009) found Parkinson’s patients also showed impaired
explicit learning on that task. In a review, Marinelli et al. (2017) found
that Parkinson’s patients showed the greatest impairment in motor learning when the task required conscious processing resources (e.g., attention;
cognitive strategies).
Much additional research indicates Parkinson’s patients have impaired
conscious processing (see Chapter 7). Siegert et al. (2008) found in a metaanalytic review that such patients exhibited consistently poorer performance
than healthy controls on working memory tasks. Roussel et al. (2017)
found 80% of Parkinson’s patients have dysexecutive syndrome which
involves general impairments in cognitive processing. In sum, findings from
Parkinson’s patients provide only limited information concerning the distinction between implicit and explicit learning.
KEY TERM
Parkinson’s disease
A progressive disorder
involving damage to the
basal ganglia (including
the striatum); the
symptoms include muscle
rigidity, limb tremor
and mask-like facial
expression.
Evaluation
Research on implicit learning has several strengths (see also Chapter 7).
First, the distinction between implicit and explicit learning has received
Figure 6.22
Percentages of experienced typists given an unfilled schematic keyboard (see
Figure 6.16) who correctly located (top number), omitted (middle number) or misplaced
(bottom number) each letter with respect to the standard keyboard.
From Snyder et al. (2014). © 2011 Psychonomic Society. Reprinted with permission from Springer.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 277
28/02/20 4:20 PM
278
Memory
KEY TERM
considerable support from behavioural and neuroimaging studies on
healthy individuals and from research on brain-damaged patients.
Second, the basal ganglia (including the striatum) tend to be associated
with implicit learning whereas the prefrontal cortex, anterior cingulate and
medial temporal lobes are associated with explicit learning. There is accumulating evidence that complex brain networks are involved in implicit
learning (e.g., Penhune & Steele, 2012).
Third, given the deficiencies in assessing conscious awareness with
any single measure, researchers are increasingly using several measures.
Thankfully, different measures often provide comparable estimates of the
extent of conscious awareness (e.g., Haider et al., 2011).
Fourth, researchers increasingly reject the erroneous assumption that
finding some evidence of explicit learning implies no implicit learning
occurred. In fact, learning typically involves implicit and explicit aspects
and the extent to which learners are consciously aware of what they are
learning depends on individual differences and the stage of learning (e.g.,
Wessel et al., 2012).
What are the limitations of research on implicit learning?
Savings method
A measure of forgetting
introduced by Ebbinghaus
in which the number
of trials for relearning
is compared against
the number for original
learning.
(1)
(2)
(3)
(4)
There is often a complex mixture of implicit and explicit learning,
making it hard to determine the extent of implicit learning.
The processes underlying implicit and explicit learning interact in
ways that remain unclear.
In order to show the existence of implicit learning we need to demonstrate that learning has occurred in the absence of conscious awareness. This is hard to do – we may fail to assess fully participants’
conscious awareness (Shanks, 2017).
The definition of implicit learning as learning occurring without conscious awareness is vague and underspecified, and so is applicable to
numerous forms of learning having little in common with each other.
It is probable that no current theory can account for the diverse forms
of implicit learning.
FORGETTING FROM LONG-TERM MEMORY
Hermann Ebbinghaus (1885/1913) studied forgetting from long-term
memory in detail, using himself as the only participant (not recommended!).
He initially learned lists of nonsense syllables lacking meaning and then
relearned each list between 21 minutes and 31 days later. His basic measure
of forgetting was the savings method – the reduction in the number of
trials during relearning compared to original learning.
Ebbinghaus found forgetting was very rapid over the first hour
after learning but then slowed considerably (see Figure 6.23). Rubin and
Wenzel (1996) found the same pattern when analysing numerous forgetting functions and argued a logarithmic function describes forgetting
over time. In contrast, Averell and Heathcote (2011) argued for a power
function.
It is often assumed (mistakenly) that forgetting should always be
avoided. Nørby (2015) identified three major functions served by forgetting:
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 278
28/02/20 4:20 PM
279
Learning, memory and forgetting
Figure 6.23
Forgetting over time
as indexed by reduced
savings.
Data from Ebbinghaus
(1885/1913).
(1)
(2)
(3)
It can enhance psychological well-being by reducing access to painful
memories.
It is useful to forget outdated information (e.g., where your friends
used to live) so it does not interfere with current information (e.g.,
where your friends live now). Richards and Frankland (2017) developed this argument. They argued a major purpose of memory is to
enhance decision-making and this purpose is facilitated when we
forget outdated information.
When trying to remember what we have read or heard, it is typically
most useful to forget specific details and focus on the overall gist or
message (see Box and Chapter 10).
IN THE REAL WORLD: IS PERFECT MEMORY
USEFUL?
What would it be like to have a perfect memory? Jorge Luis Borges
(1964) answered this question in a story called “Funes the memorious”.
After falling from a horse, Funes remembers everything that happens
to him in full detail. This had several negative consequences. When he
recalled the events of any given day, it took him an entire day to do so!
He found it very hard to think because his mind was full of incredibly
detailed information. Here is an example:
Not only was it difficult for him to comprehend that the generic symbol
dog embraces so many unlike individuals of diverse size and form; it
bothered him that the dog at three fourteen (seen from the side) should
have the same name as the dog at three fifteen (seen from the front).
(p. 153)
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 279
28/02/20 4:20 PM
280
Memory
KEY TERM
The closest real-life equivalent of Funes was a Russian called Solomon
Shereshevskii. When he worked as a journalist, his editor noticed he could
repeat everything said to him verbatim. The editor sent Shereshevskii
(S) to see the psychologist Luria. He found S rapidly learned complex
material (e.g., lists of over 100 digits) which he remembered perfectly
(even in reverse order) several years later. According to Luria (1968),
“There was no limit either to the capacity of S’s memory or to the
durability of the traces he retained.”
What was S’s secret? He had exceptional imagery and an amazing
capacity for synaesthesia (the tendency for processing in one modality
to evoke other sense modalities). For example, when hearing a tone, he
said: “It looks like fireworks tinged with a pink-red hue.”
Do you envy S’s memory powers? Ironically, his memory was so good
it disrupted his everyday life. For example, this was his experience when
hearing a prose passage: “Each word calls up images, they collide with
one another, and the result is chaos.” His mind came to resemble “a
junk heap of impressions”. His acute awareness of details meant he
sometimes failed to recognise someone he knew if, for example, their
facial colouring had altered because they had been on holiday. These
memory limitations made it hard for him to live a normal life and he
eventually ended up in an asylum.
Synaesthesia
The tendency for one
sense modality to evoke
another.
Most forgetting studies focus on declarative or explicit memory involving conscious recollection (see Chapter 7). Forgetting is often slower in
implicit than explicit memory.
For example, Mitchell (2006) asked participants to identify pictures
from fragments having seen some of them in an experiment 17 years previously. Performance was better with the previously seen pictures, providing evidence for very-long-term implicit memory. However, there was little
explicit memory for the previous experiment. A 36-year-old male participant confessed, “I’m sorry – I don’t really remember this experiment at all.”
Below we discuss major theories of forgetting. These theories are
not mutually exclusive – they all identify factors jointly responsible for
forgetting.
Decay
Perhaps the simplest explanation for forgetting of long-term memories is
decay, which involves “forgetting due to a gradual loss of the substrate of
memory” (Hardt et al., 2013, p. 111). More specifically, forgetting often
occurs because of decay processes occurring within memory traces. In spite
of its plausibility, decay has largely been ignored as an explanation of forgetting. Hardt et al. argued a decay process (operating mostly during sleep)
removes numerous trivial memories we form every day. This decay process
is especially active in the hippocampus (part of the medial temporal lobe
involved in acquiring new memories; see Chapter 7).
Forgetting can be due to decay or interference (discussed shortly).
Sadeh et al. (2016) assumed detailed memories (i.e., containing contextual
information) are sufficiently complex to be relatively immune to interference
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 280
28/02/20 4:20 PM
281
Learning, memory and forgetting
from other memories. As a result, most forgetting of such memories should
be due to decay. In contrast, weak memories (i.e., lacking contextual information) are very susceptible to interference and so forgetting of such memories should be primarily due to interference rather than decay. Sadeh et al.’s
findings supported these assumptions. Thus, the role played by decay in
forgetting depends on the nature of the underlying memory traces.
Interference theory
Interference theory was the dominant approach to forgetting during much
of the twentieth century. According to this theory, long-term memory
is impaired by two forms of interference: (1) proactive ­interference –
­disruption of memory by previous learning; (2) retroactive ­interference –
disruption of memory for previous by other learning or processing during
the retention interval. Research using methods such as those shown in
Figure 6.22 indicates proactive and retroactive interference are both maximal
when two different responses are associated with the same stimulus.
KEY TERMS
Proactive interference
Disruption of memory by
previous learning (often of
similar material).
Retroactive interference
Disruption of memory
for previously learned
information by other
learning or processing
occurring during the
retention interval.
Proactive interference
Proactive interference typically involves competition between the correct
response and an incorrect one. There is greater competition (and thus more
interference) when the incorrect response is associated with the same stimulus as the correct response. Jacoby et al. (2001) found proactive interference
was due much more to the strength of the incorrect response than the weakness of the correct response. Thus, it is hard to exclude incorrect responses
from the retrieval process.
More evidence for the importance of retrieval processes was reported
by Bäuml and Kliegl (2013). They tested the hypothesis that proactive
interference is often found because rememberers’ memory search is too
Figure 6.24
Methods of testing for
proactive and retroactive
interference.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 281
28/02/20 4:20 PM
282
Memory
broad, including material previously learned
but currently irrelevant. In the remember
(proactive interference) condition, three word
lists were presented followed by free recall of
the last one. In the forget condition, the same
lists were presented but participants were told
after the first two lists to forget them. Finally,
there was a control (no proactive interference)
condition where only one list was learned and
tested.
Participants in the control condition
recalled 68% of the words compared to only
41% in the proactive interference condition.
Crucially, participants in the forget condition recalled 68% of the words despite having
learned two previous lists. The instruction to
Figure 6.25
forget the first two lists allowed participants to
Percentage of items recalled over time for the conditions: no
limit their retrieval efforts to the third list. This
proactive interference (PI), remember (proactive interference)
interpretation was strengthened by the finding
and forget (forget previous lists).
that retrieval speed was comparable in the
From Bäuml & Kliegl (2013). Reprinted with permission of Elsevier.
forget and control conditions (see Figure 6.25).
Kliegl et al. (2015) found in a similar
study that impaired encoding (see Glossary) contributes to proactive interference. Encoding was assessed using electroencephalography (EEG; see
Glossary). The EEG indicated there was reduced attention during encoding
of a word list preceded by other word lists (proactive interference condition). As in the study by Bäuml and Kliegl (2013), there was also evidence
that proactive interference impaired retrieval.
Suppose participants learn word pairs on the first list (e.g., Cat–Dirt)
and more word pairs on the second list (e.g., Cat–Tree). They are then
given the first words (e.g., Cat) and must recall the paired word from the
second list (see Figure 6.24).
Jacoby et al. (2015) argued proactive interference (e.g., recalling
Dirt instead of Tree) often occurs when participants often fail to recognise changes in the word pairings between lists. As predicted, when they
instructed some participants to detect changed pairs, there was proactive
facilitation rather than interference. Thus, proactive interference can be
reduced (or even reversed) if we recollect the changes between information
learned originally and subsequently.
Retroactive interference
Anecdotal evidence that retroactive interference can be important in everyday life comes from travellers claiming exposure to a foreign language
reduces their ability to recall words in their own language. Misra et al.
(2012) studied bilinguals whose native language was Chinese and second
language was English. They named pictures in Chinese more slowly after
previously naming the same pictures in English. The evidence from event-­
related potentials suggested participants were inhibiting second-­language
names when naming pictures in Chinese.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 282
28/02/20 4:20 PM
Learning, memory and forgetting
283
As discussed earlier, Jacoby et al. (2015) found evidence for proactive
facilitation rather than interference when participants explicitly focused on
changes between the first and second lists (e.g., Cat–Dirt and Cat–Tree).
Jacoby et al. also found that instructing participants to focus on changes
between lists produced retroactive facilitation rather than interference.
Focusing on changes made it easier for participants to discriminate accurately between list 1 responses (e.g., Dirt) and list 2 responses (e.g., Tree).
Retroactive interference is generally greatest when the new learning
resembles previous learning. However, Dewar et al. (2007) obtained evidence of retroactive interference for a word list when participants performed an unrelated task (e.g., detecting tones) between learning and
memory test. Fatania and Mercer (2017) found children were more susceptible than adults to non-specific retroactive interference, perhaps
because they used fewer effective strategies (e.g., rehearsal) to minimise
such interference.
In sum, retroactive interference can occur in two ways:
(1)
(2)
learning material similar to the original learning material;
distraction involving expenditure of mental effort during the retention
interval (non-specific retroactive interference); this cause of retroactive interference is probably most common in everyday life.
Retrieval problems play a major role in producing retroactive interference.
Lustig et al. (2004) found that much retroactive interference occurs because
people find it hard to avoid retrieving information from the wrong list. How
can we reduce retrieval problems? Unsworth et al. (2013) obtained substantial retroactive interference when two word lists were presented prior to
recall of the first list. When focused retrieval was made easier (the words
in each list belonged to two separate categories such as animals and trees),
there was no retroactive interference.
Ecker et al. (2015) also tested recall of the first list following presentation of two word lists. When the time interval between lists was long rather
than short, recall performance was better. Focusing retrieval on first-list
words was easier when the two lists were more separated in time and thus
more discriminable.
Evaluation
There is convincing evidence for both proactive and retroactive interference, and progress has been made in identifying the underlying processes.
Proactive and retroactive interference depend in part on problems with
focusing retrieval exclusively on to-be-remembered information. Proactive
interference also depends on impaired encoding of information. Both types
of interference can be reduced by active strategies (e.g., focusing on changes
between the two lists).
What are the limitations of theory and research in this area? First,
interference theory explains why forgetting occurs but does not explain
why forgetting rate decreases over time. Second, we need clarification of
the roles of impaired encoding and impaired retrieval in producing interference effects. For example, there may be interaction effects with impaired
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 283
28/02/20 4:20 PM
284
Memory
KEY TERMS
encoding reducing the efficiency of retrieval. Third, the precise mechanisms
responsible for the reduced interference effects with various strategies have
not been identified.
Repression
Motivated forgetting
of traumatic or other
threatening events
(especially from
childhood).
Recovered memories
Childhood traumatic
memories forgotten for
several years and then
remembered in adult life.
Motivated forgetting
Interest in motivated forgetting was triggered by the bearded Austrian psychologist Sigmund Freud (1856–1939). His approach was narrowly focused
on repressed traumatic and other distressing memories. More recently, a
broader approach to motivated forgetting has been adopted. Much information in long-term memory is outdated and useless for present purposes
(e.g., where you have previously parked your car). Thus, motivated or
intentional forgetting can be adaptive.
Repression
Freud claimed threatening or traumatic memories often cannot gain access
to conscious awareness: this serves to reduce anxiety. He used the term
repression to refer to this phenomenon. He claimed childhood traumatic
memories forgotten for many years are sometimes remembered in adult life.
Freud found these recovered memories were often recalled during therapy.
However, many experts (e.g., Loftus & Davis, 2006) argue most recovered
memories are false memories referring to imaginary events.
Relevant evidence concerning the truth of recovered memories was
reported by Lief and Fetkewicz (1995). Of adult patients who admitted
reporting false recovered memories, 80% had therapists who had made
direct suggestions they had been subject to childhood sexual abuse. These
findings suggest recovered memories recalled inside therapy are more likely
to be false than those recalled outside.
Geraerts et al. (2007) obtained support for the above suggestion in a
study on three adult groups who had suffered childhood sexual abuse:
(1)
(2)
(3)
Suggestive therapy group: their recovered memories were recalled initially inside therapy.
Spontaneous recovery group: their recovered memories were recalled
initially outside therapy.
Continuous memory group: they had continuous memories of abuse
from childhood onwards.
Geraerts et al. (2007) argued the genuineness of the memories produced
could be assessed approximately by using corroborating evidence (e.g., the
abuser had confessed). Such evidence was available for 45% of the continuous memory group and 37% of the outside therapy group but for 0% of the
inside therapy group. These findings suggest recovered memories recalled
outside therapy are much more likely to be genuine than those recalled
inside therapy.
Geraerts (2012) reviewed research comparing women whose recovered memories were recalled spontaneously or in therapy. Of importance,
those with spontaneous recovered memories showed more ability to suppress unwanted memories and were more likely to forget they remembered
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 284
28/02/20 4:20 PM
285
Learning, memory and forgetting
something previously. Spontaneous recovery memories are often triggered
by relevant retrieval cues (e.g., returning to the scene of the abuse).
It seems surprising that women recovering memories outside therapy
failed for many years to remember childhood sexual abuse. However, it is
so only if the memories are traumatic (as Freud assumed). In fact, only 8%
of women with recovered memories regarded the relevant events as traumatic or sexual when they occurred (Clancy & McNally, 2005/2006). The
great majority described their memories as confusing or uncomfortable –
it seems reasonable that confusing or uncomfortable memories could be
­suppressed or simply ignored or forgotten.
In sum, many assumptions about recovered memories are false. As
McNally and Geraerts (2009, p. 132) concluded, “A genuine recovered
CSA [childhood sexual abuse] memory does not require repression, trauma,
or even complete forgetting.”
KEY TERM
Directed forgetting
Reduced long-term
memory caused by
instructions to forget
information that had been
presented for learning.
Directed forgetting
Directed forgetting is a phenomenon involving impaired long-term
memory triggered by instructions to forget information previously presented for learning. It is often studied using the item method: several words
are presented, each followed immediately by an instruction to remember or forget it. After the words have been presented, participants are
tested for recall or recognition memory of all the words. Memory performance is worse for the to-be-forgotten words than the to-be-remembered
ones.
What causes directed forgetting? The instructions cause learners to
direct their rehearsal processes to to-be-remembered items at the expense
of to-be-forgotten ones. Inhibitory processes are also involved. Successful
forgetting is associated with activation in areas within the right frontal
cortex involved in inhibition (Rizio & Dennis, 2013).
Directed forgetting is often unsuccessful. Rizio and Dennis (2017)
found 60% of items associated with forget instructions (Forget items) were
successfully recognised compared to 73% for items associated with remember instructions (Remember items). They then considered brain activation
for successfully recognised items associated with a feeling of remembering.
There was greater activation in prefrontal areas associated with effortful
processing for recognised Forget items than recognised Remember items.
This enhanced effort was required because participants engaged in inhibitory processing of Forget items at encoding even if they were subsequently
recognised.
Think/No-Think paradigm: suppression
Anderson and Green (2001) developed the Think/No-Think paradigm to
assess whether individuals can actively suppress memories. Participants
learn a list of cue–target word pairs (e.g., Ordeal–Roach; Steam–Train).
Then they receive the cues studied earlier (e.g., Ordeal; Steam) and try
to recall the associated words (e.g., Roach; Train) (respond condition) or
prevent them coming to mind (suppress condition). Some cues are not
­presented at this stage (baseline condition).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 285
28/02/20 4:20 PM
286
Memory
Finally, there are two testing conditions.
In the same-probe test condition, the original
cues are presented (e.g., Ordeal) and participants recall the corresponding target words
(e.g., Roach). In the independent-probe test
condition, participants are presented with a
novel category cue (e.g., Roach might be cued
with Insect–r).
If people can suppress unwanted memories, recall should be lower in the suppress
than the respond condition. Recall should also
be lower in the suppress condition than the
baseline condition. Anderson and Huddleston
(2012) carried out a meta-analysis of 47
Figure 6.26
experiments and found strong support for
Percentage of words correctly recalled across 32 articles in
both predictions (see Figure 6.26). However,
the respond, baseline and suppress conditions (in that order,
suppression attempts were often unsuccessful:
reading from left to right) with same probe and independent
in the suppress condition (same-probe test),
probe testing conditions.
82% of items were recalled.
From Anderson and Huddleston (2012). Reproduced with permission of
Springer Science+Business Media.
What strategies do individuals use to
produce successful suppression of unwanted
memories? Direct suppression (focusing on the cue word and blocking out
the associated target word) is an important strategy. Thought substitution
(associating a different non-target word with each cue word) is also very
common. Bergström et al. (2009) found these strategies were comparably
effective in reducing recall in the suppress condition.
Anderson et al. (2016b) pointed out the Think/No-Think paradigm is
unrealistic in that we rarely make deliberate efforts to retrieve suppressed
memories in everyday life. They argued it would be more realistic to assess
the involuntary or spontaneous retrieval of suppressed memories. They
found suppression was even more effective than voluntary retrieval at
reducing involuntary retrieval of such memories.
How do suppress instructions cause forgetting? Anderson (e.g.,
Anderson & Huddleston, 2012) argues inhibitory control is important –
the learned response to the cue word is inhibited. More specifically, he
assumes inhibitory control involves the dorsolateral prefrontal cortex and
other frontal areas. Prefrontal activation leads to reduced activation in the
hippocampus (of central importance in learning and memory).
There is much support for the above hypothesis. First, there is typically greater dorsolateral prefrontal activation during suppression
attempts than retrieval but reduced hippocampal activation (Anderson
et al., 2016b). Second, studies focusing on connectivity between the dorso­
lateral prefrontal cortex and hippocampus indicated the former influences the latter (Anderson et al., 2016b). Third, individuals whose left
and right hemisphere frontal areas involved in inhibitory control are
most closely coordinated exhibit superior memory suppression (Smith
et al., 2018).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 286
28/02/20 4:20 PM
287
Learning, memory and forgetting
Evaluation
Most individuals can actively suppress unwanted memories making them
less likely to be recalled on purpose or involuntarily. Progress has been
made in identifying the underlying mechanisms. Of most importance, inhibitory control mechanisms associated with the prefrontal cortex (especially
the dorsolateral prefrontal cortex) often reduce hippocampal activation
(Anderson et al., 2016b).
What are the limitations of theory and research in this area? First,
more research is required to clarify the reasons why suppression attempts
are often unsuccessful. Second, the reduced recall typically obtained in the
suppress condition is not always due exclusively to inhibitory processes.
Some individuals use thought substitution, a strategy which reduces recall
by producing interference or competition with the correct words (Bergström
et al., 2009). However, del Prete et al. (2015) argued (with supporting evidence) that inhibitory processes play a part in explaining the successful use
of thought substitution.
KEY TERM
Encoding specificity
principle
The notion that retrieval
depends on the overlap
between the information
available at retrieval and
the information in the
memory trace.
Cue-dependent forgetting
We often attribute forgetting to the weakness of relevant memory traces.
However, forgetting often occurs because we lack the appropriate retrieval
cues (cue-dependent forgetting). For example, suppose you have forgotten
the name of an acquaintance. If presented with four names, however, you
might well recognise the correct one.
Tulving (1979) argued that forgetting typically occurs when there is
a poor match or fit between memory-trace information and information
available at retrieval. This notion was expressed in his encoding ­specificity
principle: “The probability of successful retrieval of the target item is a
monotonically increasing function of informational overlap between the
information present at retrieval and the information stored in memory”
(p. 478). (If you are bewildered, note that a “monotonically increasing
function” is one that generally rises and does not
decrease at any point.)
The encoding specificity principle resembles the
notion of transfer-­
appropriate processing (Morris
et al., 1977; discussed earlier, see p. 263). The main
difference is that the latter focuses more directly on
the processes involved in memory.
Tulving (1979) assumed that when we store information about an event, we also store information
about its context. According to the encoding specificity
principle, memory is better when the retrieval context
is the same as that at learning. Note that context can
be external (the environment in which learning and
retrieval occur) or internal (e.g., mood state).
Eysenck (1979) argued that long-term memory
does not depend only on the match between information available at retrieval and stored information. The Endel Tulving.
extent to which the retrieval information allows us Courtesy of Anders Gade.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 287
28/02/20 4:20 PM
288
Memory
to discriminate between the correct memory trace and incorrect ones also
matters (discussed further below, see p. 298).
Findings
Recognition memory is typically much better than recall (e.g., we can recognise names we cannot recall). However, it follows from the encoding
specificity principle that recall will be better than recognition memory when
information in the recall cue overlaps more than that in the recognition cue
with memory-trace information. This surprising finding has been reported
many times. For example, Muter (1978) found people were better at recalling famous names (e.g., author of the Sherlock Holmes stories: Sir Arthur
Conan ___) than selecting the same names on a recognition test (e.g.,
DOYLE).
Much research indicates the importance of context in determining forgetting. On the assumption that information about mood state (internal
context) is often stored in the memory trace, there should be less forgetting when the mood state at learning and retrieval is the same rather than
different. This phenomenon (mood-state-dependent memory) has often
been reported (see Chapter 15). Godden and Baddeley (1975) manipulated
external context. Divers learned words on a beach or 10 feet underwater
and then recalled the words in the same or the other environment. Recall
was much better in the same environment.
However, Godden and Baddeley (1980) found no effect of context in a
very similar experiment testing recognition memory rather than recall. This
probably happened because the presence of the learned items on the recognition test provided powerful cues outweighing any impact of context.
Bramão and Johansson (2017) found that having the same picture
context at learning and retrieval enhanced memory for word pairs provided that each word pair was associated with a different picture context.
However, having the same picture context at learning and retrieval impaired
memory when each word pair was associated with the same picture context.
In this condition, the picture context did not provide useful information
specific to each of the word pairs being tested.
The encoding specificity principle can be expressed in terms of brain
activity: “Memory success varies as a function of neural encoding patterns
being reinstated at retrieval” (Staudigl et al., 2015, p. 5373). Several studies
have supported the notion that neural reinstatement is important for
memory success. For example, Wing et al. (2015) presented scenes paired
with matching verbal labels at encoding and asked participants to recall the
scenes in detail when presented with the labels at retrieval. Recall performance was better when brain activity at encoding and retrieval was similar
in the occipito-temporal cortex, which is involved in visual processing.
Limitations on the predictive power of neural reinstatement were shown
by Mallow et al. (2015) in a study on trained memory experts learning
the locations of 40 digits presented in a matrix. They turned the numbers
into concrete objects, which were then mentally inserted into a memorised
route. On average, they recalled 86% of the digits in the correct order.
However, none of the main brain areas active during encoding was activated during recall: thus, there was remarkably little neural reinstatement.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 288
28/02/20 4:20 PM
Learning, memory and forgetting
289
This happened because the processes occurring during encoding were very
different from (and much more complex than) those occurring at retrieval.
Suppose you learn paired associates including park–grove and are later
given the cue word park and asked to supply the target or response word
(i.e., grove). The response words to the other paired associates are either
associated with park (e.g., tree; bench; playground) or not associated. In the
latter case, the cue is uniquely associated with the target word and so your
task should be easier. There is high overload when a cue is associated with
several response words and low overload when it is only associated with
one response word. The target word is more distinctive when there is low
overload (distinctiveness was discussed earlier in the chapter).
Goh and Lu (2012) tested the above predictions. Encoding-retrieval
overlap was manipulated by using three item types. There was maximal
overlap when the same cue was presented at retrieval and learning (e.g.,
park–grove followed by park–???); this was an intra-list cue. There was
moderate overlap when the cue was a strong associate of the target word
(e.g., airplane–bird followed by feather–???). Finally, there was little overlap
when the cue was a weak associate of the target word e.g., roof–tin followed by armour–???).
As predicted from the encoding specificity principle, encoding-­retrieval
overlap was important (see Figure 6.27). However, cue overload was
also important – memory performance was much better when each cue
was uniquely associated with a single response word. According to the
encoding specificity principle, memory performance should be best when
­encoding-retrieval overlap is highest (i.e., with intra-list cues). However,
that was not the case with high overload.
Evaluation
Tulving’s approach based on the encoding
specificity principle has several strengths.
The overlap between memory-trace information and that available in retrieval cues often
determines retrieval success. The principle
has also received some support from neuroimaging studies and research on mood-state-­
dependent memory (see Chapter 15). The
notion that contextual information (external
and internal) strongly influences memory performance has proved correct.
What are the limitations with Tulving’s
approach? First, he exaggerated the importance of encoding-retrieval overlap as the
major factor determining remembering and
forgetting. Remembering typically involves
rejecting incorrect items as well as selecting correct ones. For this purpose, a cue’s
ability to discriminate among memory traces
is important (Bramão & Johansson, 2017;
Eysenck, 1979; Goh & Lu, 2012).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 289
Figure 6.27
Proportion of words recalled in high- and low-overload
conditions with intra-list cues, strong extra-list cues and weak
extra-list cues.
From Goh and Lu (2012). © 2011 Psychonomic Society, Inc. Reprinted
with the permission of Springer.
28/02/20 4:20 PM
290
Memory
KEY TERMS
Second, neural reinstatement of encoding brain activity at retrieval is
sometimes far less important than implied by the encoding specificity principle. This is especially the case when the processes at retrieval are very
different from those used at encoding (e.g., Mallow et al., 2015).
Third, Tulving’s assumption that retrieval-cue information is compared
directly with memory-trace information is oversimplified. For example,
you would probably use complex problem-solving strategies to answer
the question, “What did you do six days ago?”. Remembering is a more
dynamic, reconstructive process than implied by Tulving (Nairne, 2015a).
Fourth, as Nairne (2015a, p. 128) pointed out, “Each of us regularly
encounters events that ‘match’ prior episodes in our lives . . . but few of
these events yield instances of conscious recollection.” Thus, we experience
less conscious recollection than implied by the encoding specificity principle.
Fifth, it is not very clear from the encoding specificity principle why
context effects are often greater on recall than recognition memory (e.g.,
Godden & Baddeley, 1975, 1980).
Sixth, memory allegedly depends on “informational overlap” between
memory trace and retrieval environment, but this is rarely assessed.
Inferring the amount of informational overlap from memory performance
is circular reasoning.
Consolidation
A basic process within
the brain involved in
establishing long-term
memories; this process
lasts several hours or
more and newly formed
memories are fragile.
Retrograde amnesia
Impaired ability of
amnesic patients to
remember information
and events from the time
period prior to the onset
of amnesia.
Consolidation and reconsolidation
The theories discussed so far identify factors that cause forgetting, but
do not indicate clearly why the rate of forgetting decreases over time. The
answer may lie in consolidation. According to this theory, consolidation
“refers to the process by which a temporary, labile memory is transformed
into a more stable, long-lasting form” (Squire et al., 2015, p. 1).
According to the standard theory, episodic memories are initially
dependent on the hippocampus. However, during the process of consolidation, these memories are stored within cortical networks. This theory is
oversimplified: the process of consolidation involves bidirectional interactions between the hippocampus and the cortex (Albo & Gräff, 2018).
The key assumption of consolidation theory is that recently formed
memories are still being consolidated and so are especially vulnerable to
interference and forgetting. Thus, “New memories are clear but fragile and
old ones are faded but robust” (Wixted, 2004, p. 265).
Findings
Much research supports consolidation theory. First, the decreased rate of
forgetting typically found over time can be explained by assuming recent
memories are more vulnerable than older ones due to an ongoing consolidation process.
Second, there is research on retrograde amnesia, which involves
impaired memory for events occurring before amnesia onset. As predicted
by consolidation theory, patients with damage to the hippocampus often
show greatest forgetting for memories formed shortly before amnesia onset
and least for more remote memories (e.g., Manns et al., 2003). However,
the findings are somewhat mixed (see Chapter 7).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 290
28/02/20 4:20 PM
291
Learning, memory and forgetting
Squire et al. (1975) assessed recognition memory before and after
patients were given electroconvulsive therapy. Electroconvulsive therapy
reduced their memory for programmes up to 3 years beforehand from 65%
to 42% but had no effect on memories acquired 4 to 17 years earlier.
Third, individuals who drink excessively sometimes experience
“blackouts” (an almost total loss of memory for events occurring while
drunk). These blackouts probably indicate a failure to consolidate memories formed while intoxicated. As predicted, Moulton et al. (2005) found
long-term memory was impaired in individuals who drank alcohol shortly
before learning. However, alcohol consumption shortly after learning led
to improved memory. Alcohol may inhibit the subsequent formation of
new memories that would interfere with the consolidation process of memories formed just before alcohol consumption.
Fourth, consolidation theory predicts newly formed memories are
more susceptible to retroactive interference than older ones. There is some
support for this prediction when the interfering material is dissimilar to
that in the first learning task (Wixted, 2004).
Fifth, consolidation processes during sleep can enhance long-term
memory (Paller, 2017). Consider a technique known as target memory
reactivation: sleeping participants are exposed to auditory or olfactory cues
(the latter relate to the sense of smell) present in the context where learning
took place. This enhances memory consolidation by reactivating brain networks (including the hippocampus) involved in encoding new information
and increases long-term memory (Schouten et al., 2017).
KEY TERM
Reconsolidation
This is a new process of
consolidation occurring
when a previously
formed memory trace
is reactivated; it allows
that memory trace to be
updated.
Reconsolidation
Consolidation theory assumes memory traces are “fixated” because of a
consolidation process. However, accumulating evidence indicates that is
oversimplified. The current view is that consolidation involves progressive transformation of memory traces rather than simply fixation (Elsey
et al., 2018). Of most importance, reactivation of previously consolidated
memory traces puts them back into a fragile state that can lead to those
memory traces being modified (Elsey et al., 2018). Reactivation can lead to
­reconsolidation (a new consolidation process).
Findings
Reconsolidation is very useful for updating our knowledge because previous learning is now irrelevant. However, it can impair memory for the
information learned originally. This is how it happens. We learn some
information at Time 1. At Time 2, we learn additional information. If the
memory traces based on the information learned at Time 1 are activated
at Time 2, they immediately become fragile. As a result, some information
learned at Time 2 will mistakenly become incorporated into the memory
traces of Time 1 information and thus cause misremembering.
Here is a concrete example. Chan and LaPaglia (2013) had participants watch a movie about a fictional terrorist attack (original learning).
Subsequently, some recalled 24 specific details from the move (e.g., a terrorist using a hypodermic syringe) to produce reconsolidation (reactivation)
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 291
28/02/20 4:20 PM
292
Memory
whereas others performed an irrelevant distractor task (no reactivation).
After that, the participants encountered misinformation (e.g., the terrorist
used a stun gun) or neutral information (relearning). Finally, there was a
recognition-memory test for the information in the movie.
What did Chan and LaPaglia (2013) find? Misinformation during the
relearning phase led to substantial forgetting of information from the movie
in the reactivation/reconsolidation condition but not the no-­reactivation
condition. Reactivating memory traces from the movie triggered reconsolidation making those memory traces vulnerable to disruption from misinformation. In contrast, memory traces not subjected to reconsolidation
were not disrupted.
Scully et al. (2017) reported a meta-analytic review based on 34 experiments. As predicted, memory reactivation made memories susceptible to
behavioural interference leading to impaired memory performance for the
original learning event. These findings presumably reflect a reconsolidation
process. However, the mean effect size was small and some studies (e.g.,
Hardwicke et al., 2016) failed to obtain significant effects.
Evaluation
Consolidation theory explains why the rate of forgetting decreases over
time. It also successfully predicts that retrograde amnesia is greater for
recently formed memories and that retroactive interference effects are greatest when the interfering information is presented shortly after learning.
Consolidation processes during sleep are important in promoting long-term
memory and progress has been made in understanding the underlying processes (e.g., Vahdat et al., 2017).
Reconsolidation theory helps to explain how memories are updated
and no other theory can explain the range of phenomena associated
with reconsolidation (Elsey et al., 2018). It is a useful corrective to the
excessive emphasis of consolidation theory on the permanent storage of
memory traces. Reconsolidation may prove very useful in clinical contexts.
For example, patients with post-traumatic stress disorder (PTSD) typically
experience flashbacks (vivid re-experiencing of trauma-related events).
There is preliminary evidence that reconsolidation can be used successfully
in the treatment of PTSD (Elsey et al., 2018).
What are the limitations of this theoretical approach?
(1)
(2)
(3)
(4)
Forgetting does not depend solely on consolidation but also depends
on factors (e.g., encoding-retrieval overlap) not considered within the
theory.
Consolidation theory does not explain why proactive and retroactive
interference are greatest when two different responses are associated
with the same stimulus.
Much remains to be done to bridge the gap between consolidation
theory (with its focus on physical processes within the brain) and
approaches to forgetting that emphasise cognitive processes.
Consolidation processes are very complex and only partially understood. For example, it has often been assumed that cortical networks
become increasingly important during consolidation. In addition,
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 292
28/02/20 4:20 PM
Learning, memory and forgetting
(5)
(6)
(7)
293
however, consolidation is associated with a reorganisation within the
hippocampus (Dandolo & Schwabe, 2018).
How memory retrieval makes consolidated memories vulnerable and
susceptible to reconsolidation remains unclear (Bermúdez-Rattoni &
McGaugh, 2017).
It has not always been possible to replicate reconsolidation effects.
For example, Hardwicke et al. (2016) conducted seven studies but
found no evidence of reconsolidation.
Impaired memory performance for reactivated memory traces is
typically explained as indicating that reconsolidation has disrupted
storage of the original memory traces. However, it may also reflect
problems with memory retrieval (Hardwicke et al., 2016).
CHAPTER SUMMARY
•
Short-term vs long-term memory. The multi-store model assumes
there are separate sensory, short-term and long-term stores.
Much evidence (e.g., from amnesic patients) provides general
support for the model, but it is greatly oversimplified. According
to the unitary-store model, short-term memory is the temporarily
activated part of long-term memory. That is partially correct.
However, the crucial term “activation” is not precisely defined. In
addition, research on amnesic patients and neuroimaging studies
suggest the differences between short-term and long-term memory
are greater than assumed by the unitary-store model.
•
Working memory. Baddeley’s original working memory model
consisted of three components: an attention-like central executive,
a phonological loop holding speech-based information, and
a visuo-spatial sketchpad specialised for visual and spatial
processing. However, there are doubts as to whether the visuospatial sketchpad is as separate from other cognitive processes
and system as assumed theoretically. The importance of the
central executive can be seen in brain-damaged patients whose
central executive functioning is impaired (dysexecutive syndrome).
The notions of a central executive and dysexecutive syndrome are
oversimplified because they do not distinguish different executive
functions. More recently, Baddeley added an episodic buffer that
stores integrated information in multidimensional representations.
•
Working memory: executive functions and individual
differences. Individuals high in working memory capacity have
greater attentional control than low-capacity individuals, and so
are more resistant to external and internal distracting information.
There is a lack of conceptual clarity concerning the crucial
differences between high- and low-capacity individuals, and
potential costs associated with high capacity have rarely been
investigated. According to the unity/diversity framework, research
on executive functions indicates the existence of a common factor
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 293
28/02/20 4:20 PM
294
Memory
resembling concentration and two specific factors (shifting and
updating). Support for this framework has been obtained from the
psychometric, neuroimaging and genetic approaches. However,
research on brain-damaged patients provides only partial support
for the theoretical framework.
•
Levels of processing. Craik and Lockhart (1972) focused on
learning processes in their levels-of-processing theory. They
identified depth of processing, elaboration of processing and
distinctiveness of processing as key determinants of long-term
memory. Insufficient attention was paid to the relationship
between learning processes and those at retrieval and to the
role of distinctive processing in enhancing long-term memory.
The theory is not explanatory, and the reasons why depth of
processing influences explicit memory much more than implicit
memory remain unclear.
•
Learning through retrieval. Long-term memory is typically much
better when much of the learning period is devoted to retrieval
practice rather than study and the beneficial effects of retrieval
practice extend to relevant but non-tested information. The
testing effect is greater when it is hard to retrieve the to-beremembered information. Difficult retrieval probably enhances the
generation and retrieval of effective mediators. There is a reversal
of the testing effect when numerous items are not retrieved
during testing practice; this reversal is explained by the bifurcation
model.
•
Implicit learning. Behavioural findings support the distinction
between implicit and explicit learning even though most measures
of implicit learning are relatively insensitive. The brain areas
activated during implicit learning (e.g., striatum) often differ from
those activated during explicit learning (e.g., prefrontal cortex).
However, complexities arise because there are numerous forms
of implicit learning, and learning is often a mixture of implicit and
explicit. Amnesic patients provide some support for the notion of
implicit learning because they generally have less impairment of
implicit than explicit learning. Parkinson’s patients with damage
to the basal ganglia show the predicted impairment of implicit
learning. However, they generally also show impaired explicit
learning and so provide only limited information concerning the
distinction between implicit and explicit learning.
•
Forgetting from long-term memory. Some forgetting from longterm memory is due to a decay process operating mostly during
sleep. Strong proactive and retroactive interference effects have
been found inside and outside the laboratory. People use active
control processes to minimise proactive interference. Recovered
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 294
28/02/20 4:20 PM
Learning, memory and forgetting
295
memories of childhood abuse are more likely to be genuine when
recalled outside (rather than inside) therapy. Memories can be
deliberately suppressed with inhibitory control processes within
the prefrontal cortex producing reduced hippocampal activation.
Forgetting depends in part on encoding-retrieval overlap
(encoding specificity principle). However, retrieval is often a more
complex and dynamic process than implied by this principle.
Consolidation theory explains the form of the forgetting curve but
de-emphasises the role of cognitive processes. Reconsolidation
theory explains how memories are updated and provides a
useful corrective to consolidation theory’s excessive emphasis on
permanent storage. However, the complex processes involved in
consolidation and reconsolidation are poorly understood.
FURTHER READING
Baddeley, A.D., Eysenck, M.W. & Anderson, M.C. (2020). Memory (3rd edn).
Abingdon, Oxon.: Psychology Press. The main topics covered in this chapter
are discussed in this textbook (for example, Chapters 8–10 are on theories of
forgetting).
Eysenck, M.W. & Groome, D. (eds) (2020). Forgetting: Explaining Memory
Failure. London: Sage. This edited book focuses on causes of forgetting in
numerous laboratory and real-life situations. Chapter 1 by David Groome
and Michael Eysenck provides an overview of factors causing forgetting and a
discussion of the potential benefits of forgetting.
Friedman, N.P. & Miyake, A. (2017). Unity and diversity of executive functions:
Individual differences as a window on cognitive structure. Cortex, 86, 186–204.
Naomi Friedman and Akira Miyake provide an excellent review of our current
understanding of the major executive functions.
Karpicke, J.D. (2017). Retrieval-based learning: A decade of progress. In J. Wixted
(ed.), Learning and Memory: A Comprehensive Reference (2nd edn; pp. 487–514).
Amsterdam: Elsevier. Jeffrey Karpicke provides an up-to-date account of the
testing effect and other forms of retrieval-based learning.
Morey, C.C. (2018). The case against specialised visual-spatial short-term memory.
Psychological Bulletin, 144, 849–883. Candice Morey discusses a considerable
range of evidence apparently inconsistent with Baddeley’s working memory
model (especially the visuo-spatial sketchpad).
Norris, D. (2017). Short-term memory and long-term memory are still different.
Psychological Bulletin, 143, 992–1009. Dennis Norris discusses much evidence
supporting a clear-cut separation between short-term and long-term memory.
Oberauer, K., Lewandowsky, S., Awh, E., Brown, G.D.A., Conway, A., Cowan,
N., (2018). Benchmarks for models of short-term and working memory.
Psychological Bulletin, 144, 885–958. This article provides an excellent account of
the key findings relating to short-term and working memory that would need to
be explained by any comprehensive theory.
Shanks, D.R. (2017). Regressive research: The pitfalls of post hoc data selection
in the study of unconscious mental processes. Psychonomic Bulletin & Review,
24, 752–775. David Shanks discusses problems involved in attempting to
demonstrate the existence of implicit learning.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 295
28/02/20 4:20 PM
Chapter
7
Long-term memory
systems
INTRODUCTION
We have an amazing variety of information stored in long-term memory
(e.g., details of our last summer holiday; Paris is the capital of France; how
to ride a bicycle). Much of this information is stored in schemas or organised packets of knowledge used extensively during language comprehension
(see Chapter 10).
This remarkable variety is inconsistent with Atkinson and Shiffrin’s
(1968) notion of a single long-term memory store (see Chapter 6). More
recently, there has been an emphasis on memory systems (note the plural!).
Each memory system is distinct, having its own specialised brain areas
and being involved in certain forms of learning and memory. Schacter and
Tulving (1994) identified four memory systems: episodic memory; semantic memory; the perceptual representation system; and procedural memory.
Since then, there has been a lively debate concerning the number and
nature of long-term memory systems.
Amnesia
Case study:
Amnesia and long-term
memory
KEY TERMS
Amnesia
A condition caused by
brain damage in which
there is severe impairment
of long-term memory
(mostly declarative
memory).
Korsakoff’s syndrome
Amnesia (impaired longterm memory) caused by
chronic alcoholism.
Suggestive evidence for several long-term memory systems comes from
brain-damaged patients with amnesia. If you are a movie fan you may
have mistaken ideas about the nature of amnesia (Baxendale, 2004). In the
movies, serious head injuries typically cause characters to forget the past
while still being fully able to engage in new learning. In the real world,
however, new learning is typically greatly impaired as well.
Bizarrely, many movies suggest the best cure for amnesia caused by
severe head injury is to suffer another blow to the head! Approximately
40% of Americans believe a second blow to the head can restore memory
in patients whose amnesia was caused by a previous blow (Spiers, 2016).
Patients become amnesic for various reasons. Closed head injury is
the most common cause. However, patients with closed head injury often
have several other cognitive impairments making it hard to interpret their
memory deficits. As a result, much research has focused on patients whose
amnesia is due to chronic alcohol abuse (Korsakoff’s syndrome).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 296
28/02/20 4:20 PM
Long-term memory systems
297
IN THE REAL WORLD: THE FAMOUS CASE OF HM
HM (Henry Gustav Molaison) was the most-studied
amnesic patient of all time. When he was 27, his epileptic
condition was treated by surgery involving removal of
his medial temporal lobes including the hippocampus.
This affected his memory more dramatically than his
general cognitive functioning (e.g., IQ). Corkin (1984,
p. 255) reported many years later that HM “does not
know where he lives, who cares for him, or where he ate
his last meal . . . in 1982 he did not recognise a picture
of himself”.
Research on HM (starting with Scoville and Milner,
1957) transformed our understanding of long-term
memory in several ways (see Eichenbaum, 2015):
(1)
Scoville and Milner’s article was “the origin of
modern neuroscience research on memory”
Henry Molaison, the most famous amnesic
(Eichenbaum, 2015, p. 71).
patient of all time. Research on him
transformed our knowledge of the workings
(2)
HM showed reasonable learning and long-term
of long-term memory.
retention on a mirror-tracing task (drawing objects
seen only in reflection) (Corkin, 1968). He also
showed learning on the pursuit rotor (manual tracking of a moving target) suggesting there
is more than one long-term memory system.
(3)HM had essentially intact short-term memory supporting the important distinction between
short-term and long-term memory (see Chapter 6).
(4)HM had generally good memory for events occurring a long time before his operation. This
suggests memories are not stored permanently in the hippocampus.
Research on HM led to an exaggerated emphasis on the role of the hippocampus in memory
(Aggleton, 2013). His memory problems were greater than those experienced by the great majority
of amnesic patients with hippocampal damage. This probably occurred mainly because surgery
removed other areas (e.g., the parahippocampal region) and possibly because the anti-epileptic
drugs used by HM damaged brain cells relevant to memory (Aggleton, 2013).
The notion that HM’s brain damage exclusively affected his long-term memory for memories
formed after his operation is oversimplified (Eichenbaum, 2015). Evidence suggests HM had various
deficits in his perceptual and cognitive capacities. It also indicates he had impaired memory for
public and personal events occurring prior to his operation. Thus, HM’s impairments were more
widespread than generally assumed.
In sum, we need to beware of “the myth of HM” (Aly & Ranganath, 2018, p. 1), which consists of
two mistaken assumptions. First, while the hippocampus and medial temporal lobe are important
in episodic memory (memory for personal events), episodic memory depends on a network that
includes several other brain regions. For example, Vidal-Piñeiro et al. (2018) found that long-lasting
episodic memories were associated with greater activation at encoding in inferior lateral parietal
regions as well as the hippocampus.
Second, the role of the hippocampus is not limited to memory. It also includes “other
functions, such as perception, working memory, and implicit memory [memory not involving
conscious ­recollection]” (Aly & Ranganath, 2018, p. 1). This issue is discussed later (see pp. 332–336).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 297
28/02/20 4:20 PM
298
Memory
Korsakoff patients are said to suffer from the “amnesic syndrome”:
KEY TERM
Anterograde amnesia
Reduced capacity for new
learning (and subsequent
remembering) after the
onset of amnesia.
●●
●●
●●
●●
anterograde amnesia: a marked impairment in the ability to learn and
remember information encountered after the onset of amnesia;
retrograde amnesia: problems in remembering events prior to amnesia
onset (see Chapter 6);
only slightly impaired short-term memory on measures such as digit
span (repeating back a random string of digits);
some remaining learning ability (e.g., motor skills).
The relationship between anterograde and retrograde amnesia is typically
strong. Smith et al. (2013) obtained a correlation of +.81 between the two
forms of amnesia in patients with damage to the medial temporal lobes.
However, new learning is more easily disrupted by limited brain damage
within the medial temporal lobes than is memory for previously acquired
information. This probably occurs because there has typically been consolidation (see Glossary) of previously acquired information prior to amnesia
onset.
Further evidence the brain areas (and processes) underlying the two
forms of amnesia differ was provided by Buckley and Mitchell (2016).
Damage to the retrosplenial cortex (connected to the hippocampus) caused
retrograde amnesia but not anterograde amnesia.
There are problems with using Korsakoff patients. First, amnesia typically has a gradual onset caused by an increasing deficiency of the vitamin
thiamine. Thus, it is often unclear whether certain past events occurred
before or after amnesia onset.
Second, brain damage in Korsakoff patients typically involves the
medial temporal lobes (especially the hippocampus; see Figure 7.1).
However, there is often damage to the frontal lobes as well producing
various cognitive deficits (e.g., impaired cognitive control). This complicates interpreting findings from these patients.
Third, the precise area of brain damage (and thus the pattern of
memory impairment) varies across patients. For example, some Korsakoff
patients exhibit confusion, lethargy and inattention.
Figure 7.1
Damage to brain areas
within and close to the
medial temporal lobes
(indicated by asterisks)
producing amnesia.
Republished with permission of
Routledge Publishing Inc.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 298
28/02/20 4:20 PM
299
Long-term memory systems
Fourth, research on Korsakoff patients does not provide direct evidence concerning the impact of brain damage on long-term memory. Brain
plasticity and learning of compensatory strategies mean patients can gradually alleviate some memory problems (Fama et al., 2012).
In sum, the study of amnesic patients has triggered several theoretical developments. For example, the distinction between declarative and
non-declarative memory (see below) was originally proposed in part
because of findings from amnesic patients.
Declarative vs non-declarative memory
Historically, the most important distinction between different types of
long-term memory was between declarative memory and non-­declarative
memory (Squire & Dede, 2015). Declarative memory involves conscious
recollection of events and facts – it often refers to memories that can be
“declared” or described but also includes memories that cannot be described
verbally. Declarative memory is sometimes referred to as explicit memory
and involves knowing that something is the case.
The two main forms of declarative memory are episodic and semantic memory. Episodic memory is concerned with personal experiences of
events that occurred in a given place at a specific time. Semantic memory
consists of general knowledge about the world, concepts, language and
so on.
In contrast, non-declarative memory does not involve conscious recollection. We typically obtain evidence of non-declarative memory by observing changes in behaviour. For example, consider someone learning to ride a
bicycle. Their cycling ability improves over time even though they cannot
consciously recollect what they have learned. Non-declarative memory is
sometimes known as implicit memory.
There are various forms of non-declarative or implicit memory. One
is memory for skills (e.g., piano playing; bicycle riding). Such memory
involves knowing how to perform certain actions and is known as procedural memory. Another form of non-declarative memory is priming (also
known as repetition priming): it involves facilitated processing of a stimulus presented recently (Squire & Dede, 2015, p. 7). For example, it is easier
to identify a picture as a cat if a similar picture of a cat has been presented
previously. The earlier picture is a prime facilitating processing when the
second cat picture is presented.
Amnesic patients find it much harder to form and remember declarative than non-declarative memories. For example, HM (discussed above)
had extremely poor declarative memory for personal events occurring
after his operation and for faces of those who became famous in recent
decades. In stark contrast, he had reasonable learning ability and memory
for non-declarative tasks (e.g., mirror tracing; the pursuit rotor; perceptual
identification aided by priming).
This chapter contains detailed discussion of declarative and non-­
declarative memory. Figure 7.2 presents the hugely influential traditional
theoretical account, which strongly influenced most of the research discussed in this chapter. However, it is oversimplified. At the end of this
chapter, we discuss its limitations and possible new theoretical developments
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 299
KEY TERMS
Declarative memory
A form of long-term
memory that involves
knowing something
is the case; it involves
conscious recollection
and includes memory for
facts (semantic memory)
and events (episodic
memory); sometimes
known as explicit memory.
Non-declarative memory
Forms of long-term
memory that influence
behaviour but do not
involve conscious
recollection (e.g., priming;
procedural memory); also
known as implicit memory.
Procedural memory
This is memory concerned
with knowing how and it
includes the knowledge
required to perform
skilled actions.
Priming
Facilitating the processing
of (and response) to
a target stimulus by
presenting a stimulus
related to it shortly
beforehand.
Repetition priming
The finding that
processing of a stimulus
is facilitated if it has been
processed previously.
28/02/20 4:20 PM
300
Memory
Figure 7.2
The traditional theoretical account based on dividing long-term memory into two broad classes: declarative and nondeclarative. Declarative memory is divided into episodic and semantic memory, whereas non-declarative memory
is divided into procedural memory, priming, simple classical conditioning, and habituation and sensitisation. The
assumption that there are several forms of long-term memory is accompanied by the further assumption that different
brain regions are associated with each one.
From Henke (2010). Reprinted with permission from Nature Publishing Group.
in the section entitled “Beyond memory systems and declarative vs non-­
declarative memory” (pp. 332–340).
DECLARATIVE MEMORY
KEY TERMS
Episodic memory
A form of long-term
memory concerned with
personal experiences or
episodes occurring in a
given place at a specific
time.
Semantic memory
A form of long-term
memory consisting of
general knowledge about
the world, concepts,
language and so on.
Declarative or explicit memory encompasses numerous different kinds
of memories. For example, we remember what we had for breakfast this
morning and that “le petit déjeuner” is French for “breakfast”. Tulving
(1972) argued the crucial distinction within declarative memory was
between what he termed “episodic memory” and “semantic memory” (see
Eysenck & Groome, 2015b).
What is episodic memory? According to Tulving (2002, p. 5), “It makes
possible mental time travel through subjective time from the present to the
past, thus allowing one to re-experience . . . one’s own previous experiences.”
Nairne (2015b) identified the three “Ws” of episodic memory: remembering
a specific event (what) at a given time (when) in a particular place (where).
What is semantic memory? It is “an individual’s store of knowledge
about the world. The content of semantic memory is abstracted from actual
experience and is therefore said to be conceptual, that is, generalised and
without reference to any specific experience” (Binder & Desai, 2011, p. 527).
What is the relationship between episodic memory and autobiographical memory (discussed in Chapter 8)? Both are concerned with personal
past experiences. However, much information in episodic memory is trivial
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 300
28/02/20 4:21 PM
Long-term memory systems
301
and is remembered only briefly. In contrast, autobiographical memory typically stores information for long periods of time about personally significant events and experiences.
What is the relationship between episodic and semantic memory?
According to Tulving (2002), episodic memory developed out of semantic
memory during the course of evolution. It also develops later in childhood
than semantic memory.
Episodic vs semantic memory
If episodic and semantic memory form separate memory systems, they
should differ in several ways. Consider the ability of amnesic patients to
acquire new episodic and semantic memories. Spiers et al. (2001) reviewed
147 cases of amnesia involving damage to the hippocampus or fornix.
Episodic memory was impaired in all cases, whereas many patients had relatively small impairment of semantic memory.
The above difference in the impact of hippocampal brain damage suggests episodic and semantic memory are distinctly different. However, the
greater vulnerability of episodic memories than semantic ones may occur
mainly because episodic memories are formed from a single experience
whereas semantic memories often combine several learning opportunities.
We would have stronger evidence if we discovered brain-damaged
patients with very poor episodic memory but essentially intact semantic
memory. Elward and Vargha-Khadem (2018) reviewed research on patients
with developmental amnesia (amnesia due to hippocampal damage at a
young age). These patients, “typically show relatively preserved semantic
memory and factual knowledge about the natural world despite severe
impairments in episodic memory” (p. 23).
Vargha-Khadem et al. (1997) studied two patients (Beth and Jon)
with developmental amnesia. Both had very poor episodic memory for
the day’s activities and television programmes, but their semantic memory
(language development; literacy; and factual knowledge) were within the
normal range. However, Jon had various problems with semantic memory
(Gardiner et al., 2008). His rate of learning was slower than that of healthy
controls when provided with facts concerning geographical, historical and
other kinds of knowledge. Similarly slow learning in semantic memory
has been found in most patients with developmental amnesia (Elward &
Vargha-Khadem, 2018).
The findings from patients with developmental amnesia are surprising given the typical finding that individuals with an intact hippocampus
depend on it for semantic memory acquisition (Baddeley et al., 2020).
Why, then, is their semantic memory reasonably intact? Two answers have
been proposed. First, developmental amnesics typically devote more time
than healthy individuals to repeated study of factual information. This
may produce durable long-term semantic memories via a process of consolidation (see Glossary and Chapter 6).
Second, episodic memory may depend on the hippocampus whereas
semantic memory depends on the underlying entorhinal, perirhinal and
parahippocampal cortices. Note the brain damage suffered by Jon and Beth
centred on the hippocampus. Bindschaedler et al. (2011) studied a boy (VI)
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 301
28/02/20 4:21 PM
302
Memory
with severe hippocampal damage but relatively preserved ­perirhinal and
entorhinal cortex. His performance on semantic memory tasks (e.g., vocabulary) improved at the normal rate even though his performance was very poor
on episodic memory tasks. Many amnesics may have severe problems with
episodic and semantic memory because the hippocampus and underlying cortices are both damaged. This is very likely given the two areas are adjacent.
Curot et al. (2017) applied electrical brain stimulation to memory-­
related brain areas to elicit reminiscences. Semantic memories were mostly
elicited by stimulation of the rhinal cortex (including the entorhinal and
perirhinal cortices). In contrast, episodic memories were only elicited by
stimulation of the hippocampal region.
Blumenthal et al. (2017) studied a female amnesic (HC) with severe
hippocampal damage but intact perirhinal and entorhinal cortices. She was
given the semantic memory task of generating intrinsic features of objects
(e.g., shape; colour) and extrinsic features (e.g., how the object is used).
HC performed comparably to controls with intrinsic features but significantly worse than controls with extrinsic features. Thus, the hippocampus
is important for learning some aspects of semantic memory.
How can we explain Blumenthal et al.’s (2017) findings? The hippocampus is involved in learning associations between objects and contexts
in episodic memory (see final section of the chapter, pp. 332–340). In a
similar fashion, generating extrinsic features of objects requires learning
associations between objects and their uses.
Retrograde amnesia
We turn now to amnesic patients’ problems with remembering information
learned prior to the onset of amnesia: retrograde amnesia (see Glossary and
Chapter 6). Many amnesic patients have much greater retrograde amnesia
for episodic than semantic memories. Consider the amnesic patient KC.
According to Tulving (2002, p. 13), “He cannot recollect any personally
experienced events . . ., whereas his semantic knowledge [e.g. general world
knowledge] acquired before the critical accident is still reasonably intact.”
There is much support for the notion that remote semantic memories
formed prior to the onset of amnesia are essentially intact (see Chapter 6).
For example, amnesic patients often perform comparably to healthy
controls on semantic memory tasks (e.g., vocabulary knowledge; object
naming). However, Klooster and Duff (2015) argued such findings may
reflect the use of insensitive measures. In their study, Klooster and Duff
gave amnesic patients the semantic memory task of listing features of
common objects. On average, amnesic patients listed only 50% as many
features as healthy controls.
Retrograde amnesia for episodic memories in amnesic patients often
spans several years and has a temporal gradient, i.e., older memories
showing less impairment (Bayley et al., 2006). In contrast, retrograde
amnesia for semantic memories is generally small except for knowledge
acquired shortly before amnesia onset (Manns et al., 2003).
In sum, retrograde amnesia is typically greater for episodic than semantic memories. However, semantic memories can be subject to ­retrograde
amnesia when assessed using sensitive measures.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 302
28/02/20 4:21 PM
303
Long-term memory systems
Semantic dementia
KEY TERMS
Patients with semantic dementia have severe loss of concept knowledge
from semantic memory. However, their episodic memory and most cognitive functions (e.g., attention; non-verbal problem solving) are reasonably intact initially. Semantic dementia always involves degeneration of the
anterior temporal lobes. Areas such as the perirhinal and entorhinal cortex
are probably involved in the formation of semantic memories. In contrast,
the anterior temporal lobes are where such memories are stored semi-­
permanently. However, episodic memory and executive functioning are
­reasonably intact in the early stages.
Patients with semantic dementia have great problems accessing information about concepts stored in semantic memory (Lambon Ralph et al.,
2017). However, their performance on many episodic memory tasks is
good (e.g., they have intact ability to reproduce complex visual designs:
Irish et al., 2016). They also have comparable performance to healthy controls in remembering what tasks they performed 24 hours earlier and where
those tasks were performed (Adlam et al., 2009).
Landin-Romero et al. (2016) reviewed relevant research. The good episodic memory of semantic dementia patients probably occurs because they
make effective use of the frontal and parietal regions within the brain.
In sum, we have an apparent double dissociation (see Glossary).
Amnesic patients have very poor episodic memory but often reasonably
intact semantic memory. In contrast, patients with semantic dementia
have very poor semantic memory but reasonably intact episodic memory.
However, the double dissociation is only approximate and it is hard to
interpret the somewhat complex findings.
Semantic dementia
A condition involving
damage to the anterior
temporal lobes involving
widespread loss of
information about the
meanings of words and
concepts; however,
episodic memory and
executive functioning are
reasonably intact initially.
Personal semantics
Aspects of one’s personal
or autobiographical
memory that combine
elements of episodic
memory and semantic
memory.
Interdependence of episodic and semantic memory
We have seen the assumption that there are separate episodic and semantic
memory systems is oversimplified. Here we focus on the interdependence of
episodic and semantic memory. In a study by Renoult et al. (2016), participants answered questions belonging to four categories: (1) unique events
(e.g., “Did you drink coffee this morning?”); (2) general factual knowledge
(e.g., “Do many people drink coffee?”); (3) autobiographical facts (e.g., “Do
you drink coffee every day?”); and (4) repeated personal events (e.g., “Have
you drunk coffee while shopping?”). Category 1 involves episodic memory
and category 2 involves semantic memory. Categories 3 and 4 involve personal semantic memory (a combination of episodic and semantic memory).
Renoult et al. (2016) used event-related potentials (ERPs; see Glossary)
during retrieval for all four question categories. There were clear-cut ERP
differences between categories 1 and 2. Of most importance, ERP patterns
for category 3 and 4 questions were intermediate between those for categories 1 and 2 suggesting they required retrieval from both episodic and
semantic memory.
Tanguay et al. (2018) reported similar findings. They interpreted the
various findings with reference to personal semantics: aspects of autobiographical memory resembling semantic memory in being factual but also
resembling episodic memory in being “idiosyncratically personal” (p. 65).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 303
28/02/20 4:21 PM
304
Memory
KEY TERM
Greenberg et al. (2009) showed episodic and semantic memory can be
interdependent. Amnesic patients and healthy controls generated as many
members as possible from various categories. Some categories (e.g., kitchen
utensils) were selected so that performance would benefit from using episodic memory, whereas other categories (e.g., things typically red) seemed
less likely to involve episodic memory. Amnesic patients performed worse
than controls especially with categories potentially benefitting from episodic memory. With those categories, controls were much more likely than
patients to use episodic memory as an efficient organisational strategy to
generate category members.
Semanticisation
The phenomenon of
episodic memories
changing into semantic
memories over time.
Semanticisation of episodic memory
Robin and Moscovitch (2017) argued initially episodic memories are transformed into semantic memories over time. For example, the first time you
went to a seaside resort, you formed episodic memories of your experiences
there. As an adult, while you still remember visiting that seaside resort as
a child, you have probably forgotten the personal and contextual information originally associated with your childhood memories. Thus, what was
an episodic memory has become a semantic memory. This change involves
semanticisation of episodic memory and suggests episodic and semantic
memories are related.
Robin and Moscovitch (2017) argued the process of semanticisation
often involves a memory transformation from an initially detail-rich episodic representation to a gist-like or schematic representation involving
semantic memory. They provided a theoretical framework within which to
understand these processes (see Figure 7.3).
There is much support for this theoretical approach (discussed later).
For example, Gilboa and Marlatte (2017) found in a meta-analytic review
that the ventromedial prefrontal cortex is typically involved in schema processing within semantic memory.
Sekeres et al. (2016) tested memory for movie clips. There was much
more forgetting of peripheral detail over time (episodic memory) than of
the gist (semantic memory). St-Laurent et al. (2016) found amnesic patients
with hippocampal damage had reduced processing of episodic perceptual
details.
Robin and Moscovitch (2017) discussed research focusing on changes
in brain activation during recall as time since learning increased. As predicted, there was reduced anterior hippocampal activation but increased
activation in the ventromedial prefrontal cortex. These findings reflected
increased use of gist or schematic information compensating for reduced
availability of details.
Overall evaluation
There is some support for separate episodic and semantic memory systems
in the double dissociation involving amnesia and semantic dementia: the
former is associated with greater impairment of episodic than semantic memory whereas the latter is associated with the opposite pattern.
However, there are complications in interpreting these findings and the
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 304
28/02/20 4:21 PM
305
Long-term memory systems
Particular, detailed cues:
Cake at
10th
birthday
party
Generic cues:
House
Party
Posterior
neocortex
Particular, coarse cues:
Mom’s
house
Perceptual
representations
10th
birthday
party
pHPC
vmPFC
Schema /
monitoring
EL
O
AB
TIO
RA
N
Details
CO
N
STR
UC
aHPC
TIO
N
Gist
A
EL
BO
TIO
RA
N
WEAK ELABORATION
Figure 7.3
Episodic memories (involving perceptual representations and specific details) depend
on the posterior hippocampus (pHPC); semantic memories (involving schemas)
depend on the ventromedial prefrontal cortex (vmPFC); and gist memories (combining
episodic and semantic memory) depend on the anterior hippocampus (aHPC). There are
interactions between these forms of memory caused by processes such as construction
and elaboration.
From Robin and Moscovitch (2017). Reprinted with permission of Elsevier.
double dissociation is only approximate. In addition, episodic and semantic
memory are often interdependent at learning and during retrieval, making
it hard to disentangle their respective contributions.
EPISODIC MEMORY
How can we assess someone’s episodic memory following learning (e.g., a
list of to-be-remembered items)? Recognition and recall are the two main
types of episodic memory test. Recognition-memory tests generally involve
presenting various items with participants deciding whether each one was
presented previously (often 50% were presented previously and 50% were
not). As we will see, more complex forms of recognition-memory test have
also been used.
There are three types of recall test: free recall, serial recall and cued recall.
Free recall involves producing previously presented items in any order in
the absence of specific cues. Serial recall involves producing previously presented items in the order they were presented. Cued recall involves producing previously presented items to relevant cues. For example, “cat–table”
might be presented at learning and the cue, “cat–???” at test.
KEY TERMS
Free recall
A test of episodic
memory in which
previously presented
to-be-remembered items
are recalled in any order.
Serial recall
A test of episodic memory
in which previously
presented to-beremembered items must
be recalled in the order of
presentation.
Cued recall
A test of episodic
memory in which
previously presented
to-be-remembered items
are recalled in response
to relevant cues.
306
Memory
Recognition memory: familiarity and recollection
Recognition memory can involve recollection or familiarity. Recollection
involves recognition based on conscious retrieval of contextual information
whereas such conscious retrieval is lacking in familiarity-based recognition
(Brandt et al., 2016). Here is a concrete example. Several years ago, the first
author walked past a man in Wimbledon, and was immediately confident
he recognised him. However, he simply could not think where he had previously seen the man. After some thought (this is the kind of thing academic
psychologists think about!), he realised the man was a ticket-office clerk at
Wimbledon railway station. Initial recognition based on familiarity was
replaced by recognition based on recollection.
The remember/know procedure (Migo et al., 2012) has often been
used to assess familiarity and recollection. List learning is followed by
a test where participants indicate whether each item is “Old” or “New”.
Items identified as “Old” are followed by a know or remember judgement.
Typical instructions require participants to respond know if they recognise
the list words, “but these words fail to evoke any specific conscious recollection from the study list” (Rajaram, 1993, p. 102). They should respond
remember if “the ‘remembered’ word brings back to mind a particular
association, image, or something more personal from the time of study”
(Rajaram, 1993, p. 102).
Dunn (2008) proposed a single-process account: strong memory traces
give rise to recollection judgements whereas weak memory traces give rise
to familiarity judgements. As we will see, however, most evidence supports
a dual- or two-process account, namely, that recollection and familiarity
involve different processes.
Brain mechanisms
Diana et al. (2007) provided an influential theoretical account of the key
brain areas involved in recognition memory in their binding-of-item-andcontext model (see Figure 7.4):
(1)
(2)
(3)
The perirhinal cortex receives information about specific items (what
information needed for familiarity judgements).
The parahippocampal cortex receives information about context
(where information useful for recollection judgements).
The hippocampus receives what and where information (both of great
importance to episodic memory) and binds them to form item–context
associations permitting recollection.
Findings
Functional neuroimaging studies support the above model. In a meta-­
analytic review, Diana et al. (2007) found recollection was associated
with more activation in parahippocampal cortex and the hippocampus
than perirhinal cortex. In contrast, familiarity was associated with more
activation in perirhinal cortex than the parahippocampal cortex or
hippocampus.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 306
28/02/20 4:21 PM
Long-term memory systems
307
Figure 7.4
(a) Locations of the
hippocampus (red), the
perirhinal cortex (blue) and
the parahippocampal cortex
(green); (b) the binding-ofitem-and-context model.
From Diana et al. (2007).
Reprinted with permission of
Oxford University Press.
Neuroimaging evidence is correlational and so cannot show the hippocampus is more essential to recollection than familiarity. In principle,
more direct evidence could be obtained from brain-damaged patients.
Bowles et al. (2010) studied amnesic patients with severe hippocampal
damage. As predicted, these patients had significantly impaired recollection
but not familiarity. However, other research has typically found amnesic
patients with medial temporal lobe damage have a minor impairment in
familiarity but a much larger one in recollection (Skinner & Femandes,
2007).
According to the model, patients with damage to the perirhinal cortex
should have largely intact recollection but impaired familiarity. Bowles
et al. (2011) tested this prediction with a female patient, NB. As predicted,
her recollection performance was consistently intact. However, she had
impaired familiarity for verbal materials. Brandt et al. (2016) studied a
female patient, MR, with selective damage to entorhinal cortex (adjacent
to perirhinal cortex and previously linked to familiarity). As predicted,
MR had impaired familiarity for words but intact recollection.
308
Memory
According to the original model, the parahippocampal cortex is limited
to processing spatial context (i.e., where information). This is too limited.
Diana (2017) used a non-spatial context – words were accompanied by
contextual questions (e.g., “Is this word common or uncommon?”). There
was greater parahippocampal activation for words associated with correct
(rather than incorrect) context memory. Since the context (i.e., contextual
questions) was non-spatial, the role of the parahippocampal cortex in episodic memory extends beyond spatial information.
Dual-process models assume the hippocampus is required to process
relationships between items and to bind items to contexts but is not
required to process items in isolation. There are two potential problems
with these assumptions (Bird, 2017). First, the term “item” is often imprecisely defined. Second, these models often de-emphasise the importance of
the learning material (e.g., faces; names; pictures).
Smith et al. (2014) compared immediate memory performance in
healthy controls and amnesic patients with hippocampal damage. Fifty
famous faces were presented followed by a recognition-memory test.
The amnesic patients performed comparably to controls for famous
faces not identified as famous but were significantly impaired for famous
faces i­dentified as famous. A plausible interpretation is that unfamiliar
faces (i.e., unknown famous faces) are processed as isolated items and
so do not require hippocampal processing. In contrast, known famous
faces benefit from additional contextual processing dependent on the
hippocampus.
Bird (2017, p. 161) concluded his research review as follows: “There
are no clear-cut examples of materials other than [unfamiliar] faces
that can be recognised using extrahippocampal [outside the hippocampus] ­
familiarity processes.” This is because most “items” are not processed in isolation but require the integrative processing provided by the
hippocampus.
Scalici et al. (2017) reviewed research on the involvement of the
prefrontal cortex in familiarity and recollection. There was greater
­familiarity-based than recollection-based activity in the ventromedial and
dorsomedial prefrontal cortex and lateral BA10 (at the front of the prefrontal cortex) whereas the opposite was the case in medial BA10 (see
Figure 7.5). These findings suggest familiarity and recollection involve different processes.
Evaluation
Recognition memory depends on rather separate processes of familiarity and recollection, as indicated by neuroimaging studies. However, the
most convincing findings come from studying brain-damaged patients. A
double dissociation has been obtained – some patients have reasonably
intact familiarity but impaired recollection whereas a few patients exhibit
the opposite pattern.
What are the limitations of theory and research in this area?
(1)
The typical emphasis on recollection based on conscious awareness of contextual details is oversimplified because we can also have
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 308
28/02/20 4:21 PM
Long-term memory systems
309
Figure 7.5
Left lateral (A), medial (B)
and anterior (C) views of
prefrontal areas having
greater activation to
familiarity-based than
recollection-based
processes (in red) and
areas showing the opposite
pattern (in blue).
From Scalici et al. (2017).
Reprinted with permission of
Elsevier.
Figure 7.6
Sample pictures on the
recognition-memory test.
The one on the left is
high-contrast and easy to
process whereas the one on
the right is low-contrast and
hard to process.
From Geurten & Willems (2017).
Reprinted with permission of
Elsevier.
(2)
conscious awareness of having previously seen the target items themselves. Brainerd et al. (2014) found a model assuming two types of
recollection predicted behavioural data better than models assuming
only one type of recollection.
Diana et al.’s (2007) model does not identify the processes u
­ nderlying
familiarity judgements. However, it is often assumed that items on
a recognition-memory test that are easy to process are judged to be
familiar. Geurten and Willems (2017) tested this assumption using
unfamiliar pictures. On the recognition-memory test, some pictures
were presented with reduced contrast to reduce processing fluency (see
Figure 7.6). As predicted, recognition-memory performance was better
with high-­contrast than with low-contrast test pictures (70% vs 59%,
respectively).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 309
28/02/20 4:21 PM
310
Memory
(3)
(4)
More brain mechanisms are involved in recognition memory than
assumed by Diana et al. (2007).
The notion of an “item” requires more precise definition (Bird, 2017).
Recall memory
Here we will consider briefly similarities and differences between recall
(especially free recall: see Glossary) and recognition memory. Mickes et al.
(2013) reported important similarities using the remember/know procedure with free recall. Participants received a word list and for each word
answered one question (e.g., “Is this item animate?”; “Is this item bigger
than a shoebox?”). They then recalled the words, made a remember or know
judgement for each recalled word and indicated which question had been
associated with each word (contextual information).
Participants were more accurate at remembering which question was
associated with recalled words when the words received remember (rather
than know) judgements. This is very similar to recognition memory where
participants access more contextual information for remember words than
know ones.
Kragel and Polyn (2016) compared patterns of brain activation during
recognition-memory and free-recall tasks. Brain areas activated during
familiarity processes in recognition memory were also activated during free
recall. There was also weaker evidence that brain areas activated during
recollective processes in recognition were activated in free recall.
As we have seen, amnesic patients exhibit very poor recognition memory
(especially recognition associated with recollection). Amnesic patients also
typically have very poor free recall (e.g., Brooks & Baddeley, 1976).
Some aspects of recognition memory depend on structures other than
the hippocampus itself (Diana et al., 2007). In contrast, it has typically
been assumed the hippocampus is crucial for recall memory. Patal et al.
(2015) supported these assumptions in patients with relatively selective hippocampal damage. The extent of hippocampal damage in these patients
was negatively correlated with their recall performance but uncorrelated
with their recognition-memory performance.
There are several similarities between the processes involved in recall
and recognition. However, the to-be-remembered information is physically
present on recognition tests but not recall tests. As a result, processing
demands should generally be less with recognition. Chan et al. (2017)
obtained findings consistent with this analysis in patients with damage to
the frontal lobes impairing higher-level cognitive processes. Individual differences in intelligence were strongly related to performance on recall tests
but not recognition-memory tests. Thus, recall performance depends much
more on higher-level cognitive processes.
Is episodic memory constructive?
We use episodic memory to remember experienced past events. Most
people believe the episodic memory system resembles a video recorder providing us with accurate and detailed information about past events (Simons
& Chabris, 2011). In fact, “Episodic memory is . . . a fundamentally
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 310
28/02/20 4:21 PM
Long-term memory systems
311
constructive, rather than reproductive process that is prone to various kinds
of errors and illusions” (Schacter & Addis, 2007, p. 773). For example, the
constructive nature of episodic memory leads to distorted remembering of
stories (Chapter 10) and to eyewitnesses producing inaccurate memories of
crimes (Chapter 8).
Why is episodic memory so error-prone? First, it would require
massive processing to produce a semi-permanent record of all our experiences. Second, we typically want to access the gist or essence of our past
experiences, omitting trivial details. Third, we often enrich our episodic
memories when discussing our experiences with friends even when this produces memory errors (Dudai & Edelson, 2016; see Chapter 8).
What are the functions of episodic memory (other than to remember
past events)? First, we use episodic memory to imagine possible future scenarios and to plan the future (Madore et al., 2016). Imagining the future
(episodic simulation) is greatly facilitated by episodic memory’s flexible and
constructive nature. According to Addis (2018), remembered and imagined
events are both very similar, “simulations of experience from the same
pool of experiential details” (p. 69). However, Schacter and Addis (2007)
assumed in their constructive episodic simulation hypothesis that episodic
simulation is more demanding than episodic memory retrieval because
control processes are required to combine details from multiple episodes.
Second, Madore et al. (2019) found episodic memory influences divergent creative thinking (thinking of unusual and creative uses for common
objects). Creative thinking was associated with enhanced connectivity
between brain areas linked to episodic processing and brain areas associated with cognitive control.
Findings
The tendency to recall the gist of our previous experiences increases
throughout childhood (Brainerd et al., 2008). More surprisingly, children’s
greater focus on remembering gist as they become older often increases
memory errors. Brainerd and Mojardin (1998) asked children to listen to
sentences such as “The tea is hotter than the cocoa”. Subsequently, they
decided whether test sentences had been presented previously in precisely
that form. Sentences having the same meaning as an original sentence but
different wording (e.g., “The cocoa is cooler than the tea”) were more likely
to be falsely recognised by older children.
We turn now to the hypothesis (Schacter & Addis, 2007; Addis,
2018) that imagining future events involves very similar processes to those
involved in remembering past episodic events. On that hypothesis, brain
areas important for episodic memory (e.g., the hippocampus) should also
be activated when imagining future events. Benoit and Schacter (2015)
reported supportive evidence. There were two key findings:
(1)
Several brain regions were activated both while imagining future
events (episodic simulation) and during episodic-memory recollection
(see Figure 7.7A). The overlapping areas included “the hippocampus and parahippocampal cortex within the medial temporal lobes”
(Benoit & Schacter, 2015, p. 450).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 311
28/02/20 4:21 PM
312
Memory
(2)As predicted, several brain areas were more
strongly activated during episodic simulation
than episodic memory retrieval (see Figure
7.7B). These included clusters in the dorsolateral prefrontal cortex and posterior inferior parietal lobes and clusters in the right
medial temporal lobe (including the hippocampus) (Benoit & Schacter, 2015, p. 453).
Some of these areas are involved in cognitive
control – the borders of the fronto-parietal
control network (see Chapter 6) are indicated
by white dashed lines.
Imagining future events is generally associated
with hippocampal activation. We would have
more direct evidence the hippocampus is necessarily involved if amnesic patients with hippocampal damage had impaired ability to imagine
future events. Hassabis et al. (2007) found amnesics’ imaginary experiences consisted of isolated fragments lacking the richness and spatial
coherence of healthy controls’ experiences. The
amnesic patient KC with extensive brain damage
(including to the hippocampus) could not recall a
single episodic memory from the past or imagine
a possible future event (Schacter & Madore,
2016).
Robin (2018) argued that spatial context is
of major importance for both episodic memory
and imagining future events. For example, Robin
et al. (2016) asked participants to read brief narratives and imagine them in detail. Even when
Figure 7.7
no spatial context was specified in the narrative,
(A) Areas activated for both episodic simulation and
participants nevertheless generated an appropriepisodic memory; (B) areas activated more for episodic
ate spatial context while imagining on 78% of
simulation than episodic memory.
trials.
From Benoit and Schacter (2015). Reprinted with permission of
Elsevier.
The similarities between recall of past
personal events and imagining future personal
­
events have typically been attributed to episodic
processes common to both tasks. However, some similarities may also
reflect non-episodic processes. For example, amnesics’ impaired past recall
and future imagining may reflect an impaired ability to construct detailed
narrative. Schacter and Madore (2016) provided convincing evidence
that episodic processes are involved in recalling past events and imagining future ones. Participants received training in recollecting details of a
recent experience. If recall of past events and imaging of future events
both rely on episodic memory, this induction should benefit performance
by increasing participants’ production of episodic details in recall and
imagination. That is what was found.
313
Long-term memory systems
Evaluation
It is assumed episodic memory relies heavily on constructive processes.
This assumption is supported by research on eyewitness memory (Chapter
8) and language comprehension (Chapter 10). The additional assumption
that constructive processes used in episodic memory retrieval of past events
are also involved in imagining future events is an exciting development supported by much relevant evidence. Episodic memory is also involved in
divergent creative thinking.
What are the main limitations of research in this area? First, several
brain areas associated with recalling past personal events and imagining
future events have been identified, but their specific contributions remain
somewhat unclear. Second, finding a given area is involved in recalling the
past and imagining the future does not necessarily mean it is associated
with the same cognitive processes in both cases. Third, there is greater
uncertainty about future events than past ones. This may explain why
imagined future events are less vivid than recalled past events but more
abstract and dependent on semantic memory (MacLeod, 2016).
KEY TERM
Concepts
Mental representations of
categories of objects or
items.
SEMANTIC MEMORY
Our organised general knowledge about the world is stored in semantic
memory. Such knowledge is extremely varied (e.g., information about the
French language; the rules of hockey; the names of capital cities). Much
of this information consists of concepts: mental representations relating to
objects, people, facts and words (Lambon Ralph et al., 2017). These representations are multimodal (i.e., they incorporate information from several
sense modalities).
How is conceptual information in semantic memory organised? We
start this section by addressing this issue. First, we consider the notion that
concepts are organised into hierarchies. Second, we discuss an alternative
view, according to which semantic memory is organised on the basis of
the semantic distance or semantic relatedness between concepts. After that,
we focus on the nature of concepts and on how concepts are used. Finally,
we consider larger information structures known as schemas.
Organisation: hierarchies of concepts
Suppose you are shown a photograph of a chair and asked what it is. You
might say it is an item of furniture, a chair or an easy chair. This suggests
concepts are organised into hierarchies. Rosch et al. (1976) identified three
levels within such hierarchies: superordinate categories (e.g., items of furniture) at the top, basic level categories (e.g., chair) in the middle and subordinate categories (e.g., easy chair) at the bottom.
Which level do we use most often? Sometimes we talk about superordinate categories (e.g., “That furniture is expensive”) or subordinate categories (e.g., “I love my iPhone”). However, we typically deal with objects
at the intermediate or basic level.
Rosch et al. (1976) asked people to list concept attributes at each
level in the hierarchy. Very few attributes were listed for superordinate
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 313
28/02/20 4:21 PM
314
Memory
categories because they are relatively abstract. Many attributes were listed
for categories at the other two levels. However, very similar attributes were
listed for different categories at the lowest level. Thus, basic level categories
generally have the best balance between informativeness and distinctiveness:
informativeness is low at the highest level of the hierarchy and distinctiveness is low at the lowest level. In similar fashion, Rigoli et al. (2017) argued
(with supporting evidence) that categorising objects at the basic level generally allows us to select the most appropriate action with respect to that
object while minimising processing costs.
Bauer and Just (2017) found the processing of basic level concepts
involved many more brain regions than the processing of subordinate
concepts. More specifically, brain areas associated with sensorimotor and
language processing were activated with basic level concepts, whereas
­processing was focused on perceptual areas with subordinate concepts.
Basic level categories have other special properties. First, they represent the most general level at which individuals use similar motor movements when interacting with category members (e.g., we sit on most chairs
in similar ways). Second, basic level categories were used 99% of the time
when people named pictures of objects (Rosch et al., 1976).
However, we do not always prefer basic level categories. For example,
we expect experts to use subordinate categories. We would be surprised
if a botanist simply described all the different kinds of plants in a garden
as plants! We also often use subordinate categories with atypical category
members. For example, people categorise penguins faster as penguins than
as birds (Jolicoeur et al., 1984).
Findings
Tanaka and Taylor (1991) studied category naming in bird-watchers and
dog experts who were shown pictures of birds and dogs. Both groups used
subordinate names much more often in their expert domain than their
novice domain.
Even though people generally prefer basic level categories, this does
not necessarily mean they categorise fastest at that level. Prass et al. (2013)
presented photographs of objects very briefly and asked participants to categorise them at the superordinate level (animal or vehicle), the basic level
(e.g., cat or dog) or the subordinate level (e.g., Siamese cat vs Persian cat).
Performance was most accurate and fastest at the superordinate level (see
Figure 7.8). In similar fashion, Besson et al. (2017) found categorisation of
faces was fastest at the superordinate level.
Why does categorisation often occur faster at the superordinate level
than the basic level? Close and Pothos (2012) argued that categorisation at
the basic level is generally more informative and so requires more detailed
processing. Rogers and Patterson (2007) supported this viewpoint. They
studied patients with semantic dementia, a condition involving impairment of semantic memory (discussed earlier in this chapter, p. 303; see
Glossary). Patients with severe semantic dementia performed better at
the superordinate level than the basic level because less processing was
required.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 314
28/02/20 4:21 PM
315
Long-term memory systems
Figure 7.8
Accuracy of (a) object
categorisation and (b)
speed of categorisation at
the superordinate, basic
and subordinate levels.
From Prass et al. (2013).
Reprinted with permission.
Organisation: semantic distance
The assumption that concepts in semantic memory are organised hierarchically is too inflexible and exaggerates how neatly information in semantic memory is organised. Collins and Loftus (1975) proposed an approach
based on the more flexible assumption that semantic memory is organised
in terms of the semantic distance between concepts. Semantic distance
between concepts has been measured in many ways (Kenett et al., 2017).
Kenett et al. used data from 60 individuals instructed to produce as many
associations as possible in 60 seconds to 800 Hebrew cue words in order to
assess semantic distance in terms of path length: “the shortest number of
steps connecting any two cue words” (p. 1473).
Kenett et al. (2017) asked participants to judge whether word pairs
were semantically related. These judgements were well predicted by path
distance: 91% of directly linked words (one-step) were judged to be semantically related, compared to 69% of two-step word pairs and 64% of threestep word pairs.
Of importance, Kenett et al. (2017) found semantic distance predicted
performance on various episodic memory tasks (e.g., free recall). In an
experiment on cued recall, participants were presented with word pairs.
This was followed by presenting the first word of each pair and instructing them to recall the associated word. Performance was much higher on
directly linked word pairs (1-step) than three-step word pairs: 30% vs 11%,
respectively.
Semantic distance also predicts aspects of language production. For
example, Rose et al. (2019) had participants name target pictures (e.g.,
eagle) in the presence of distractor pictures that were semantically close
(e.g., owl) or semantically distant (e.g., gorilla). There was an interference
effect: naming times were longer when distractors were semantically close.
What is the underlying mechanism responsible for the above findings?
According to Collins and Loftus’s (1975) influential spreading-activation
theory, the appropriate node in semantic memory is activated when we
see, hear or think about a concept. Activation then spreads rapidly to
other concepts, with greater activation for concepts closely related semantically than those weakly related. Such an account can readily explain Rose
et al.’s (2019) findings.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 315
28/02/20 4:21 PM
316
Memory
Spreading-activation theory is also applicable to semantic priming (see
Glossary and Chapter 9). For example, dog is recognised as a word faster
when the preceding prime is cat than when it is car (Heyman et al., 2018).
This can be explained by assuming that presentation of cat activates the
dog concept and so facilitates recognising it as a word.
In sum, the semantic distance of concepts within semantic memory is
important in explaining findings in episodic memory research (e.g., free
recall; cued recall) as well as findings relating to language processing.
However, this approach is based on the incorrect assumption that each
concept has a single fixed representation in semantic memory. Our processing of any given concept is influenced by context (see next section). For
example, think about the meaning of piano. You probably did not focus
on the fact that pianos are heavy. However, you would do so if you read
the sentence “Fred struggled to lift the piano”. Thus, the meaning of any
concept (and its relation to other concepts) varies as a function of the circumstances in which it is encountered.
Using concepts: Barsalou’s approach
What do the mental representations of concepts look like? The “traditional”
view involved the following assumptions about concept representations:
●●
●●
●●
They are abstract and so detached from input (sensory) and output
(motor) processes.
They are stable: the same concept representation is used on different
occasions.
Different individuals have similar representations of any given concept.
In sum, it was assumed concept representations “have the flavour of
detached encyclopaedia descriptions in a database of categorical knowledge
about the world” (Barsalou, 2012, p. 247). This approach forms part of the
sandwich model (Barsalou, 2016b): cognition (including concept processing)
is “sandwiched” between perception and action and can be studied without
considering them. How, then, could we use such concept representations
to perceive the visual world or decide how to behave in a given situation
(Barsalou, 2016a)?
Barsalou (2012) argued all the above theoretical assumptions are
incorrect. We process concepts in numerous different settings and that processing is influenced by the current setting or context. More generally, any
concept’s representation varies flexibly across situations depending on the
individual’s current goals and the precise situation.
Consider the concept of a bicycle. A traditional abstract representation
would resemble the Chambers Dictionary definition, a “vehicle with two
wheels one directly in front of the other, driven by pedals”. According to
Barsalou (2009), the individual’s current goals determine which features
are activated. For example, the saddle’s height is important if you want to
ride a bicycle, whereas information about the tyres is activated if you have
a puncture.
According to Barsalou’s theoretical approach (e.g., 2012, 2016a,b),
conceptual processing is anchored in a given context or situation and
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 316
28/02/20 4:21 PM
Long-term memory systems
317
involves the perceptual and motor or action systems. His approach is
described as grounded cognition: cognition (including concept processing)
is largely grounded (or based) on the perceptual and motor systems.
Findings
Evidence that conceptual processing can involve the perceptual system was
reported by Wu and Barsalou (2009). Participants wrote down properties
for nouns or noun phrases. Those given the word lawn focused on external
properties (e.g., plant; blades) whereas those given rolled-up lawn focused
more on internal properties (e.g., dirt; soil). Thus, object qualities not visible
if you were actually looking at the object itself are harder to think of than
visible ones.
We might expect Barsalou’s grounded cognition approach to be less
applicable to abstract concepts (e.g., truth; freedom) than concrete ones
(objects we can see or hear). However, Barsalou et al. (2018) argued
that abstract concepts are typically processed within a relatively concrete
context. In fact, abstract-concept processing sometimes involves perceptual
information but much less often than concrete-concept processing (Borghi
et al., 2018).
Hauk et al. (2004) reported suggestive evidence that the motor system
is often involved when we access concept information. When participants
read words such as “lick”, “pick” and “kick”, these verbs activated parts
of the motor strip overlapping with areas activated when people make
the relevant tongue, finger and foot movements. These findings do not
show the motor system is necessary for concept processing – perhaps
activation in areas within the motor strip occurs only after concept
­
activation.
Miller et al. (2018) asked participants to make hand or foot
responses after reading hand-associated words (e.g., knead; wipe) or
foot-­
associated words (e.g., kick; sprint). Responses were faster when
the word was compatible with the limb making the response (e.g., hand
response to a hand-associated word) than when word and limb were
incompatible. These findings apparently support Barsalou’s approach,
according to which “The understanding of action verbs requires activation of the motor areas used to carry out the named action” (Miller et al.,
2018, p. 335).
Miller et al. (2018) tested the above prediction using event-related
potentials (see Glossary) to assess limb-relevant brain activity. However,
presentation of hand- and foot-associated words was not followed rapidly
by limb-relevant brain activity. Thus, the reaction time findings discussed
above were based on processing verb meanings and did not directly involve
motor processing.
How can we explain the differences in the findings obtained by Hauk
et al. (2004) and by Miller et al. (2018)? Miller et al. used a speeded task
that allowed insufficient time for motor imagery (and activation of relevant
motor areas) to occur, whereas this was not the case with the study by
Hauk et al.
According to Barsalou, patients with severe motor system damage
should have difficulty in processing action-related words (e.g., names of
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 317
28/02/20 4:21 PM
318
Memory
tools). Dreyer et al. (2015) studied HS, a patient with damage to sensorimotor brain systems close to the hand area. He had specific problems in
recognising nouns relating to tools rather than those referring to food or
animals.
In a review, Vannuscorps et al. (2016) found some studies reported
findings consistent with Dreyer et al.’s (2015) research. In other studies,
however, patients with damage to sensorimotor systems had no deficit
in conceptual processing of actions or manipulable objects. Vannuscorps
et al. concluded many patients with deficits in processing concepts relating
to actions and tool have extensive damage to brain areas additional to sensorimotor areas. The findings from such patients have limited relevance to
Barsalou’s (2016b) theory.
Vannuscorps et al. (2016) studied a patient, JR, with brain damage
primarily affecting the action production system. JR’s picture-naming
ability was assessed repeatedly over a 3-year period. Even though JR’s
disease was progressive, his naming performance with action-related concepts (e.g., hammer; shovel) remained intact. Thus, processing of action-­
related concepts does not necessarily require the involvement of the motor
system.
Evaluation
Barsalou’s general theoretical approach has several strengths. First, our
everyday use of concept knowledge often involves the perceptual and motor
systems. Second, concept processing is generally flexible: it is influenced by
the present context and the individual’s goals. Third, it is easier to see how
concept representations facilitate perception and action within Barsalou’s
approach than the “traditional” approach.
What are the limitations of Barsalou’s approach? First, Barsalou
argues it is generally necessary to use perceptual and/or motor processes to
understand concept meanings fully. However, motor processes may often
not be necessary (Miller et al., 2018; Vannuscorps et al., 2016).
Second, Barsalou exaggerates variations in concept processing across
time and contexts. The traditional view that concepts possess a stable,
abstract core has not been disproved (Borghesani & Piazza, 2017). In
fact, concepts have a stable core and concept processing is often context-­
dependent (discussed below).
Third, much concept knowledge does not consist simply of perceptual
and motor features. Borghesani and Piazza (2017, p. 8) provide the following example: “Tomatoes are native to South and Central America.”
Fourth, we recognise the similarities between concepts not sharing
perceptual or motor features. For example, we categorise watermelon and
blackberry as fruit even though they are very different visually and we eat
them using different motor actions.
Using concepts: hub-and-spoke model
We have seen concept processing often involves the perceptual and motor
systems. However, it is improbable nothing else is involved. First, we would
not have coherent concepts if concept processing varied considerably across
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 318
28/02/20 4:21 PM
Long-term memory systems
319
Figure 7.9
The hub-and-spoke model. (a) the hub within the anterior temporal lobe (ATL) has bidirectional connections to the
spokes (praxis refers to object manipulability; it is action-related); (b) the locations of the hub and spokes are shown,
same colour coding as in (a).
From Lambon Ralph et al. (2017).
situations. Second, as mentioned above, we can detect similarities in concepts differing greatly in perceptual terms.
Such considerations led Patterson et al. (2007) to propose their huband-spoke model (see Figure 7.9). The “spokes” consist of several modality-­
specific regions involving sensory and motor processing. Each concept also
has a “hub” – a modality-independent unified representation efficiently
integrating our conceptual knowledge.
It is assumed hubs are located within the anterior temporal lobes. As
discussed earlier, patients with semantic dementia invariably have damage
to the anterior temporal lobes and extensive loss of conceptual knowledge
is their main problem.
In the original model, it was assumed the two anterior temporal lobes
(left and right hemisphere) formed a unified system. This is approximately
correct – there is substantial activation in both anterior temporal lobes
whether concepts are presented visually or verbally. However, the left anterior temporal lobe was more involved than the right in processing verbal
information whereas the opposite was the case in processing visual information (Rice et al., 2015). Lambon Ralph et al. (2017) discussed research
where patients with damage to the left anterior temporal lobe had particular problems with anomia (object naming). In contrast, patients with
damage to the right anterior temporal lobe had particular problems in face
recognition.
Findings
We start with research on the “hub”. Mayberry et al. (2011) argued semantic dementia involves a progressive loss of “hub” information producing
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 319
28/02/20 4:21 PM
320
Memory
KEY TERM
a blurring of the boundary between category members and non-members.
Accordingly, they predicted semantic dementia patients would have particular problems making accurate category-membership decisions with (1)
atypical category members (e.g., emu is an atypical bird); and (2) pseudotypical items: non-category members resembling category members (e.g.,
butterfly is like a bird). Both predictions were supported with pictures and
words, suggesting processing within the anterior temporal lobes is general
and “hub-like” rather than modality-specific (e.g., confined to the visual
modality).
Findings from patients with semantic dementia suggest the anterior
temporal lobes are the main brain areas associated with “hubs”. Binder
et al. (2009) reviewed 120 neuroimaging studies involving semantic memory
in healthy individuals and found the anterior temporal lobes were consistently activated. Pobric et al. (2010a) applied transcranial magnetic stimulation (TMS; see Glossary) to interfere with processing in the left or right
anterior temporal lobe while participants processed concepts presented by
verbal or pictorial stimuli. TMS disrupted concept processing comparably
in both anterior temporal lobes.
However, Murphy et al. (2017) discovered important differences
between ventral (bottom) and anterior (front) regions of the anterior temporal lobe. Ventral regions responded to meaning and acted as a hub.
However, anterior regions were responsive to differences in input modality
(visual vs auditory) and thus are not “hub-like”.
We turn now to research on the “spokes”. Pobric et al. (2010b) applied
transcranial magnetic stimulation (TMS) to interfere briefly with processing within the inferior parietal lobule (involved in processing actions we
can make towards objects; the praxis spoke in Figure 7.9). TMS slowed
naming times for manipulable objects but not non-manipulable ones indicating this brain area (unlike the anterior temporal lobes) is involved in
relatively specific processing.
Findings consistent with those of Pobric et al. (2010b) were reported
by Ishibashi et al. (2018). They applied transcranial direct current stimulation (tDCS; see Glossary) to the inferior parietal lobule and the anterior
temporal lobe. Since they used anodal tDCS, it was expected this stimulation would enhance performance on tasks requiring rapid access to semantic information concerning tool function (e.g., scissors are used for cutting)
or tool manipulation (e.g., pliers are gripped by the handles).
As predicted, anodal tDCS applied to the anterior temporal lobe facilitated performance on both tasks because this brain area contains much
general object knowledge (see Figure 7.10). The effects of anodal tDCS
applied to the inferior parietal lobule were limited to the manipulation
task as predicted because this area processes action-related information.
Suppose we studied patients whose brain damage primarily affected
one or more of the “spokes”. According to the model, we should find
c­ ategory-specific deficits (problems with specific categories of objects).
There is convincing evidence for the existence of various category-specific
deficits and these deficits are mostly associated with the model’s spokes
(Chen et al., 2017).
However, it is often hard to interpret the findings from patients with
category-specific deficits. For example, many patients find it much harder
Category-specific
deficits
Disorders caused by brain
damage in which semantic
memory is disrupted
for certain semantic
categories.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 320
28/02/20 4:21 PM
Long-term memory systems
321
0.7
Accuracy
0.65
Sham
ATL-A
0.6
IPL-A
0.55
0.5
Manipulation
Function
Task
Figure 7.10
Performance accuracy on tool function and tool manipulation tasks with anodal
transcranial direct current stimulation to the anterior temporal lobe (ATL-A) or to the
inferior parietal lobule (IPL-A) and in a control condition (Sham).
From Ishibashi et al. (2018).
to identify pictures of living than non-living things. Several factors are
involved: living things have greater contour overlap than non-living things,
they are more complex structurally and they activate less motor information (Marques et al., 2013). It is difficult to disentangle the relative importance of these factors.
Finally, we consider a study by Borghesani et al. (2019). Participants
read words (e.g., elephant) having conceptual features (e.g., mammal)
and perceptual features (e.g., big; trumpeting). There were two main findings. First, conceptual and perceptual features were processed in different
brain areas. Second, initial processing of both types of features occurred
approximately 200 ms after word onset. These findings support the model’s
assumptions that there is somewhat independent processing of “hub” information (i.e., conceptual features) and “spoke” information (i.e., perceptual
features). However, the findings are inconsistent with Barsalou’s approach,
according to which perceptual processing should precede (and influence)
conceptual processing.
Evaluation
The hub-and-spoke model provides a comprehensive approach combining aspects of the traditional view of concept processing and Barsalou’s
approach. The notion within the model that concepts are represented by
abstract core information and modality-specific information has strong
support. Brain areas associated with different aspects of concept processing
have been identified.
What are the model’s limitations? First, it emphasises mostly the
storage and processing of single concepts. However, we also need to consider relations between concepts. For example, we can distinguish between
taxonomic relations based on similarity (e.g., dog–bear) and thematic relations based on proximity (e.g., dog–leash). The anterior temporal lobes are
important for taxonomic semantic processing whereas the temporo-parietal
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 321
28/02/20 4:21 PM
322
Memory
KEY TERMS
cortex is important for thematic semantic processing (Mirman et al., 2017).
The model has problems with the latter finding given its focus on the anterior temporal lobes.
Second, the role of the anterior temporal lobes in semantic memory
is more complex than assumed theoretically. For example, Mesulam et al.
(2013) found semantic dementia patients with damage primarily to the left
anterior temporal lobe had much greater problems with verbal concepts
than visually triggered object concepts. Thus, regions of the left anterior
temporal lobe form part of a language network rather than a very general
modality-independent hub.
Third, we have only a limited understanding of the division of labour
between the hub and the spokes during concept processing (Lambon
Ralph, 2014). For example, we do not know how the relative importance
of hub-and-spoke processing depends on task demands. It is also unclear
how information from hubs and spokes is integrated during concept
processing.
Schema
An organised packet of
information about the
world, events or people
stored in long-term
memory.
Script
A form of schema
containing information
about a sequence of
events (e.g., events during
a typical restaurant meal).
Schemas vs concepts
We may have implied semantic memory consists exclusively of concepts. In
fact, there are also larger information structures called schemas. Schemas
are “superordinate knowledge structures that reflect abstracted commonalities across multiple experiences” (Gilboa & Marlatte, 2017, p. 618).
Scripts are schemas containing information about sequences of events.
For example, your restaurant script probably includes the following: being
given a menu, ordering food and drink, eating and drinking and paying the
bill (Bower et al., 1979).
Scripts (and schemas more generally) are discussed in Chapter 10 (in
relation to language comprehension and memory) and Chapter 8 (relating to failures of eyewitness memory). Here we first consider brain areas
associated with schema-related information. We then explore implications
of the theoretical assumption that semantic memory contains abstract concepts corresponding to words and broader organisational structures based
on schemas. On that assumption, we might expect some brain-damaged
patients would have greater problems accessing concept-based information
than schema-based information, whereas others would exhibit the opposite
pattern. This is a double dissociation (see Glossary).
Brain networks
Schema information and processing involves several brain areas. However,
the ventromedial prefrontal cortex (vmPFC) is especially important. It
includes several Brodmann Areas including BA10, BA11, BA12, BA14
and BA25 (see Figure 1.5). Gilboa and Marlatte (2017) reviewed 12 fMRI
experiments where participants engaged in schema processing. Much of the
ventromedial prefrontal cortex was consistently activated, plus other areas
including the hippocampus.
Research on brain-damaged patients also indicates the important role
of the ventromedial prefrontal cortex in schema processing. Ghosh et al.
(2014) gave participants a schema (“going to bed at night”) and asked
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 322
28/02/20 4:21 PM
Long-term memory systems
323
them to decide rapidly whether each of a series of words was closely
related to it. Patients with damage to the ventromedial prefrontal cortex
performed worse than healthy controls on this task, indicating impaired
schema-­related processing.
Warren et al. (2014) presented participants with words belonging to
a single schema (e.g., winter; blizzard; cold) followed by recall. Healthy
individuals often falsely recall a schema-relevant non-presented word (e.g.,
snow) because their processing and recall involve extensive schema processing. If patients with damage to the ventromedial prefrontal cortex engage
in minimal schema processing, they should show reduced false recall. That
is what Warren et al. found.
Double dissociation
As discussed earlier, brain-damaged patients with early-stage semantic
dementia (see Glossary) have severe problems accessing word and object
meanings. Bier et al. (2013) assessed the ability of three semantic dementia patients to use schema-relevant information by asking them what they
would do if they had unknowingly invited two guests to lunch. The required
script actions included dressing to go outdoors, going to the grocery store,
shopping for food, preparing the meal and clearing up afterwards.
One patient successfully described all the above script actions accurately despite severe problems with accessing concept information from
semantic memory. The other patients had particular problems with planning and preparing the meal. However, they remembered script actions
relating to dressing and shopping. Note we might expect semantic dementia patients to experience problems with using script knowledge because
they would need access to relevant concept knowledge (e.g., knowledge
about food ingredients) when using script knowledge (e.g., preparing a
meal).
Other patients have greater problems with accessing script information
than concept meanings. Scripts typically have a goal-directed quality (e.g.,
using a script to achieve the goal of enjoying a restaurant meal). Since
the prefrontal cortex is of major importance in goal-directed activity, we
might expect patients with prefrontal damage (e.g., ventromedial prefrontal cortex) to have particular problems with script memory.
Cosentino et al. (2006) studied patients having semantic dementia or
fronto-temporal dementia (involving extensive damage to the prefrontal cortex and the temporal lobes) with scripts containing sequencing or
script errors (e.g., dropping fish in a bucket before casting the fishing
line). Patients with extensive prefrontal damage failed to detect far more
sequencing or script errors than those with semantic dementia.
Farag et al. (2010) confirmed that patients with fronto-temporal
dementia are generally less sensitive than those with semantic dementia to
the appropriate order of script events. They identified the areas of brain
damage in their participants (see Figure 7.11). Patients (including fronto-temporal ones) insensitive to script sequencing had damage in inferior
and dorsolateral prefrontal cortex. In contrast, patients (including those
with semantic dementia) sensitive to script sequencing showed little evidence of prefrontal damage.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 323
28/02/20 4:21 PM
324
Memory
Figure 7.11
(a) Brain areas damaged in patients with fronto-temporal degeneration or progressive non-fluent aphasia. (b) Brain areas
damaged in patients with semantic dementia or mild Alzheimer’s disease.
From Farag et al. (2010). By permission of Oxford University Press.
Zahn et al. (2017) also studied patients with fronto-temporal dementia
with damage to the fronto-polar cortex (BA10, part of the ventromedial
prefrontal cortex) and the anterior temporal lobe. They assessed patients’
knowledge of social concepts (e.g., adventurous) and script knowledge
(e.g., the likely long-term consequences of ignoring their employer’s
requests). Patients with greater damage to the fronto-polar cortex than
the anterior temporal lobe showed relatively poorer script knowledge
than knowledge of social concepts. In contrast, patients with the opposite pattern of brain damage had relatively poorer knowledge of social
concepts.
In sum, semantic memory for concepts centres on the anterior temporal lobe. Patients with semantic dementia have damage to this area causing
severely impaired concept memory. In contrast, semantic memory for scripts
or schemas involves the prefrontal cortex (especially ventromedial prefrontal cortex). However, when we use our script knowledge (e.g., preparing a
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 324
28/02/20 4:21 PM
325
Long-term memory systems
meal), it is important to access relevant concept k
­ nowledge (e.g., knowledge about food ingredients). As a consequence, semantic dementia patients
whose primary impairment is to concept knowledge also have great difficulties in accessing and using script knowledge.
NON-DECLARATIVE MEMORY
Non-declarative memory does not involve conscious recollection but
instead reveals itself through behaviour. As mentioned earlier, priming (the
facilitated processing of repeated stimuli) and procedural memory (mainly
skill learning) are two major forms of non-declarative memory. Note that
procedural memory is typically involved in implicit learning (discussed in
Chapter 6).
There are two major differences between priming (also known as repetition priming) and procedural memory:
(1)
(2)
KEY TERMS
Perceptual priming
A form of priming
in which repeated
presentations of a
stimulus facilitates its
perceptual processing.
Conceptual priming
A form of priming in
which there is facilitated
processing of stimulus
meaning.
Priming often occurs rapidly whereas procedural memory or skill
learning is typically slow and gradual (Knowlton & Foerde, 2008).
Priming is tied fairly closely to specific stimuli whereas skill learning
typically generalises to numerous stimuli. For example, it would be
useless if you could hit a good backhand at tennis only when the ball
approached you from a given direction at a given speed!
The strongest evidence for distinguishing between declarative and non-­
declarative memory comes from amnesic patients. Such patients mostly have
severely impaired declarative memory but almost intact non-­
declarative
memory (but see next section for a more complex account). Oudman et al.
(2015) reviewed research on priming and procedural memory or skill learning in amnesic patients with Korsakoff’s syndrome (see Glossary). Their performance was nearly intact on tasks such as the pursuit rotor (a stylus must
be kept in contact with a target on a rotating turntable) and the serial reaction time task (see Glossary).
Amnesic patients performed poorly on some non-declarative tasks
reviewed by Oudman et al. (2015) for various reasons. First, some tasks
require declarative as well as non-declarative memory. Second, some
­Kors­akoff’s patients have widespread brain damage (including areas involved
in non-declarative memory). Third, the distinction between declarative and
non-declarative memory is less clear-cut and important than ­traditionally
assumed (see later discussion).
Repetition priming
We can distinguish between perceptual and conceptual priming.
Perceptual priming occurs when repeated presentation of a stimulus
leads to facilitated processing of its perceptual features. For example,
it is easier to identify a degraded stimulus if it was presented shortly
beforehand. Conceptual priming occurs when repeated presentation of a
stimulus leads to facilitated processing of its meaning. For example, we
can decide faster whether an object is living or non-living if we saw it
recently.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 325
28/02/20 4:21 PM
326
Memory
There are important differences between perceptual priming and
conceptual priming. Gong et al. (2016) found patients with frontal lobe
damage performed poorly on conceptual priming but had intact perceptual
priming. In contrast, patients with occipital lobe damage (an area associated with visual processing) had intact conceptual priming but impaired
perceptual priming.
If repetition priming involves non-declarative memory, amnesic
patients should show intact repetition priming. This prediction has much
support. For example, Cermak et al. (1985) found amnesic patients had
comparable perceptual priming to controls. However, patients sometimes
exhibit a modest priming impairment.
Levy et al. (2004) studied conceptual priming: deciding whether words
previously studied (vs not studied) belonged to given categories. Two
male amnesic patients (EP and GP) with large lesions in the medial temporal lobes had intact conceptual priming to healthy controls, but they
performed much worse than controls on recognition memory (involving
declarative memory).
Much additional research was carried out on EP, who had extensive
damage to the perirhinal cortex (BA35 and BA36) plus other regions within
the medial temporal lobe (Insausti et al., 2013). His long-term declarative
memory was massively impaired. For example, he had very poor ability to
identify names, words and faces that became familiar only after amnesia
onset. However, EP’s performance was intact on non-declarative tasks
(e.g., perceptual priming; visuo-motor skill learning; see Figure 7.12). His
performance was at chance level on recognition memory but as good as
that of healthy controls on perceptual priming.
Schacter and Church (1995) reported further evidence amnesic
patients have intact perceptual priming. Participants initially heard
words all spoken in the same voice and then identified the same words
passed through an auditory filter. There was priming because identification performance was better when the
words were spoken in the same voice as
initially.
The notion that priming depends
on memory systems different from those
involved in declarative memory would be
strengthened if we found patients having
intact declarative memory but impaired
priming. This would provide a double dissociation when considered together with
amnesics having intact priming but impaired
declarative memory. Gabrieli et al. (1995)
studied a patient, MS with damage to the
right occipital lobe. MS had intact performance on recognition and cued recall (declarFigure 7.12
ative memory) but impaired performance on
Percentages of priming effect (left-hand side) and recognitionperceptual priming. This latter finding is conmemory performance of healthy controls (CON) and
sistent with findings reported by Gong et al.
patients (EP).
(2016) in patients with occipital lobe damage
From Insausti et al. (2013). © National Academy of Sciences.
Reproduced with permission.
(discussed earlier).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 326
28/02/20 4:21 PM
327
Long-term memory systems
The above picture is too neat-and-tidy. Like Schacter and Church
(1995), Schacter et al. (1995) studied perceptual priming based on auditory word identification. However, the words were initially presented in six
different voices. On the word-identification test, half were presented in the
same voice as initially and the other half were spoken by one of the other
voices (re-paired condition). Healthy controls (but not amnesic patients)
had more priming for words presented in the same voice.
How can we explain these findings? In both conditions, participants
were exposed to words and voices previously heard. The only advantage in
the same voice condition was that the pairing of word and voice was the
same as before. However, only those participants who had linked or associated words and voices at the original presentation would have benefited
from the repeated pairings. Thus, amnesics are poor at binding together
different kinds of information even on priming tasks apparently involving
non-declarative memory (see later discussion pp. 333–336).
Related findings were obtained by Race et al. (2019). Amnesic patients
had intact repetition priming when the task involved relatively simple associative learning. However, their repetition priming was impaired when the
task involved more complex and abstract associative learning. Race et al.
concluded “These results highlight the multiple, distinct cognitive and
neural mechanisms that support repletion priming” (p. 102).
KEY TERMS
Repetition suppression
The finding that stimulus
repetition often leads
to reduced brain activity
(typically with enhanced
performance via priming).
Repetition enhancement
The finding that stimulus
repetition sometimes
leads to increased brain
activity.
Priming processes
What processes are involved in priming? A popular view is based on perceptual fluency: repeated presentation of a stimulus means it can be processed more efficiently using fewer resources. This view is supported by the
frequent finding that brain activity decreases with stimulus repetition: this is
repetition suppression. However, this finding on its own does not demonstrate a causal link between repetition suppression and priming.
Wig et al. (2005) reported more direct evidence using transcranial magnetic stimulation to disrupt processing. TMS abolished repetition suppression and conceptual priming suggesting that repetition suppression was
necessary for conceptual priming.
Stimulus repetition is sometimes associated with repetition enhancement involving increased brain activity with stimulus repetition. de Gardelle
et al. (2013) presented repeated faces and found evidence of both stimulus
suppression and stimulus enhancement.
What determines whether there is repetition suppression or enhancement? Ferrari et al. (2017b) presented participants with repeated neutral
and emotional scenes. Repetition suppression was found when scenes were
repeated many times in rapid succession, probably reflecting increased perceptual fluency. In contrast, repetition enhancement was found when repetitions were spaced out in time. This was probably due to spontaneous
retrieval of previously presented stimuli.
Kim (2017a) reported a meta-analysis of studies on repetition suppression and enhancement in repetition priming (see Figure 7.13). There were
two main findings. First, repetition suppression was associated with reduced
activation in the ventromedial prefrontal cortex and related areas, suggesting it reflected reduced encoding of repeated stimuli.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 327
28/02/20 4:21 PM
328
Memory
Figure 7.13
Brain regions showing repetition suppression (RS; orange colour) or response
enhancement (RE; blue colour) in a meta-analysis.
From Kim (2017a).
Second, repetition enhancement was associated with increased activation in dorsolateral prefrontal cortex and related areas. According to
Kim (2017a, p. 1894), “The mechanism for repetition enhancement is . . .
explicit retrieval during an implicit memory task.” Thus, explicit or declarative memory is sometimes involved in allegedly non-declarative priming
tasks.
In sum, progress has been made in understanding the processes underlying priming. Of importance is suggestive evidence that priming sometimes involves declarative as well as non-declarative memory (Kim, 2017).
The mechanisms involved in repetition suppression and priming are still
not fully understood. However, these effects depend on complex interactions among the time interval between successive stimuli, the task and the
allocation of attention (Kovacs & Schweinberger, 2016).
Procedural memory or skill learning
Motor skills are important in everyday life – examples include word processing, writing, playing netball and playing a musical instrument. Skill
learning or procedural memory includes sequence learning, mirror tracing
(tracing a figure seen in a mirror), perceptual skill learning, mirror reading
(reading a text seen in a mirror) and artificial grammar learning (Foerde
& Poldrack, 2009; see Chapter 6). However, although these tasks are all
categorised as skill learning, they differ in terms of the precise cognitive
­processes involved.
Here we consider whether the above tasks involve non-declarative or
procedural memory and thus involve different memory systems from those
underlying episodic and semantic memory. We will consider skill learning in amnesic patients. If they have essentially intact skill learning but
severely impaired declarative memory, that would provide evidence that
different memory systems are involved.
Before considering the relevant evidence, we address an important
general issue. It is sometimes incorrectly assumed any given task is always
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 328
28/02/20 4:21 PM
Long-term memory systems
329
performed using non-declarative or declarative memory. Consider the
weather-prediction task where participants use various cues to predict
whether the weather will be sunny or rainy. Reber et al. (1996) found
amnesics learned this task as rapidly as healthy controls, suggesting it
involves procedural (non-declarative) memory. However, Rustemeier et al.
(2013) found 61% of participants used a non-declarative strategy throughout learning but 12% used a declarative strategy throughout. In addition,
27% shifted from an early declarative to a later declarative strategy.
Findings
Amnesics often have essentially intact skill learning on numerous
skill-learning tasks. For example, using the pursuit rotor (manual tracking of a moving target), Tranel et al. (1994) found that 28 amnesic patients
had intact learning. Even a patient (Boswell) with unusually extensive brain
damage to brain areas strongly associated with declarative memory had
intact learning.
Much research has used the serial reaction time task (see Glossary). As
discussed in Chapter 6, amnesics’ performance on this task is typically reasonably intact. It is somewhat hard to interpret the findings because performance on this task by healthy controls often involves some consciously
accessible knowledge (Gaillard et al., 2009).
Spiers et al. (2001) considered the non-declarative memory performance
of 147 amnesic patients. All showed intact performance on tasks involving
priming and learning skills or habits. However, as mentioned earlier, some
studies have shown modest impairment in amnesic patients (Oudman et al.,
2015). In addition, amnesics’ procedural memory has important limitations:
“[Amnesic patients] typically do not remember how or where information was obtained, nor can they flexibly use the acquired information. The
knowledge therefore lacks a . . . context” (Clark & Maguire, 2016, p. 68).
Most tasks assessing skill learning in amnesics require learning far
removed from everyday life. However, Cavaco et al. (2004) used five
skill-learning tasks (e.g., a weaving task) involving real-world skills.
Amnesic patients showed comparable learning to healthy controls despite
significantly impaired declarative memory for the same tasks. Anderson
et al. (2007) studied the motor skill of car driving in two severely amnesic
patients. Their steering, speed control, safety errors and driving with
­distraction were intact.
Finally, we discuss patients with Parkinson’s disease (see Glossary).
These patients have damage to the striatum (see Glossary), which is of
greater importance to non-declarative learning than declarative learning.
As predicted, Parkinson’s patients typically have severely impaired non-­
declarative learning and memory (see Chapter 6). For example, Kemeny
et al. (2018) found on the serial reaction time task that Parkinson’s patients
showed practically no evidence of learning (see Figure 7.14).
However, Parkinson’s patients sometimes have relatively intact episodic memory. For example, Pirogovsky-Turk et al. (2015) found normal
performance by Parkinson’s patients on measures of free recall, cued
recall and recognition memory. These findings strengthen the case for a
­distinction between declarative and non-declarative memory.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 329
28/02/20 4:21 PM
330
Memory
Figure 7.14
Mean reaction times on the
serial reaction time task by
Parkinson’s disease patients
(PD) and healthy controls
(HC).
1,150
1,100
1,050
Mean RTs (ms)
From Kemeny et al. (2018).
1,200
1,000
950
900
850
800
750
2R
1
k1
k1
oc
Bl
0
oc
k1
Bl
k9
oc
Bl
oc
k8
HC
Bl
oc
k7
Bl
oc
k6
PD
Bl
oc
k5
Bl
k4
oc
Bl
oc
k3
Bl
oc
k2
Bl
oc
Bl
Bl
oc
k1
700
Other research complicates the picture. First, Parkinson’s patients
(especially as the disease progresses) often have damage to brain areas
associated with episodic memory. Das et al. (2019) found impairments
in recognition memory (a form of episodic memory) among Parkinson’s
patients were related to damage within the hippocampus (of central importance in episodic memory). Many Parkinson’s patients also have problems
with attention and executive functions (Roussel et al., 2017). Bezdicek
et al. (2019) found impaired episodic memory in Parkinson’s patients was
related to reduced functioning of brain areas associated with attention and
executive functions as well as reduced hippocampal functioning.
Second, there are individual differences in the strategies used on many
tasks (e.g., weather-prediction task discussed earlier). Kemeny et al. (2018)
found Parkinson’s patients and healthy controls had comparable performance on the weather-prediction task. However, most Parkinson’s patients
used a much simpler strategy than healthy controls. Thus, the patients’
processing was affected by the disease although this was not apparent from
their overall performance.
Interacting systems
A central theme of this chapter is that traditional theoretical views are oversimplified (see next section pp. 332–340). For example, skill learning often
involves brain circuitry including the hippocampus (traditionally associated
exclusively with episodic memory). Döhring et al. (2017) studied patients
with transient global amnesia who had dysfunction of the hippocampus lasting for several hours. This caused profound deficits in declarative
memory but also reduced learning on a motor learning task involving finger
sequence tapping. Thus, optimal motor learning can require interactions of
the procedural and declarative memory systems.
Albouy et al. (2013) discussed research on motor sequence learning
(skill learning). The hippocampus (centrally involved in the formation of
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 330
28/02/20 4:21 PM
Long-term memory systems
331
declarative memories) played a major role in the acquisition and storage
of procedural memories and there were numerous interactions between
hippocampal-cortical and striato-cortical systems. Doyon et al. (2018)
reviewed changes during motor sequence learning. Early learning mainly
involved striatal regions in conjunction with prefrontal and premotor cortical regions. The contribution of the striatum and motor cortical regions
increases progressively during later learning. These findings suggest procedural learning is dominant later in learning but that declarative memory
plays a part early in learning. Similar findings are discussed by Beukema
and Verstynen (2018) (see p. 276).
How different are priming and skill learning?
Priming and skill learning are both forms of non-declarative memory.
However, as Squire and Dede (2015, p. 2) pointed out, “Non-declarative
memory is an umbrella term referring to multiple forms of memory.” Thus,
we might expect to find differences between priming and skill learning. As
mentioned earlier, priming generally occurs more rapidly and the learning
associated with priming is typically less flexible.
If priming and skill learning involve different processes, we would
not necessarily expect individuals good at skill learning to also be good at
priming. Schwartz and Hashtroudi (1991) found no correlation between
performance on a priming task (word identification) and a skill-learning
task (inverted text reading).
Findings based on neuroimaging or on brain-damaged patients might
clarify the relationship between priming and skill learning. Squire and
Dede (2015) argued the striatum is especially important in skill learning
whereas the neocortex (including the prefrontal cortex) is of major importance in priming.
Some evidence (including research discussed above) is supportive of
Squire and Dede’s (2015) viewpoint. However, other research is less supportive. Osman et al. (2008) found Parkinson’s patients had intact procedural learning when learning about and controlling a complex system (e.g.,
water-tank system). This suggests the striatum is not needed for all forms
of skill learning. Gong et al. (2016; discussed earlier, p. 326) found patients
with frontal damage nevertheless had intact perceptual priming.
The wide range of tasks used to assess priming and skill learning means
numerous brain regions are sometimes activated on both kinds of tasks.
We start with skill learning. Penhune and Steele (2012; see Chapter 6) proposed a theory assuming skill learning involves several brain areas including the primary motor cortex, cerebellum and striatum. So far as priming
is concerned, Segaert et al. (2013) reviewed 29 neuroimaging studies and
concluded that “Repetition enhancement effects have been found all over
the brain” (p. 60).
Evaluation
Much evidence suggests priming and skill learning are forms of non-­
declarative memory involving different processes and brain areas from
those involved in declarative memory. There is limited evidence of a double
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 331
28/02/20 4:21 PM
332
Memory
dissociation: amnesic patients often exhibit reasonably intact priming
and skill learning but severely impaired declarative memory. In contrast,
Parkinson’s patients (especially in the early stages of the disease) sometimes
have intact declarative memory but impaired procedural memory.
What are the main limitations of research in this area?
(1)
(2)
(3)
(4)
There is considerable flexibility in the processes used on many memory
tasks. As a result, it is often an oversimplification to describe a task as
involving only “non-declarative memory”.
Numerous tasks have been used to assess priming and skill learning.
More attention needs to be paid to differences among tasks in the
precise cognitive processes involved.
There should be more emphasis on brain networks rather than specific
brain areas. For example, motor sequence learning involves a striato-cortical system rather than simply the striatum. In addition, this
system interacts with a hippocampal-cortical system (Albouy et al.,
2013).
The findings from Parkinson’s patients are mixed and inconsistent.
Why is this? As the disease progresses, brain damage in such patients
typically moves beyond brain areas involved in non-declarative
memory (e.g., the striatum) to areas involved in declarative memory
(e.g., the hippocampus and prefrontal areas).
BEYOND MEMORY SYSTEMS AND DECLARATIVE
VS NON-DECLARATIVE MEMORY
Until relatively recently, most memory researchers argued the distinction
between declarative/explicit and non-declarative/implicit memory was of
major theoretical importance. According to this traditional approach, a
crucial difference between memory systems is whether they support ­conscious
access to stored information (see Figure 7.2). It was also often assumed that
only memory systems involving conscious access depend heavily on the
medial temporal lobe (especially the hippocampus). The traditional approach
has proved extremely successful – consider all the accurate predictions
it made with respect to the research discussed earlier. However, its major
assumptions are oversimplified and more complex t­ heories are required.
Explicit vs implicit memory
If the major dividing line in long-term memory is between declarative
(explicit) and non-declarative (implicit) memory, it is important to devise
tasks involving only one type of memory. This sounds easy: declarative memory is involved when participants are instructed to remember
­previously presented information but not otherwise.
Reality is more complex. Consider the word-completion task.
Participants are presented with a word list. Subsequently, they perform an
apparently unrelated task: word fragments (e.g., STR _____ ) are presented
and they produce a word starting with those letters. Implicit memory is
revealed by the extent to which their word completions match list words.
Since the instructions make no reference to recall, this task is apparently
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 332
28/02/20 4:21 PM
Long-term memory systems
an implicit/non-declarative task. However, participants who become aware
of the connection between the word list and the word-completion task
perform better than those who do not (Mace, 2003).
Hippocampal activation is generally associated with declarative
memory whereas activity of the striatum is associated with non-­declarative
memory. However, Sadeh et al. (2011) obtained more complex findings. Effective learning on an episodic memory task was associated with
­interactive activity between the hippocampus and striatum. Following a
familiar route also often involves complex interactions between the hippocampus and striatum with declarative memory assisting in the guidance of ongoing actions retrieved from non-declarative memory (Goodroe
et al., 2018).
The involvement of declarative/explicit memory and non-declarative/
implicit memory on any given task sometimes changes during the course
of learning and/or there are individual differences in use of the two forms
of memory. Consider the acquisition of sequential motor skills. There
is often a shift from an early reliance on explicit processes to a later reliance on implicit processes (Beukema & Verstynen, 2018; see Chapter 6).
Lawson et al. (2017) reported individual differences during learning on the
serial reaction time task (see Chapter 6). Some learners appeared to rely
solely on implicit processes whereas others also used explicit processes.
333
Research activity:
Word-stem completion task
Henke’s processing-based theoretical account
Several theories differing substantially from the traditional theoretical
approach have been proposed. For example, compare Henke’s (2010)
processing-based model (see Figure 7.15) against the traditional model
(see Figure 7.2). Henke’s model differs crucially in that “Consciousness of
encoding and retrieval does not select for memory systems and hence does
not feature in this model” (p. 528).
Another striking difference relates to declarative memory. In the traditional model, all declarative memory (episodic plus semantic memory)
depends on the medial temporal lobes (especially the hippocampus) and
the diencephalon. In Henke’s model, in contrast, episodic memory depends
on the hippocampus and neocortex, semantic memory can involve brain
areas outside the hippocampus, and familiarity in recognition memory
depends on the parahippocampal gyrus and neocortex (and also the perirhinal cortex).
Figure 7.15 is oversimplified. Henke (2010) argued semantic knowledge can be learned in two different ways: one way is indicated in the figure
but the other way “uses the hippocampus and involves episodic memory
formation” (p. 528). The assumption that semantic memory need not
depend on the hippocampus helps to explain why amnesic patients’ semantic memory is generally less impaired than their episodic memory (Spiers
et al., 2001).
There are three basic processing modes in Henke’s (2010) model:
(1)
Rapid encoding of flexible associations: this involves episodic memory
and depends on the hippocampus. It is also assumed semantic memory
often involves the hippocampus.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 333
28/02/20 4:21 PM
334
Memory
Figure 7.15
A processing-based memory model. There are three basic processing modes: (1) rapid
encoding of flexible associations; (2) slow encoding of rigid associations; and (3) rapid
encoding of single or unitised items formed into a single unit. The brain areas associated
with each of these processing modes are indicated towards the bottom of the figure.
From Henke (2010). Reproduced with permission from Nature Publishing Group.
(2)
(3)
Slow encoding of rigid associations: this involves procedural memory,
semantic memory and classical conditioning, and depends on the
basal ganglia (e.g., the striatum) and cerebellum.
Rapid encoding of single or unitised items (formed into a single unit):
this involves priming and familiarity in recognition memory and
depends on the parahippocampal gyrus.
Many predictions are common to Henke’s (2010) model and the traditional
model. For example, amnesic patients with hippocampal damage should
have generally poor episodic memory but intact procedural memory and
priming. However, the two models make different predictions:
(1)
(2)
Henke’s (2010) model predicts that amnesic patients with hippocampal damage should have severe impairments of episodic memory
(and semantic memory) for flexible relational associations but not
for single or unitised items. In contrast, according to the traditional
model, amnesic patients should have impaired episodic and semantic
memory for single or unitised items as well as for flexible relational
associations.
Henke’s (2010) model predicts the hippocampus is involved in the
encoding of flexible associations with unconscious and conscious
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 334
28/02/20 4:21 PM
Long-term memory systems
(3)
learning. In contrast, the traditional model assumes the hippocampus
is involved only in conscious learning.
Henke’s model predicts the hippocampus is not directly involved in
familiarity judgements in recognition memory. In contrast, the traditional model assumes all forms of episodic memory depend on the
hippocampus.
Findings
We start with the first prediction above as it applies to episodic memory.
Quamme et al. (2007) studied recognition memory for word pairs
(e.g., CLOUD–LAWN). In the key condition, each word pair was unitised (e.g., CLOUD-LAWN was interpreted as a lawn used for viewing
clouds). Amnesic patients with hippocampal damage had a much smaller
­recognition-memory deficit when the word pairs were unitised than when
they were not. Olson et al. (2015) presented faces with a fixed or variable
viewpoint followed by a recognition-memory test. It was assumed flexible
associations would be formed only in the variable-viewpoint condition.
As predicted, a female amnesic (HC) had intact performance only in the
fixed-viewpoint condition (see Figure 7.16).
Research by Blumenthal et al. (2017; discussed earlier, p. 302) on
semantic memory is also relevant to the first prediction. An amnesic patient
with hippocampal damage had impaired semantic memory performance
when it depended on having formed relational associations. However, her
semantic memory performance was intact when relational associations
were not required.
Support for the second prediction was reported by Duss et al. (2014).
Unrelated word pairs (e.g., violin–lemon) were presented subliminally to
amnesic patients and healthy controls. The amnesic patients had significantly poorer relational or associative encoding and retrieval than the controls. However, their encoding (and retrieval) of information about single
Corrected recognition
0.7
0.6
0.5
0.4
Controls fixed
HC fixed
Controls variable
HC variable
0.3
0.2
0.1
0.0
Repeated
Novel
Tested viewpoint
Figure 7.16
Recognition memory for faces presented in a fixed or variable viewpoint and tested in a
fixed or variable viewpoint; HC is a female amnesic patient.
From Olson et al. (2015).
335
336
Memory
words (e.g., angler) was comparable to controls. Only the relational task
involved hippocampal activation.
Hannula and Greene (2012) discussed several studies showing associative or relational learning can occur without conscious awareness. Of most
relevance here, however, is whether the hippocampus is activated during
non-conscious encoding and retrieval. Henke et al. (2003) presented participants with task–occupation pairs below the level of conscious awareness. There was hippocampal activation during nonconscious encoding of
the face–occupation pairs. There was also hippocampal activation during
non-conscious retrieval of occupations associated with faces.
Finally, we turn to Henke’s third prediction, namely, that the hippocampus is not required for familiarity judgements in recognition
memory. If so, we might predict amnesic patients should have intact familiarity judgements. As predicted, amnesics have intact recognition memory
(including familiarity judgements) for unfamiliar faces (Bird, 2017; discussed earlier, p. 308).
However, the findings with unfamiliar faces are unusual because
patients generally have only reasonably (but not totally) intact familiarity
judgements for other types of material (Bird, 2017; Bowles et al., 2010;
Skinner & Femandes, 2007) (discussed earlier, pp. 307–308). However,
these findings may not be inconsistent with Henke’s (2010) model because
amnesics’ brain damage often extends beyond the hippocampus to areas
associated with familiarity (perirhinal cortex). A male amnesic patient
(KN) with hippocampal damage but no perirhinal damage had intact
familiarity performance (Aggleton et al., 2005).
As shown in Figure 7.15, Henke (2010) assumed that familiarity
judgements depend on activation in brain areas also involved in priming.
As predicted, Thakral et al. (2016) found similar brain areas were associated with familiarity and priming, suggesting they both involve similar
processes.
Evaluation
Henke’s (2010) model with its emphasis on memory processes rather than
memory systems is an advance. We have considered several examples where
predictions from her model have proved superior to predictions from the
traditional approach.
What are the model’s limitations? First, more research and theorising are needed to clarify the role of consciousness in memory. Conscious
awareness is associated with integrated processing across several brain areas
(Chapter 16) and so is likely to enhance learning and memory. However,
how this happens is not specified.
Second, the model resembles a framework rather than a model. For
example, it is assumed the acquisition of semantic memories is sometimes
closely related to episodic memory. However, we cannot make precise predictions unless we know the precise conditions determining when this is
the case and how processes associated with semantic and episodic memory
interact.
Third, the model does not consider the brain networks associated with
different types of memory (see below).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 336
28/02/20 4:21 PM
Long-term memory systems
337
Does each memory system depend on a few brain
areas?
According to the traditional theoretical approach (see Figure
7.2), each memory system depends on only a few key brain areas
(a similar assumption was made by Henke, 2010). Nowadays,
however, it is generally assumed each type of memory involves
several brain areas forming one or more networks.
How can we explain the above theoretical shift? Early
memory research relied heavily on findings from brain-damaged
patients. Such findings (while valuable) are limited. They can
indicate a given brain area is of major importance. However,
neuroimaging research allows us to identify all brain areas associated with a given type of memory. Examples of the traditional
approach’s limitations are discussed below.
First, it was assumed that episodic memory depends primarily on the medial temporal lobe (especially the hippocampus).
Neuroimaging research indicates that several other brain areas
interconnected with the medial temporal lobe are also involved.
In a review, Bastin et al. (2019) concluded there is a general
recollection network specific to episodic memory including the
inferior parietal cortex, the medial prefrontal cortex and the posterior cingulate cortex.
Kim and Voss (2019) assessed brain activity during the formation of episodic memories. They discovered that activation within
large brain networks predicted subsequent ­recognition-memory
performance (see Figure 7.17). Why did activation in certain
areas predict lower recognition-­memory performance? The most
important reason is that such activation often reflects various
kinds of task-irrelevant processing.
Task-positive
Second, in the traditional approach (and Henke’s, 2010,
model), autobiographical memories were regarded simply as
Task-negative
a form of episodic memory. However, the retrieval of autobiographical memories often involves more brain networks
than the retrieval of simple episodic memories. As is shown
Figure 7.17
in Figure 8.7, retrieval of autobiographical memories involves Brain areas whose activity during
the fronto-­parietal network, the cingulo-­operculum network, the episodic learning predicted increased
medial prefrontal cortex network and the medial temporal lobe recognition-memory performance
network. Only the last of these networks is emphasised within the (task-positive; in red) or decreased
performance (task-negative; in blue).
traditional approach (and Henke’s model).
Third, more brain areas are associated with semantic From Kim & Voss (2019).
memory than the medial temporal lobes emphasised in the traditional model. In a meta-­analysis, Binder et al. (2009) identified a left-­
hemisphere network consisting of seven regions including the middle
temporal gyrus, dorsomedial prefrontal cortex and ventromedial prefrontal cortex.
Fourth, it was assumed within the traditional approach that priming
involves the neocortex. In fact, what is involved is more complex. Kim
(2017a; discussed earlier, pp. 327–328) found in a meta-analysis that
priming is associated with reduced activation in the fronto-parietal control
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 337
28/02/20 4:21 PM
338
Memory
network and the dorsal attention network but increased activation in the
dorsolateral prefrontal cortex and related areas.
Are memory systems independent?
A key feature of the traditional theoretical approach (see Figure 7.2) was
the assumption that each memory system operates independently. As a consequence, any given memory task should typically involve only a single
memory system. This assumption is an oversimplification. As Ferbinteanu
(2019, p. 74) pointed out, “The lab conditions, where experiments are carefully designed to target specific types of memories, most likely do not universally apply in natural settings where different types of memories combine
in fluid and complex manners to guide behaviour.”
First, consider episodic and semantic memory. Earlier we considered cases where episodic and semantic memory were both involved. For
example, people answering questions about repeated personal events (e.g.,
“Have you drunk coffee while shopping?”) rely on both episodic and
semantic memory (Renoult et al., 2016).
Second, consider skill learning and memory. Traditionally, it was
assumed that skill learning depends primarily on implicit processes.
However, as we saw earlier, explicit processes are often involved early in
learning processes (Beukema & Verstynen, 2018; see Chapter 6).
Component-process models
The traditional theoretical model is too neat and tidy: it assumes the
nature of any given memory task rigidly determines the processes used.
We need a theoretical approach assuming that memory processes are
much more flexible than assumed within the traditional model (or
Henke’s model). Dew and Cabeza (2011) proposed such an approach
(see Figure 7.18). Five brain areas were identified varying along three
dimensions:
(1) cognitive process: perceptually or conceptually driven;
(2) stimulus representation: item or relational;
(3) level of intention: controlled vs. automatic.
This approach is based on two major assumptions, which differ from those
of previous approaches. First, there is considerable flexibility in the combination of processes (and associated brain areas) involved in the performance of any memory task. Second, “The brain regions operative during
explicit or implicit memory do not divide on consciousness per se” (Dew &
Cabeza, 2011, p. 185).
Cabeza et al. (2018) proposed a component-process model resembling
that of Dew and Cabeza (2011). This model assumes that processing is very
flexible and depends heavily on process-specific alliances (PSAs) or mini-­
networks. According to Cabeza et al., “A PSA is a small team of brain
regions that rapidly assemble to mediate a cognitive process in response
to task demands but quickly disassemble when the process is no longer
needed . . . PSAs are flexible, temporary, and opportunistic” (p. 996).
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 338
28/02/20 4:21 PM
339
Long-term memory systems
Ferbinteanu (2019) proposed a dynamic
network model based on very similar
assumptions.
A major motivation for this theoretical
approach was neuroimaging evidence. Here
is an example involving the left angular
gyrus in the parietal lobe. This region is
involved in both the recollection of episodic
memories and numerous tasks requiring
semantic processing (see Figure 7.19).
Moscovitch et al. (2016) pointed
out that the hippocampus’s connections
to several other brain areas (e.g., those
involved in visual perception) suggests it
is not only involved in episodic memory.
Consider research on boundary extension: “the . . . tendency to reconstruct a
scene with a larger background than actually was presented” (Moscovitch et al.,
2016, p. 121). Boundary extension is
accompanied by hippocampal activation
and is greatly reduced in amnesic patients
with hippocampal damage.
McCormick et al. (2018) reviewed
research on patients with damage to
the hippocampus. Such patients mostly
showed decreased future thinking and
impaired scene construction, navigation
and moral decision-making as well as
impaired episodic memory. McCormick
et al. also reviewed research on patients
with damage to the ventromedial prefrontal cortex (centrally involved in schema
processing in semantic memory), which
is also connected to several other brain
areas. Such patients had decreased future
thinking and impaired scene construction,
navigation and emotion regulation.
Evaluation
Figure 7.18
A three-dimensional model of memory: (1) conceptually
or perceptually driven; (2) relational or item stimulus
representation; (3) controlled or automatic/involuntary
intention. The brain areas are the visual cortex (Vis Ctx),
parahippocampal cortex (PHC), hippocampus (Hipp), rhinal
cortex (RhC) and left ventrolateral prefrontal cortex (L VL PFC).
From Dew and Cabeza (2011). © 2011 New York Academy of Sciences.
Reprinted with permission of Wiley & Sons.
Example PSAs including L-AG
Episodic recollection
Semantic processing
AG
AG
vATL
HC
Figure 7.19
Process-specific alliances including the left angular gyrus (L-AG) are
involved in recollection of episodic memories (left-hand side) and
semantic processing (right-hand side).
The component-process approach has
several strengths. First, there is compelling
evidence that processes associated with From Cabeza et al. (2018).
different memory systems combine very
flexibly on numerous memory tasks. This flexibility depends on the precise
task demands (e.g., processes necessary early in learning may be less so
subsequently) and on individual differences in learning/memory skills and
previous knowledge. In other words, we use whatever processes (and associated brain areas) are most useful for the current learning or memory task.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 339
28/02/20 4:21 PM
340
Memory
KEY TERM
Second, this approach is more consistent with the neuroimaging evidence than previous approaches. It can account for the fact that many
more brain areas are typically active during most memory tasks than
expected from the traditional approach.
Third, the component-process approach has encouraged researchers to
abandon the traditional approach of studying memory as an isolated mental
function. For example, processes associated with episodic memory are also
involved in scene construction, aspects of decision-making, navigation,
imagining the future and empathy (McCormick et al., 2018; Moscovitch
et al., 2016). More generally, “The border between memory and perception/action has become more blurred” (Ferbinteanu, 2019, p. 74).
What are the limitations of the component-process approach? First, it
does not provide a detailed model. This makes it hard to make specific predictions concerning the precise combination of processes individuals will
use on any given memory task.
Second, our ability to create process-specific alliances rapidly and
efficiently undoubtedly depends on our previous experiences and various
forms of learning (Ferbinteanu, 2019). However, the nature of such learning remains unclear.
Third, as Moscovitch et al. (2016, p. 125) pointed out, “Given that
PSAs are rapidly assembled and disassembled, they require a mechanism
that can quickly control communication between distant brain regions.”
Moscovitch et al. argued the prefrontal cortex is centrally involved, but we
have very limited evidence concerning its functioning.
Fourth, process-specific alliances are typically mini-networks involving
two or three brain regions. However, as we have seen, some research has
suggested the involvement of larger brain networks consisting of numerous brain regions (e.g., Kim & Voss, 2019). The optimal network size for
explaining learning and memory remains unclear.
Boundary extension
Misremembering a
scene as having a larger
surround area than was
actually the case.
CHAPTER SUMMARY
•
Introduction. The notion there are several memory systems is very
influential. Within that approach, the crucial distinction is between
declarative memory (involving conscious recollection) and nondeclarative memory (not involving conscious recollection). This
distinction has received strong support from amnesic patients
with severely impaired declarative memory but almost intact non-­
declarative memory. Declarative memory is divided into semantic
and episodic/autobiographical memory, whereas non-declarative
memory is divided into priming and skill learning or procedural
memory.
•
Declarative memory. Evidence from patients supports the
distinction between episodic and semantic memory. Amnesic
patients with damage to the medial temporal lobes including
the hippocampus typically have more extensive impairment of
episodic than semantic memory. In contrast, patients with semantic
dementia (involving damage to the anterior temporal lobes) have
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 340
28/02/20 4:21 PM
Long-term memory systems
341
more extensive impairment of semantic than episodic memory.
However, a complicating factor is that many memory tasks involve
combining episodic and semantic memory processes. Another
complicating factor is semanticisation (transformation of episodic
memories into semantic ones over time): perceptual details within
episodic memory are lost over time and there is increased reliance
on gist and schematic information within semantic memory.
•
Episodic memory. Episodic memory is often assessed by
recognition tests. Recognition memory can involve familiarity or
recollection. Evidence supports the binding-of-item-and-context
model: familiarity judgements depend on perirhinal cortex whereas
recollection judgements depend on binding what and where
information in the hippocampus. In similar fashion, free recall can
involve familiarity or recollection with the latter being associated
with better recall of contextual information. Episodic memory
is basically constructive rather than reproductive, and so we
remember mostly the gist of our past experiences. Constructive
processes associated with episodic memory are used to imagine
future events. However, imaging future events relies more heavily
on semantic memory than does recalling past events. Episodic
memory is also used in divergent creative thinking.
•
Semantic memory. Most objects can be described at the
superordinate, basic and subordinate levels. Basic level categories
are typically used in everyday life. However, categorisation is
often faster at the superordinate level than the basic level because
less information processing is required. According to Barsalou’s
situated simulation theory, concept processing involves perceptual
and motor information. However, it is unclear whether perceptual
and motor information are both necessary and sufficient for
concept understanding (e.g., patients with damage to the motor
system can understand action-related words). Concepts have an
abstract central core of meaning de-emphasised by Barsalou.
According to the hub-and-spoke model, concepts consist of
hubs (unified abstract representations) and spokes (modalityspecific information). The existence of patients with categoryspecific deficits supports the notion of spokes. Evidence from
patients with semantic dementia indicates hubs are stored in the
anterior temporal lobes. It is unclear how information from hubs
and spokes is combined and integrated.
Schemas are stored in semantic memory with the ventromedial
prefrontal cortex being especially involved in schema processing.
Patients with damage to that brain area often have greater
impairments in schema knowledge than concept knowledge. In
contrast, patients with semantic dementia (damage to the anterior
temporal lobes) have greater impairments in concept knowledge
than schema knowledge. Thus, there is some evidence for a
double dissociation.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 341
28/02/20 4:21 PM
342
Memory
•
Non-declarative memory. Priming is tied to specific stimuli
and occurs rapidly. Priming often depends on enhanced neural
efficiency shown by repetition suppression of brain activity. Skill
learning occurs slowly and generalises to stimuli not presented
during learning. Amnesic patients (with hippocampal damage)
typically have fairly intact performance on priming and skill learning
but severely impaired declarative memory. In contrast, Parkinson’s
patients (with striatal damage) exhibit the opposite pattern.
Amnesic and Parkinson’s patients provide only an approximate
double dissociation. Complications arise because some tasks can
be performed using either declarative or non-declarative memory,
because different memory systems sometimes interact during
learning, and because non-declarative learning often involves
networks consisting of several brain areas.
•
Beyond memory systems and declarative vs non-declarative
memory. The traditional emphasis on the distinction between
declarative and non-declarative memory is oversimplified. It does
not fully explain amnesics’ memory deficits and exaggerates the
relevance of whether processing is conscious or not. Henke’s
model (with its emphasis on processes rather than memory
systems) provides an account that is superior to the traditional
approach. According to the component-process model, memory
involves numerous brain areas and processes used in flexible
combinations rather than a much smaller number of rigid memory
systems. This model has great potential. However, it is hard to
make specific predictions about the combinations of processes
individuals will use on any given memory task.
FURTHER READING
Baddeley, A.D., Eysenck, M.W. & Anderson, M.C. (2020). Memory (3rd edn).
Abingdon, Oxon.: Psychology Press. Several chapters are of direct relevance to
the topics covered in this chapter.
Bastin, C., Besson, G., Simon, J., Delhaye, E., Geurten, M., Willems, S., (2019). An
integrative memory model of recollection and familiarity to understand memory
deficits. Behavioral and Brain Sciences, 1–66 (epub: 5 February 2019). Christine
Bastin and colleagues provide a comprehensive theoretical account of episodic
memory.
Cabeza, R., Stanley, M.L. & Moscovitch, M. (2018). Process-specific alliances
(PSAs) in cognitive neuroscience. Trends in Cognitive Sciences, 22, 996–1010.
Roberto Cabeza and colleagues how cognitive processes (including memory)
depend on flexible interactions among brain regions.
Ferbinteanu, J. (2019). Memory systems 2018 – Towards a new paradigm.
Neurobiology of Learning and Memory, 157, 61–78. Janina Ferbinteanu discusses
recent theoretical developments in our understanding of memory systems.
Kim, H. (2017). Brain regions that show repetition suppression and enhancement:
A meta-analysis of 137 neuroimaging experiments. Human Brain Mapping, 38,
1894–1913. Hongkeun Kim discusses the processes underlying repetition priming
with reference to a meta-analysis of the relevant brain areas.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 342
28/02/20 4:21 PM
Long-term memory systems
343
Lambon Ralph, M.A., Jefferies, E., Patterson, K. & Rogers, T.T. (2017). The neural
and computational bases of semantic cognition. Nature Reviews Neuroscience, 18,
42–55. Our current knowledge and understanding of semantic memory are discussed in the context of the hub-and-spoke model.
Verfaillie, M. & Keane, M.M. (2017). Neuropsychological investigations of human
amnesia: Insights into the role of the medial temporal lobes in cognition. Journal
of the International Neuropsychological Society, 23, 732–740. Research on amnesia
and memory is discussed in detail in this article.
Yee, E., Jones, M.N. & McRae, K. (2018). Semantic memory. In S.L. ThompsonSchill & J.T. Wixted (eds), Stevens’ Handbook of Experimental Psychology and
Cognitive Neuroscience, Vol. 3: Language and Thought: Developmental and social
psychology (4th edn; pp. 319–356). New York: Wiley. This chapter provides a
comprehensive account of theory and research on semantic memory.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 343
28/02/20 4:21 PM
Chapter
8
Everyday memory
INTRODUCTION
Most memory research discussed in Chapters 6 and 7 was laboratory-based
but nevertheless of reasonably direct relevance to how we use memory in
our everyday lives. In this chapter, we focus on topics rarely researched
until approximately 50 years ago but arguably even more directly relevant
to our everyday lives. Two such topics are autobiographical memory and
prospective memory, which are both strongly influenced by our everyday
goals and motives. This is very clear with prospective memory (remembering to carry out intended actions). Our intended actions assist us to achieve
our current goals. For example, if you have agreed to meet a friend at
10 am, you need to remember to set off at the appropriate time to achieve
that goal.
The other main topic discussed in this chapter is eyewitness testimony.
Such research has obvious applied value with respect to the judicial system.
However, most research on eyewitness testimony has been conducted in
laboratory settings. Thus, it would be wrong to distinguish sharply between
laboratory research and everyday memory or applied research.
In spite of what has been said so far, everyday memory sometimes
differs from more traditional memory research in various ways. First, social
factors are often important in everyday memory (e.g., a group of friends
discuss some event or holiday they have shared together). In contrast,
participants in traditional memory research typically learn and remember
information on their own.
Second, participants in traditional memory experiments are generally
motivated to be as accurate as possible. In contrast, everyday memory
research is typically based on the notion that “Remembering is a form of
purposeful action” (Neisser, 1996, p. 204). This approach involves three
assumptions about everyday memory:
(1)
(2)
It is purposeful (i.e., motivated).
It has a personal quality about it, meaning it is influenced by the individual’s personality and other characteristics.
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 344
28/02/20 4:21 PM
345
Everyday memory
(3)
It is influenced by situational demands (e.g., the wish to impress one’s
audience).
The essence of Neisser’s (1996) argument is this: what we remember in
everyday life is determined by our personal goals, whereas what we remember in traditional memory research is mostly determined by the experimenter’s demands for accuracy. Sometimes we strive for maximal memory
accuracy in our everyday life (e.g., during an examination), but that is
­typically not our main goal.
KEY TERM
Saying-is-believing effect
Tailoring a message
about an event to suit a
given audience causes
subsequent inaccuracies
in memory for that event.
Findings
Evidence that the memories we report in everyday life are sometimes deliberately distorted was reported by Brown et al. (2015). They found 58% of
students admitted to having “borrowed” other people’s personal memories
when describing experiences that had allegedly happened to them. This was
often done to entertain or impress an audience.
If what you say about an event is deliberately distorted, does this
change the memory itself? It often does. Dudokovic et al. (2004) asked
people to recall a story accurately (as in traditional memory research) or
entertainingly (as in the real world). Unsurprisingly, entertaining retellings
were more emotional but contained fewer details.
The participants were then instructed to recall the story accurately.
Those who had previously recalled it entertainingly recalled fewer details
and were less accurate than those who previously recalled it accurately.
This exemplifies the saying-is-believing effect – tailoring what one says
about an event to suit a given audience causes inaccuracies in memory for
that event.
Further evidence of the saying-is-believing effect was reported by
Hellmann et al. (2011). Participants saw a video of a pub brawl involving two men. They then described the brawl to a student having previously
been told this student believed person A was (or was not) the culprit. The
participants’ retelling of the event reflected the student’s biased views. On a
subsequent unexpected test of free recall for the crime event, participants’
recall was systematically influenced by their earlier retelling. Free recall was
most distorted in those participants whose retelling of the event had been
most biased.
What should be done?
Research on human memory should ideally possess ecological validity (i.e.,
applicability to real life; see Glossary). Ecological validity has two aspects:
(1) representativeness (the naturalness of the experimental situation and
task); and (2) generalisability (the extent to which a study’s findings apply
to the real world).
It is often (mistakenly) assumed that everyday memory research has
greater ecological validity than traditional laboratory research. This is
simply incorrect. Generalisability is more important than representativeness (Kvavilashvili & Ellis, 2004). Laboratory research is generally carried
out under well-controlled conditions and very often produces findings that
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 345
28/02/20 4:21 PM
346
Memory
KEY TERMS
apply to the real world. Indeed, the fact that the level of experimental
control is generally higher in laboratory research than in more naturalistic
research means that the findings obtained often have greater generalisability. Laboratory research also often satisfies the criterion of representativeness because the experimental situation captures key features of the real
world.
In sum, the distinction between traditional laboratory research and
everyday memory research is blurred and indistinct. In practice, there
is much cross-fertilisation, with the insights from both kinds of memory
research enhancing our understanding of human memory.
Autobiographical
memory
Long-term memory for the
events of one’s own life.
Mentalising
The ability to perceive
and interpret behaviour
in terms of mental states
(e.g., goals; needs).
AUTOBIOGRAPHICAL MEMORY: INTRODUCTION
We have hundreds of thousands of memories relating to an endless variety
of things. However, those relating to the experiences we have had and those
of other people important to us have special significance and form our
autobiographical memory (memory for the events of one’s own life).
What is the relationship between autobiographical memory and episodic memory (concerned with events at a given time in a specific place;
see Chapter 7)? One important similarity is that both types of memory
relate to personally experienced events. In addition, both are susceptible to
proactive and retroactive interference and unusual or distinctive events are
especially well remembered.
There are also several differences between them. First, autobiographical memory typically relates to events of personal significance whereas
episodic memory (sometimes called “laboratory memory”) often relates to
trivial events (e.g., was the word chair presented in the first list?). As a consequence, autobiographical memories are often thought about more often
than episodic ones. They also tend to be more organised than ­episodic
memories because they relate to the self.
Second, neuroimaging evidence suggests autobiographical memory is
more complex and involves more brain regions than episodic memory.
Andrews-Hanna et al. (2014) carried out a meta-analysis (see Glossary)
of studies on autobiographical memory, episodic memory and mentalising (understanding the mental states of oneself and others) (see Figure
8.1). Episodic memory retrieval involved medial temporal regions (including the hippocampus) whereas mentalising involved the dorsal medial
regions (including the dorsal medial prefrontal cortex). Of most importance, the brain regions associated with autobiographical memory overlapped with those associated with episodic memory and mentalising. Thus,
autobiographical memory seems to involve both episodic memory and
­
mentalising.
Third, some people have large discrepancies between their auto­
biographical and episodic memory (Roediger & McDermott, 2013). For
example, Patihis et al. (2013) found individuals with exceptionally good
autobiographical memory had only average episodic memory performance
when recalling information learned under laboratory conditions (see below).
Fourth, the role of motivation differs between autobiographical and
episodic memory (Marsh & Roediger, 2012). We are much more interested in our own personal history than episodic memories formed in the
9781138482210_COGNITIVE_PSYCHOLOGY_PART_2.indd 346
28/02/20 4:21 PM
Everyday memory
347
laboratory. In addition, as mentioned earlier,
we are motivated to recall autobiographical
memories reflecting well on ourselves. In
contrast, we are motivated to recall laboratory episodic memories accurately.
Fifth, some aspects of autobiographical
memory involve semantic memory (general
knowledge; see Glossary) rather than episodic memory (Prebble et al., 2013). For
example, we know where and when we
were born but this is not based on episodic
memory! Further evidence for the involvement of semantic memory in autobiographical memory comes from research on amnesic
Figure 8.1
patients (Juskenaite et al., 2016). They have Brain regions activated by autobiographical, episodic retrieval
little or no episodic memory but can never- and mentalising tasks including regions of episodic (green);
theless recall much information about them- mentalising (blue); autobiographical (red-brown); episodic +
selves (e.g., aspects of their own personality). mentalising (blue/green); episodic + autobiographical (yellow);
Eustache et al. (2016) distinguished mentalising + a
Download