FMRI Studies of Effects of Hearing Status on

advertisement
FMRI Studies of Effects of Hearing Status on
Audio-Visual Speech Perception
by
Julie J. Yoo
B.A.Sc. Computer Engineering, University of Waterloo, 1998
M.A.Sc. Electrical Engineering, University of Waterloo, 2002
Submitted to the Division of Health Sciences and Technology
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy in Speech and Hearing Bioscience and Technology
at the
OF TECHNOLOGY
of Technology
Massachusetts Institute
FEB 2 1 2007
February 2007
LIBRARIES
ARCHIE
©2007 Massachusetts Institute of Technology
All rights reserved.
The author hereby grants to MIT permission to reproduce and distribute publicly paper and
electronic copies of this thesis document in whole or in part in any medium now known or
hereafter created.
Signature of Author:
Department of Health Sciences and Technology
February 12, 2007
Certified by:
________________________
Frank H. Guenther, Ph.D.
Associate Professor of Cognitive Neural Systems, Boston University
Thesis Supervisor
Accepted
by:
Martha L. Gray, Ph.D.
Edward Hood Taplin Proasor of Medical and Electrical Engineering
Director, Harvard-MIT Divisi n of Health Sciences and Technology
FMRI Studies of Effects of Hearing Status on
Audio-Visual Speech Perception
by
Julie J. Yoo
Submitted to the Division of Health Sciences and Technology on February 12, 2007 in
Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in
Speech and Hearing Bioscience and Technology
Abstract
The overall goal of this research is to acquire a more complete picture of the neurological
processes of visual influences on speech perception and to investigate effects of hearing status on
AV speech perception. More specifically, functional magnetic resonance imaging (fMRI) was
used to investigate the brain activity underlying audio-visual speech perception in three groups of
subjects: (1) normally hearing, (2) congenitally deafened signers (American Sign Language) who
do not use hearing aids, and (3) congenitally hearing impaired individuals with hearing aids.
FMRI data were collected while subjects experienced three different types of speech stimuli:
video of a speaking face with audio input, audio speech without visual input, and video of a
speaking face without audio input.
The cortical areas found to be active for speechreading included: visual cortex, auditory cortex
(but not primary auditory cortex), speech motor network areas, supramarginal gyrus, thalamus,
superior parietal cortex and fusiform gyrus. For hearing impaired subjects, in addition to the
areas listed above, Heschl's gyrus, right angular gyrus (AG), cerebellum and regions around right
inferior frontal sulcus (IFS) in the frontal lobe were also found to be active. Results from our
study added to existing evidence of the engagement of motor-articulatory strategies in visual
speech perception. We also found that an individual's speechreading ability is related to the
amount of activity in superior temporal cortical areas, including primary auditory cortex, preSMA, IFS and right AG during visual speech perception.
Results from effective connectivity analyses suggest that posterior superior temporal sulcus may
be a potential AV speech integration site; and that AG serves a critical role in visual speech
perception when auditory information is absent for hearing subjects, and when auditory
information is available for hearing impaired subjects. Also, strong excitatory projections from
STS to inferior frontal gyrus (IFG) and premotor/motor areas, and a strong inhibitory
projection from IFG to STS seem to play an important role in visual speech perception in all
subject groups. Finally, correlation analyses revealed that in hearing aid users, the amount of
acoustic and speech signal gained by using hearing aids were significantly correlated with activity
in IFG.
Thesis Supervisor: Frank H. Guenther
Title: Associate Professor
Acknowledgements
First and foremost, my deepest gratitude goes to my research advisors Frank Guenther and
Joseph Perkell. Together they provided just the right balance of engaged and hands-off
guidance, while being the continual source of abundant insight and inspiration. They were
always available whenever I needed direction and their doors were always open. Without
their constant support, encouragement and patience, I would not have been able to complete
the dissertation work and I am greatly indebted to them. I am especially grateful to Frank for
all his inputs which were instrumental throughout every stage of this project - from its
inception to completion; his profound wisdom and in-depth knowledge are deeply
appreciated and by being his student in his lab, I feel that have learned a great deal, gained
invaluable experience and grown as a scientist.
I would also like to thank my other thesis committee members, Ken Stevens and John
Gabrieli, for their time and energy, and for the helpful criticisms and questions they provided
along the way. I appreciated Ken occasionally dropping by my office to see how I was
doing, and John attending my oral defense which actually happened to be on the day his wife
delivered a baby girl.
Everyone at the CNS Speech Lab and at the Speech Communication Group has been a great
moral support and help during my time as a graduate student. I would like to acknowledge
Satra Ghosh for helping me in so many different ways that I cannot even list them all here;
Alfonso Nieto-Castanon for helping me with data analysis and interpretation when I first
started out in Frank's lab; Jason Tourville for creating the figure generating scripts that I
needed just hours before my defense; Jay Bohland, Jonathan Brumberg and Seth Hall for
solving numerous miscellaneous problems I had encountered with our servers; Carrie
Niziolek, Elisa Golfinopolous, and Steven Shannon (at the Martinos Center for Brain
Imaging) for second chairing many long MRI scan sessions while keeping me entertained
with interesting conversations; Majid Zandipour for listening to my occasional rants and
taking care of various issues around the lab; Arlene Wint for managing the lab in perfect
ways; and Maxine Samuels (at the RLE headquarter) for cheerfully processing stacks of
paperwork I piled on her desk every few days.
Harlan Lane has been a great help in the recruiting process of hearing impaired research
subjects for our study. Melanie Matthies and our audiologist Nicole Marrone have also been
a tremendous help during the experimental sessions with our hearing aid research
participants. Melanie allowed me to use the sound booth and audiological testing equipment
at Sargent College and also guided me with interpretation of data collected from the
audiological testings; and Nicole took the time out of her completely packed schedule to
conduct hours and hours of audiological testing on our research subjects. Additionally, I
would like to thank Lou Braida and Robert Hoffmeister for graciously providing me with
useful video recordings for our study, effectively saving me from doing weeks of video
recording and editing work.
Leigh Haverty, Matt Gilbertson and Jenn Flynn have been extremely generous with their
time, eagerly taking on sign language interpreting jobs for many of our experimental sessions
(even at some odd hours like at midnight on Fridays). Without their cooperation, scheduling
our experiments would have been much more difficult, and it would have taken me a lot
longer to collect all the data. I know that they had many other jobs to choose from, but
prioritizing our research project as one of the most important jobs to them is something I am
very thankful for. Additional thanks go to all of our anonymous study participants for their
willingness and patience.
Most importantly, I thank God for blessing me with wonderful family members and friends:
the greatest parents one could ever ask for, Jung and Soon Yoo, my sisters who are also my
best friends, Kathy, Sarah and Kari (you all are real troopers!), my brother-in-law Paul who
also was a great host at the cornfield cottage, honorary family members Sunny and Younghee
Kwak, and my very understanding and forgiving friends whose phone calls I have many
times ignored with an excuse that I am busy. Their loving support, prayers and quiet
encouragements are what kept me going through many ups and downs and I am forever
grateful. I am also very delighted to be able to complete my dissertation just in time for my
mom's 60* birthday, and would like to dedicate this work to my parents.
Finally, this research was supported by National Institute on Deafness and other
Communication Disorders (NIDCD) grants RO1 DC03007 (J. Perkell, PI) and ROl DC02852
(F. Guenther, PI). I would also like to acknowledge Domingo Altarejos for helping me
secure the financial support for my last semester at MIT.
5
TABLE OF CONTENTS
A BSTRA CT ...................................................................................................................................
2
A C KN OW LEDG EM EN TS.....................................................................................................
3
TA BLE O F C O N TEN TS..............................................................................................................
5
LIST O F FIG UR ES.......................................................................................................................
7
LIST O F TA BLES.......................................................................................................................
11
1
15
IN TR O D U CTIO N ................................................................................................................
1.1
AUDIO-VISUAL SPEECH PERCEPTION (AVSP) ...........................................................................
15
1.2
1.3
NEUROIMAGING STUDIES OF AVSP AND HEARING STATUS..........................................................
N EURAL CONNECTIVITY................................................................................................................
GOALS...........................................................................................................................................
22
23
24
EXPERIMENTAL METHODS AND DATA ANALYSIS...............................................
26
1.4
2
2.1
2.2
2.3
2.4
2.5
2.6
2.7
3
SUBJECTS ......................................................................................................................................
TESTS CONDUCTED .......................................................................................................................
FUNCTIONAL M AGNETIC RESONANCE IMAGING EXPERIMENT ...................................................
SPEECHREADING TEST...................................................................................................................
AUDIOLOGICAL AND SPEECH TESTS FOR HEARING AID USERS.....................................................
CORRELATION ANALYSES.............................................................................................................
EFFECTIVE CONNECTIVITY ANALYSES.......................................................................................
STU D Y R ESU LTS...............................................................................................................
3.1
RESULTS FROM STANDARD FM RI ANALYSES.............................................................................
3.1.1
3.1.2
3.1.3
3.1.4
NormalHearing(NH) .......................................................................................................
CongenitallyDeaf(CD)......................................................................................................
HearingAid Users (HA) .......................................................................................................
Discussionof Results ........................................................................................................
3.1.4.1
3.1.4.2
3.1.4.3
3.2
3.3
3.3.1
3.3.2
4
35
35
43
51
58
65
Auditory Cortex ...............................................................................................................................
Lateral Prefrontal Cortex..................................................................................................................
P re-S M A ..........................................................................................................................................
Angular Gyrus..................................................................................................................................
Conclusion .......................................................................................................................................
67
67
69
70
71
RESULTS FROM CORRELATION ANALYSES ................................................................................
81
Normally Hearing and Congenitally Deqf (NH and CD) ............................... 81
HearingAid Users (HA) ...................................................................................................
86
EFFECTIVE CONNECTIVITY ANALYSES ................................................................
4.1
35
Auditory-Visual Speech Perception Network in NH Individuals............
.......... ....................... 58
Speech Motor Network and Visual Speech Perception
...................................... 59
Hearing Status and Auditory-Visual Speech Perception Network ......................
62
SPEECHREADING TEST AND FM RI ANALYSES.............................................................................
3.2.1.1
3.2.1.2
3 .2.1.3
3.2.1.4
3.2.1.5
26
27
27
31
31
33
34
FUNCTIONAL CONNECTIVITY ......................................................................................................
102
103
4.1.1
4.1.2
4.1.3
4.2
PartialLeast Squares .........................................................................................................
EigenimageAnalysis...........................................................................................................
MultidimensionalScaling ...................................................................................................
STRUCTURAL EQUATION M ODELING............................
4.2.1
4.2.2
4.2.3
4.2.3.1
4.2.3.2
4.2.3.3
4.2.3.4
.................................................
Superior Temporal Sulcus as AV Speech Integration Site ...........
Visual -Temporal-Parietal Interactions ..................
..
Fronto-Temporal Interactions.............................................
Network Differences between NH and CD Groups. ........
4.3.1
4.3.2
.............. ...........................
.......................................
Theory.................................................................................................................................
Results.................................................................................................................................
SUMMARY OF RESULTS AND DISCUSSION............................................................
5.1
5.2
5.3
108
108
113
115
120
124
128
.... ................. .................. 130
DYNAMIC CAUSAL M ODELING....................................................................................................
134
4.3
5
........
Theory.................................................................................................................................
Methods ...........................................................................................................................
Results.................................................................................................................................
104
105
107
NORMALLY HEARING..................................................................................................................
H EARING-IMPAIRED ....................................................................................................................
CONCLUDING REMARKS AND FUTURE W ORK .............
........... ..........................................
135
141
146
146
148
151
A PPEN D IX A ............................................................................................................................
153
R EFER EN CES ..........................................................................................................................
158
LIST OF FIGURES
Figure 1-1 Hypothesized projections from visual to auditory cortical areas......................25
Figure 2-1 A still image of the video clip stimulus. ...........................................................
28
Figure 2-2 Block-design paradigm: a typical run ...............................................................
29
Figure 3-1 NH group: Averaged cortical activation produced by the contrast of the AudioOnly condition with the baseline condition for CVCV (left panel) and Vowel
(right panel) [T > 4.23 (CVCV), T > 5.27 (Vowel), mixed-effects analyses with P
< 0.05, FDR corrected]. .................................................................................
36
Figure 3-2 NH group: Averaged cortical activation produced by the contrast of the VisualOnly condition with the baseline condition for CVCV (left panel) and Vowel
(right panel) [T > 3.39 (CVCV), T > 3.96 (Vowel), mixed-effects analyses with P
< 0.05, FD R corrected]. .................................................................................
38
Figure 3-3 NH group: Averaged cortical activation produced by the contrast of the AudioVisual condition with the baseline condition for CVCV (left panel) and Vowel
(right panel) [T > 3.32 (CVCV), T > 3.95 (Vowel), mixed-effects analyses with P
< 0.05, FDR corrected]. .................................................................................
41
Figure 3-4 CD group: Averaged cortical activation produced by the contrast of the AudioOnly condition with the baseline condition for CVCV (left panel) and Vowel
(right panel) [T > 5.00 (CVCV), T > 4.02 (Vowel), mixed-effects analyses with P
< 0.05, FDR corrected]. .................................................................................
44
Figure 3-5 CD group: Averaged cortical activation produced by the contrast of the VisualOnly condition with the baseline condition for CVCV (left panel) and Vowel
(right panel) [T > 2.82 (CVCV), T > 4.17 (Vowel), mixed-effects analyses with P
< 0.05, FDR corrected]. .................................................................................
45
Figure 3-6 CD group: Averaged cortical activation produced by the contrast of the AudioVisual condition with the baseline condition for CVCV (left panel) and Vowel
(right panel) [T > 3.28 (CVCV), T > 4.28 (Vowel), mixed-effects analyses with P
< 0.05, FD R corrected]. .................................................................................
48
Figure 3-7 HA group: Averaged cortical activation produced by the contrast of the AudioOnly condition with the baseline condition for CVCV (left panel) and Vowel
(right panel) [T > 4.02 (CVCV), T > 4.02 (Vowel), mixed-effects analyses with P
< 0.001, uncorrected]. ......................................................................................
52
Figure 3-8 HA group: Averaged cortical activation produced by the contrast of the VisualOnly condition with the baseline condition for CVCV (left panel) and Vowel
(right panel) [T > 4.02 (CVCV), T > 4.02 (Vowel), mixed-effects analyses with P
< 0.00 1, uncorrected]. ....................................................................................
53
Figure 3-9 HA group: Averaged cortical activation produced by the contrast of the AudioVisual condition with the baseline condition for CVCV (left panel) and Vowel
(right panel) [T > 3.11 (CVCV), T > 3.11 (Vowel), mixed-effects analyses with P
< 0.001, uncorrected] .......................................................................................
55
Figure 3-10 NH group: Speechreading test scores...................................................................72
Figure 3-11 NH group: Averaged cortical activation produced by the contrast of the CVCV
Visual-Only condition with the baseline condition for Poor Speechreaders (Left
panel) and Good Speechreaders (Right panel) [T > 3.00 (Poor), T > 3.07 (Good),
fixed-effects analyses with P < 0.01, FDR corrected]. ....................................
72
Figure 3-12 CD group: Speechreading test scores...............................................................
75
Figure 3-13 CD group: Averaged cortical activation produced by the contrast of the CVCV
Visual-Only condition with the baseline condition for Poor Speechreaders (Left
panel) and Good Speechreaders (Right panel) [T > 3.00 (Poor), T > 3.07 (Good),
fixed-effects analyses with P < 0.01, FDR corrected]. .....................................
75
Figure 3-14 HA group: Speechreading test scores...............................................................
78
Figure 3-15 HA group: Averaged cortical activation produced by the contrast of the CVCV
Visual-Only condition with the baseline condition for Poor Speechreaders (Left
panel) and Good Speechreaders (Right panel) [T > 3.00 (Poor), T > 3.07 (Good),
fixed-effects analyses with P < 0.01, FDR corrected]. ....................................
78
Figure 3-16 NH group: Significantly correlated regions identified using the regression
analysis for the CVCV Visual-Only condition and speechreading scores [F>12.83,
P < 0.005, uncorrected]...................................................................................
83
Figure 3-17 CD group: Significantly correlated regions identified using the regression
analysis for the CVCV Visual-Only condition and speechreading scores [F >
12.83, P < 0.005, uncorrected]..........................................................................
85
Figure 3-18 HA group: Significantly correlated regions identified using the regression
analysis for the CVCV Visual-Only condition and speechreading scores [F >
12.83, P < 0.005, uncorrected]..........................................................................
91
Figure 3-19 HA group: active regions identified using the regression analysis for the CVCV
Audio-Only condition and percentage of hearing impairment (unaided) [F >
21.04, P < 0.001, uncorrected]..........................................................................
93
Figure 3-20 HA group: active regions identified using the regression analysis for the CVCV
Audio-Visual condition and percentage of hearing impairment (unaided) [F >
21.04, P < 0.001, uncorrected]..........................................................................
95
Figure 4-1 D istribution of eigenvalues..................................................................................106
Figure 4-2 The first eigenim age. ...........................................................................................
107
Figure 4-3 Example of a structural m odel. ............................................................................
108
Figure 4-4 Anatomical model for SEM analyses (V = Higher-order Visual Cortex, AG =
Angular Gyrus, IPT = Inferoposterior Temporal Lobe, STS = Superior Temporal
Sulcus, IFG = Inferior Frontal Gyrus, M = Lateral Premotor Cortex & Lip area on
primary m otor cortex).........................................................................................114
Figure 4-5 NH (left): estimated path coefficients [black text: CVCV Visual-Only, blue text:
CVCV Audio-Visual; thicker black arrows: connections with significant increase
in strength for the CVCV Visual-Only condition, thicker blue arrows:
connections with significant increase in strength for the CVCV Audio-Visual
con d ition ]. ...........................................................................................................
1 18
Figure 4-6 NH (right): estimated path coefficients [black text: CVCV Visual-Only, blue text:
CVCV Audio-Visual; thicker black arrows: connections with significant increase
in strength for the CVCV Visual-Only condition, thicker blue arrows:
connections with significant increase in strength for the CVCV Audio-Visual
con d ition ]. ...........................................................................................................
1 19
Figure 4-7 CD (right): estimated path coefficients [black text: CVCV Visual-Only, blue text:
CVCV Audio-Visual; thicker black arrows: connections with significant increase
in strength for the CVCV Visual-Only condition, thicker blue arrows:
connections with significant increase in strength for the CVCV Audio-Visual
con dition ]............................................................................................................1
19
Figure 4-8 HA (right): estimated path coefficients [black text: CVCV Visual-Only, blue text:
CVCV Audio-Visual; thicker black arrows: connections with significant increase
in strength for the CVCV Visual-Only condition, thicker blue arrows:
connections with significant increase in strength for the CVCV Audio-Visual
con dition ]............................................................................................................120
Figure 4-9 VO (right): estimated path coefficients [black text: NH, blue text: CD; thicker
black arrows: connections with significant increase in strength for the NH group,
thicker blue arrows: connections with significant increase in strength for the CD
gro u p ]..................................................................................................................13
2
Figure 4-10 AV (right): estimated path coefficients [black text: NH, blue text: CD; thicker
black arrows: connections with significant increase in strength for the NH group,
thicker blue arrows: connections with significant increase in strength for the CD
grou p ]..................................................................................................................13
3
Figure 4-11 Example DCM model [adopted from Friston et al. (2003)]. ..............
135
Figure 4-12 The hemodynamic model [adopted from Friston et al. (2003)].........................136
Figure 4-13 Example DCM model with its state variables [adopted from Friston et al. (2003)].
.............................................................................................................................
14 0
Figure 4-14 The anatomical model for DCM analyses..........................................................142
Figure 4-15 NH (left): Results from DCM analysis [black: intrinsic connection estimates for
both conditions combined; blue: modulatory effect estimates when auditory
speech is present]................................................................................................144
Figure 4-16 HA (right): Results from DCM analysis [black: intrinsic connection estimates for
both conditions combined; blue: modulatory effect estimates when auditory
speech is present]................................................................................................144
Figure 5-1 NH subjects for the CVCV Visual-Only condition [black arrow: positive
connection, blue arrow: negative connection; thin arrow: weak connection; thick
arrow : strong connection]. ..................................................................................
147
Figure 5-2 NH subjects for the CVCV Audio-Visual condition [black arrow: positive
connection, blue arrow: negative connection; thin arrow: weak connection; thick
arrow : strong connection]. ..................................................................................
148
Figure 5-3 Hearing impaired subjects for the CVCV Visual-Only condition [black arrow:
positive connection, blue arrow: negative connection; thin arrow: weak
connection; thick arrow: strong connection].......................................................150
Figure 5-4 Hearing impaired subjects for the CVCV Audio-Visual condition [black arrow:
positive connection, blue arrow: negative connection; thin arrow: weak
connection; thick arrow: strong connection].......................................................150
LIST OF TABLES
Table 3-1 NH group: Summary of peak cortical activation produced by the contrast of the
CVCV Audio-Only condition versus the baseline condition...........................37
Table 3-2 NH group: Summary of peak cortical activation produced by the contrast of the
CVCV Audio-Only condition versus the baseline condition...........................37
Table 3-3 NH group: Summary of peak cortical activation produced by the contrast of the
CVCV Visual-Only condition versus the baseline condition. ......................... 39
Table 3-4 NH group: Summary of peak cortical activation produced by the contrast of the
Vowel Visual-Only condition versus the baseline condition...........................40
Table 3-5 NH group: Summary of peak cortical activation produced by the contrast of the
CVCV Audio-Visual condition versus the baseline condition. ....................... 42
Table 3-6 NH group: Summary of peak cortical activation produced by the contrast of the
Vowel Audio-Visual condition versus the baseline condition........................42
Table 3-7 CD group: Summary of peak cortical activation produced by the contrast of the
CVCV Visual-Only condition versus the baseline condition. .........................
46
Table 3-8 CD group: Summary of peak cortical activation produced by the contrast of the
Vowel Visual-Only condition versus the baseline condition...........................47
Table 3-9 CD group: Summary of peak cortical activation produced by the contrast of the
CVCV Audio-Visual condition versus the baseline condition. ....................... 50
Table 3-10 CD group: Summary of peak cortical activation produced by the contrast of the
Vowel Audio-Visual condition versus the baseline condition..........................50
Table 3-11 HA group: Summary of peak cortical activation produced by the contrast of the
CVCV Visual-Only condition versus the baseline condition. .........................
54
Table 3-12 HA group: Summary of peak cortical activation produced by the contrast of the
Vowel Visual-Only condition versus the baseline condition...........................54
Table 3-13 HA group: Summary of peak cortical activation produced by the contrast of the
CVCV Audio-Visual condition versus the baseline condition. .......................
56
Table 3-14 HA group: Summary of peak cortical activation produced by the contrast of the
Vowel Audio-Visual condition versus the baseline condition........................57
Table 3-15 NH group: cortical activation produced by the contrast of the CVCV Visual-Only
condition with the baseline condition for Poor Speechreaders (Left panel) and
Good Speechreaders (Right panel) [x, y, z in MNI coordinates].....................74
Table 3-16 CD group: cortical activation produced by the contrast of the CVCV Visual-Only
condition with the baseline condition for Poor Speechreaders (Left panel) and
Good Speechreaders (Right panel) [x,y,z in MNI coordinates]........................77
Table 3-17 HA group: cortical activation produced by the contrast of the CVCV Visual-Only
condition with the baseline condition for Poor Speechreaders (Left panel) and
Good Speechreaders (Right panel) [x,y,z in MNI coordinates]........................80
Table 3-18 NH group: Significantly correlated regions identified using the regression analysis
for the CVCV Visual-Only condition and speechreading scores [F > 12.83, P <
0.005, uncorrected]. ........................................................................................
84
Table 3-19 CD group: Significantly correlated regions identified using the regression analysis
for the CVCV Visual-Only condition and speechreading scores [F > 12.83, P <
0.005, uncorrected] .........................................................................................
85
Table 3-20 HA group: subjects' hearing impairment levels, speech detection (reception)
thresholds and word recognition test results...................................................
88
Table 3-21 HA group: Significantly correlated regions identified using the regression analysis
for the CVCV Visual-Only condition and speechreading scores [F > 12.83, P <
0.005, uncorrected]. ........................................................................................
92
Table 3-22 HA group: active regions identified using the regression analysis for the CVCV
Audio-Only condition and percentage of hearing impairment (unaided) [F >
21.04, P < 0.001, uncorrected]........................................................................
94
Table 3-23 HA group: active regions identified using the regression analysis for the CVCV
Audio-Visual condition and percentage of hearing impairment (unaided) [F >
21.04, P < 0.001, uncorrected]........................................................................
96
Table 3-24 HA group: active regions identified using the regression analysis for the CVCV
Audio-Only condition and percentage of hearing impairment (aided) [F > 21.04,
P < 0.001, uncorrected]...................................................................................
96
Table 3-25 HA group: active regions identified using the regression analysis for the CVCV
Visual-Only condition and percentage of hearing impairment (aided) [F > 21.04,
P < 0.001, uncorrected]...................................................................................
97
Table 3-26 HA group: active regions identified using the regression analysis for the CVCV
Audio-Visual condition and percentage of hearing impairment (aided) [F > 21.04,
P < 0.001, uncorrected]...................................................................................
98
Table 3-27 HA group: active regions identified using the regression analysis for the CVCV
Audio-Only condition and percentage of hearing impairment (unaided - aided) [F
> 21.04, P < 0.001, uncorrected]......................................................................
98
Table 3-28 HA group: active regions identified using the regression analysis for the CVCV
Visual-Only condition and percentage of hearing impairment (aided) [F > 21.04,
P < 0.001, uncorrected]...................................................................................
99
Table 3-29 HA group: active regions identified using the regression analysis for the CVCV
Audio-Visual condition and percentage of hearing impairment (unaided - aided)
[F > 21.04, P < 0.001, uncorrected].................................................................
99
Table 3-30 HA group: active regions identified using the regression analysis for the CVCV
Audio-Only condition and speech detection/reception threshold (unaided - aided)
[F > 21.04, P < 0.001, uncorrected]....................................................................100
Table 3-31 HA group: active regions identified using the regression analysis for the CVCV
Visual-Only condition and speech detection/reception threshold (unaided - aided)
[F > 21.04, P < 0.001, uncorrected]....................................................................100
Table 3-32 HA group: active regions identified using the regression analysis for the CVCV
Audio-Visual condition and speech detection/reception threshold (unaided aided) [F > 21.04, P < 0.001, uncorrected].........................................................101
Table 4-1 Goodness-of-fit and stability indices of SEM models for the NH, CD and HA
groups: both null (constrained) and free (unconstrained) models for each
hemisphere [P < 0.05 for model comparison (last column) represents a significant
difference between the constrained and unconstrained models].........................117
Table 4-2 Goodness-of-fit and stability indices of SEM models for the CVCV Visual-Only
and CVCV Audio-Visual conditions: both null (constrained: CD = NH) and free
(unconstrained) models for each hemisphere [P <0.05 for model comparison (last
column) represents a significant difference between the constrained and
unconstrained models]. .......................................................................................
132
Table A-1 SEM Results for the NH group models (left and right hemispheres). Estimated
path coefficients are shown for the CVCV Visual-Only and CVCV Audio-Visual
conditions in the unconstrained model [*** = P < 0.001; ** = P < 0.01; * = P <
0 .0 5]. ...................................................................................................................
15 3
Table A-2 SEM results for the CD (right hemisphere) model. Estimated path coefficients are
shown for the CVCV Visual-Only and CVCV Audio-Visual conditions in the
unconstrained model [*** = P < 0.001; ** = P < 0.01; * = P < 0.05]. ............... 154
Table A-3 SEM results for the CVCV Visual-Only condition models (right and left):
estimated path coefficients for the NH and CD groups in the unconstrained model
[*** = P < 0.001; ** = P < 0.01; * = P < 0.05]...................................................155
Table A-4 SEM results for the CVCV Audio-Visual condition models (right and left
hemispheres). Estimated path coefficients are shown for the NH and CD groups
in the unconstrained model [*** = P < 0.001; ** = P < 0.01; * = P < 0.05]. ..... 156
14
Table A-5 DCM results for the NH (left) and HA (right) models: estimated path coefficients
for intrinsic connections (A) and their posterior probabilities (pA), estimated
modulatory effect values (B) and their posterior probabilities (pB), estimated
coefficient for direct input connection (C) and its posterior probability (pC) [* =
posterior probability >= 0.900]...........................................................................157
1 Introduction
1.1
Audio-Visual Speech Perception (AVSP)
Human speech perception is surprisingly robust and adaptive to the speaker and
environmental variables. Humans can adapt immediately to variations in speaking manner
and rate, as well as accents and dysfluencies in speech. Even in environments with very low
signal-to-noise (SNR) ratios as in crowded restaurants, humans have the ability to overcome
the background noise while attentively perceiving the speech of the speaker.
One of the fundamental underlying principles of speech communication that contributes to
the robustness of speech perception is the fact that it involves two sensory input systems. It
is widely known that during face-to-face conversations, humans not only perceive an acoustic
signal, but also use visual cues such as lip movements and hand gestures to decode the
linguistic message in the visual modality. In a landmark study, Sumby & Pollack (1954)
estimated that in noisy environments, making visual speech available to the perceiver can
increase the SNR by up to approximately 15 dB. There is also ample evidence that visual
cues increase speech intelligibility in noise-free environments (Arnold and Hill, 2001;
Reisberg et al., 1987).
Another landmark study regarding audio-visual integration in speech perception was
performed by McGurk & MacDonald (1976). In this experiment, audio recordings of
consonant-vowel (CV) syllables were dubbed onto videos of a speaker producing a different
CV. This often resulted in the percept of a syllable that was not presented in either modality.
For example, when the subjects were exposed to an acoustic /ba/ and visual /ga/, most of
them classified the perceived CV pair to be /da/. When the two modalities were reversed, the
perceived CV pair was often reported as /bga/.
These perceived syllables are known as
fusion and combination percepts, respectively, and these phenomena are generally referred to
as the McGurk effect. This compelling demonstration of cross-modal interactions in speech
perception is quite robust. The McGurk effect was found to occur at different SNR levels,
although size of effect did increase with decreasing SNR level for acoustic speech (Sekiyama
and Tohkura, 1991). Even when observers were fully aware of the dubbing procedures, the
effect was observed. For example, Green et al. (1991) reported that when subjects attended
to stimuli with the face of a male speaker dubbed with the voice of a female speaker the
McGurk effect was still observed. In other McGurk studies, the effect was also shown to be
relatively insensitive to temporal (Green and Gerdeman, 1995; Munhall and Tohkura, 1998)
and spatial discrepancies (Jones and Munhall, 1997), where the auditory and visual signals
were either temporally mismatched (up to -240ms) or spatially separated.
These experimental results make it evident that speech perception is inherently bimodal in
nature, and also suggest that humans are not only capable of processing bimodal speech
information, but almost always use both auditory and visual speech information whenever
they are available (even when the auditory signal is clear and intact. How the brain actually
integrates visual information with acoustic speech to create a percept in the listener's mind is
still poorly understood. Studying the details of the neurological processes that are involved
in audio-visual speech perception comprises the main part of this thesis research.
Additionally, the neural mechanisms associated with visual speech perception most likely
differ for individuals with congenital profound hearing impairment, that is, the absence of
auditory input from birth. Hence, the effects of hearing status on auditory-visual speech
perception are investigated in the current research as well.
As for how visual information might be integrated with acoustic speech, a number of models
have been proposed to account for the integration strategies of auditory-visual speech
perception and have been contrasted in various schemes.
Some of the commonly used
contrasting schemes are: early vs. late; auditory vs. articulatory; common format vs.
modality-specific; or language-general phonetic-level vs. language-specific phonemic-level.
These model classifications have some overlaps in terms of properties used for model
categorizations; most generally an audiovisual speech integration model is classified based
on "where" the integration takes place and "what" form the information takes at the point of
integration.
Early integration models combine acoustic and visual speech information before any
classification on the input is performed, whereas late integration models treat acoustic and
visual speech streams as two separate channels which are categorized separately and then
fused using some mechanism to give the final result. The format of information used in these
models determines whether the model uses a common format or modality-specific input for
speech classification. Many earlier models adopted the late integration strategy. For instance,
one of the earliest models known as VPAM (Vision:Place, Acoustic:Manner) (Summerfield,
1987) postulates that the visual signal is used to determine the place of articulation and the
acoustic signal is used to identify the manner of articulation of a phoneme. This hypothesis
was disproved, and Summerfield et al. (1989) subsequently formulated four metrics of AV
speech integration: (1) the filter function of the vocal tract is estimated by integrating two
separate estimates obtained from auditory and visual signals, (2) the acoustical parameters of
the speech waveform and the visible shape of the mouth are simply concatenated to form a
vector, (3) a 3D vocal tract configuration is computed where the visual speech is used to
estimate the configuration of the front of the vocal tract and the acoustic signal is used to
estimate the back configuration, (4) modality-free information, in particular the articulatory
dynamics are derived from each modality.
Ten years later, Robert-Ribes et al. (1995)
reformulated these metrics into: (1) direct integration, (2) separate integration, (3) dominant
recording, and (4) motor space recording (for review, see Robert-Ribes et al., 1995). They
further argued that the experimental data and above four metrics or models are inconsistent
and proposed a more complex model called the Timing Target model.
Another notable model is the Fuzzy Logical Model of Perception (FLMP), which was
developed by Massaro (Massaro et al., 1986).
The FLMP is based on a late-integration
strategy in which speech inputs are matched separately against unimodal phonetic prototypes
and the truth value of each modality is calculated where the truth values represent the
likelihood of a hypothesis given some observed data.
The separate classifications (the
computed truth values) are subsequently combined using the fuzzy-logic multiplicative AND
rule. Here, the integration is assumed to occur at a post-phonetic level.
One of the aforementioned contrasting schemes for model classification is language-general
phonetic-level integration and language-specific phonemic-level integration. This distinction
is one of the most thoroughly studied issues in audiovisual speech integration, since it is
mainly concerned with identifying the stage at which audiovisual speech is integrated, and
determining whether audiovisual speech integration is independent of speech learning or
some working knowledge of speech and specific language experience.
In other words,
establishing "where" in the process integration occurs, and deciding whether multimodal
speech integration is a learned or innate skill are central issues in research on audiovisual
speech perception.
Since these issues are concerned primarily with the developmental processes of multimodal
speech perception, results from infant studies and cross-language studies play critical roles in
attempting to resolve them. As a corollary to the motor theory of speech perception
(Liberman and Mattingly, 1985), some researchers propose that there exists an innate link
between speech production and perception for facilitating audiovisual speech perception, and
that audiovisual speech perception is a specifically built-in functional module in humans this is sometimes known as the "speech is special" theory (Liberman and Mattingly, 1989;
Mattingly and Studdert-Kennedy, 1991). On the other hand, there is an alternative argument
for a role of experience in audiovisual speech perception: that the McGurk effect merely
reflects learned integration. McGurk and MacDonald (1976) found generally that children
showed smaller McGurk effects compared to their adult subjects, leading to the inference that
the McGurk effect is a result of linguistic experience. In either case, the existence of the
McGurk effect in children motivated further investigation on infants to identify how and
when audiovisual speech perception is developed in humans. If there is a specialized speechprocessing module, one would predict that formation of an integrated speech percept
precedes the independent perception of the auditory and visual speech information
(Summerfield et al., 1989), and the integrated percept should be observable in early infancy,
whereas if the audiovisual speech integration is a learned skill, integrated speech percepts
will not be evident in infants.
In terms of audiovisual speech integration models, language-general, phonetic-level speech
perception models would predict that auditory-visual integration such as the McGurk effect
will occur in early infancy whereas language-specific, phonemic-level models would predict
that integrated speech percepts will not be evident early in life, and only when the child has
learned some language-specific phonemic prototypes, is speech integration possible. Infants
are known to learn mouth movements very early (Meltzoff, 1990), and even show interest to
matching properties of auditory and visual speech as early as 10-weeks after birth (Dodd,
1979). Rosenblum et al. (1997) tested for the McGurk effect in pre-linguistic infants (5month-old English-exposed) and found that infant subjects were visually influenced in a way
similar to English-speaking adults. In a similar study, Burnham and Dodd (2004) further
investigated the McGurk effect in 4.5-month-old prelinguistic infants in a habituation-test
paradigm and found that they demonstrated the McGurk effect and therefore were integrating
auditory and visual speech. These results lead to the inference that the McGurk effect does
not reflect learned integration, and that infants integrate heard and seen speech as early as 5
months after birth. While these studies support the idea that infants do integrate heard and
seen speech, they do not demonstrate that this integration is strong or that it is necessary for
speech perception. To determine whether auditory-visual speech integration is mandatory for
infants, Desjardin and Werker (2004) tested 4-5 month-old infants in three habituation
experiments. They reported that the interpretation of integration was partially supported and
concluded that an initial mechanism for speech perception supports audiovisual integration,
but that integration is not mandatory for young infants. As mentioned, there also is evidence
of a developmental component in auditory-visual speech perception as well as in integration.
The original McGurk study actually tested preschoolers, school children, and adults on the
McGurk effect.
The amount of visual influence on the responses was found to be
significantly greater in adults than children.
Massaro, Thompson, and Laren (1986) and
Hockley and Polka (1994) also reported that the degree of visual influence in auditory-visual
speech perception was stronger in adults than children.
Additionally, differences between adults across languages can be observed to study the
influence of specific language experience on auditory-visual speech integration. Sekiyama &
Tohkura (1991) showed that the McGurk effect was weaker in native Japanese speakers (with
Japanese talker stimuli) than in English speakers (with English talker stimuli). The subjects
also showed a stronger McGurk effect for lower SNR levels.
In a subsequent study,
Sekiyama & Tohkura (1993) tested Japanese and English speakers with both Japanese and
English talker stimulus sets. In this study, both groups showed a stronger McGurk effect for
non-native talker stimuli. However, Japanese participants showed less effect overall than did
English participants. Similar results were obtained in a McGurk study with Chinese speakers
(Sekiyama, 1997), where the effect size was also shown to be smaller in Chinese speakers
than English speakers.
These results from cross-language studies support the view that
audiovisual speech perception is influenced by experience with a particular language, in
addition to having a language-independent component. Although the McGurk effect seems
to be affected by language proficiency, it is shown to exist across different languages,
demonstrating the automaticity of audiovisual speech. Rosenblum (2005) further proposed
that multimodal speech perception is the primary mode of speech perception in humans and
that it is not a function that is piggybacked on top of auditory speech. Rosenblum speculates
that if the primary mode of speech perception is indeed multimodal in nature, then there
should evidence for multimodal speech in the evolution of language.
A recent study by
Ghazanfar and Logothetis (2003) provides evidence for an influence of audiovisual
correspondences in vocal calls. They found that rhesus monkeys showed sensitivity to crossmodal correspondence; this result, however, does not necessarily mean that there is crossmodal integration.
In an attempt to examine audiovisual integration in primates, some
research groups are currently implementing the McGurk effect studies in rhesus monkeys.
Results from such studies should shed light on influence of the visual modality on the
evolution of spoken language.
To summarize, although no one particular view discussed in this section has been completely
proved or disproved, most researchers agree that speech sensory integration is carried out at
an early stage of processing.
Many theories have been developed to account for where
integration occurs, and these theories propose various stages of audiovisual speech
processing for integration. The complete scope of the literature on this topic is too vast to be
covered in this dissertation; however, for the current purposes most evidence is consistent
with the idea that integration occurs at a stage prior to phonetic classification.
Turning to "what" - the domains of the audiovisual speech information at the stage of
integration, one of the most debated discussions in speech perception concerns whether the
perceptual primitives of speech are auditory or articulatory in nature. Despite the lack of
concrete evidences suggesting that a particular fusion theory best models auditory-visual
speech perception, observations from audiovisual studies and modeling have had important
influences on theories of speech perception. Particularly, the McGurk effect studies, along
with other visual speech research, have contributed to this discussion.
For example,
Liberman and Mattingly (1985) proposed the motor theory of speech perception in which the
listener is thought to recover the speaker's intended phonetic gesture, and the primitives of
According to this theory, the
the speech perception function are articulatory in nature.
auditory and visual speech inputs are both mapped to the articulatory space to provide the
observer with information about the motor act of speaking and these transformed signals are
integrated to form a final speech percept. They argue that findings from audiovisual speech
perception are consistent with their motor theory and cited observations about the
automaticity of audiovisual speech as evidence supporting the concept of gestural primitives.
At the other end of the spectrum, auditory theories assume that visual speech influences the
auditory percept formed by processing acoustic speech.
This influence on the auditory
percept by visual input occurs at a different stage, depending on whether the model
incorporates early or late integration.
In general, visual speech signals can play three vital roles in speech perception. These roles
are: 1) attention, 2) redundancy, and 3) complementarity.
Visual signals can help the
perceiver to identify who the speaker is (attention) and provide additional information
through speechreading (complementarity). Even when the auditory signal is clear and intact,
studies show that visual information is still used (redundancy). While the psychophysical
aspects of audio-visual speech integration have been studied widely, relatively little was
known about the neural processes involved until recently. A number of investigators have
hypothesized that the McGurk effect is a result of visual information (a face that is mouthing
syllables or words) biasing the subject's auditory percept toward the auditory signal that
would normally accompany the viewed face; consequently, it is expected that there are neural
pathways between visual and auditory cortical areas in which the visual signals somehow
alter the auditory maps. Supporting this view of auditory-visual speech perception, fMRI
studies have shown that viewing a speaking face in the absence of acoustic stimuli activates
the auditory cortex in normal hearing individuals (Calvert et al., 1997).
1.2
Neuroimaging Studies of AVSP and Hearing Status
Following Calvert's study in 1997, a handful of neuroimaging studies have investigated the
neural circuitry involved in auditory-visual speech perception (Burton et al., 2005;
MacSweeney et al., 2002b; Pekkola et al., 2006; Skipper et al., 2005; Surguladze et al., 2001;
Wright et al., 2003). However, only a small portion of these studies involved subjects with
hearing impairment. MacSweeney et al. (2001) specifically addressed the neural circuitry of
speechreading in deaf and normally hearing people by observing brain activation during
silent speechreading of numbers for both groups of volunteers. The deaf participants in that
study were congenitally profoundly deaf, but had hearing parents and had attended
mainstream schools or 'oral' schools for the deaf, where training on speechreading and
written English were emphasized. Other neuroimaging findings regarding neural aspects of
auditory-visual modality interactions in deaf people come from studies of the effects of
simple visual motion processing of moving dots (Fine et al., 2005), grammatical and
emotional facial expressions related to sign language processing (Gizewski et al., 2005;
MacSweeney et al., 2004), verbal working memory (Buchsbaum et al., 2005), auditory
deprivation in general (Finney et al., 2001), and sign language comprehension in deaf native
signers (MacSweeney et al., 2006; Sakai et al., 2005).
To our knowledge, no functional
neuroimaging study directly has been conducted that compares neural activities of audiovisual speech perception in congenitally deaf native signers with that of normally hearing
individuals.
The primary goal of this dissertation research is to investigate the effects of hearing status on
cortical activation in relation to audio-visual integration in speech perception. FMRI was
used to study the brain activity underlying auditory-visual speech perception in hearing
impaired and normally hearing individuals. Unlike MacSweeeney et al. (2002a; 2001), we
did not restrict our deaf group to those who had hearing parents and had speech-based
training in their school years. In contrast, our study included two separate groups of hearing
impaired subjects: (1) congenitally deafened signers who do not use hearing aids and who
have ASL as their native language, and (2) hard-of-hearing individuals with hearing aids.
It is likely that the amount of exposure to acoustic speech plays a significant role in
development of the neural mechanisms that underlie audio-visual speech perception.
However, an individual who is congenitally deaf and regularly wears hearing aids may or
may not rely on the acoustic cues during speech perception. The extent to which acoustic
information one uses in speech processing depends on various factors such as hearing
threshold, aided threshold, primary mode of communication, benefit gained from using
hearing aids, and so on. To get a measure of how much of the acoustic information is
available and is utilized by hearing impaired participants, a number of audiological tests were
also performed. We hypothesized that the measured benefit values of hearing aids and the
speechreading abilities of this group of subjects would have significant correlations with the
activation patterns obtained during audio-visual speech perception tasks.
1.3
Neural Connectivity
The primary objective of functional neuroimaging involves characterizing the brain areas in
terms of their functional specialization.
However, this approach reveals no information
concerning how different brain regions are connected and exchange information with one
another during the experimental tasks. Thus one of the aims of the current study was to
examine the connectivity of the network of cortical areas involved in audio-visual speech
perception.
The neural mechanisms involved in the convergence of auditory and visual
speech information are still largely unknown. In particular, the primary neural pathways
involved in transforming auditory and visual information into a unified phonological percept
are still poorly understood. Since the end product is a single speech percept, it seems very
likely that two separate projections from auditory and visual input modalities converge at
some point in the brain.
In the present study, we supplemented voxel-based activity analyses (described in Chapter 2
and 3) with effective connectivity analyses (Chapter 4) designed to identify important
cortico-cortical pathways involved in audio-visual speech integration.
1.4
Goals
The overall goal of this research was to acquire a more complete picture of the neurological
processes of visual influences on speech perception and to investigate effects of hearing
status on audio-visual speech perception. More specifically, functional magnetic resonance
imaging (fMRI) was used to investigate the brain activity underlying audio-visual speech
perception in three groups of subjects: (1) normally hearing, (2) congenitally deafened
signers (American Sign Language) who do not use hearing aids, and (3) congenitally hearing
impaired individuals with hearing aids.
The influence of visual cues on auditory cortical areas was investigated by characterizing the
modularity and the network of cortical areas underlying visual-auditory associations. This
part of research involved fMRI studies on all subject groups using two different types of
visual speech stimuli (mouthing words with rapid transitions, specifically CVCV bisyllable
utterances, vs. mouthing single vowels). For this part of the study, the following questions
were addressed and associated hypotheses were tested.
1. What are the brain pathways and areas underlying visual-auditory associations? One of
our initial hypotheses is that there is a projection (labeled as "c" in Figure 1-1) from
visual cortical areas to inferoposterior temporal lobe, (Broadman's area (BA) 37: refer to
Figure 1-1) and/or angular gyrus (BA 39; pathway "a") as well as pathways from
inferoposterior temporal lobe and angular gyrus to auditory cortical areas (BA 22, 41, 42;
pathways "b" and "d"). To test this hypothesis, we identified the network of cortical
areas that is responsible for auditory visual-speech perception. The neural sites that are
responsible for processing auditory sensory inputs and visual sensory inputs are known;
however, there has not been a confirmed site known for processing and integrating multisensory inputs.
Figure 1-1 Hypothesized projections from visual to auditory cortical areas
2. How modular are visual influences on auditory cortical areas? It is widely known that
different auditory cortical areas are sensitive to different kinds of auditory input. Our
hypothesis was that auditory cortical areas will be sensitive to different kinds of visual
speech input depending on the type of associated auditory input.
In addition, the second major objective of this research was to focus on exploring differences
in speech perceptual processes between normal-hearing and hearing impaired individuals.
Specifically, the effects of hearing status on cortical activation in relation to audio-visual
integration in speech perception were studied. Hence, the third question is:
3. How does hearing status affect the auditory-visual interactions of speech perception?
Here, cortical activation levels are compared across the three groups of subjects to study
the differences in activation and neural pathways, and to test the hypothesized brain
mechanisms that underlie auditory expectations of visible speech. We hypothesized that
the congenitally deaf individuals will have stronger visually induced cortical activation in
auditory cortical areas than both hearing aid users and normal-hearing individuals since
they rely more on vision for speech communication and auditory deprivation over their
lifetimes would have allowed parts of temporal cortex to be specialized for visual speech.
We also investigated whether different pathways in the speechreading cortical network
will be recruited to process visual speech between these three groups of subjects. We
also investigated goals 1 and 2 for two hearing impaired subject groups as well.
2
2.1
Experimental Methods and Data Analysis
Subjects
There were three groups of subjects: normal-hearing native speakers of American English
(NH), congenitally deafened signers of ASL (CD) and individuals with congenital hearing
loss who regularly wear hearing aids (HA). Each group consisted of twelve right-handed
adults (6 males, 6 females) between the ages of 18 and 60 years old with no history of
language or other neurological disorders (other than hearing impairment for the CD and HA
groups). The CD group consisted of congenitally profoundly deaf signers who had binaural
hearing loss of greater than 90 dB. They had acquired ASL as their primary language and
were exposed to ASL between birth and 4 years, either because they had deaf signers in the
family or because they had learned it in a school. In addition, they had learned American
English as their second language.
The hearing aid group consisted of individuals with
congenital hearing loss (ranging from moderate-severe to profound, i.e. greater than 60 dB
hearing loss) who regularly wore hearing aids (either monaural or binaural) for the last 20
years or more on a daily basis. The HA subjects were not required to be proficient in ASL.
The NH and HA groups were age and gender matched to the Deaf group.
The three subject groups were formed such that we can sample different points on the
"exposure to acoustic speech" spectrum shown below. Since the NH group has no hearing
impairment, they have the most exposure to acoustic speech in their lifetime, whereas the CD
subjects have the least amount of exposure (since they have the greatest hearing loss, and
have not used any aids).
The HA group was formed to include individuals who lie
somewhere in between the two extreme ends on this spectrum. Hence, our subject pool for
the HA group has the greatest amount of variability in terms of the amount of exposure to
acoustic speech in their lifetime.
Increasing Amount of Exposure to Acoustic Speech
CD
HA
NH
Subjects were paid for their participation and gave written informed consent for experimental
procedures, which were approved by the Boston University, Massachusetts Institute of
Technology and Massachusetts General Hospital committees on human subjects.
2.2
Tests Conducted
For each subject group, the following two tests were conducted:
(1) FMRI experiment for audio-visual speech perception tasks (Section 2.3),
(2) Test of speechreading English sentences (Section 2.4).
For the HA group only, we also conducted a battery of audiological and speech tests (refer to
Section 2.5) in addition to the two tests listed above.
2.3
Functional Magnetic Resonance Imaging Experiment
Stimuli and Tasks
The speech stimuli were vowels and CVCV syllables that were presented in different
acoustic-only, visual-only or audio-visual blocks.
Therefore, there are six experimental
conditions (Vowel Audio-Only, Vowel Visual-Only, Vowel Audio-Visual, CVCV AudioOnly, CVCV Visual-Only, CVCV Audio-Visual) and one control condition consisting of
viewing a blank screen without any audio input. The stimuli were spoken by one female
native English speaker and were digitally recorded using a camcorder. The visual stimuli
were edited to only include the lower half of the speaker's face (Figure 2-1). The stimulus
duration was 1.6777 s, and 13 stimuli were randomly presented in 30-second blocks.
Figure 2-1 Astill image of the video clip stimulus.
The task for the subject was to identify in their heads the speech sounds they were hearing
and/or seeing. The subjects were not required to report their responses.
Data Acquisition and Analyses
The protocol used for all subject groups was exactly the same. The magnet parameters for
fMRI scans were:
(1)
Protocol (e.g., EPI):
Gradient Echo EPI
(2)
TR:
3s
(3)
Orientation:
Axial
(4)
Number of slices:
32
(5)
Slice Thickness:
5mm
(6)
Gap:
0mm (dist factor: 0)
The stimuli were presented in a block design (Figure 2-2). Each subject's data set consisted
of images collected during ten separate 4-minute-long runs. A run included eight 30-second
blocks. In Figure 2-2, each color represents a different block type. Here, the white block
represents the control block and it is the only type of block that was presented twice in a run.
30s
60s
240s
-Time
Figure 2-2 Block-design paradigm: a typical run.
Within each block, excluding the control block, each of 7 different stimuli sets was presented
13 times. For a given subject in a given run, the run sequence was one of the following
(where the numbers represent the block types 1-7 listed above):
A. 2-3-4-6-1-7-5-7
B. 6-5-7-2-7-4-3-1
C. 4-7-3-5-6-2-7-1
D. 3-7-4-6-5-7-1-2
E. 6-4-7-2-3-1-7-5
F. 4-7-3-1-6-5-2-7
G. 2-5-1-3-7-6-7-4
H. 3-2-6-7-1-5-7-4
I. 2-1-4-7-3-5-6-7
J. 2-4-5-7-6-7-1-3
Each subject performed this seven to ten (any one of A to J run sequences listed above) 4minute functional runs, each consisting of eight 30-second blocks: one for each experimental
condition and 2 control blocks. Blocks were pseudo-randomly permuted in different runs,
with the requirement that 2 control blocks never occurred consecutively in a run.
In summary, the stimulus presentation protocol is characterized as follows:
* Presentation protocol: block design
* Total number of runs per subject: 10
* Total run length in seconds: 240s
e
Total number of blocks per run: 8
*
Total block length in seconds: 30s
" Number of different block types: 7
*
Total number of stimuli per block: 13 (except control blocks, with no stimuli)
*
Total number of stimulus presentations per run: 80
"
Stimulus duration: 1677ms
e
Control trial type: Silence with a blank screen (i.e., no stimulation)
Data were obtained using a 3 Tesla Siemens Trio whole-body scanner with a Bruker head
coil. T2*-weighted functional images of the entire cortex were collected. Thirty-two axial
slices (5 mm thickness, 0 mm inter-slice gap, 64 x 64 matrix, 3.125 mm 2 ) aligned parallel to
the anterior-posterior commissure line were acquired using a gradient echo echo-planar
imaging sequence with repetition time of 3s, flip angle 900 and echo time of 40ms. In a
single run, 80 volumes were obtained following three dummy images. Individual functional
runs were realigned using rigid body transformations to the first image in each scan, then coregistered with a high-resolution anatomical T1-weighted volume for each subject (128
sagittal images, 1.33 mm slice thickness, 256 x 256 matrix, 1 mm 2 in plane resolution,
TR=2530 ms, TE=3.3 ms, flip angle 90).
Image volumes were pre-processed and analyzed with the SPM2 software package
(http://www.fil.ion.ucl.ac.uk/spm/; Wellcome Department of Imaging Neuroscience, London,
UK).
In the pre-processing stage, functional series were realigned using a rigid-body
transformation, then co-registered to the high-resolution structural scans, normalized into the
Montreal Neurological Institute (MNI) space (Evans et al., 1993), and finally smoothed with
a Gaussian filter (full width at half maximum of 12mm). Both fixed-effects and randomeffects analyses (i.e. mixed-effects) were employed for voxel-based analyses of preprocessed image volumes. For the group analyses, the mixed-effects model was applied with
False Discovery Rate (FDR) error correction for multiple comparison and p-value threshold
of 0.05. The Automated Anatomical Labeling (AAL) toolbox (Tzourio-Mazoyer et al., 2002)
was used to identify labels for active clusters in averaged activation maps. The resulting
statistical maps were projected onto the pial cortical surfaces created by FreeSurfer (Dale et
al., 1999; Fischl et al., 1999) and the canonical SPM brain.
2.4
Speechreading Test
Although the majority of deaf people are born to hearing parents, their exposure to visible
speech in their lifetime differ widely.
Consequently, some deaf people learn to be quite
efficient at speechreading while other deaf people speechread at a level inferior to the
average level of normally hearing people. It is generally assumed that speechreading skill of
a deaf individual is heavily dependent on the education system and the communication
training received during his or her infancy to childhood.
To evaluate how well the
participants can use visual cues alone to understand speech, a test of speechreading English
sentences was administered to all participants.
The speechreading test involved subjects watching video clips (without any auditory cues) of
a female speaker uttering common English sentences. The stimuli used for the speechreading
test were video clips of 100 spoken English sentences. These sentences were selected from
the Central Institute for the Deaf (CID) Everyday Sentences Test (Erber, 1979) and were
spoken by one female native English speaker. The participants were presented with one
sentence at a time on a computer screen and were asked to type or write down the word(s)
that they were able to speechread using only the information available on the visual speech of
the speaker.
2.5
Audiological and Speech Tests for Hearing Aid Users
In addition to the speechreading test, a battery of audiological and speech tests was
conducted for the HA group to quantify how much acoustic information was available and
utilized by each HA participant. Some of the audiological tests were conducted twice for HA
subjects - with and without their hearing aids - to measure the benefit gained from hearing
aids. The test results were used along with data collected from the fMRI experiment and
speechreading test for regression analyses.
The following data were collected from the HA group:
1. Speechreading test: See section 2.4 for details.
2. Otoscopy: This procedure involves examining the ear canal with an otoscope, which
is part of standard clinical testing in audiology. The otoscopic check was done before
placing an insert earphone to avoid placing the earphone if there was a foreign body,
an active infection or other contraindication present in the ear canal.
3. Audiometry: A number of hearing tests were conducted, including the pure-tone air
and bone conduction audiometry under headphones or using insert earphones
(without any hearing aids), and "aided" audiometry with warble tones (between 5004kHz) in the sound field.
4. Speech Reception Threshold (SRT): The SRT test was done using spondaic words
presented monaurally under headphones, and aided in the sound field. In cases when
the unaided sound field SRT exceeded the limits of the audiometer, the Speech
Awareness Threshold test was used, for example, when the subject's hearing loss was
too severe or English proficiency was limited.
5. Word recognition test: We presented a list of the Northwestern University Auditory
Test No.6 (NU-6) words in two conditions: (1) monaurally under headphones and (2)
aided in the sound field, and asked the participants to identify the words and write
down the responses so that whole word score and/or phonemic scoring could be
performed. In some cases, the audiologist wrote down spoken responses from either
the participant or interpreter.
6. Audio-visual speech recognition test: To assess the synergistic effect of combined
auditory plus visual information, we used another videotaped test, the City University
of New York (CUNY) Sentence Test. In this task, the participants were asked to take
the test twice - 15 sentences per condition, with and without hearing aids. It was
expected that the participant would derive information from both the speaker's face
and also from the auditory signal as amplified by the hearing aid. This would be
reflected by the percent correct scores which might be much higher than would be
expected from the simple addition of the results from the visual-alone plus auditoryalone tasks.
7. Abbreviated Profile of Hearing Aid Benefit (APHAB) Questionnaire:
This is a
widely used questionnaire which consists of 24 categories. Through responses in
these various categories, participants with hearing loss report the amount of difficulty
they have with communication or with noises in various everyday situations - in the
unaided condition and when the person is using amplification with aids. Hearing aid
benefit can be computed by comparing the reported amount of difficulty in these two
conditions. The APHAB produces scores for 4 scales. They are: general ease of
communication,
difficulty with reverberation, problems in the presence of
background noise, and aversiveness to sound.
8. ASL Reception Test: The American Sign Language Assessment Instrument (ASLAI)
(http://www.signlang-assessment.info/eng/ASLAI-eng/aslai-eng.html)
developed at
the Center of the Study of Communication and the Deaf at Boston University
(Hoffieister, 1994; 1999). This test was used to quantify the language skills of the
HA subjects who use ASL. There are eight subtests in the ASLAI but we only used
the Synonyms and Antonyms subtests, which were sufficient to verify our subjects'
basic ASL competence.
These tests were performed on the computer and were
presented in multiple-choice format, which allowed a quick assessment of ASL
vocabulary.
9. English Proficiency Test: The Test of Syntactic Abilities (TSA) was administered to
evaluate our HA subjects' English proficiency with basic grammatical forms such as
question formation, negation and relative clauses. People with congenital hearing
loss with ASL as their primary mode of communication may have limited proficiency
with complex grammatical forms of English due to the fact that ASL is a completely
different language from English. Therefore, the results from this test can be helpful
in estimating and determining the extent to which the participant is proficient in the
English language.
2.6
Correlation Analyses
To distinguish areas that were more specifically associated with the extent of speechreading
ability or the amount of acoustic speech exposure, regions of cortical activity (obtained from
the fMRI analyses) were identified that correlated with the data collected from the
psychophysical tests. A single-subject analysis was performed on each individual subject's
fMRI data to generate T-contrast activation maps for that subject.
These T-contrast
activation maps were then used in simple regression analyses with the psychophysical
measures, such as the speechreading test scores, as covariate measures. The F-contrast map
34
showing the regions that have statistically significant correlation with psychophysical
measures were obtained with p < 0.001, uncorrected (for some cases, p < 0.005, uncorrected
was used to better locate the regions with significant correlation).
2.7
Effective Connectivity Analyses
To further investigate the cortical interactions involved in auditory-visual speech perception,
we performed effective connectivity analyses on data collected from the fMRI experiment.
The details of the methods implemented and the results obtained are described in Chapter 4.
3
Study Results
Results obtained from standard fMRI analyses and correlation analyses are presented in this
section. Each of the six experimental conditions (Vowel Audio-Only, CVCV Audio-Only,
Vowel Visual-Only, CVCV Visual-Only, Vowel Audio-Visual, and CVCV Audio-Visual)
was compared with the control (baseline) condition to obtain averaged activation maps. For
all three subject groups, figures displaying activation patterns for six experimental conditions
are shown along with tables listing the labels of active regions (Section 3.1). All tables in
this section list clusters and peaks sorted by normalized effect size and cluster peaks were
separated by a minimum of 4 mm. Also, no more than five peaks are reported for each
cluster.
Standard fMRI analysis results are followed by plots of speechreading test scores for each
subject group (Section 3.2).
The six subjects who had the highest scores on the
speechreading test were classified as the Good speechreaders, and the six subjects with the
lowest six scores were assigned into the Poorspeechreader subgroup. The activation patterns
comparing Poor and Good speechreaders were also acquired and are shown in Section 3.2.
Finally, correlations between study participants' speechreading skills (and other covariate
measures for the HA group) and the amplitude of cortical activation during the CVCV
Visual-Only condition were examined for each group, and regions with significant
correlation were identified. These analyses are presented in Section 3.3.
3.1
3.1.1
Results from Standard FMRI Analyses
Normal Hearing (NH)
Figures 3-1, 3-2, and 3-3 show the averaged activation maps for the NH group during AudioOnly, Visual-only, and Audio-Visual conditions, respectively, for both CVCV and Vowel.
Activations for these six experimental conditions contrasted with baseline are summarized in
Tables 3-1 to 3-6. Brief summaries of active areas for these contrasts are listed below.
e Audio-Only conditions: auditory cortical areas were active for both CVCV and
Vowel Audio-Only conditions, but the CVCV Audio-Only condition also included
activities in right cerebellum (crus I and lobule VI).
* Visual-Only conditions: visual cortex, right posterior superior temporal gyrus,
inferior frontal gyrus, middle temporal gyrus, fusiform gyrus, premotor and motor
cortices, right inferior frontal sulcus, right supramarginal gyrus, left insula, left
thalamus, and left rolandic operculum.
* Audio-Visual conditions: auditory and visual cortices were active along with left
fusiform gyrus, left thalamus, right cerebellum (lobule VIII), premotor and motor
cortices, and left supplementary motor area.
Normal Hearing: Audio-Only
CVCV
Vowel
Figure 3-1 NH group: Averaged cortical activation produced by the contrast of the Audio-Only
condition with the baseline condition for CVCV (left panel) and Vowel (right panel) IT > 4.23
(CVCV), T > 5.27 (Vowel), mixed-effects analyses with P < 0.05, FDR corrected].
CvcvA-Silence, MFX, Correction: FDR, T > 4.23
AAL Label
Norm'd
Effect
T
MNI Location (nun)
x
y
z
(P)
TemporalSupL |
8.66
|
8.47
Temporal_SupR |
TemporalSupR
8.61
8.45
|
10.56
9.03
(2.16-07) |
(1.0e-06) |
62 | -14 j
64 | -20
CerebelumCruslR |
4.14
|
4.33
(5.9e-04) |
32 | -86 | -26
4.54
(4.2e-04)
36
4.32
(6.0e-04) |
36 | -82 | -26
4.38
4.46
(5.5e-04) j -10 | -20
-42
(4.8e-04) I -12
-24 | -36
No Label
3.93
CerebelumCruslR |
3.85
|
No Label | 3.64
No Label I 2.00
No Label
Cerebelum_6_R |
TemporalSupR
I
(1.9e-06) | -56 | -24 |
-86
4
4
| -24
3.62
|
4.71
(3.2e-04)
I
2.13
|
4.39
(5.4e-04)
I
26 | -62 j -24
4.62
(3.7e-04)
I
48
2.02
4
I -88
10
| -20
J -42 I
14
Table 3-1 NH group: Summary of peak cortical activation produced by the contrast of the CVCV
Audio-Only condition versus the baseline condition.
VowelA-Silence, MFX, Correction: FDR,
AAL Label
TenporalSupR
TemporalSupL
T
Norm'd
Effect
I
T > 5.27
(p)
MNI Location (nun)
x
z
y
8.52
|
5.48
(9.6e-05) |
62 | -16 |
6
7.96
|
6.34
(2.8e-05) | -58 | -24 |
12
Table 3-2 NH group: Summary of peak cortical activation produced by the contrast of the CVCV
Audio-Only condition versus the baseline condition.
Normal Hearing: Visual-Only
CVCV
Vowel
Figure 3-2 NH group: Averaged cortical activation produced by the contrast of the Visual-Only
condition with the baseline condition for CVCV (left panel) and Vowel (right panel) IT > 3.39
(CVCV), T > 3.96 (Vowel), mixed-effects analyses with P < 0.05, FDR corrected].
CvcvV-Silence, MFX, Correction: FDR,
I Norm'd
AAL Label
T > 3.39
I
T
(p)
1
8.43
6.61
(2.0e-06)
(1.9e-05)
|
|
6.10
(3.9e-05)
|
5.42
4.40
(1.0e-04)
(5.3e-04)
|
8.25
5.18
(2.4e-06)
(1.5e-04)
I -30
I
I
5.11
(1.7e-04)
5.23
(1.4e-04)
1 -94 I -4
| -42 | -74 I -16
I -42 I -70 I -16
I -48 I -72 I
4
MNI Location (mm)
x
y
z
Effect
OccipitalInf_R
I
TemporalMid_R
Fusiform_R |
TemporalSupR
TemporalSupR I
|
I
11.92
9.21
8.86
5.93
5.83
I
I
|
I
I
I
42 1 -84 1
50 1 -66 1
42 | -62 |
64 1 -36 |
62 1 -38 I
Fusiform_L |
Fusiform_L I
OccipitalMid_L I
10.40
9.30
9.24
7.59
Precentral_L
I
6.74
I
4.62
(3.7e-04)
I -52 I
SuppMotorArea_R
I
6.68
I
5.10
(1.7e-04)
1
6.25
I
5.00
(2.0e-04) 1
5.55
|
4.57
OccipitalMid_L
PrecentralR
TemporalSupL
|
ParietalSupL
|
-6
0
-20
20
14
I
52
2 |
2 I
70
54 I
0 |
48
5.79
(6.le-05) 1 -58 I -38 |
22
I
3.39
(3.0e-03)
1 -34 I -60 I
60
-4
Precentral_L
I
4.35
I
3.73
(1.6e-03)
1 -40 I
-4
I
64
RolandicOperL
I
4.06
I
3.75
(1.6e-03)
1 -56 |
10 I
4
R |
4.02
I
3.40
(3.0e-03)
|
52 I
22 |
-2
ParietalSup_R |
3.08
I
3.54
(2.3e-03)
|
40 I -50 |
66
2.55
I
3.49
(2.5e-03)
I
62 I -20 |
24
I
2.55
I
3.47
(2.6e-03)
I
-50 I
48 I
10
Frontal Mid R I
Frontal MidR I
2.46
2.34
|
|
3.46
3.48
(2.7e-03)
(2.6e-03)
I
I
50|
52 I
541
48 |
2
4
Frontal Inf Tri L I
2.42
I
3.41
(2.9e-03)
I
-52 I
FrontalInf_TriR I
FrontalMid_R |
FrontalMid_R I
FrontalInf_Tni_R I
2.31
2.06
2.03
1.95
|
(2.9e-03)
(2.2e-03)
(1.9e-03)
(2.7e-03)
I
1
3.42
3.57
3.64
3.45
54
54
50
54
Precentral_L I
Precentral_L I
2.20
I
I
3.39
3.57
FrontalInf_Tri_L
FrontalInfTriL
I
2.19
1.87
I
|
ThalamusL
I
2.02
Frontal_InfOperR
I
2.02
FrontalInfTr
SupraMarginalR
FrontalInf_Tri_L
1.95
44
1 10
I
32 I
36 |
40 |
36 I
14
20
30
24
(3.0e-03)
(2.2e-03)
1 -58 |
1 -60 |
6 I
6 |
26
30
3.40
(3.0e-03)
1 -52
I
3.52
(2.4e-03)
1 -54 I
40 I
38 |
10
6
|
3.39
(3.0e-03)
|
-6
-12
I
3.50
(2.5e-03) I
58
12
I
|
1
1
I
I
I
|
Table 3-3 NH group: Summary of peak cortical activation produced by the contrast of the CVCV
Visual-Only condition versus the baseline condition.
VowelV-Silence, MFX, Correction: FDR, T > 3.96
AAL Label
I Norm'd I
T
(p)
I MNI Location (mm)
Effect
OccipitalInfR
1 11.91
9.80
x
1
9.53
6.60
(6.0e-07)
(1.9e-05)
1
42
y
1 -84
z
1
-8
1 50 1 -70 1 -4
(1.2e-05) 1 44 1 -72 1 -16
TemporalInfR 1
Occipital_InfR 1
9.78
Occipital_InfL 1
Occipital MidL |
9.33
8.55
1
I 10.91
(4.0e-06) 1 -44 1 -72 1 -14
(1.5e-07) 1 -32 1 -96 1 -4
SuppMotor Area_L 1 6.80
SuppMotorAreaL 1 6.34
SuppMotorAreaL I 5.51
1
1
1
4.32
4.08
4.11
(6.le-04) 1 -2 1
(9.le-04) 1 -4 1
(8.7e-04) 1 -6 1
1
1
6.94
7.84
2 1
2 1
-8 1
66
70
76
No Label 1
4.84
1
4.04
(9.8e-04) [ -26 1 -94
TemporalSupR 1
4.79
I
3.98
(1.le-03) 1 66 1 -38 1 20
Frontal_MidR 1
4.52
1
4.51
(4.4e-04) 1 50 1
PrecentralR 1
4.36
1 4.05
(9.6e-04)
1
54 1
FrontalInf_TriR 1
3.79
1 3.96
(1.le-03)
I
PrecentralR 1
3.44
1 3.96
I
3.36
I
FrontalInf_Tri_L 1
Insula_L 1
3.20
3.16
1
No Label
FrontalInfOrbR 1 2.86
PostcentralL
I
2.33
|
1 -22
-2 1
56
I
42
54 1
22 1
0
(l.le-03) 1
48 1
0 1
40
4.58
(4.0e-04) 1
62
1 -24 1
50
4.45
6.34
(4.9e-04) 1 -44 1 16 1
(2.8e-05) 1 -40 1 18 1
2
2
1 3.97
(1.le-03) 1
1 4.12
(8.4e-04) | -50 1
4
52 1 26 1
-4
-6
1 38
Table 3-4 NH group: Summary of peak cortical activation produced by the contrast of the Vowel
Visual-Only condition versus the baseline condition.
Normal Hearing: Audio-Visual
CVCV
Vowel
Figure 3-3 NH group: Averaged cortical activation produced by the contrast of the Audio-Visual
condition with the baseline condition for CVCV (left panel) and Vowel (right panel) IT > 3.32
(CVCV), T > 3.95 (Vowel), mixed-effects analyses with P < 0.05, FDR corrected].
CvcvAV-Silence, MFX, Correction: FDR, T > 3.32
I
I
T
OccipitalInfR 1 10.82
TemporalMidR |
8.91
OccipitalInf_R |
7.41
1
1
|
6.57
5.69
4.60
(2.0e-05) |
(7.0e-05) I
(3.8e-04) |
TemporalSupL | 10.06
TemporalSupL
9.68
|
6.26
6.95
(3.le-05) I -58 | -28 |
(1.2e-05) | -62 | -16 |
14
TemporalSupR
TemporalSupR
9.92
9.91
|
6.58
8.47
(2.0e-05) I
(1.9e-06) I
|
|
12
9.03
8.32
8.28
7.85
7.05
I 7.46
1
1
|
|
4.14
4.61
3.68
3.38
(6.3e-06) | -30 | -94
(8.3e-04)
-38 | -80
(3.8e-04) | -36 | -84
(1.8e-03) I -42 I -70
(3.le-03) I -42 | -60
|
1
|
1
1
-6
-16
-14
-16
-18
No Label |
5.96
|
4.73
(3.le-04)
I
PrecentralR |
5.04
|
7.49
(6.le-06)
SuppMotorAreaL |
4.76
|
3.38
2.86
I
Thalamus L |
2.63
No Label I 2.59
AAL Label
|
Occipital_MidL I
Fusiform_LI
FusiformL |
FusiformL I
FusiformL |
No Label
Cerebelum_8_R
I
I
Norm'd
Effect
2.47
I
I
(p)
| MNI Location (mm)
x
y
z
42 I -84 1 -6
50 | -66 I 2
44 | -66 | -16
64 | -26
64 I -14
I
6
6
|
-6
I
54
I
56 |
-2
|
48
(3.le-03)
I
0
I
I
68
3.49
(2.5e-03)
I
6
I -28
|
-2
I
I
3.58
3.56
(2.2e-03)
(2.2e-03)
I
-8
I
3.42
(2.9e-03)
1
-52
0
I -30 I
I -12 I -28 I
22
I
-68
1
0
-6
-52
Table 3-5 NH group: Summary of peak cortical activation produced by the contrast of the CVCV
Audio-Visual condition versus the baseline condition.
VowelAV-Silence,
MFX,
AAL Label
Correction:
I Norm'd I
FDR, T > 3.95
T
(p)
Effect
TemporalSup_R
TemporalSupR
| MNI Location (mm)
x
y
z
I
I
1
1
9.27
8.05
|
1
4.81
8.59
(2.7e-04)
(1.6e-06)
OccipitalInfR 1
OccipitalMid_R 1
Temporal MidR 1
8.95
8.09
7.86
1
|
1
5.41
5.30
4.81
(1.le-04) |
(1.3e-04) I
(2.7e-04) |
1
8.50
1
6.57
(2.0e-05) | -58
| -26
Occipital MidL 1
7.77
|
5.08
(1.8e-04) | -30
I -96 I
I
6.10
1
4.02
(1.0e-03)
I -44
PrecentralR
PrecentralR
PrecentralR
FrontalMidR
I
I
4.58
4.48
4.28
4.20
1
1
1
1
4.04
4.03
4.03
4.04
(9.7e-04)
(9.9e-04)
(9.9e-04)
(9.7e-04)
I
I
I
I
No Label
I
3.57
1
4.25
(6.8e-04) |
3.47
|
4.18
(7.7e-04) | -56
TemporalSup_L
OccipitalInfL
I
I
PostcentralL |
64
62
| -36
1 -12
42 | -84
32 | -92
50 I -68
| -76
1
1
14
6
|
|
I
-6
0
0
|
14
-2
1 -10
54
54 |
54 I
54 |
-2
4
6
2
1
|
1
1
48
48
44
52
10 |
-2 1
78
-8 1
42
|
Table 3-6 NH group: Summary of peak cortical activation produced by the contrast of the Vowel
Audio-Visual condition versus the baseline condition.
3.1.2
Congenitally Deaf (CD)
Figures 3-4, 3-5, and 3-6 show the averaged activation maps for the CD group during AudioOnly, Audio-Visual, and Visual-only conditions, respectively, for both CVCV and Vowel
conditions. For the CD group, there were no regions with significant activity in Audio-Only
conditions.
Labels for regions of active cortical areas for Audio-Visual and Visual-Only
experimental conditions contrasted with baseline are summarized in Tables 3-7 to 3-10.
Brief summaries of active areas for these contrasts are listed below.
*
Audio-Only conditions: no active regions.
" The CVCV Visual-Only condition: visual and auditory cortical areas, fusiform gyrus,
inferior frontal gyrus, premotor and motor cortices, inferior frontal sulcus, right
angular gyrus, cerebellum (lobule VIII), and supplementary motor association areas.
e
The Vowel Visual-Only condition: visual and auditory cortical areas, premotor and
motor cortices, right middle and inferior temporal gyri, left cerebellum (lobule VIII),
right insula, inferior frontal gyrus, supplementary motor area, putamen, left caudate,
and left thalamus.
" The CVCV Audio-Visual condition: Visual and auditory cortical areas, premotor and
motor cortices, inferior parietal cortex, left middle and temporal gyri, left fusiform
gyrus, inferior frontal gyrus, supplementary motor association areas, areas around
inferior frontal sulcus, left cerebellum (lobule VIII), and right hippocampus.
" The Vowel Audio-Visual condition: visual and auditory cortical areas, premotor and
motor cortices, left SMA, right MTG, left SMG, left cerebellum (lobule VIII), right
insula, and right IFG.
Congenitally Deaf: Audio-Only
CVCV
Vowel
Figure 3-4 CD group: Averaged cortical activation produced by the contrast of the Audio-Only
condition with the baseline condition for CVCV (left panel) and Vowel (right panel) IT > 5.00
(CVCV), T > 4.02 (Vowel), mixed-effects analyses with P <0.05, FDR corrected).
Congenitally Deaf: Visual-Only
CVCV
Vowel
Figure 3-5 CD group: Averaged cortical activation produced by the contrast of the Visual-Only
condition with the baseline condition for CVCV (left panel) and Vowel (right panel) [T > 2.82
(CVCV), T > 4.17 (Vowel), mixed-effects analyses with P <0.05, FDR corrected].
CvcvV-Silence, MFX, Correction: FDR, T > 2.82
AAL Label
I Norm'd I
T
(p)
Effect
| MNI Location (mm)
x
y
z
TemporalSupR
Temporal_Inf R
FusiformR
Occipital MidR
TemporalInf_R
I
15.10
| 11.40
| 10.68
| 10.57
I 6.57
|
6.71
5.42
| 3.36
| 3.91
I 3.48
(1.7e-05)
(1.0e-04)
(3.2e-03)
(1.2e-03)
(2.6e-03)
|
62
48
| 44
I 34
I 46
1 -38 1
| -72 1
I -72 1
| -90 I
| -50 |
TemporalMidL
OccipitalMidL
Temporal_SupL
OccipitalInf L
FusiformL
| 11.51
| 10.88
I 10.60
| 10.13
I 8.78
I 5.30
5.27
| 6.35
| 4.05
1 4.14
(1.3e-04)
(1.3e-04)
(2.7e-05)
(9.5e-04)
(8.2e-04)
| -56
I -28
| -62
| -46
I -40
| -42
I -92
| -28
I -76
| -66
|
8
| -2
|
6
| -4
1 -16
|
|
1
1
1
1
(9.le-06)
(2.3e-05)
(2.9e-04)
(5.2e-05)
(8.4e-05)
|
|
|
|
1 48
1 44
| 66
I 28
I 48
PrecentralR
PrecentralR
Supp_Motor_AreaR
Frontal_Inf_OperR
PostcentralL
9.25
I 9.24
I 9.10
I 9.08
I
7.11
I
I
7.17
6.47
4.77
5.90
5.56
I
54
54
|
2
I 48
I -54
I
-2
-2
0
I 12
| -6
10
-4
-18
0
-22
Cerebelum_8 L I 5.14
1 3.08
(5.3e-03)
| -24 | -60
I -54
Cerebelum 8_R I 5.02
I
2.82
(8.3e-03) I 28 1 -60
I -52
3.96
(1.le-03) I 34 1 -58
I 48
AngularR |
4.61
1
CingulumAntL |
2.81
1 2.93
(6.9e-03) 1
FrontalSupR |
2.36
|
2.86
(7.7e-03) 1 16
FrontalSupR |
2.19
|
2.83
(8.2e-03) |
PostcentralR |
1.67
I 3.21
(4.le-03) I 54 I -24 |
52
PrecentralL |
1.56
|
(8.2e-03) I -48
28
2.83
-4 I 36 |
I
48 I 36
8 I 56
22 |
|
20
-8 |
Table 3-7 CD group: Summary of peak cortical activation produced by the contrast of the CVCV
Visual-Only condition versus the baseline condition.
VowelV-Silence, MFX, Correction: FDR, T > 4.17
AAL Label
I Norm'd I
T
(p)
I MNI Location
Effect
TemporalSupR
1 12.02
x
y
1 -40 |
| -24 1
12
0
I
0
1
Temporal_Sup_R 1 8.04
1
7.41
5.70
(6.7e-06) 1
(6.9e-05) 1
TemporalMidR 1 11.21
1
4.79
(2.8e-04) 1 50 | -68
7.23
5.05
5.21
4.22
(1.9e-04)
(1.5e-04)
(7.le-04)
6.47
4.34
(5.8e-04)
Occipital_Inf_L
OccipitalInf_L
Occipital_Inf_L
OccipitalInfL
10.48
10.20
64
60
(mm)
z
-44
-78
-48
-26
-40
-74
-92
-62
TemporalInfR 1
8.36
1
4.23
(7.le-04) 1 46 1 -48
1 -16
Frontal MidR 1
7.76
1
7.38
(7.0e-06) 1 54 1 -2
1
Cerebelum_8_L I
Cerebelum_8_L I
5.59
1
4.41
(5.2e-04) 1 -22 1 -62
1 -48
5.50
1
4.21
(7.4e-04)
1 -50
1 -26
| -58
52
InsulaRR
Insula_R
5.37
1
I
4.76
1
4.18
4.21
(7.7e-04) I
(7.3e-04) 1
OccipitalMid_L
I
5.26
1
4.18
(7.7e-04) I -40
FrontalInfTri L |
5.12
1
4.27
(6.6e-04)
I
SuppMotorArea_R |
5.09
1
4.36
(5.6e-04)
I
TemporalSupL
I
4.80
I
4.28
(6.5e-04)
| -64
| -14
1
Supp_MotorArea_L
I
4.53
I
4.41
(5.2e-04)
1 -8
I
4
1
60
SuppMotorArea_L
I
4.29
I
4.48
(4.6e-04)
1 -6
I
6
I
52
FrontalInf_Oper_R
I
4.28
I
4.71
(3.2e-04)
1 38 |
10 1 26
Putamen_R
Putamen_R
Putamen_R |
4.27
3.84
I
I
4.64
4.54
1 4.40
(3.6e-04)
(4.2e-04)
(5.3e-04)
I
1 30 |
24 I
I 22 |
6 1 0
6 1 14
6 1 10
1 4.55
1 4.48
(4.2e-04)
(4.7e-04)
1 -46 I -34
1 -44 I -32
3.75
Postcentral_L |
Postcentral_L I
4.25
FrontalInf TriR
Frontal Mid_R I
FrontalInfTni_R |
4.19
3.96
3.86
1 4.23
1 4.39
1
4.19
(7.le-04) 1
(5.4e-04) |
(7.6e-04) I
Frontal Inf Tri L
4.07
1
4.17
Postcentral_L I
Precentral_L |
3.77
1
5.88
3.74
|
CaudateL I
Caudate_L I
Caudate_L I
ThalamusL |
Putamen L I
3.67
3.36
3.12
I
2.96
I
4.07
I
I
2.69
42 1
40
1
0
I
1 -64
1
-38 1 30 |
12
I
-14
2 1 -10
14 1
-2
16
66
4
1 58
1 62
42 j
38 |
42 |
38 1
42 1
42 1
-2
0
-2
(7.8e-04)
| -50 |
32 1
12
4.51
(5.3e-05)
(4.4e-04)
I -54 |
I -52 I
-6
-2
| 48
1 44
6.27
4.25
4.41
4.53
5.68
(3.0e-05)
(6.9e-04)
(5.2e-04)
(4.3e-04)
(7.2e-05)
I
I
|
1
1
6
8
-12 I
8
-6 1 -10
-24 1 8
1 18
I 10
1
6
|
4
1 14
-12 I
-14 I
Frontal_Inf_OperL
I
3.51
I
4.95
(2.2e-04)
1 -42
FrontalInf_Tri R
FrontalInf_Tri R
I
I
3.45
I
3.24
|
4.27
4.28
(6.7e-04)
(6.5e-04)
Cingulum_MidR
I
3.25
I
4.76
No Label
I
3.21
|
No Label |
3.08
No Label I
No Label |
No Label
No Label
1
6 1 20
1
1
48 1
44 I
30 | 16
32 1 16
(2.9e-04)
1
14 1
10 1 36
4.24
(6.9e-04)
1 -16
1
|
4.18
(7.7e-04) 1 -26
1
2.56
2.53
|
4.74
I
5.18
(3.0e-04)
(1.5e-04)
| -38 1
1 -38 1
I
2.01
|
4.21
(7.3e-04)
|
|
2.01
|
4.46
(4.8e-04) 1
0 | -12
16 |
2
-2 1 -28
2 1 -28
0 | -20 1 -16
4 | -20
1 -16
Table 3-8 CD group: Summary of peak cortical activation produced by the contrast of the Vowel
Visual-Only condition versus the baseline condition.
Congenitally Deaf: Audio-Visual
CVCV
Vowel
Figure 3-6 CD group: Averaged cortical activation produced by the contrast of the Audio-Visual
condition with the baseline condition for CVCV (left panel) and Vowel (right panel) IT > 3.28
(CVCV), T > 4.28 (Vowel), mixed-effects analyses with P < 0.05, FDR corrected].
CvcvAV-Silence, MFX, Correction: FDR, T > 3.28
I Norm'd |
AAL Label
T
(p)
| MNI Location (mm)
Effect
|
62
z
|
(2.6e-05)
8.68
8.53
3.94
4.04
3.30
3.33
(1.le-03)
(9.7e-04)
(3.5e-03)
(3.4e-03)
| 12.06
TemporalSupL I 11.83
TemporalMid_L I 4.28
TemporalInf_L |
4.02
Temporal MidL |
3.08
3.63
7.05
3.30
3.58
3.36
(2. Oe-03)
(1. le-05)
(3.5e-03)
(2.2e-03)
SuppMotorAreaL I 10.84
SuppMotor Area_R I 3.49
6.31
4.35
(2.9e-05) I
(5.8e-04) I
OccipitalMidL | 10.31
Occipital_InfL | 7.80
5.65
3.98
(7.4e-05) I -28 1 -94 I
(1.le-03) I -42 1 -82 |
6.57
5.29
4.85
3.41
3.67
(2. Oe-05)
54
(1. 3e-04)
|
8.36
8.33
6.89
4.96
4.36
52
48
58
46
|
7.08
|
6.32
5.87
3.98
3.91
3.36
3.52
3.67
3.47
3.42
(3.2e-03)
I
Precentral L |
PostcentralL I
PostcentralL |
PostcentralL I
6.47
2.95
2.22
2.20
|
1.81
Fusiform L |
6.37
PrecentralL |
OccipitalMid_R
OccipitalInfR
CerebelumCrus1_R |
|
y
6.38
TemporalSupR
Occipital_Mid_R
| 13.04
| 10.94
x
8.99
TemporalMid_L
PrecentralR |
FrontalMidR
I
Frontal_Inf_OperR |
Frontal_Inf Tr i R
FrontalMidR
FrontalInf_Tri L
FrontalInf_Tri L
Frontal InfOper L
Frontal_InfTriL
Frontal_Inf_Tri L
|
I
(2.5e-04)
(2.9e-03)
(1. 8e-03)
-42
-28
-44
-46
-46
01
8 |
22 I
-52
-54
-28
-24
4.20
(7.4e-04) | -40
| -64
I
3.38
(3.le-03) I -42 |
4.53
|
3.33
(3.3e-03)
I
4.38
|
3.31
(3.5e-03)
|
4.35
I
Temporal Mid L I
4.21
I
FrontalInfOrbR I 3.86
70
62
-54
-48
-46
FrontalMidL I 4.11
8
6
-2
-12
-6
01
-6
FrontalSupMedialL
|
-24
-24
PrecentralL
4.72
-16
-54
-44
ParietalInfL
I
0
-2
-52
PostcentralL
10
-50
-46
(2.4e-03)
(1.9e-03)
(2.6e-03)
(2. 9e-03)
(1. Oe-04)
(1. 2e-03)
-94
-74
-66
-68
-56
-62
-50
-44
-52
(3.2e-03)
5.42
3.91
3.42
3.54
4.09
1 -38
30
48
46
42
(2. 9e-03)
(2. 3e-03)
(9. Oe-04)
-34
I -20
-2 |
58
I
-48 I -46 I
54
I
-40
|
4 |
40
3.42
(2.9e-03) I
-4
|
40 |
3.29
(3.6e-03) I -52 | -68
1 3.37
(3.le-03)
| -48 1
I
3.31
(3.5e-03) I
50 |
30
50
I
6
|
34
46 | -10
Cerebelum_8_L |
3.84
|
3.54
(2.3e-03)
I
-26
I
3.45
|
3.35
(3.2e-03)
I
-36 |
ParietalInfL
|
3.42
I
3.57
(2.2e-03)
| -58
| -38
|
46
ParacentralLobule_L
|
3.36
|
3.46
(2.7e-03)
I
| -28
|
70
ParacentralLobule_R |
Precentral_R |
3.18
3.13
|
|
3.82
(1.4e-03) I
3.56
(2.3e-03)
|
|
3.17
1
3.31
(3.5e-03)
|
FrontalInfTri
L
FrontalSupMedialR
-6
I
I
-52
18 |
26
-60
6 I -30 I
12
| -32
8 1
44
68
I
74
|
48
ParietalInfR 1
2.83
1
3.29
(3.6e-03) 1
32 | -50
1
CerebelumCrus1_R 1
2.71
|
3.30
(3.5e-03) 1
50 1 -62
| -40
FrontalSupL 1
2.40
1
3.34
(3.3e-03) 1 -18 1
I
1.47
1
3.50
(2.5e-03) 1
PostcentralL 1
0.88
1
3.82
(1.4e-03) 1 -32 1 -36
HippocampusR
-6 1
40 1 -26
44
72
1 -14
1
60
Table 3-9 CD group: Summary of peak cortical activation produced by the contrast of the CVCV
Audio-Visual condition versus the baseline condition.
VowelAV-Silence,
Correction: FDR,
MFX,
AAL Label
I Norm'd I
T
T > 4.28
(p)
Effect
| MNI Location (mm)
x
z
y
I
14.19
I
5.79
(6.0e-05) 1
64
-40
1
12
Temporal_MidR 1
9.18
1
4.54
(4.2e-04) 1
48 1 -72
1
0
Occipital_MidR 1
9.08
I
4.45
(4.9e-04)
28 1 -94
1
0
SuppMotor_AreaL 1
8.66
1
4.37
(5.6e-04) 1
-2 1
62
PrecentralR 1
8.14
1
4.67
(3.4e-04) |
-2
1
48
I
7.60
1
4.32
(6.le-04)
1 -98
1
0
TemporalSupL 1
Temporal_SupL 1
SupraMarginal L 1
SupraMarginal L I
7.41
6.65
3.35
2.89
1
1
1
1
4.62
4.44
4.29
4.69
(3.7e-04) 1 -64 1 -26
(4.9e-04) 1 -62 1 -38
(6.4e-04) 1 -58 1 -38
(3.3e-04) 1 -60 1 -36
I
1
8
16
26
30
1
1
1
1
4.53
4.38
4.63
4.46
4.85
(4.3e-04)
(5.5e-04)
(3.6e-04)
(4.8e-04)
(2.6e-04)
I
1
1
|
6.74
6.68
5.13
4.51
4.45
18
16
24
38
42
1
1
|
1
1
24
26
26
24
26
OccipitalMidL 1
OccipitalMidL 1
4.61
4.49
1
1
4.35
4.41
(5.8e-04) 1 -50 1 -76
(5.3e-04) 1 -50 1 -72
1
1
2
4
ParacentralLobuleR 1
ParacentralLobuleR 1
4.46
4.03
1
|
4.47
4.54
(4.7e-04) 1
(4.2e-04) 1
1
|
74
74
3.94
1
4.62
(3.7e-04) | -50
1 -10 1
54
3.27
1
4.29
(6.4e-04)
1
I
4 1
30
3.26
1
4.75
(3.0e-04)
1
-8 1
12 1
40
2.74
2.63
2.24
1
1
1
4.53
4.46
4.46
(4.3e-04) |
40 1
(4.8e-04) I 38 1
(4.8e-04) 1 32 1
38 1
44 1
44 1
2
2
2
TemporalSup_R
OccipitalMidL
FrontalInfTri R
FrontalInfOper R
FrontalInf_Tri R
FrontalInf_Tri R
FrontalMidR
1
I
PostcentralL |
PrecentralL 1
CingulumMidL
I
FrontalInfTri R |
FrontalMid R 1
No Label 1
I
1
1
1
1
1
1
0
1
1
54 1
-22
50
44
42
44
42
1
1
1
1
1
6 1 -26
10 | -32
-38
1
1
Table 3-10 CD group: Summary of peak cortical activation produced by the contr ast of the Vowel
Audio-Visual condition versus the baseline condition.
3.1.3
Hearing Aid Users (HA)
The HA is group comprised of individuals with varied amounts of hearing loss and a wide
range of benefit from hearing aid usage. Clearly this group, with diverse hearing states, was
the least homogenous subject group. Due primarily to this large variability in our subject
pool, when the mixed-effects analyses were performed most voxels did not survive the
threshold when the error correction was applied. Hence, the results presented in this section
are activation patterns obtained without any error corrections. Figures 3-7, 3-8, and 3-9 show
the averaged activation maps for both the HA group during Audio-Only, Audio-Visual, and
Visual-only conditions, respectively, for both CVCV and Vowel. As in the CD group, there
were no regions with significant activity in Audio-Only conditions for the HA group. Labels
for regions of active cortical areas for Audio-Visual and Visual-Only experimental conditions
contrasted with baseline are summarized in Tables 3-11 to 3-14. Brief summaries of active
areas for these contrasts are listed below.
" Audio-Only conditions: no active regions.
" The CVCV Visual-Only condition: visual and auditory cortical areas, right inferior
frontal gyrus, right fusiform gyrus, premotor and motor cortices, left supplementary
motor association areas.
" The Vowel Visual-Only condition: visual cortex, right posterior superior temporal
gyrus, left premotor and motor cortices, right middle and inferior temporal gyri,
Broca's area, and right fusiform gyrus.
*
The CVCV Audio-Visual condition: visual and auditory cortices, premotor and motor
cortices, right fusiform gyrus, inferior frontal gyrus, left supplementary motor
association area, and cerebellum (lobule VIII).
*
The Vowel Audio-Visual condition: visual cortex, posterior superior temporal gyrus,
lateral premotor cortex, and right middle temporal gyrus.
Hearing Aid Users: Audio-Only
CVCV
Vowel
Figure 3-7 HA group: Averaged cortical activation produced by the contrast of the Audio-Only
condition with the baseline condition for CVCV (left panel) and Vowel (right panel) IT > 4.02
(CVCV), T > 4.02 (Vowel), mixed-effects analyses with P < 0.001, uncorrected].
Hearing Aid Users: Visual-Only
CVCV
Vowel
Figure 3-8 HA group: Averaged cortical activation produced by the contrast of the Visual-Only
condition with the baseline condition for CVCV (left panel) and Vowel (right panel) IT > 4.02
(CVCV), T > 4.02 (Vowel), mixed-effects analyses with P <0.001, uncorrectedj.
CvcvV-Silence, MFX, Correction: none, T > 4.02
| Norm'd I
Effect
AAL Label
7.62
7.08
7.06
|
SupR 1
SupR 1
SupR 1
SupR I
7.52
7.39
6.86
6.77
I
I
Occipital Inf_R |
Occipital InfR |
OccipitalInf_R |
T
(p)
I MNI Location (mm)
x
z
y
5.39
4.15
4.13
(1.le-04) 1
(8.le-04) |
(8.4e-04) I
32 1 -92
44 1 -80
40 | -86
1
|
|
-6
-6
-6
(6.7e-04) I
(8.2e-04) |
(8.4e-04) I
(7.8e-04) |
54 | -38
58 I -34
62 | -34
64 | -32
I
I
4.26
4.14
4.13
4.17
|
|
|
10
8
18
6
6.75
I
6.63
(1.9e-05) 1 -26
| -96
I
-4
Temporal SupL I 6.50
Temporal Sup L |
5.75
I
|
4.60
4.08
(3.8e-04) 1 -52
(9.2e-04) | -62
I -44
I -34
1
|
22
12
6.29
|
4.04
(9.8e-04) | -48
I
-6 |
SuppMotorArea_L |
5.79
SuppMotorArea_L I 4.71
|
4.13
4.16
(8.4e-04) |
(7.9e-04) I
-2
0
I
I
0 |
62
10 I 60
24 |
8 |
Temporal
Temporal
Temporal
Temporal
Occipital MidL
PrecentralL |
FrontalInf_TriR
PrecentralR
I
I
FusiformR |
OccipitalInfL
I
I
I
I
|
I
54
5.22
4.62
I
|
4.85
4.18
(2.6e-04) I 52
48
(7.7e-04) |
I
I
5.13
|
4.63
(3.6e-04) |
46
I -62
1 -22
3.66
I
4.13
(8.4e-04) | -42
1 -82
1
22
40
-4
Table 3-11 HA group: Summary of peak cortical activation produced by the contrast of the CVCV
Visual-Only condition versus the baseline condition.
VowelV-Silence, MFX, Correction: none, T > 4.02
I Norm'd
I
T
(p)
I
1
1
1
(1.Oe-04)
(6.6e-05)
(2.le-04)
(3.4e-04)
(3.6e-04)
1
1
1
1
5.44
5.73
4.97
4.67
4.63
4.96
1
4.53
Frontal_Inf_OperL |
4.89
Precentral_L I 3.81
PrecentralL |
3.76
1
AAL Label
Effect
Temporal MidR I 8.61
TemporalInfR I 7.10
OccipitalInfR I 6.93
OccipitalInfR I 6.81
OccipitalInfR |
6.68
MNI Location (mm)
x
y
z
1
-62
-74
-92
-88
-84
1
1
1
1
1
6
-4
-4
-4
-2
(4.3e-04) 1 -30
1 -94
1
-6
4.12
4.31
4.04
(8.4e-04) 1 -50
(6.2e-04) 1 -56
(9.7e-04) 1 -44
|
10 1
4
2 |
22
24
30
|
4.26
5.79
4.23
(6.7e-04) |
(6.le-05) I
(7.0e-04) |
3.58
|
4.13
(8.4e-04) | -44
2.34
I
4.24
(6.9e-04) |
Frontal Mid Orb L |
1.91
No Label I 1.78
I
I
4.24
4.03
(6.9e-04)
(1.0e-03)
OccipitalMidL
I
Temporal_InfR |
4.11
Fusiform_R I 3.90
TemporalInfR |
3.66
FrontalInfTriL |
TemporalSup_R
I
1
1
1
|
I
1
1
52
48
32
38
42
1
1
1
1
I
I
46 I -50
44 I -46
44 | -58
I
I -16
I -16
| -12
36
|
10
44 | -28
|
4
I -26 I
I -22 I
40 | -10
42 |
-8
Table 3-12 HA group: Summary of peak cortical activation produced by the contrast of the Vowel
Visual-Only condition versus the baseline condition.
Hearin. Aid Users: Audio-Visual
CVCV
Vowel
Figure 3-9 HA group: Averaged cortical activation produced by the contrast of the Audio-Visual
condition with the baseline condition for CVCV (left panel) and Vowel (right panel) [T > 3.11
(CVCV), T > 3.11 (Vowel), mixed-effects analyses with P < 0.001, uncorrected].
CvcvAV-Silence, MFX, Correction: none, T > 3.11
I
AAL Label
TemporalSupR
I 11.20
OccipitalInfR I
OccipitalInfR I
TemporalSupR
Norm'd I
Effect
|
TemporalMidR |
Temporal SupL |
TemporalSupL
I
TemporalSupL
|
OccipitalInf_L I
OccipitalInf_L |
8.33
8.00
7.74
5.72
9.29
8.66
8.56
7.02
6.83
T
(p)
| MNI Location (mm)
x
y
z
3.97
4.62
4.83
3.41
3.38
(1. le-03)
(3.7e-04)
(2. 6e-04)
(2. 9e-03)
(3. le-03)
-24
-76
-92
-2
-62
4.84
4.72
(2. 6e-04)
(3.2e-04)
(3.6e-05)
(8.2e-05)
(5. 2e-04)
-54
-60
-58
-26
-30
-40
-28
-32
-94
-92
-56
-54
-48
-54
-12
6.16
5.59
4.41
20
12
14
-6
-10
7.33
6.14
5.78
5.44
I
I
3.20
3.80
1 4.12
| 4.36
(4.2e-03)
(1.5e-03)
(8.5e-04)
(5.7e-04)
6.12
4.76
| 3.90
1 3.16
(1.2e-03) |
(4.5e-03) I
-6 |
-2 |
-2 1
0 |
5.48
4.01
1
I
4.43
3.33
(5.0e-04) |
(3.4e-03) |
52 |
50 I
2 I 46
18 | 26
3.95
1
4.17
(7.9e-04)
46 | -52
Temporal_Inf_L |
3.92
|
3.46
(2.7e-03) | -46
Precentral_R
I
3.67
I
4.69
(3.3e-04)
Postcentral_R
Postcentral_R
I
3.37
2.83
1
I
3.95
1 3.49
(l.le-03) 1
(2.5e-03) 1
34 1 -40 1 72
46 | -38 1 64
Cerebelum_8_R |
3.03
1 3.25
(3.8e-03) 1
28 1 -62
FrontalInfTri L I 2.86
I 3.33
(3.4e-03)
I 3.27
I 3.93
(3.8e-03)
(1.2e-03)
|
3.41
(2.9e-03)
TemporalSupL
|
Temporal Pole SupL |
Precentral_L I
Frontal_Inf
perL
Supp MotorAreaL
Supp_Motor_Area_L
|
Precentral_R
Frontal_InfOperR
I
Fusiform_R
1
I
1 -46
I
I
I
2.79
2.62
2.43
Cerebelum_8_R
I
2.30
1 3.19
(4.3e-03)
1.79
1 3.15
(4.6e-03) | -40
Tri L
1
1 -20
1
80
1 -54
0
38 1
-26 | -62
-8 I -68
-14 | -68
1
76
66
1 -50 1 -20
18 1 -26
Cerebelum8_L
No Label
Cerebelum_8_L
FrontalInf
6
-4
12
1 -50
| -50
I -52
30 1 -52 1 -52
1
32 1
0
Table 3-13 HA group: Summary of peak cortical activation produced by the contrast of the CVCV
Audio-Visual condition versus the baseline condition.
VowelAV-Silence,
MFX, Correction: none, T > 3.11
AAL Label
| Norm'd I
Effect
T
(p)
I MNI Location (mm)
x
y
z
TemporalSupR 1
TemporalSupR 1
TemporalSupR
8.85
8.21
5.80
1 3.14
1 3.11
| 3.80
(4.7e-03) |
(5.0e-03) 1
(1.5e-03) 1
60 | -24
56 I -22
|
1
50 | -38
1
TemporalSupR
7.57
1 3.11
(5.0e-03) 1
60 1 -24
I
6.35
6.34
1 3.43
1 3.67
(2.8e-03) 1 -54 1 -42
(1.8e-03) | -60 | -28
|
TemporalSupL 1
TemporalSupL 1
0
1 18
1 12
57
Occipital_Inf_R 1 6.14
OccipitalInfR 1 5.91
OccipitalInfR I 5.60
Temporal MidR 1 4.98
1
1
1 4.47
3.77
4.37
1 3.40
(4.8e-04)
(1.5e-03)
(5.6e-04)
(3.0e-03)
PrecentralR 1 4.29
1
3.79
(1.5e-03) 1
TemporalSupL 1 3.76
1
3.24
(4.0e-03) 1 -46 1 -16 1
2
OccipitalMidL 1 3.28
1
3.47
(2.6e-03) 1 -26 | -98
1
0
PrecentralL 1 3.10
1
3.16
(4.5e-03) 1 -50 1 -4
1
52
1
1
1
1
34
46
44
54
1
1
1
1
-92 1 -6
-76 I -4
-82 1 -6
-62 1 4
52 1
0 1 46
Table 3-14 HA group: Summary of peak cortical activation produced by the contrast of the Vowel
Audio-Visual condition versus the baseline condition.
3.1.4
Discussion of Results
In the fMRI experiment, subjects viewed and/or listened to various unimodal or bimodal
speech stimuli. As expected for the NH group, there is significant activity in the auditory
cortex for Audio-Only conditions (Figure 3-1). However, since CVCV stimuli contain more
acoustic fluctuation information and therefore convey more information than simple steady
state vowels, the extent of activity for the CVCV Audio-Only condition was noticeably
greater than the Vowel Audio-Only condition.
3.1.4.1
Auditory-Visual Speech Perception Network in NH Individuals
The experimental conditions of most interest in this study were Visual-Only and AudioVisual conditions. Calvert et al (1997) identified five main areas of activation while normal
hearing subjects watched a video of a speaking face without sound: visual cortex, primary
and secondary auditory cortex, higher-order auditory cortex, the angular gyrus, and the
inferoposterior temporal lobe. Other than the primary auditory cortex and angular gyrus, all
areas reported in Calvert et al. (1997) were also activated bilaterally in our study for normal
hearing subjects during Visual-Only conditions (Figure 3-2). Calvert et al. (1997) reported
that the activation in the auditory cortex also included the lateral tip of Heschl's gyrus. In the
present study, Heschl's gyrus was not included in the speechreading network for the NH
group; only the posterior portion of the superior temporal gyrus/sulcus was included (Figure
3-2).
In addition to these areas, the speechreading network of the NH group comprised
precentral gyrus including lateral premotor cortex and nearby primary motor cortex, the
supplementary motor area, inferior frontal gyrus and inferior frontal suclus in the frontal
lobe, superior and inferior parietal cortex (including the angular gyrus and the supramarginal
gyrus), inferior cerebellar cortex (lobule VIII), and middle temporal gyrus.
The areas of activity seen during Audio-Visual conditions included most of the areas that
were active in Visual-Only conditions; however there was considerably more activity in the
auditory cortex.
This is not surprising given that Audio-Visual stimuli included auditory
input, whereas in Visual-Only conditions, subjects were only attending to visual speech only.
There was considerably less spread in activity patterns for Audio-Visual conditions than
Visual-Only conditions - that is, activity regions seem to be more focused spatially. The left
inferior cerebellar cortex (lobule VIII) was also found to active during the CVCV AudioVisual condition.
3.1.4.2
Speech Motor Network and Visual Speech Perception
In the Visual-Only and Audio-Visual conditions, activations were observed in Broca's area
(triangular and opercular parts of the left inferior frontal gyrus) and its right hemisphere
homolog (right IFG), the bilateral premotor/motor cortex, the supplementary motor area, and
the cerebellum.
These brain regions are components of the speech motor network,
suggesting that during visual speech perception the speech motor network is also engaged in
addition to auditory and visual cortical areas. This supports the idea that motor-articulatory
strategies are employed in visual speech perception, as suggested in previous studies (Ojanen
et al., 2005; Paulesu et al., 2003).
More recent studies have reported activity in areas thought to be involved with planning and
execution of speech production during visual speech perception. These brain regions include
Broca's area, anterior insula and premotor cortex (Callan et al., 2000; Kent and Tjaden, 1997).
A number of studies have shown activations in speech motor areas during speech perception
tasks (Bernstein et al., 2002; Callan et al., 2003; Callan et al., 2004; Calvert and Campbell,
2003; Campbell et al., 2001; Olson et al., 2002; Paulesu et al., 2003). However, there are
other studies that did not find activations in speech motor areas during speech perception
(Calvert et al., 1999; Calvert et al., 1997; Calvert et al., 2000; MacSweeney et al., 2001). In
Calvert and Campbell's (2003) study, it was shown that even implied visual motion of speech
gesture (not an actual motion, but a still picture containing speech gesture information) can
elicit a response in speech motor areas. Additionally, Broca's area has been found to be
active when subjects are observing visual speech motion without any auditory signal
(Campbell et al., 2001).
These results tie in nicely with the recent discovery of the mirror neuron system, which
displays involvement of brain regions that are associated with producing some gestures
during perception of the same or similar gestures (Rizzolatti and Arbib, 1998; Rizzolatti and
Fadiga, 1998; Rizzolatti et al., 2002; Rizzolatti et al., 1998). In other words, brain regions
involved with observing a certain form of gesture are the same as those used during action
execution of that same gesture. So a listener's speech mirror neuron system would function
by engaging speech motor regions to simulate the articulatory movements of the speaker
during visual speech perception, and could be used to facilitate perception when auditory
information is degraded and gestureal information is available.
In Callan et al. (2003),
activity was found in Broca's area and lateral premotor cortex - thought to form part of a
"mirror neuron" system for speech perception - for various conditions including
degraded/intact auditory/visual speech information. However, in a subsequent study, Callan
et al. (2004) did not find Broca's area to be active during visual speech processing. This
discrepancy in activation observed in the two studies may be explained by the view that there
are multiple parallel pathways by which visual speech information can be processed. One
pathway may be comprised of those regions involved in internal simulation of planning and
execution of speech production, and another pathway may include multisensory integration
sites (e.g., superior temporal sulcus).
In another study by Watkins and Paus (2004), the investigators used PET and TMS during
auditory speech perception and found that there was increased excitability of the motor
system underlying speech production and that this increase was significantly correlated with
activity in the posterior part of the left inferior frontal gyrus (Broca's area). They proposed
that Broca's area may "prime" the motor system in response to heard speech even when no
speech output is required, operating at the interface of perception and action. Interestingly, in
the current study, inferior frontal gyrus was found to be active in both Visual-Only and
Audio-Visual speech conditions (Figures 3-2, 3-3, 3-5, 3-6, 3-8, and 3-9) in all subject groups,
suggesting that Broca's area may "prime" the motor system in response not only to heard, but
seen speech as well.
In addition to these reported activations in inferior frontal gyrus,
significant correlations with hearing aid users' gained benefit from using aids were found
(see Section 3.3.2).
Furthermore, in the current study showed activation precentral gyrus in both Visual-Only and
Audio-Visual conditions included both premotor and nearby primary motor cortex. This
activation encompassed the mouth area and even extended onto the lip, jaw and tongue areas
of primary motor cortex according to probabilistic maps (Fox et al., 2001) and estimated
anatomical locations of the components of the speech motor system (Guenther et al., 2006).
So simply viewing someone mouthing words was sufficient to elicit activation in motor
cortex responsible for controlling visible speech articulators, supporting the hypothesized
role of this area in human speech perception and production (Wilson et al., 2004).
The left insula was also found to be active for the Vowel Visual-Only condition in the
hearing group (Figure 3-2) whereas right insula was shown to be active for the same
condition in the congenitally deaf groups (Figure 3-5).
The anterior insula is generally
thought to be associated with processes related to speech production planning (Dronkers,
1996). We also found that the [CVCV Visual-Only - CVCV Audio-Visual] contrast showed
significant responses in supplementary motor area, Broca's area and cerebellum.
In
particular, supplementary motor area was shown to be active in most Visual-Only and AudioVisual conditions for all subjects. Although a clear functional delineation between SMA and
pre-SMA has not been identified, SMA is many studies have shown that the BA 6 does have
at least two subregions based on cytoarchitecture and patterns of anatomical connectivity.
The roles of SMA and pre-SMA in speech production are thought be involve representing
movement sequences, but the pre-SMA more involved with planning and the SMA with
motor performance. In terms of anatomical connectivity patterns, the pre-SMA is found to be
well connected with the prefrontal cortices while the SMA is more strongly connected with
the motor cortex (Johansen-Berg et al., 2004; Jfrgens, 1984; Lehdricy et al., 2004; Luppino
et al., 1993).
Several portions of the cerebellar cortex (lobules VI, VIII, and crus 1 of lobule VII) were
active in the current study for some Visual-Only conditions in all subject groups. These
portions of the cerebellum are active in most speech production experiments; lobule VI and
crus 1 have been associated with feedforward motor commands during speech (Guenther et
al., 2006) while lobule VIII has been associated with the sequencing of speech sounds
(Bohland and Guenther, 2006). However their role in auditory-visual speech processing is
still unclear. Gizewski et al. (2005) addressed the influence of language presentation and
cerebellar activation and found that crus 1 was active in deaf individuals when subjects were
perceiving sign language, while in normally hearing volunteers, crus I was less active for
sign language comprehension, but more significantly active when reading texts. Results from
this study suggest that activity in crus 1 may correspond to language perception regardless of
the mode of language presentation. Callan et al. (2003) suggested that it may reflect the
instantiation of internal models for motor control that map between visual, auditory, and
articulatory speech events to facilitate perception, particularly in noisy environments (see
also Doya, 1999).
3.1.4.3
Hearing Status and Auditory-Visual Speech Perception Network
In the CVCV Visual-Only condition, the right hemisphere auditory areas for deaf subjects
(Figure 3-5) were heavily active, unlike normal subjects (Figure 3-2) who show no right
hemisphere auditory activation. Belin et al. (1998) showed that rapid acoustic transitions (as
in consonants) primarily activate left hemisphere locations while slow acoustic transitions (as
in vowels) cause bilateral activation. Since deaf individuals cannot use precise temporal
auditory information, such as voice onset time (VOT) to distinguish consonants, this might
explain the increased right hemisphere activation in the deaf subjects. Also, earlier studies
report that the right auditory cortex in hearing individuals has a tendency to process timevarying auditory signals and that there exists a right hemisphere bias for such auditory
motion. This may explain the hemispheric selectivity shown in congenitally deaf subjects. In
the Vowel Visual-Only condition, the left hemisphere for deaf subjects (Figure 3-5) showed
no significant activity in parietal and visual cortex. Instead only premotor cortex and primary
auditory cortex were activated, whereas in normal hearing subjects, there was no primary
auditory cortex activation.
The CD subjects' activity in Visual-Only conditions looked more like normal hearing
activation in Audio-Visual conditions rather than Visual-Only conditions. Furthermore, deaf
subjects' visual cortical activation levels were significantly lower than normal hearing
subjects' in all experimental conditions. MacSweeney et al. (2002a; 2001) also investigated
the neural circuitry of speechreading in hearing impaired people who were congenitally
profoundly deaf, but not native signers. Their deaf subjects represent the majority of the deaf
population and usually have considerably superior speechreading ability than deaf native
signers born to deaf parents. They reported that in congenitally deaf people, significant
activations were found in posterior cingulate cortex and hippocampal/lingual gyri, but not in
the temporal lobes during silent speechreading of numbers. Moreover, they commented that
the activation in the left temporal regions seemed to be more dispersed across a number of
sites, and that activation in posterior cerebral areas seemed to be increased in the deaf group.
However, the pattern of activation found for deaf subjects in the present study while viewing
visual-only stimuli did not include posterior cingulate cortex or hippocampal/lingual gyri. As
can be seen in Figure 3-5 for the visual-only condition, the congenitally deaf group also
showed significant activations in all of the areas identified in Calvert et al.'s study;
additionally active regions included Heschl's gyrus, premotor cortex, insula, supramarginal
area (BA 40), inferior frontal sulcus and inferior frontal gyrus (BA 44/45). The differences
in activation patterns between our study and MacSweeney et al. (2002a; 2001) are most
likely due to the fact that our deaf subjects used ASL as their primary mode of
communication. Also the number of subjects in MacSweeney et al. (2002a; 2001)'s study
was six in contrast to twelve in the present study.
As can be clearly seen in Figure 3-4, the auditory signal alone was not sufficient to activate
auditory cortical areas or any other areas in the congenitally deaf subjects; however with
added visual speech, a large area of auditory cortex was found to be active even though these
subjects were all profoundly deaf. MacSweeney et al. (2001) and (2002) reported that in
congenitally deaf people, significant activations were found in posterior cingulate cortex and
hippocampal/lingual gyri, but not in the temporal lobes during silent speechreading of
numbers.
Similarly to visual-only condition as described in the previous paragraph, the
congenitally deaf group in the auditory-visual condition showed significant activations in all
of the five areas identified in Calvert et al.'s study (Figure 3-5), including Heschl's gyrus and
some sites in motor areas including premotor cortex and Broca's area.
The congenitally deaf group showed similar activation patterns to normally hearing subjects
in the CVCV Visual-Only condition, but it included other areas that were not active in the
NH group. These areas included: right angular gyrus, insula, supramarginal gyrus (BA 40),
thalamus, caudate, middle cingulum in both hemispheres; left-lateralized Heschl's gyrus and
putamen; and right-lateralized rolandic operculum.
lobule VI cerebellar activity was also
shown to be left-lateralized in the CD group. Activity was observed in auditory cortex,
visual cortex, Broca's area (triangular and opercular parts of the left inferior frontal gyrus),
its right-hemisphere homolog (right IFG), the bilateral premotor/motor cortex (lip area),
regions near the inferior frontal sulcus, and the supplementary motor area. As stated in the
previous section, in the present study, Heschl's gyrus was not included in the speechreading
network for the NH group, but a significant response was seen in the left Heschl's gyrus in
deaf participants. It should also be noted that there was far less activity in visual cortex and
far more activity in auditory cortex in the CD group compared to the NH group.
Since our congenitally deafened subjects still have some residual hearing, it was expected
that there would be some detectable differences in activation patterns between Audio-Visual
and Visual-Only conditions. Such findings are demonstrated in Figure 3-5 and 3-6 where
there seems to be more auditory cortex involvement in Audio-Visual conditions than VisualOnly conditions, however, the contrast image obtained by subtracting Audio-Visual from
Visual-Only did not result in any statistically significant results.
As for the HA group, since no error correction was used to correct for multiple testing, a
direct comparison of cortical activations across the groups must be made with caution.
However, based on the maps we obtained, prominent regions found to be active in the CD
group for Visual-Only conditions were found to be active in hearing aid users as well. The
HA group was more suited for correlation analyses, for which its heterogeneous population
can be exploited. Results obtained from correlation analyses are discussed in Section 3.3.
3.2
Speechreading Test and fMRI Analyses
We sought to test the hypothesis that the response level in auditory cortex to visual speech
stimuli corresponds to an individual's ability to process visual speech. To evaluate our
subjects' visual speech processing skills, we administered a test involving speechreading of
English sentences.
We identified subjects with good speechreading skills and those with
poor speechreading skills and applied standard fMRI analyses separately for the two
subgroups within each subject groups. Results obtained from the speechreading test scores
and the corresponding standard fMRI analyses are presented in this section.
The speechreading test consisted of video clips without sound of a speaker producing 500
English keywords presented in common English sentences. Each subject's speechreading
score consisted of the number of whole keywords that the subject was able to speechread.
The resulting scores for the NH group are in Figures 3-10, (CD: Figure 3-12; HA: Figure 314). The scores for the NH group ranged from 16 words correct to 305 words correct (out of
500) (mean = 135.1, standard deviation = 109.1).
As evident from the plot in Figure 3-10,
there as a significant amount of variability in scores across the NH group. Out of 12 subjects,
we assigned top 6 scorers (subject # 2, 4, 6, 8, 10, and 11; marked with red circles in Figure
3-10) to a "good" speechreader category, and the lowest 6 scorers to a "poor" speechreader
subgroup. The good speechreaders' scores ranged from 81 words to 305 words correct,
whereas poor speechreaders scored from 16 to 72 words correct.
We then performed standard fMRI analyses for each subgroup separately, and compared the
activation patterns for the CVCV Visual-Only condition. Since there were only six subjects
per subgroup, we did not perform the mixed-effects analyses. Only fixed-effects analyses
were conducted, with FDR error correction and p < 0.01 as the threshold. The cortical maps
obtained are shown in Figure 3-11 and labels of brain regions are listed in Table 3-5 for the
NH group (CD: Figure 3-13, Table 3-16; HA: Figure 3-15, Table 3-17).
In the NH group, the pattern of activation in auditory cortex for "good" speechreaders differs
significantly in comparison to the counter "poor" speechreaders. One prominent difference is
easy to detect visually: there is a significantly greater amount of activity in left superior
temporal gyrus for "good" speechreaders; this cluster of activation also included Heschl's
gyrus as seen in some previous studies (Calvert et al., 1997; Pekkola et al., 2005). Another
notable observation is that there is greater activation in the frontal cortex (near IFG and IFS)
for "poor" speechreaders than "good" speechreaders.
The congenitally deaf participants' speechreading test scores ranged from 8 to 333 keywords
correct (mean = 160.0, standard deviation = 144.4). The good speechreaders' scores ranged
from 200 words to 333 words correct, whereas poor speechreaders scored from 8 to 128
words correct.
As seen in the NH group, the CD group's "good" speechreaders also
displayed much more activation in superior temporal cortex than "poor" speechreaders
(Figure 3-13; Table 3-16). The right hemisphere bias still existed for both subgroups and
there was a swath of activity between the visual and auditory cortex, including the
inferoposterior temporal junction areas as well as middle temporal gyrus for "good"
speechreaders, whereas the activation in auditory and visual cortices tend to be segregated in
distinct clusters. Contrary to what was seen in the hearing subjects, the prefrontal cortical
region was more highly active for "good" speechreaders than "poor" speechreaders.
The hearing aid users speechreading test scores ranged from 212 words to 472 words correct
(average = 342.5, standard deviation = 87.3). The HA group outperformed both the CD and
the NH group by a considerable margin: all hearing aid users performed in the score range for
"good" speechreading of the other two subject groups. Although we did divide the HA group
into two subgroups as per the other two groups, the "good" and "poor" categories hold
alternate meanings for the HA subject group. Hence, the HA results should be interpreted
with caution when compared to the other groups. As seen in Figure 3-14 and Table 3-17
changes in cortical activation patterns were similar to those of the CD subjects - in that the
better the speechreaders showed more auditory and frontal cortex activations in the CVCV
Visual-Only condition.
3.2.1.1
Auditory Cortex
In agreement with previously reported studies (Calvert et al., 1997; Calvert and Campbell,
2003; Campbell et al., 2001; MacSweeney et al., 2000; MacSweeney et al., 2002a;
MacSweeney et al., 2001; Olson et al., 2002; Skipper et al., 2005), the cortical activation
pattern for Visual-Only conditions in the current study showed that visual speech alone can
elicit activity in the auditory cortex. However, this activation failed to extend to the primary
auditory cortex, and only included the very posterior tip of superior temporal gyrus. Whether
the primary auditory cortex is activated by visual speech alone in neurologically normal
individuals is still not fully established, and there have been some inconsistencies in findings
from previous studies. Several studies of visual speech perception have reported activity in
primary auditory cortex (Heschl's gyrus) (Calvert et al., 1997; Pekkola et al., 2006), whereas
some other studies (including the current study) reported very little or no activity in superior
temporal gyrus (or auditory cortical areas in general) (Skipper et al., 2005) when participants
were processing visual speech alone. The discrepancy in activity patterns may possibly be
due to the differences in experiment stimuli, tasks, and /or paradigms; or, it may have been
brought about by the differences in individuals' abilities to process visual speech.
However in present study, all good speechreaders in all subject groups showed increased
activity in the anterior portion of superior temporal gyrus. Based on these results obtained,
the differences reported in previous studies regarding recruitment of primary auditory areas
during visual speech perception can be reconciled by the claim that all subjects'
speechreading ability varies widely from person to person and that it is significantly
correlated with activities in auditory cortical areas in visual speech perception.
3.2.1.2
Lateral Prefrontal Cortex
Good speechreaders who are congenitally hearing impaired also showed increased activity in
lateral prefrontal cortical area (near IFS) and in medial premotor area (pre-SMA), whereas
activity levels in these areas were lower in good speechreaders who have normal hearing.
The lateral prefrontal cortex has been implicated in language and working memory tasks,
(D'Esposito et al., 1998; Fiez et al., 1996; Gabrieli et al., 1998; Kerns et al., 2004), the use of
semantic knowledge in word generation (Crosson et al., 2003), non-semantic representations
of speech plans (Bohland and Guenther, 2006), and in serial-order processing (Averbeck et
al., 2002; Averbeck et al., 2003a; Averbeck et al., 2003b; Petrides, 1991). Prefrontal cortex
activity has also been associated with the ability to select and coordinate actions or thoughts
in relation to internal goals, a process that is often referred to as executive control (Koechlin
et al., 2003; Miller and Cohen, 2001) and was also shown to be involved in the development
of cross-modal
associations (Petrides,
1985), including visual-auditory associations.
Additionally, much of the prefrontal cortex has been long been considered to be polysensory,
suggesting that the prefrontal cortex may also serve as an auditory-visual speech integration
site. Although there are some uncertainties with anatomical and functional subdivisions of
the cortex, it is known that the prefrontal cortex receives reciprocal projections from anterior
and posterior STG, posterior STS (Petrides and Pandya, 2002) as well as from secondary
visual and intra parietal sulcus in parietal cortices (Miller and Cohen, 2001). Much of the
inputs to the medial prefrontal cortical area are from STS, including STP and parabelt
auditory cortex (Barbas et al., 1999; Hackett et al., 1999; Romanski et al., 1999). The dorsal
prefrontal cortex is connected with premotor cortex (Lu et al., 1994) and the orbital region of
PFC has auditory and multisensory inputs projected from the rostral part of the auditory
parabelt and the STS/G (Carmichael and Price, 1995; Hackett et al., 1999; Romanski et al.,
1999). Single unit recordings in monkey premotor area F5 (the homologue of Broca's area)
identified this region to respond to sound and vision (Kohler et al., 2002).
Its
interconnectivity includes visual input from the IPS area (Graziano et al., 1999), auditory
input from STS and posterior STG regions (Arbib and Bota, 2003), and massive connections
to and from primary motor cortex (Miller and Cohen, 2001).
In humans, the arcuate
fasciculus is known to provide direction connection between Broca's area and posterior
temporal areas including STG, STS and middle temporal gyrus.
The arcuate fasciculus
provides a pathway by which speech production areas in frontal lobe can influence auditory
and speech perception areas. These speech production areas and superior temporal cotical
areas are also indirectly connected through inferior parietal cortex (Catani et al., 2005).
Although most prefrontal cortical areas exhibit polysensory properties, Calvert et al. (1997)
reported no activation in the prefrontal areas when viewing talking faces. The lack of any
activity in PFC may make sense if PFC is more involved in the development of auditoryvisual associations, and after extensive training, it is relieved from the role of mediating these
associations. The connections between auditory and visual cortices may take on the role of
mediating auditory-visual associations once the training is complete. This may explain why
good speechreaders in the NH group showed very little to no activity in prefrontal cortex,
whereas bad speechreaders had significantly higher activity in this area. Good speechreaders
supposedly already have developed auditory-visual associations and relieved this region from
its mediation role, whereas bad speechreaders are still in the process of training and learning
auditory-visual associations. Since good speechreaders with congenital hearing impairment
also showed increased activity in PFC, it most likely is because the development of auditoryvisual association in this area is continuously being undertaken and was never relieved of its
role due to constant lack of acoustic input. Hence, greater activity in the PFC for hearing
impaired individuals may reflect more extensive work in progress in terms of developing
auditory-visual association, which in effect somehow results in better speechreading ability.
3.2.1.3
Pre-SMA
Another prominent difference in patterns of activation for good and poor speechreaders in the
CD and HA groups was that the good speechreaders' cluster of activation in medial aspect of
Brodmann's Area 6 extended more anteriorly, encompassing not only SMA, but pre-SMA as
well. Interestingly, the opposite was true for the NH group - poor speechreaders showed
activity in anterior pre-SMA and SMA while good speechreaders only had significant activity
in SMA.
The pre-SMA is thought to play a crucial role in the procedural learning of new sequential
movements (Hikosaka et al., 1996), and based on monkey data, it is also assumed to be
engaged in the representation of movement sequences, but in a higher-order role than the
SMA (Matsuzaka et al., 1992; Shima et al., 1996; Shima and Tanji, 1998; Shima and Tanji,
2000; Tanji and Shima, 1994; Tanji et al., 2001). In a recent speech sequencing study by
Bohland and Guenther (2006), it was found that pre-SMA was more active when the
structure of individual syllables in the speech plan was complex, suggesting that the anterior
pre-SMA is possibly used to represent syllable or word-sized units.
Regardless of the exact roles of pre-SMA in visual speech perception, our findings lead to the
fact that congenitally deaf individuals who are better speechreaders were likely have more
effective employment of pre-SMA (and PFC) in visual speech perception than congenitally
deaf poor speechreaders. It is unclear why good speechreaders amongst the NH participants
did not demonstrate similar patterns in these areas as did good speechreaders in the CD or HA
groups; however, it is clear that specific functional roles of these areas differ depending on
what hearing status.
3.2.1.4
Angular Gyrus
In hearing subjects, no activity was observed in primary auditory cortex and angular gyrus
for VO conditions (refer to Section 3.1.1, Figures 3-2; Tables 3-3 and 3-4), as opposed to
these areas being reported as the main regions of activity in Calvert et al. (1997)'s study. In
Section 3.2.1.1, the discrepancy in activity reported in the primary auditory cortex was
explained by the different levels of activity reported between good vs. bad speechreaders
(Section 3.2.1.1). Similarly, the right angular gyrus was also found to be more active in good
speechreaders for the NH and CD groups than in bad speechreaders (Figure 3-11 and Table
3-15 for the NH group; Figure 3-13 and Table 3-16 for the CD group), possibly explaining
why we did not have significant activity in angular gyrus in hearing subjects during the
speechreading task.
Angular gyrus is a region of the inferior parietal lobule with proposed functional roles that
run a gamut from sound perception, touch, memory, speech processing, visual processing and
language comprehension to out-of-body experiences.
In a PET study by Horwitz et al.
(1995), it was discovered that the left angular gyrus activity shows strong correlations with
occipital and temporal lobe activities during single word reading.
However these
relationships were absent in subjects with dyslexia, indicating that angular gyrus might play a
role in relating letters to speech. Deficits in accessing visual word forms for reading have
been associated with damage to the left angular gyrus (Krauss et al., 1996), but it is still
unclear what role angular gyrus may serve in processing visual speech information. However,
angular gyrus has been shown to be involved in motion perception (Lui et al., 1999), and
visual speech processing does involve deciphering facial and lip movements. Although it is
unclear what the exact functional role of angular gyrus might be, it seems to serve a critical
role in speechreading.
3.2.1.5
Conclusion
Effective speechreading strategies are notoriously difficult to teach and learn (Binnie, 1977;
Summerfield, 1991), and some even argue that good speechreading skills are an innate trait
rather than something that is learned (Summerfield, 1991).
If this argument is true, then
whether hearing impaired individuals can recruit pre-SMA and lateral PFC, or whether
individuals can engage anterior part of STG and angular gyrus during visual speech
perception may be a reflection of how proficiently one can learn to speechread. On the other
hand, if speechreading is a skill that is taught and learned over a long period of time, then our
results can be interpreted otherwise - that the differences in activation patterns between good
and poor speechreaders are result of whether or not one has acquired effective speechreading
skills.
Speechreading Test Scores
350
300
o 250
In 200
ft
0 150
100
0g
50
0
0
L.
1
3
5 0
Good Speechreaders
7
9
12
Subject
Figure 3-10 NH group: Speechreading test scores.
Figure 3-11 NH group: Averaged cortical activation produced by the contrast of the CVCV VisualOnly condition with the baseline condition for Poor Speechreaders (Left panel) and Good
Speechreaders (Right panel) IT > 3.00 (Poor), T > 3.07 (Good), fixed-effects analyses with P < 0.01,
FDR corrected].
73
Label
T
HPrecentralL
11
lPrecentralR
II1|
IlFrontalSupL
II
II
I|
x
(P)
y
10.09
(008+00)
5.10
4.24
3.28
3.34
3.00
4.93
(1.8e-07)
C1.le-05)
(5.2e-04)
(4.3e-04)
(1.4e-03)
(4.3e-07)
-54 1
1II
I
62
42
-22
-16
20
-48
4.18
9.53
||
5.81
11 4.44
||
4.74
11 4.62
||
5.79
(1.5e-05)
(0.Oe+00)
(3.4e-09)
(4.7e-06)
(1.le-06)
(2.0e-06)
(3.7e-09)
52
54
-64 1
-46 1
60 1
58
-58
4.36
2.97
3.98
II 4.08
11 4.30
1| 4.42
||
6.71
(6.6e-06)
(1.5e-03)
3.5e-05)
(2.3e-05)
(8.7e-06)
(5.0e-06)
(l.le-11)
IFrontal Sup_MedialL
11
lFrontal SupMedialR
11
HlInsulaR
lCingulumMidL
IlCingulumMidR
1|
11
1|
11
(2.le-03) 1
(2.6e-03) 1
(4.2e-03)
(4.4e-03)
IlLingualR
||
|
IlFrontal1Sup-R
lFrontalMidL
111
lFrontal Mid R
11
IFrontalInfoperL
11
IFrontal lnfOperR
11
lFrontalInf_TriL
11I
IlFrontalInf_TriR
11
lFrontalInfOrbL
11
lFrontalInfOrbR
11
IISupp_Motor_AreaL
HOccipitalSupL
IlOccipitalMidL
1I
H
I|Occipital MidR
Oaccipital_Inf_L
11
lOccipital_InfR
1I
HFusiformL
II
1|
IIFusiformR
11I
IlPostcentralL
H
IlPostcentralR
IlParietalSupL
IlParietal_SupR
11
|lParietalInfL
1i
IParietalInfR
||
IlSupraMarginalL
||
IlSupraMarginalR
IAngularR
H
||
z
-4
50
I
10
6
40
42
26
16
I
II
II
||
2.87
2.80
2.64
2.62
2.63
2.63
3.06
3.74
11.26
IR
II
3.41
9.98
II 13.75
16.23
I| 16.09
II 9.39
9.92
|R 11.14
0
0
-22
-20
-22
-20
66
54
46 1
48
48
24
30
34
38
(4.3e-03)
-6
-36
(4.3e-03)
4
24
(1.le-03) 1 16 1 -30 1
(9.3e-05)
-26
-72
(0.Oe+00)
-48
-72
44
40
-6
40
6
1
1
1
1
(3.3e-04)
(.Oe+00)
(0.Oe+00)
(0.Oe+00)
(0.Oe+00)
(0.Oe+00)
(0.Oe+00)
(0.Oe+00)
50
58
-36
-32
36
32
1 -2 1
0
0
4
4
38
1 -42 1
1 -28 1
36
32
-42
-42
42
1|
I1
|1
11
I1
11
||
1|
1|
11
1|
3.65
3.96
2.93
2.71
6.60
3.97
I
2.73
2.85
2.97
||
2.89
liParacentralLobuleR 11 2.67
I|
lPrecuneusL
PrecuneusR
II
IHThalamusR
(2.6e-04)
(1.6e-03)
(1.3e-05)
(5.le-11) 1
(1.7e-10) 1
1
(1.4e-04)
(3.7e-05)
(1.7e-03)
(3.4e-03)
(2.3e-11) 1
1
(3.6e-05)
(3.2e-03)
(2.2e-03)
(1.5e-03)
(2.0e-03) 1
(3.8e-03)
6
||
11
|1
||
|1
IlTemporalMidL
11
IlTemporalMidR
11
IlTemporal_InfR
lCerebelumCrus1_L
IlCerebelumCrus2_L
IlCerebelum_8_L
HlCerebelumoP.
11
1II
I| 15.85
1|
-28
I
II
||
IlTemporalSupR
||
IlTemporalPoleSupL
IlTemporalPoleSupR
I
I
50
-68
54
-70
-2
-48
4 1 -48 1
6.74
6.76
9.22
9.28
2.82
3.42
3.60
4.64
|I 11.62
||
4.77
|4.07
3.74
I
3.04
-26
38
-64
-64
z
(3.7e-06)
-58
10
32
(0.Oe+00) I-50 I-4 J56
2.68
2.80
6.54
3.03
3.05
4.99
5.09
3.63
7.05
3.71
3.62
4.38
4.45
2.88
3.67
(3.7e-03)
-22
-2
76
(2.6e-03)
-18
0
76
(3.4e-11)
38
-6
68
(1.2e-03)
-42
52
16
(1.2e-03) I-36 I54 I16
(3.1e-07)
50
52
2
(1.8e-07)
46
54
2
(2.0e-03)
-52
14
8
(1.le-05)
-38
4
26
(1.0e-04)
62
22
28
(1.5e-04)
58
20
30
(6.0e-06)
-50
48
4
(4.5e-06) I-52 I44 I
6
(2.e-03)
44
36
0
(1.2e-04)
58
24
28
(4.3e-03)
(4.6e-09)
(7.3e-06)
30
52
-8
30
20
2
-18
-4
78
-14
11
2.83
(2.3e-03)
40
6
11
11
3.06
(1.le-03)
16
-30
-6
(0.Oe+00)
-34
-94
(0.Oe+00) I-38 I-92
C0.Oe+00)
32
-98
(.Oe+00)
-48
-78
-4
1
11
17.12
10.01
16.30
13.89
I12
2
-8
I9.35
(0.Oe+00)
46
-86
-6
5.78
(6.2e-05)
-40
-76
-18
11.31
I 12.02
3.47
2.96
2.92
11 3.99
11 4.52
11I 4.23
2.83
2.92
2.75
(0.Oe+00)
44
-50
-22
(0.Oe+00) I40 I-60
-20
(2.6e-04)
-66
-10
14
(1.6e-03)
-66
-4
14
(1.8e-03)
42
-42
66
(3.3e-05)
-32
-62
64
(3.2e-06)
26
-56
56
(1.2e-05) I28 I-60 I62
(2.3e-03)
-44
-52
60
(1.8e-03)
-46
-48
60
(3.0e-03)
46
-52
56
11 4.28
11 7.57
4.36
(9.5e-06)
(2.3e-14)
(6.6e-06)
-66
-58
66
-24
-38
-44
20
24
36
5.62
(7.8e-05)
30
-56
44
2.63
3.67
4.58
5.20
6.46
(4.3e-03)
16
-26
-2
(1.3e-04) I
4 I-12 I12
(2.4e-06)
-64
-6
-2
(1.le-07)
-66
-48
12
(5.8e-11)
68
-36
20
6.12
2.78
(5.le-10)
(2.8e-03)
2.62
I
8.31
4.44
11I 5.12
I
11.62
(4.4e-03)
(1.le-16)
(4.5e-06)
(1.6e-07)
CO.Oe+00)
I
11
I
-56
-50
y
11
78
I
x
(P)
4.49
12.12
2.63
5.76
11 4.34
30
32
16
18
(9.1e-12) 1 -54 1 -38 1 20
(7.8e-12)
-58
-34
22
(0.Oe+00)
52
-38
14
(0.Oe+00)
56
-38
16
(2.4e-03)
-56 1 14 1 -6
(3.2e-04)
64
4
-2
(1.6e-04)
66
6
2
(1.8e-06)
-52
-52
12
I
(0.Oe+00)
50
-66
0
I
(0.Oe+00)
46
-46
-24
(2.9e-04)
-40
-82
-22
(9.4e-05)
(1.2e-03)
11
I
-66
-10
14
-66
-4
14
58
-24
54
-36 1 -58 1 58
34 1 -54
58
1
-48
-42
38
-50
-40
46
56
-36
46
58
-44
52
-62 1 -32 1 24
1
1
64
-24
48
11I
|HTemporalSupL
14
54
12
24
20
26
6
-68
24
-70 1 -12
-94 1 -6
-88
-4
-94
-2
-60
-16
-66
-14
-54
-18
I
3.47
2.96
4.21
6.48
6.30
I|
I
48
0
6 1
8 1
14 1
14
18
I
26
44
28
30
28
30
0 1
I
II
I
18
32
52
54
64
48
I
II
T
11
iI
I
1
6.19
6.31
2.4e-05)
(3.4e-10)
(1.e-0)
1 -52 1
6 1
0
1 44 1
6
-16
1
1
-42
-54
-4
I-68 I-24 I
2
62
-38
-8
I66 I-32 I-6
46
-46
-24
-4
-28
22
-78
-62
-66
-24
-54
-54
I
11
11
If
I
IISuppMotor_Area_R
It
IlInsula_L
I |CingulumAnt_L
I CingulumAntPR
||
II
I
||
I
iI
II
6.69
6.60
2.82
2 |
(1.3e-11) I
(2.3e-11) I
8I|
(2.4e-03) 1 -34 I
41
21
26 I
68
2
I
I
I
2
||
||
lII
||
11
I RolandicOperL
IlLingual_L
II
2.78
(2.7e-03)
( -12
||
II
||
II
||
I CerebelumCrus_R
IFrontal_Mid_Orb_R
-32
-6
7.23
(2.9e-13)
66
3.41
3.13
3.78
(2.9e-03)
(4.8e-03)
(1.5e-03)
3.18
(4.4e-03)
18
28
22
28
2.80
2.73
(2.6e-03)
(3.le-03)
-18
-18
||
II
2.78
(2.7e-03)
-6
1
||
|
|
II
||
II
||
II
Il
Table 3-15 NH group: cortical activation produced by the contrast of the CVCV Visual-Only
condition with the baseline condition for Poor Speechreaders (Left panel) and Good Speechreaders
(Right panel) [x, y, z in MNI coordinates].
Speechreading Test Scores
350
300
250
200
150
100
50
0
0 @(6
Good Speechreaders
7 8 9 10 11
Subject
Figure 3-12 CD group: Speechreading test scores.
Figure 3-13 CD group: Averaged cortical activation produced by the contrast of the CVCV VisualOnly condition with the baseline condition for Poor Speechreaders (Left panel) and Good
Speechreaders (Right panel) [T > 3.00 (Poor), T > 3.07 (Good), fixed-effects analyses with P < 0.01,
FDR corrected].
II
Label
IPrecentralL
IPrecentralR
I|FrontalSupL
||FrontalSupR
T
(p)
x
7.38
3.00
7.05
3.57
3.01
(9.8e-14)
(1.3e-03)
(1.0e-12)
(1.8e-04)
(1.3e-03)
2.59
2.60
(4.8e-03)
(4.6e-03) I
y
T
z
-4
-8
-4
-6
-6
I
I
I
I
I
50
64
42
62
72
||
||
|
I
|
20 |
18 |
20
24
|
|
36
36
||
||
| -54
| -34
| 54
| 40
I -18
|
I
||
||
||
IFrontalSupOrb_L
||FrontalSupOrb_R
IFrontalMidL
||0
-8
24
26
-10
-2 ||
IFrontalMidR
IFrontalMidOrbL
||FrontalMidOrb_R
||Frontal_InfOper_L
I|Frontal_InfOperR
lFrontalInf_Tri_L
11
I|FrontalInfTriR
11
IFrontal_InfOrb_L
IlFrontal_InfOrb_R
IRolandicOper_L
ISuppMotorArea_R
11
IFrontalSupMedialL
11
IFrontalSupMedialR
2.70
4.03
4.30
4.43
3.41
3.20
3.30
3.57
2.91
(3.5e-03)
(2.8e-05)
(8.8e-06)
(4.9e-06)
(3.2e-04)
(6.9e-04)
(4.9e-04)
(1.8e-04)
(1.4e-04)
(1.8e-03)
3.54
(2.0e-04)
3.07
3.75
5.45
4.01
3.03
(l.le-03)
(9.0e-05)
(2.7e-08)
(3.le-05)
(1.2e-03)
3.64
-44
46
-46
-60
48
-46
-44
48
48
-40
28
24
-56
I
I
|
|
|
|
|
4 |
4
-4
|
|
52
46
10
10
20
26
26
24
38
54
28
34
10
-6
-18
36
I
|
|
I
|
|
|
|
|
|
-8
24
26
-2
0
4
-2
0
-12
-10
-10
4
70
74
36
|I
||
||
||
||
||
||
||I
||
||
||
||
||
||I
|}
IfFrontalMedOrb_L
11
IFrontalMedOrbR
I lInsula_L
2.72
(3.3e-03)
14 |
44 I -10
IInsula_R
ICingulumAnt_L
11
ICingulumMidR
IHippocampusR
11
ICalcarine_ P
||OccipitalMid_L
(4.le-03)
(3.le-03)
(1.6e-03)
-6 |
-8 I
16 |
34 |
36 |
18 |
10
14
36
(1.8e-11)
-28
||OccipitalMid_ P
11
|Occipital_Inf_L
||
5.12
5.16
7.06
(1.6e-07)
(1.3e-07)
(1.0e-05)
36 | -84
88
34 I
-24 | -94 I
IFusiform_L
IFusiformPR
||
IPostcentral_L
1I
4.69
3.80
(1.4e-06)
(7.4e-05)
-40 I -56
48 | -30
I -14
2.76
3.70
2.68
2.69
(2.9e-03)
(1.le-04)
(3.7e-03)
(3.6e-03)
-66
-50
-22
-52
I
| -94
-2
11
I ParietalSupL
I|Parietal_Inf_L
11
11
|Angular_R
|Paracentral_Lobule_L
11
5.10
(1.7e-04)
||Putamen_L
11
| Putamen_ P
11
I|Thalamus_L
2.72
(3.3e-03)
7.29
7.16
9.15
(1.9e-13)
(4.9e-13)
(0.Oe+00)
7.12
6.41
(6.3e-13)I
(8.4e-11)
11
IThalamus_R
I|TemporalSupL
I TemporalSup_ P
11
I|TemporalMid_L
I Temporal Mid R
|jTemporalInf_L
2.87
(2.0e-03)
42
56
50
72
64
68
||
||
6.00
8.31
10.08
3.93
2.95
2.83
(1.le-09)
(1.le-16)
(0.Oe+00)
(4.3e-05)
(1.6e-03)
(2.3e-03)
I -40 |
0
I -48 |
-6
-2
I 54 I
| 12 | -32
| -28 |
-6
| -28 I
-6
|
|
1
3.31
3.94
5.21
6.01
4.33
4.24
(3.5e-03)
(4.2e-05)
(9.7e-08)
(1.0e-09)
(7.6e-06)
(1.le-05)
| -22
| 24
I -36
I -46
I 36
I 42
|
|
|
|
I
|
46
56
46
34
56
54
| -10
| -2
| 10
| 28
|
0
I
2
6.95
6.90
10.15
5.78
(2.le-12)
(2.9e-12)
(0.Oe+00)
(4.le-09)
| -46
| -44
| 44
| -44
|
|
|
|
14
12
12
34
I
|
|
|
22
28
30
24
26 I
6
I
|
|
22
56
72
(1.2e-11) |
40 |
(2.7e-05) I
6.74
6.86
2.67
2.62
6.01
3.84
3.81
(8.8e-12)
(4.le-12)
(3.8e-03)
(4.4e-03)
(1.0e-09)
(1.4e-03)
(1.4e-03)
I
|
I
|
|
|
|
2.82
5.68
3.46
2.60
(2.4e-03)
(7.3e-09)
(2.8e-04)
(4.7e-03)
| -36
I -34
I 34
|
34
40 |
|
-6
22 I
|
|
|
|
|
48
60
12
18
44
-12
-10
| -16 I
| 24 I
| -20 |
I
61
-4
6
8
14
41
I
|
|
|
|
66
68
30
44
40
2.78
3.27
4.57
4.60
4.55
(2.8e-03)
(5.5e-04)
(4.0e-04)
(3.8e-04)
(4.le-04)
|
40
42
|
| 24
| -30
I -32
|
|
|
|
|
-12
-16
-96
-92
-96
| -16
| -14
|
0
|
-6
|
-2
||
||
||
|
|
|
|
-60
-22
-24
-32
| -20
| -18
| 54
|
60
3.77
4.81
3.41
3.46
6.65
2.96
16
2.85
3.81
10 I -30
66
||
I| 4.26
4.84
2.63
2.81
3.49
|38
3.39
-4
-20
3.43
2
||
3.32
4.60
-42
-60 |
5.28
12
||
-62 | -42
16
|| 12.68
66 I -36 I
3.75
8
||
|I 3.71
-54 | -46
6
|| 11.76
-46 | -66
10
||
13.84
8| -38
4.58
-44 I
||
(8.2e-05)
(7.8e-07)
(2.9e-03)
(2.7e-03)
(1.7e-11)
(1.6e-03)
(2.2e-03)
(7.2e-05)
(l.1e-05)
(6.9e-07)
(4.3e-03)
(2.5e-03)
(2.5e-04)
(3.5e-04)
(3.0e-04)
(4.5e-04)
(2.2e-06)
(6.9e-08)
(0.Oe+00)
(8.8e-05)
(1.0e-04)
(0.Oe+00)
|
|
|
I
I
I
|
|
I
I
I
I
|
|
|
|
|
|
I
-30
-42
-64
-62
34
-4
-8
10
-14
18
-32
-32
32
32
-8
-8
12
-58
-64
I 44
| 44
| -54
I
|
|
|
I
-58 | 46
-40 | 48
-24 | 40
-26 | 44
-58 I 48
-26 I
64
-20 | 64
-28 | 68
2 | 14
6 | 14
-12 | -4
-16 I
-2
81
-8
-16 |
2
-12 I
4
-8 |
4
-12 |
6
0 | -4
-26 |
4
-2 I -14
-6 I -12
-42 I
8
(0.Oe+00)
(2.5e-06)
I 52
I -42
| -68
| -32
14.72
13.18
|I
|I
||
|
||
||
||
||
|I
||
||
||
||
||
|I
||
||
I|
||
|
40
| 46
I -56
I -48
|
|
|
||
||
||
||
||
||
(1.5e-11)
(6.9e-05)
(2.7e-03)
(5.le-04)
||
||
||
||
||
||
||
6.66
3.82
2.78
3.29
22
56
72
50
||
||
I|
41
4f
-8
-8
10
0
-2
38
|
|
|
-8
| -32
| -42
| -46
||
||
||
||
4.05
| -78
| -92
||
||
I|
|I
6.71
(0.Oe+00) I -46
(0.Oe+00) | -26
I -18
||
||
||
||
|II
||
||
|
||
ISupraMarginal_L
I|ParacentralLobulePR
ICaudate_L
ICaudate_R
0
2
-4
II
||
||
||8
6.64
z
||
-2
2.64
2.74
2.96
v
x
(p)
-6
-4
||
||
||
I
|
|
|
|
|
|
I
|
|
|
|
I
|
|
|
I
|
-2
| -14
||
||
||
||
|I
||
||
I|
||
||
I|
I|
||
||
||
||
||
I|
||
||
II
IlTemporalInf_R
IcerebelumCrus1_L
2.78
4.99
3.56
ICerebelumCrusl_R
3.73
II
(2.8e-03) I 48 I -14
(3.2e-07) | 48 | -42
(2.2e-03) 1 -18 | -90
(9.7e-05) | 20 I -72
I
1 -30
1 -18
1 -22
1 -34
1 1 6.88
1[
|1
|1
| 1 2.88
2.65
2.98
|iCerebelum_6_L
liCerebelum_6_R
II
ICerebelum 7bL
||Cerebeum_7b_R
|II
IlCerebelum_8_L
IlCerebelum_8_R
IIAmygdala_L
||TemporalPoleSupR
3.44
3.43
(2.9e-04) I 32 | -60 1 -26
(3.le-04) I 32 | -56 1 -26
3.61
2.69
(1.5e-04) 1 -24 I -56
(3.6e-03) | 30 I -62
2.90
(1.9e-03) |
56 |
7.30
(1.7e-13) |
46 I -76
IlVermis_3
I
Occipital_Inf_R
||
IlHeschiR
I -60
| -52
12 1 -8
I
1 -4
I
I
-48
1 -12
20
(2.0e-03)
(4.0e-03) | -16
(1.5e-03) | -22
-72
1 -30
-62
-60
1 -30
| -28
| -12
1 34
1 32
1 -26
1 28
1 -28
| 62
I 2
42
(0.Oe+00)
(0.Oe+00) | 32
(1.5e-03) | 42
-72
-72
-70
-60
-58
-4
4
-36
-86
-92
-22
1 -44
1 -50
1 -46
1 -50
1 -52
1 -12
I -8
1 -14
I -6
1 -4
1
4
(3.5e-12) 1 -44
I
11
11
11
1|
|1
11
|1
| |
|
|
6.09
3.30
3.13
6.15
8.06
2.83
7.57
3.29
13.47
14.54
3.77
(6.le-10)
(3.5e-03)
(4.8e-03)
(4.3e-10)
(5.6e-16)
(2.3e-03)
(2.3e-14)
(5.le-04)
I
Table 3-16 CD group: cortical activation produced by the contrast of the CVCV Visual-Only
condition with the baseline condition for Poor Speechreaders (Left panel) and Good Speechreaders
(Right panel) [x,y,z in MNI coordinates].
Speechreading Test Scores
500
400
300
200
100
0
O
2
0
Good Speechreaders
6
8
10 11
12
Subject
Figure 3-14 HA group: Speechreading test scores.
Figure 3-15 HA group: Averaged cortical activation produced by the contrast of the CVCV VisualOnly condition with the baseline condition for Poor Speechreaders (Left panel) and Good
Speechreaders (Right panel) IT > 3.00 (Poor), T > 3.07 (Good), fixed-effects analyses with P < 0.01,
FDR corrected].
II II
T
T
Label
Label
5.59
4.15
3.23
3.81
4.28
3.80
3.53
2.83
3.20
I|Precentral_L
11
||Precentral_R
11
IFrontalSup_L
IFrontalMidL
IFrontalMidR
xx
Ip)
(P)
(1.2e-08)
(1.7e-05)
(6.3e-04)
(7.2e-05)
(9.6e-06)
(7.5e-05)
(2.le-04)
(2.3e-03)
(7.0e-04)
I
I
|
|
|
|
I
yy
-50
-30
40
44
-32
-48
-50
42
42
T
z
z
| -6
| -14
2
|
I
4
| -6
| 50
I 34
32
I 14
|
|
|
|
1
I
1
1
1
T
54
72
28
40
68
8
30
42
44
|1
|I
||
||
||
||
||
(|
||
f[Frontal_InfOper_L
lFrontalInfOper_R
z
I|I
|
54
||
8 |
4 |
22
26
|
54 |
4
36 |
81
62
||
||
y
x
(p)
z
y
x
(p)
1
6.78
(6.9e-12) I
3.54
3.45
(2.0e-04) I
(2.8e-04) |
3.18
(7.4e-04)
2.67
(3.8e-03) |
5.66
5.38
3.62
(8.0e-09) | -54
(4.0e-08) | -48
(1.5e-04) | 56
|
|
|
8 |
14 I
12 I
6
22
28
3.66
3.53
3.73
4.82
3.37
3.68
3.05
5.79
3.07
(1.3e-04)
(2.le-04)
(9.9e-05)
(7.5e-07)
(3.8e-04)
(1.2e-04)
(1.2e-03)
(3.8e-09)
(1.le-03)
I -40
I -50
I 48
I 52
|
46
|
42
-2
|
I
-2
|
0
|
30
44
20
24
32
42
24
-2
36
I
|
|
|
|
|
|
|
|
0
8
2
18
16
-12
60
64
48
||
20 |
-2
(|
||
(|
||
||
-50
62 I
62 |
-4
||
||
(i
||
| -46
||
||
||
||
||
IFrontalInf_Tri_L
4.76
(1.Oe-06)
IFrontalInfTri R
4.95
5.29
(3.8e-07) |
(6.5e-08) |
|
20
|
24
|I
52 I
54
38
30
|
|
26
28
||
||
| -60
IFrontalInfOrbR
I|Supp_.MotorArea_L
I|Frontal SupMedial L
((FrontalMedOrb_R
3.61
2.60
(1.5e-04) |
(4.7e-03) I
3.08
3.04
(1.le-03) I
(1.2e-03) I
-6 |
10 I
4 I
58
70
I|
||
21
41
60 I
64 |
-4
-2
||
(|
0 |
3.69
l(Insula R
ICingulumMidR
ILingual_R
iOccipitalMidL
||OccipitalMid_R
||Occipital_Inf_L
8.54
7.59
10.02
11
I Occipital_Inf_R
(0.Oe+00) I -28
(2.le-14) | -44
(0.Oe+00) | 40
| -88
| -76
| -84
|
|
|
-8
-8
-4
||
||
||
IFusiformL
IFusiformR
2.74
5.54
2.66
IlPostcentral L
I1
I|PostcentralR
(3.le-03) I -66 | -16
(1.6e-08) I -50 I -10
(3.9e-03) I 32 I -36
1
1
I
14
56
66
||
||
||
Il
IParietalSup_L
IParietalInfL
IlSupraMarginalL
1|
||SupraMarginalR
11
I|AngularR
4.58
(2.4e-06)
70
(1.le-04) |
(1.9e-06)
(3.4e-11)
(0.Oe+00)
(0.Oe+00)
(0.Oe+00)
(1.5e-10)
(1.7e-14)
(3.8e-06)
(3.le-04)
(2.9e-04)
(1.9e-05)
(1.4e-05)
(3.5e-05)
(5.2e-04)
(2.6e-04)
(6.2e-05)
(5.8e-05)
(1.7e-05)
(5.6e-08)
(1.le-04)
I
|
|
|
|
44 |
|
10
| -26
|
30
I -40
| -44
|
48
I
30
| -38
|
42
I 44
I -62
I -66
64
| 62
| -38
| -48
| -44
| -60
| -52
| 66
|
|
|
|
|
|
|
|
|
|
I
I
I
I
I
|
-34
|
-98
-82
-82
-80
-96
-62
1
-1001
-58
-52
||
I
||
||
||
||
I
-12
1|
1
|
1
|
-10
-6
-4
(I
||
||
-20
||
I
|
I
-18
-16 I
-14 I
0
-56
-52
-54
I -22
I -46
| -50
-2
-4
2
|I
||
||
|
|
|
I
|
|
|
22
||
-20
||
22
24
||
||
18
||
22
64
58
60
20
26
28
||
||
||
||
|I
||
||
||
I|Precuneus L
2.97
2.62
|ICaudateR
I
(1.5e-03)
(4.4e-03)
-2
20
-28
I|TemporalSup_L
I|TemporalSupR
|ITemporalPoleSup R
|ITemporalMidL
|ITemporal_Inf_R
|iCerebelumCrus1_L
|iCerebelumCrus2_L
|ICerebelum_6_R
|ICerebelum 7b L
|iCerebelum_8_L
|ICerebelum_8_R
ICerebelum_9_L
((SuppMotor_Area_R
|ICaudate_L
R
-4
0(
2
2.82
4.32
4.40
4.76
(2.4e-03)
(7.9e-06)
(5.6e-06)
(9.9e-07)
-44
-54
60 |
66
-6
-48
4.36
2.95
(6.7e-06)
(1.6e-03)
-66
-56
-68 | -44
|ITemporalMidR
-20
6(
4
I
||-3
-328(
||
(3.5e-03)
(6.4e-04)
(1.6e-04)
(2.8e-04)
(3.6e-03)
(3.4e-03)
(1.4e-04)I
(5.6e-05)
(2.le-04)
(3.4e-07)
(8.5e-06)
(1.6e-04)
3.90
4.84
(4.9e-05)
(6.8e-07)
-34
-20
-14
-8
18
24
-14
-24
-18
28
16
-10
(2.5e-03) |
(2.4e-04) |
2.83
2.75
2.75
2.63
2.75
3.09
5.57
6.75
6.50
2.76
5.21
(2.3e-03)
(3.0e-03)
(3.0e-03)
(4.3e-03)
(3.0e-03)
(1.0e-03)
(1.4e-08)
(8.6e-12)
(4.5e-11)
(2.9e-03)
(9.7e-08)
|
|
|
|
|
|
|
|
|
|
|
5.57
6.67
(1.4e-08)
(1.5e-11)
|
3.72
4.34
(1.0e-04) | -16
(7.2e-06) | -32
|
|
28
28
|
|
|
|
|
|
|
|
|
|
6
6
2
|I
||
2
||
4
2
10
||
2
|I
20
-2
10
||
||
|I
12
12
-24
-20
-20
-56
-64
62
66
60
-54
I
I
I
I
|
|
|
6
10
-2
4
0
-16
-34
-28
-34
12
-52
||
||
||
||
-32 I
430
-10 I
40 11
-46 | -14
270 -44(
|
|
||
2.70
3.22
3.61
3.45
2.69
2.71
3.64
3.87
3.53
4.98
4.31
3.60
60 I -62
64 | -56
2.81
3.50
||
2
i|Putamen_L
I|PallidumL
lTemporalInf_L
lCerebelum Crus1
4.63
6.55
14.81
12.56
12.47
6.32
7.61
4.48
3.43
3.44
4.12
4.20
3.98
3.28
3.48
3.84
3.86
4.16
5.32
3.71
I
-32
-58
|I -348
|
|
10
10
||
||
|
1
-22
-22
||
| -92
| -88
||
||
(||
||
-648
-640
||
||
||
-344I
-344(
66 ||
12
-662
-30
I
7.23
3.30
-44 |
42
-60
60
52 | -42
(2.9e-13)
(4.8e-04)
2|
-10
66
12
||
||
||
(|
II
| Cerebelum_Crus2_R
||
IVermis_9
||RolandicOper_L
| | RolandicOper_R
I [ParaHippocampalR
ILingual_L
I|Vermis_8
IIVermis 7
11
2.85
2.83
2.79
3.40
2.78
2.79
2.68
2.60
(2.2e-03)
(2.3e-03)
(2.7e-03)
(3.4e-04)
(2.7e-03)
(2.6e-03)
(3.6e-03)
(4.6e-03)
4.85
2.72
2.72
3.93
(6.3e-07)
(3. 3e-03)
(3.3e-03)
(4. 4e-05)
Table 3-17 HA group: cortical activation produced by the contrast of the CVCV Visual-Only
condition with the baseline condition for Poor Speechreaders (Left panel) and Good Speechreaders
(Right panel) [x,y,z in MNI coordinates].
3.3
Results from Correlation Analyses
To expand on the standard fMRI analyses (Section 3.1 and 3.2), simple regression analyses
were also carried out. The goal of the correlation analyses was also to identify regions that
may play a significant role in visual speech perception with different analysis approach.
Along with the activation maps obtained from both good and poor speechreaders, correlation
analyses can be used to corroborate the results of the another analyses and refine the
locations of cortical sites that may be functionally specialized or specifically recruited for
visual speech perception tasks.
For each subject group, we examined the correlation between their speechreading skills and
the magnitude of effect sizes for all voxels during the CVCV Visual-Only condition. Figures
3-16, 3-17, and 3-18 (Tables 3-18, 3-19 and 3-21) display active voxels in the CVCV VisualOnly condition that were significantly correlated with the speechreading test scores for the
NH, CD and HA group, respectively.
3.3.1
Normally Hearing and Congenitally Deaf (NH and CD)
For normal hearing individuals (Figure 3-16, Table 3-18), activities in lingual gyri, fusiform
gyri, and middle temporal gyri in both hemispheres were significantly correlated with
participant's speechreading skills.
In the left hemisphere, activities in the middle and
superior middle areas in the frontal lobe and superior temporal cortex were also found to be
correlated with speechreading test scores. The left superior temporal gyrus correlation with
speechreading skills is in agreement with the results described in the previous section, where
we observed a clear distinction in auditory cortex activity between good and poor
speechreaders.
We also identified cortical regions with a significant difference in activation patterns for the
CVCV Visual-Only condition contrasted with the CVCV Audio-Visual condition (instead of
the baseline condition) for normal hearing subjects. This contrast identifies regions that are
more active when visual speech is not accompanied by an acoustic signal and thus may
reflect areas that attempt to compensate for the missing sound using visual information.
Small regions of significant activity were found in the left hemisphere in the superior parietal
lobule (BA 7) and the inferior frontal triangular region; and bilaterally in the inferior frontal
opercular region, the inferior cerebellum (lobule VIII) and the supplementary motor areas.
With the exception of the superior parietal lobe, these are regions involved in speech
production and may reflect increased use of motor areas for speech perception when the
acoustic signal is degraded (Callan et al., 2003) or, as in this case, missing.
The same
contrast was also applied to the CD group, but as expected, no significant activities were
found since the auditory signal provides little or no neural stimulation in this group.
For the CD group (Figure 3-17, Table 3-19), activity levels of clusters of voxels in the left
occipital cortex, the lateral portion of premotor cortex, the inferior frontal opercular region,
and the middle temporal gyrus were correlated with speechreading scores. In the right
hemisphere, areas of the inferior parietal cortex, including the angular gyrus and the anterior
superior temporal area (auditory cortex) were significantly correlated.
effects of interest. F-Smeechreadin -CvcvV. Correction: none. F > 12.83
y--92
-64
-36
20
t' 01
48
0
262.7
Figure 3-16 NH group: Significantly correlated regions identified using the regression analysis for
the CVCV Visual-Only condition and speechreading scores [F>12.83, P <0.005, uncorrectedj.
effects of interest, Correction: none, F > 12.83
AAL Label
I Norm'd
|
T
Effect
CalcarineR I
262.75
| 13.53
(p)
|MNI Location (n)
z
x
y
(4.2e-03) 1
6 | -86 |
62 I
6
(3.2e-03) |
-2 |
LingualR | 133.44 I 13.92
LingualR I 129.86 | 15.04
LingualR I 127.33 I 15.77
(3.9e-03) 1
(3.le-03) I
(2.6e-03) I
14
14
12
| -70
I -66 |
I -64 |
-8
-2
-6
LingualL I 116.21 I 17.73
LingualL | 113.65 | 24.24
(1.8e-03) | -14
(6.0e-04) | -12
| -66 |
I -66 |
-2
-6
FrontalSupL | 112.97 | 17.13
I 13.14
FrontalSupL I 65.60
(2.0e-03) | -22
(4.6e-03) I -18
|
52
TemporalSupL I 112.06 | 16.01
(2.5e-03) I -50
I
| -30
FrontalSupMedialL |
137.27 | 14.91
36
1
32
I 40 1 46
4 |
-2
TemporalMidL
I 93.21
I 13.09
(4.7e-03) | -60
No Label
I 90.38
| 13.69
(4.le-03) |
-8 I
34
Frontal SupMedialL
FrontalSupMedialL
I 76.51
I 64.85
I 12.88
I 13.74
(4.9e-03) I
(4.le-03) |
-4 I
0 |
52 1
46 |
FrontalMidR I 72.98
| 13.87
(3.9e-03) |
52 |
14
No Label I 72.60
TemporalMidL I 71.21
| 14.26
I 12.87
(3.6e-03) 1 -48
(4.9e-03) 1 -50
Cerebelum_4_5_R I 63.97
1 13.21
(4.6e-03) 1
FrontalSupL | 62.32
1 14.26
(3.6e-03) | -12
Vermis_10 I 58.90
I 13.06
(4.7e-03) |
FrontalMidL I 54.29
1 13.43
(4.3e-03) 1 -48
I 20 1 46
Frontal_Sup_L I 49.72
| 12.90
(4.9e-03) I -12
I
Fusiform L I 47.44
Fusiform L | 36.30
I 20.41
| 12.85
(1.le-03) 1 -30 | -36
(5.0e-03) 1 -28 | -32
PrecentralR I 47.35
PrecentralR | 41.21
PrecentralR 1 27.52
1 13.08
1 14.70
1 14.20
(4.7e-03) 1
(3.3e-03) 1
(3.7e-03) 1
62
62
64
|
1
1
No Label 1 44.91
LingualR I 28.71
1 14.96
1 13.45
(3.le-03) 1
(4.3e-03) 1
18
18
| -24
I -30
TemporalMidR | 40.19
| 16.66
(2.2e-03) I
60
| -30 I
FusiformR | 33.26
I 13.58
(4.2e-03) |
28 I -38
1 -36
I -36
10 I -46
I
1
I 64
44
40
42
I 50
I
I
|
46 1
0 I -42
6
0
4
-4
52
| -40
1
48
| -24
1 -22
6 1
6 |
4 1
20
16
24
| -14
I -6
-8
| -16
Table 3-18 NH group: Significantly correlated regions identified using the regression analysis for the
CVCV Visual-Only condition and speechreading scores [F > 12.83, P < 0.005, uncorrected].
effects of interest, F-S eechreadin -CvcvV, Correction: none, F> 12.83
y--92
-64
-36
-8
20
0
262.7
Figure 3-17 CD group: Significantly correlated regions identified using the regression analysis for the
CVCV Visual-Only condition and speechreading scores [F > 12.83, P < 0.005, uncorrected].
effects of interest, F-Speechreading-CycvV,
AAL Label
I
Norm'd
Effect
I
T
Correction: none, r > 12 .83
(p)
I HUI
x
Location
z
y
OccipitalInf_L 1 431.08
1 13.63
(4.2*-03) 1 -48
Temporal_MidL 1 180.12
| 13.27
(4.5e-03) 1 -58 | -66 1
Cerebelum_8_R | 150.20
1 17.84
(1.8e-03)
FusiformR 1 145.97
| 13.04
(4.80-03) 1
32 1 -32 | -26
Parietal_SupR | 142.94
1 15.28
(2.9*-03) 1
36 1 -54
|
60
1 13.77
(4.0e-03)
I
I
52
SuppMotor AreaR
I 44.20
1
I
1 -72 1 -10
36 1 -46
12
8
2
1 -54
Table 3-19 CD group: Significantly correlated regions identified using the regression analysis for the
CVCV Visual-Only condition and speechreading scores IF > 12.83, P < 0.005, uncorrected].
3.3.2
Hearing Aid Users (HA)
As for the HA group, a number of separate simple regression analyses were completed. They
were:
a) Speechreading score and the CVCV Visual-Only condition
b) Speechreading score and the CVCV Audio-Visual condition
c) % of hearing impairment (unaided) and the CVCV Audio-Only condition
d) % of hearing impairment (unaided) and the CVCV Visual-Only condition
e) % of hearing impairment (unaided) and the CVCV Audio-Visual condition
f)
% of hearing impairment (aided) and the CVCV Audio-Only condition
g) % of hearing impairment (aided) and the CVCV Visual-Only condition
h) % of hearing impairment (aided) and the CVCV Audio-Visual condition
i) (unaided - aided) percentage of hearing impairment and the CVCV Audio-Only
condition
j)
(unaided - aided) percentage of hearing impairment and the CVCV Visual-Only
condition
k) (unaided - aided) percentage of hearing impairment and the CVCV Audio-Visual
condition
1) (unaided - aided) speech detection/reception threshold and the CVCV AudioOnly condition
m) (unaided - aided) speech detection/reception threshold and the CVCV VisualOnly condition
n) (unaided - aided) speech detection/reception threshold and the CVCV AudioVisual condition
o) (unaided - aided) word recognition test phonemic score and the CVCV AudioOnly condition
p) (unaided - aided) word recognition test phonemic score and the CVCV VisualOnly condition
q) (unaided - aided) word recognition test phonemic score and CVCV the AudioVisual condition
Hearing loss is usually measured as threshold shift in dB units, where the 0 dB threshold shift
represents the average hearing threshold level of an average young adult with disease-free
ears. There are several methods of calculating the amount of hearing impairment. One of the
most widely used is the percentage of hearing impairment (based on American Medical
Association's Guides to the Evaluation of Permanent Impairment, Fourth edition, 2003)
which is determined as follows: (1) calculate the average hearing threshold level at 500, 1000,
2000 and 3000 Hz for each ear, (2) multiply the amount by which the average threshold level
exceeds 25 dB by 1.5 for each ear, and (3) multiply the percentage of the better ear by 5, add
it to the poorer ear percentage and divide the total by 6. According to this formula, Hearing
impairment is 100% for a 92 dB average hearing threshold level. Using this method, we
calculated the percentage of hearing impairment for the HA subjects, for both unaided and
aided conditions (Table 3-20).
Difference in percentage of hearing impairment was
calculated by subtracting the aided value from the unaided value.
As described in Section 2.5, word recognition tests were also conducted for the HA group.
Before the word recognition test were conducted, each subject's speech reception threshold
was found. If the subject's hearing loss was too severe, then instead of speech reception
threshold, we measured speech detection threshold. In the unaided condition, subjects were
presented with a list of spondaic words monaurally at a 20 dB greater than their unaided
speech detection (or reception) threshold. If this value was greater than 110 dB, the word
stimuli were presented at 110 dB. The task was to repeat the words presented (verbally, by
writing responses, or by using sign language). Words identified correctly and phonemes
identified correctly were both scored. Here, we present only the phone scores since they
were found to be more useful than the word scores. In the aided condition for the word
recognition test, the speech stimuli were presented in the sound field at a sound level that was
20 dB greater than the subject's aided SDT (or SRT). By computing the differences in SDT
(or SRT) values between unaided and aided (and in word recognition test results as well), an
estimate can be made of how much hearing aid use benefits an individual's speech detection
(or reception). This estimate may be used to provide a rough approximation of how much
acoustic speech the individual might have been exposed to and was able to utilize. We
correlated these measures with active voxels in our experimental conditions; the results are
presented below.
Subject
1
2
3
4
5
6
7
8
9
10
11
12
Unaided
Aided
% HI
% HI
54.7
107.4
120.3
124.1
118.1
115.6
66.6
107.2
128.8
124.4
123.8
131.3
18.8
43.1
63.8
61.9
45
41.3
31.9
63.8
101.3
60
71.3
52.5
Diff
% HI
35.9
64.3
56.6
62.2
73.1
74.4
34.7
43.4
27.5
64.4
52.5
78.8
Unaided
Aided
Diff
Unaided
SD(R)T
(dB)
SD(R)T
(dB)
SD (R)T
( dB)
96
43
19
55
80
80
85
85
85
65
80
90
85
85
90
30
30
70
55
30
35
35
55
70
55
50
45
Aided
Diff
WRT %
WRT %
WRT %
phonemic
phonemic
phonemic
93
46
34
31
44
-3
3
15
25
0
-4
92
71.3
0
21
6
18
6.7
0
2
3
3
25
50
10
30
55
50
30
25
20
30
35
45
6
28
4
85
64.6
0
19
3
15
16
7
Table 3-20 HA group: subjects' hearing impairment levels, speech detection (reception) thresholds
and word recognition test results
As with the other two groups, results from simple regression analyses with speechreading test
scores and the CVCV Visual-Only condition (analysis "a" from the list of correlation
analyses performed) T-maps are depicted for the HA group in Figure 3-18 (Table 3-21). The
pattern obtained for this regression was found to be mostly left lateralized. These activities
were seen in: IFG triagularis and opercularis, SMA, supramarginal gyrus, and inferior/middle
temporal gyrus. Insula in both hemispheres was correlated as well.
The regression analysis for the unaided hearing impairment measure and the CVCV AudioOnly condition (analysis "c", Figure 3-19, Table 3-22) revealed a significant correlation in
superior temporal cortical areas bilaterally; however the activity in the right hemisphere
superior temporal region was considerably larger in both cluster size and effect size than the
left hemisphere.
This was also true for the correlation between the measure of unaided
hearing impairment and the CVCV Audio-Visual condition (analysis "e", Figure 3-20, Table
3-23). Besides the left and right superior temporal regions, activities in the right rolandic
opercular region were also found to be highly correlated with the unaided hearing threshold.
Surprisingly, there were no significant correlations between the unaided hearing impairment
measure and the CVCV Visual-Only condition (analysis "d").
On the other hand, the aided hearing threshold measure was found to be associated with
activity in the right middle temporal gyrus in the CVCV Audio-Only condition (analysis "f",
Table 3-24), reinforcing the finding of right hemisphere bias seen in previous analyses.
Other regions that were correlated in the CVCV Audio-Only condition with hearing
impairment measures include left SMA, right cerebellum (lobule VIII, X), right precuneus
and bilateral parahippocampal region.
When the aided hearing threshold measure was
correlated with the CVCV Visual-Only condition (analysis "g", Table 3-25), voxels[what
measure?] in middle temporal gyrus, Broca's area, right cerebellum (crus 2, lobule VI) were
also found to be significant.
Similar regions were found to be correlated in regressions
between the difference between unaided and aided hearing impairment and cortical activities
in the CVCV Visual-Only condition (analysis "j", Table 3-28). For the CVCV Audio-Visual
condition, Broca's area, activities in right superior temporal gyrus, right supramarginal gyrus,
right cerebellum (lobule VIII) and right parahippocampal region were found to be correlated
with aided hearing impairment measures (Table 3-26). The aided hearing threshold is a good
approximate to how much acoustic information is available to our hearing aid user subjects
on a daily basis. Based on these findings, the amount of acoustic signal an individual is
exposed to seems to be directly related to the extent of engagement of left IFG (Broca's area)
and right cerebelluem in both visual-only and audio-visual speech perception.
The difference between unaided and aided hearing impairment was correlated with activity in
rolandic operculum and right IFG in the CVCV Audio-Only condition (analysis "i", Table 327). In a similar correlation analysis - the difference between unaided and aided SDT (or
SRT) correlated with the cortical maps from the CVCV Audio-Only condition (analysis "1",
Table 3-30) - the left IFT region was identified (i.e. Broca's area). Obtaining the SDT or
SRT requires that the subjects are able to detect or perceive speech, whereas the hearing
threshold only requires subjects to be able to detect the presence of simple sounds (tones).
This result concurs with a widely accepted notion of specialized role of left IFG in speech
processing.
Wearing hearing aids involves learning and adapting to the new sound
information, and the results obtained here leads to the inference that that the more an
individual can learn to use a speech processing mechanism that involves activity in IFG, the
larger the benefit one gets from using hearing aids.
However, the difference in word
recognition test results for aided versus unaided conditions failed to show any correlation
with any brain region in the CVCV Audio-Only condition (analysis "o").
The word
recognition test requires more language-specific knowledge about English phonetics and
syllable structure, and our experimental condition (non-word CVCV) was probably not
suitable for using this correlation analysis to attempt to identify brain regions related to the
increase in word recognition rate.
Finally, the activity in right angular gyrus was correlated with the difference between unaided
and aided hearing impairment in the CVCV Audio-Visual condition (analysis "k", Table 329). This result also coincides with our effective connectivity analyses (presented in Chapter
4), and will be further discussed in later sections.
Figure 3-18 HA group: Significantly correlated regions identified using the regression analysis for the
CVCV Visual-Only condition and speechreading scores [F > 12.83, P < 0.005, uncorrected]
effects of interest, F Lipreading CvcvV, Correction:
AAL Label
I Norm'd |
T
none, F > 12.83
(p)
MNI Location (mm)
x
y
z
Effect
I 860.87
| 21.54
(9.2e-04)
g
Frontal Inf Tri L | 423.54
No Label | 330.59
InsulaL | 204.62
| 13.12
| 14.36
I 12.95
(4.7e-03)
(3.5e-03)
(4.9e-03)
I -52
I -60
No Label | 417.49
| 14.05
(3.8e-03)
No Label | 335.57
| 13.19
(4.6e-03) |
8 |
| 24.52
(5.8e-04) 1
0
I
-2 |
72
4
-4
|
|
22
22
|
|
66
62
No Label
0 1 -46
|
6
I -46
1
1
1
18 |
16 I
8 I
20
10
2
I
I
-6 |
10
2
-4
I
10
SuppMotor_Area_L
I
No Label
SuppMotor_Area_L
| 292.36
| 204.41
I
12.98
| 13.23
(4.8e-03) |
(4.6e-03) |
FusiformL
| 282.04
| 14.23
(3.6e-03) | -40
I
259.59
| 14.38
(3.5e-03)
|
InsulaL | 183.33
| 13.46
(4.3e-03)
I -44 I
I 181.64
| 12.89
(4.9e-03)
|
14
|
ThalamusL 1 175.41
I 13.65
(4.le-03) |
-4
I -16
|
10
TemporalMidL | 158.11
| 17.47
(1.9e-03) | -54
| -56
|
12
I
13.45
(4.3e-03)
-66
| -10
|
24
FrontalSupL | 143.24
| 13.26
(4.5e-03)
-28
|
56 |
26
Insula R | 135.63
InsulaR | 131.42
| 13.96
| 32.73
(3.9e-03) |
(1.9e-04) I
44
42
|
I
18
22 |
--4
| 104.77
| 13.02
(4.8e-03)
| -42
|
10 |
16
| 98.36
| 14.25
(3.6e-03)
I
-56 | -12
CerebelumCrus2_L | 88.07
| 20.80
(1.0e-03)
I
-36
No Label
| 77.39
| 13.06
(4.7e-03)
I
PostcentralL
SupraMarginalL
j 74.28
j 62.98
| 12.96
15.17
(4.8e-03) | -60
(3.0e-03) I -60
I
I
62.55
| 13.84
(4.0e-03) I
52
I 50.00
| 13.14
(4.6e-03) I -60
PostcentralL | 49.63
| 13.81
(4.0e-03)
| 40.11
13.16
TemporalMidL | 39.14
No Label
No Label
PostcentralL
FrontalInfOperL
PostcentralL
PrecentralR
TemporalMidL
335.21
| 150.48
I
I
18
| -62
| -22
| -22
| -18
I
0 |
-44
|
-6
0 |
10
12
-78
|
40
| -40
-20 |
72
|
|
24
20
|
6 |
32
|
0 | -20
|
I
-18
-22
-26 |
66
(4.6e-03) | -42
I -30 |
70
13.19
(4.6e-03) | -62
|
| 21.25
12.96
(4.8e-03) I -44
|
-34
| -12
TemporalInfL | 15.10
| 13.51
(4.3e-03) | -62
I -10
| -26
No Label
No Label
-4 | -20
Table 3-21 HA group: Significantly correlated regions identified using the regression analysis for the
CVCV Visual-Only condition and speechreading scores [F > 12.83, P < 0.005, uncorrected].
Figure 3-19 HA group: active regions identified using the regression analysis for the CVCV AudioOnly condition and percentage of hearing impairment (unaided) IF > 21.04, P < 0.001, uncorrectedl
effects of interest, FUnHICvcvA, Correction: none, F > 21.04
AAL Label
I Norm'd I
T
(p)
mI
Location
y
x
Effect
Temporal_SupR 1 109.82 1 22.86
(7.4e-04) 1 70 1 -28 1
TemporalSup_L 1 53.69
1 21.93
(8.6.-04) 1 -54 1 -24 1
Temporal_Sup_R 1 45.91
Temporal_Sup_R 1 45.82
TemporalSupR 1 40.96
1 21.17
1 21.12
1 21.19
(9.8e-04) 1 58 1 -26 1
(9.9e-04) 1 62 1 -22 1
(9.8e-04) I 66 I -24 I
Temporal MidL 1 45.01
1 29.20
(3.0*-04) 1 -60 1 -40 1
Rolandic_Oper_R 1 19.26
1 24.04
(6.2e-04) 1 66 1
Temporal_Sup_L 1 15.26
TemporalPole_Sup_L 1 14.08
1 30.29
1 29.35
6 1
(2.6*-04) 1 -54 1
(2.9e-04) 1 -52 1 10 1
Frontal_MidR 1
9.80
1 23.50
(6.7e-04)
9.46
1 22.09
(8.4e-04) |
FrontalSupMedialR |
1 54
1
2 1
0 1
8 1 56 1
(am)
z
9.05
8.90
I 25.88
I 21.78
(4.7e-04)
(8.8e-04)
I
SuppMotorArea L |
8.60
No Label I 8.46
I 24.41
| 58.34
(5.9e-04)
(1.8e-05)
I
I
CingulumAntL
No Label
1
I
I
-4 |
54 |
-2
2 I 56 I 0
0
4
I
I
-2 |
72 I
|
[
|
|
|
-106 |
-106 I
-108 I
-106 I
-100 |
66
22
OccipitalMidL |
No Label I
No Label |
No Label I
No Label I
6.25
5.73
5.62
4.15
3.29
| 25.43
I 25.60
I 21.62
I 25.46
| 54.45
(5.0e-04) |
-8
(4.9e-04) I -20
(9.le-04) | -16
(5.0e-04) | -18
(2.4e-05) I -36
Occipital SupR |
No Label |
4.98
2.46
| 28.95
| 28.21
(3.le-04) I
(3.4e-04) |
16 1 -104 |
8 1 -104 |
No Label 1
No Label |
4.80
4.71
I 24.16
I 30.39
(6.le-04) |
(2.6e-04) |
30
34
3.98
| 31.23
(2.3e-04)
I -40
| -98
|
FusiformR |
Cerebelum_6_R |
3.84
2.04
| 27.26
| 23.19
(3.9e-04)
(7.le-04)
I
I
I -28
I -32
1 -30
| -34
|
3.82
I 21.37
(9.5e-04)
I -40 I
-98
FrontalSup_OrbR
I
3.58
| 21.59
(9.le-04) |
I
52
No Label
I
3.54
| 22.88
(7.4e-04) | -36 1 -100
FrontalMidOrbR |
3.37
| 29.29
(3.0e-04)
I
18 1
FrontalMedOrbR |
2.77
1 24.53
(5.8e-04)
I
4 |
2.20
1 40.44
(8.3e-05) 1 -18 1
2.04
| 21.36
(9.5e-04) |
56 | -76
I
10
Frontal MedOrbL |
2.01
| 21.90
(8.7e-04) 1
-4 |
62
|
-8
No Label |
1.60
| 54.43
(2.4e-05) 1 -38 | -98
|
-4
FrontalMidOrbL |
1.58
I 24.37
(5.9e-04) | -22
I
-18
Cerebelum_8_L |
1.53
I 41.62
(7.3e-05) 1 -32 | -36
No Label
No Label
I
No Label 1
No Label
I
34
36
18
1
1
I
2
8
8
2
0
6
6
58 1 -34
58 1 -34
2
1
6
1 -22
1
6
56 | -20
66 1
52
54
-4
I -20
| -50
No Label
I
1.53
| 23.68
(6.5e-04) | -12 | -104
No Label
I
1.43
| 39.77
(8.8e-05)
ParaHippocampalL |
1.36
| 29.82
(2.8e-04) 1 -24 | -18 1 -28
No Label |
No Label |
1.15
0.69
| 46.49
| 99.45
(4.6e-05) 1
(1.6e-06) 1
RectusL |
1.02
1
(4.5e-04)
26.28
I -28
1
No Label
No Label
I
|
1.02
0.68
1 69.53
1 34.45
(8.2e-06) 1
(1.6e-04) 1
No Label
I
0.60
1 27.02
(4.0e-04) 1
| -104
-2 | -104
0 I -104
-8
I
| -16
|
|
1
18
26
22
42 1 -22
-2 | -102 1
2 | -102 1
2 1 -100
1
30
26
30
Table 3-22 HA group: active regions identified using the regression analysis for the CVCV Audio-
Only condition and percentage of hearing impairment (unaided) [F > 21.04, P < 0.001, uncorrected]
Figure 3-20 HA group: active regions identified using the regression analysis for the CVCV AudioVisual condition and percentage of hearing impairment (unaided) IF > 21.04, P < 0.001, uncorrected)
effects of interest, FUnHICvcvAV, Correction: none, F > 21.04
I Norm'd I
AAL Label
T
(p)
I MI Location (mu)
Effect
x
a
y
1 29.04
(3.le-04)
I
72 1 -30
I
2
55.26
I 21.20
(9.7*-04)
1
56 1 -20
I
16
RolandicOperR 1 52.69
HeschlR 1 36.20
I 21.17
| 21.31
(9.8*-04) 1
(9.6*-04) 1
48 1 -22
42 1 -24
I
I
14
14
1
8
12
14
TemporalSupR 1 74.23
Rolandic_Oper_R
I
Temporal_Mid_L
TemporalSupL
Temporal_Sup_L
35.27
31.70
27.22
25.23
22.33
24.55
(5.2e-04)
(8.le-04)
(5.70-04)
-54
-46
-42
-32
-32
-32
No
No
No
No
No
20.27
19.97
18.88
13.65
2.89
95.64
39.65
89.79
21.51
83.08
(1. 9e-06)
(8.90-05)
(2.6*-06)
(9.2e-04)
(3.7e-06)
-18
-26
-16
-30
-8
-1101
4
-1061
0
-1101
0
-1041
0
-1081 -10
Temp~oral_Sup_L. 1 19.74
T 23.74
(6.5e-04)
Label
Label
Label
Label
Label
1
1
1 -44 1 -12 1
-2
-4
54
Frontal Mid R
11.67
25.91
(4. 7e-04)
No Label
3.77
22.93
(7. 4e-04)
-18
3.11
23.10
(7.2e-04)
-28
2.37
41.28
(7. 6e-05)
-2
2.36
28.62
(3.2e-04)
-10
64
No Label
1.12
58.42
(1. 8e-05)
-50
-88
No Label
0.98
52.24
(2. 8e-05)
-48
-90
TemporalPoleMidL
0.39
28.38
(3. 3e-04)
-22
No Label
0.36
25.86
(4. 7e-04)
TemporalPoleMidL
No Label
FrontalSupMedialL
-90
12
-84
6
2
-2
Table 3-23 HA group: active regions identified using the regression analysis for the CVCV AudioVisual condition and percentage of hearing impairment (unaided) [F > 21.04, P < 0.001, uncorrected]
effects of interest, FAiHICvcvA, Correction: none, F > 21.04
AAL Label
I Norm'd |
T
(p)
I MNI Location (mm)
TemporalMidR
PrecuneusR
I
y
x
Effect
z
48.25
| 24.77
(5.6e-04)
1
70
| -36
|
0
I 11.69
I 21.31
(9.5e-04)
1
8
| -66
|
32
Cerebelum_8_R 1
9.11
I 21.14
(9.8e-04) 1
36
1 -38
| -46
Cerebelum_10_R 1
7.91
I 32.15
(2.le-04) |
32
1 -36
I -44
SuppMotorAreaL 1
5.87
I 21.73
(8.9e-04) |
-2
1
Cerebelum_10_R 1
4.32
| 23.02
(7.3e-04) |
30
| -34
2
|
66
| -42
No Label
No Label
1
1
4.13
3.18
| 22.14
I 26.45
(8.3e-04) | -18
(4.4e-04) | -18
1 -1061
1 -1081
ParaHippocampalR
1
3.23
| 29.03
(3.le-04) |
32
1 -24
1 -28
ParaHippocampalL
|
2.87
| 23.81
(6.4e-04) | -28
| -12
1 -28
No Label
|
1.23
| 22.22
(8.2e-04) |
|
1 -46
28
60
0
10
Table 3-24 HA group: active regions identified using the regression analysis for the CVCV AudioOnly condition and percentage of hearing impairment (aided) tF > 21.04, P < 0.001, uncorrected]
effects of interest, FAiHICvcvV, Correction: none, F > 21.04
AAL Label
No Label
| Norm'd I
Effect
| 21.82
Caudate R I 14.87
OlfactoryR | 14.81
OlfactoryL I 13.94
CaudateL |
9.04
TemporalMidL | 14.60
T
(p)
| MNI Location (mm)
x
y
z
1 21.74
(8.9e-04)
|
1
|
|
|
(5.0e-04)
(8.9e-04)
(8.9e-04)
(9.7e-04)
|
6
|
2
|
-2
| -12
25.43
21.73
21.76
21.22
12
1
-2
|
28
|
|
|
|
10
10
10
16
1
1
1
1
-6
-6
-6
-4
0
I 24.57
(5.7e-04) | -68
| -44
1
ParaHippocampalL
| 12.49
I
54.59
(2.3e-05) | -28
I
-20
1 -26
No Label
| 11.78
I
21.19
(9.8e-04) |
I
-30
1
No Label
| 11.65
| 21.47
(9.3e-04)
| -16
I -20
1 -38
I
-70
1 -20
| -10
| -10
1 -18
1 -18
2
Cerebelum_6_R
1
9.65
1 21.88
(8.7e-04)
|
HippocampusL
HippocampusL
1
7.21
7.08
1 21.82
1 21.53
(8.8e-04)
(9.2e-04)
I -24
I
No Label
I
6.80
1 22.59
(7.8e-04)
|
No Label
1
5.20
1 22.88
(7.4e-04)
I
FrontalInfOrbL
I
5.19
| 26.39
(4.4e-04)
| -26
CerebelumCrus2_R 1
5.17
| 22.08
(8.4e-04) |
FrontalSupOrbL 1
4.08
| 21.12
(9.9e-04) I -14
TemporalMidR |
2.97
1 21.31
(9.5e-04)
|
2.68
| 21.43
(9.4e-04) | -12
CerebelumCrus2_R |
1.97
| 26.34
(4.4e-04)
ParaHippocampalL
16
I -20
56 1
36 | -18
4 | -38
|
18
52 | -48
|
56 |
|
-8
62
1 -30
| -24
| -40
|
-6
4 | -32
-2 | -22
52 | -46
| -44
Table 3-25 HA group: active regions identified using the regression analysis for the CVCV VisualOnly condition and percentage of hearing impairment (aided) [F > 21.04, P < 0.001, uncorrected]
effects of interest,
AAL Label
FAiHICvcvAV, Correction: none, F > 21.04
1 Norm'd
T
(p)
Effect
TemporalSup_ R | 44.33
| MNI Location (mm)
x
y
z
| 22.06
(8.5e-04)
|
70
| -22
|
14
66
|
-4
| -26
|
24
FrontalMedOrbR
1 24.17
I 65.98
(1.0e-05)
|
SupraMarginalR
| 24.13
| 22.55
(7.8e-04)
|
FrontalSupOrbL
| 13.41
| 21.80
(8.8e-04)
| -18
|
60
| -10
FrontalSupOrbL
| 12.22
| 21.96
(8.6e-04)
1 -16
1
54
| -12
| 11.14
| 25.81
(4.8e-04)
1
66
|
1 10.93
| 24.88
(5.5e-04)
1 -20
|
1
7.69
| 27.24
(3.9e-04)
1 -68
| -38
Frontal SupOrb_L 1
6.71
| 22.48
(7.9e-04)
1 -14
FrontalSup _R
Frontal InfOrbL
No Label
8 |
68
16
8
60
6
| -20
|
30
|
-6
I
6.51
1 22.07
(8.4e-04) 1
-6 1
64 1
-4
No Label
|
6.36
1 23.77
(6.5e-04) |
60 1 -54 1
48
No Label
1
5.69
1 22.05
(8.5e-04) 1
60
| -50 1
50
Cerebelum_8_R 1
3.58
1 25.23
(5.2e-04)
I
20
1 -58 1 -60
ParaHippocampalR 1
1.30
1 23.39
(6.9e-04) |
32
| -22 1 -28
FrontalMedOrbL
Table 3-26 HA group: active regions identified using the regression analysis for the CVCV AudioVisual condition and percentage of hearing impairment (aided) [F > 21.04, P < 0.001, uncorrected]
effects of interest
AAL Label
F DHI CvcvA, Correction: none, F > 21.04
I Norm'd |
T
(p)
I
1
1
Effect
MNI Location (mm)
x
y
z
Rolandic_OperR 1 24.98
Rolandic_OperR 1 19.88
| 23.67
| 21.60
(6.6e-04)
(9.le-04)
No Label | 22.06
FrontalSupL | 15.20
Frontal MidL 1 14.76
| 30.30
| 28.69
| 21.40
(2.6e-04) | -34 1
(3.2e-04) 1 -30 |
(9.4e-04) 1 -30 1
I 19.57
I 21.51
(9.2e-04) 1 -60
Rolandic_OperL | 18.00
1 21.52
(9.2e-04) 1 -56 1
|
1
1
1
1
(6.5e-04)
(2.6e-04)
(7.8e-05)
(7.9e-04)
(9.0e-04)
Rolandic_OperL
Frontal_InfOperR
FrontalInfOperR
TemporalPoleSupR
FrontalInfOrbR
Temporal_PoleSupR
1 14.46
1 14.08
I 10.93
1 10.80
1 9.49
No Label |
23.69
30.28
41.04
22.51
21.68
1
1
I
1
1
56 | -14
60 1 -12
54
56
54
54
52
I
1
1
1
1
1
|
1
42 |
46 |
48 |
12
12
42
40
36
2 1
8
2 1
2
14
18
18
20
18
1
0
1
0
1 -14
1 -8
1 -18
8.47
1 28.31
(3.4e-04) 1 -36 1
CerebelumCrus1_R 1
6.34
1 35.48
(1.4e-04) 1
No Label 1
3.63
| 21.14
(9.8e-04) | -26
| -1021
8
No Label 1
3.56
1 26.10
(4.6e-04) | -30
1 -1021
16
No Label 1
3.47
1 21.59
(9.le-04) | -28
1 -1021
20
38 1 -20
48 1 -72
1 -28
Table 3-27 HA group: active regions identified using the regression analysis for the CVCV AudioOnly condition and percentage of hearing impairment (unaided - aided) [F > 21.04, P < 0.001,
uncorrected]
effects of interest, FDHICvcvV, Correction:
AAL Label
I Norm'd I
T
none, F > 21.04
(p)
I MNI Location (mm)
No Label I 21.82
Caudate R | 14.87
OlfactoryR | 14.81
OlfactoryL I 13.94
9.04
CaudateL |
z
y
x
Effect
1 21.74
(8.9e-04)
1
12
1
-2
1
28
25.43
21.73
21.76
21.22
(5.0e-04)
(8.9e-04)
(8.9e-04)
(9.7e-04)
1
6
2
1
1 -2
1 -12
1
|
|
|
10
10
10
16
|
|
|
1
-6
-6
-6
-4
0
1
1
1
1
TemporalMidL
I 14.60
| 24.57
(5.7e-04)
| -68
1 -44
1
ParaHippocampalL
| 12.49
1 54.59
(2.3e-05) | -28
1 -20
1 -26
No Label
| 11.78
1 21.19
(9.8e-04) |
No Label
| 11.65
| 21.47
(9.3e-04) 1 -16
2 | -30
9.65
1 21.88
(8.7e-04) 1
Hippocampus L |
HippocampusL I
7.21
7.08
1 21.82
| 21.53
(8.8e-04) 1 -24
(9.2e-04) | -20
No Label
|
6.80
1 22.59
(7.8e-04) |
No Label
|
5.20
1 22.88
(7.4e-04) |
Frontal InfOrbL |
5.19
| 26.39
(4.4e-04) 1 -26
1
5.17
1 22.08
(8.4e-04) 1
FrontalSup_OrbL 1
4.08
1 21.12
(9.9e-04) 1 -14
TemporalMidR 1
2.97
| 21.31
(9.5e-04) 1
ParaHippocampalL 1
2.68
1 21.43
(9.4e-04) 1 -12
CerebelumCrus2_R 1
1.97
1 26.34
(4.4e-04) 1
CerebelumCrus2_R
-8
1 -20 1 -38
16 | -70
Cerebelum_6_R |
1
| -10
1 -10
56 1
| -20
1 -18
1 -18
36 1 -18
4 1 -38 1 -30
|
18
1 -24
52 1 -48
| -40
1
62
|
-6
56 |
4 | -32
|
-2 1 -22
52
| -46
1 -44
Table 3-28 HA group: active regions identified using the regression analysis for the CVCV VisualOnly condition and percentage of hearing impairment (aided) [F > 21.04, P < 0.001, uncorrected]
effects of interest, FDHICvcvAV,
AAL Label
I Norm'd I
F > 21.04
Correction: none,
T
(p)
| MNI Location (mm)
x
Effect
z
y
OccipitalMidR 1 19.33
1 36.38
(1.3e-04) |
58 | -66 |
No Label 1 17.47
| 27.22
(3.9e-04) |
72
No Label 1 13.41
| 23.58
AngularR | 13.08
PrecentralL |
8.96
24
| -24
1
18
(6.7e-04) | -58
| -70
1
16
| 25.47
(5.0e-04) |
| -60
1
22
1 25.59
(4.9e-04) | -52
8 1
42
62
1
Table 3-29 HA group: active regions identified using the regression analysis for the CVCV AudioVisual condition and percentage of hearing impairment (unaided - aided) [F > 21.04, P < 0.001,
uncorrectedl
100
effects of interest, FDiffSDTCvcvA, Correction: none, F > 21.04
AAL Label
I Norm'd I
T
I MNI Location (mm)
(p)
x
Effect
z
y
117.67
| 15.91
1 27.72
I 22.44
(3.7e-04) 1 -48 1
(8.0e-04) 1 -46 1
48 1 -14
44 | -18
FrontalInfOrbL | 15.00
| 22.79
(7.5e-04) 1 -52 I
44
6.05
| 21.82
(8.8e-04) |
No Label
No Label
ParietalSupR 1
26 | -54
1
-6
1
58
Table 3-30 HA group: active regions identified using the regression analysis for the CVCV AudioOnly condition and speech detection/reception threshold (unaided - aided) [F > 21.04, P < 0.001,
uncorrected]
AAL Label
I Norm'd I
Effect
T
1
rnor
Crrection:
(p)
4
I MNI Location (mm)
y
z
x
FrontalMidL 1 31.69
| 24.02
(6.2e-04) 1 -24 |
32
FrontalMidL 1 31.42
| 22.07
(8.4e-04) 1 -30 1
26 |
54
FrontalMidL 1 17.06
I 22.50
(7.9e-04) 1 -34 1
24 |
54
CerebelumCrus1_L 1 10.34
| 22.28
(8.2e-04) | -18 1 -86
1
50
1 -28
FrontalMidL |
9.78
| 23.09
(7.2e-04) 1 -36 1
20 |
56
FrontalSup_MedialL |
9.70
I 23.32
(6.9e-04) 1
-8 |
32
58
TemporalInfR |
4.81
I 23.80
(6.4e-04) 1
42 | -10
1 -36
CerebelumCrus2_L |
4.75
1 21.51
(9.2e-04) 1 -32 | -82
1 -36
CerebelumCrus2_L |
3.10
1 26.87
(4.le-04) 1 -24 | -84
| -34
1
Table 3-31 HA group: active regions identified using the regression analysis for the CVCV VisualOnly condition and speech detection/reception threshold (unaided - aided) [F > 21.04, P < 0.001,
uncorrected]
effects of interest, FDiffSDT CvcvAV, Correction: none, F > 12.83
AAL Label
I Norm'd
|
T
(p)
I MNI Location (mm)
x
Effect
Occipital_MidL 1 243.65
OccipitalMidL 1 203.18
z
y
I
I
1 13.04
1 13.46
(4.7e-03) 1 -20
(4.3e-03) 1 -20
1 -102
1 -102
229.79
198.45
178.04
127.63
103.17
1 26.69
32.58
1 15.56
1 17.88
I 12.97
(4.2e-04)
(2.0e-04)
(2.8e-03)
(1.7e-03)
(4.8e-03)
1
1
|
|
|
No Label 1 193.55
1 12.88
(4.9e-03) 1 -22
1 -102 1
FrontalSupL | 168.37
Frontal_MidL I 132.58
| 15.12
1 13.64
(3.0e-03) I -24
(4.le-03) | -28
I
I
TemporalInfR | 156.52
1 14.59
(3.4e-03) |
I
OccipitalMidL | 148.60
1 17.04
(2.Oe-03)
I 132.38
I 100.50
1 12.83
1 13.61
(5.Oe-03) |
32
(4.2e-03) I 36
I
24 |
26 1
52
54
ParietalSupR 1 131.65
1 15.82
(2.6e-03) |
| -76 1
54
FrontalInf_TriL 1 117.58
1 18.98
(1.4e-03)
Cerebelum_4_5_L
Vermis_1_2
Vermis_3
ParaHippocampalL
Fusiform_L
Frontal MidR
FrontalMidR
1
1
|
1
1
I
1 -10
1 -2
1
6
| -30
1 -22
54
I -46
I 82.18 |
I 70.95 1
| -16
I -22
| -16
| -18
| -22
-38
-40
-40
-26
-30
42 1
30 1
|
24
I -52 I
30 1
| -12
1 -16
1
| 22.20
I 12.84
(8.3e-04) I -18
(5.0e-03) | -18
1 -48
| -44
I
|
I 16.19
(3.le-03) I -6 | -32 |
(2.4e-03) |
-2 | -30 |
I 55.31
I 17.60
(1.8e-03)
OccipitalMidR | 55.06
| 15.91
(2.6e-03) 1
CingulumMidR
14.91
I
6
-2
1 -2
1
2
1
4
1 16
(2.9e-03) 1 -32
(4.2e-03) I -34
CingulumMidL 1 59.75
CingulumMidL 1 56.48
44
52
| -84 1
15.30
13.59
Cerebelum_8_L 1 77.87
Cerebelum_9_L 1 67.42
14
-58 1 -18
LingualL | 112.69
1 12.84
(5.0e-03) | -24 I -52
LingualL | 91.78
| 13.46
(4.3e-03) 1 -20 1 -56
CalcarineL I 70.71
1 13.41
(4.4e-03) 1 -24 1 -64
No Label | 61.86
I 13.66 (4.le-03) 1 -20 1 -46
HippocampusL
Hippocampus L
8
4
1 -20
-18
| -58
-56
44
44
2
I
-22
I
34
40
I
-84
I
18
Table 3-32 HA group: active regions identified using the regression analysis for the CVCV AudioVisual condition and speech detection/reception threshold (unaided - aided) [F > 21.04, P < 0.001,
uncorrected]
4
Effective Connectivity Analyses
Several methods have been introduced for investigating interactions among brain regions.
The term "connectivity" can refer to one or more of the following: anatomical connectivity,
functional connectivity, and effective connectivity. Anatomical connectivity refers to direct
axonal
projections
from
one
cortical
region
to another, typically
identified
in
neuroanatomical studies involving non-human animals. Functional connectivity is defined as
"temporal correlations between spatially remote neurophysiological events" (Friston et al.,
1997), which is simply a statement about observed correlations and does not imply any
information about how these correlations are mediated.
An overview of the common methods used to investigate functional connectivity is briefly
discussed in the subsequent section 4.1. Although functional connectivity alone does not
provide any evidence of neural connectivity, it is often used as a first step in establishing the
possibility of neural connectivity between certain cortical regions. The results obtained from
one or more of the above method(s) then can be used in combination with effective
connectivity to make better inferences about neural connectivity maps for auditory-visual
interactions for speech perception.
In contrast, effective connectivity is more directly associated with the notion of neural
connectivity and addresses the issue of "one neuronal system exerting influences over
another" (Friston et al., 1997). There are two main methods that are commonly used to
investigate effective connectivity.
They are: structural equation modeling (SEM) and
dynamic causal modeling (DCM). In SEM, the parameters of the model are estimated by
minimizing the difference between the observed covariances and the covariances implied by
the anatomical structural model. The DCM method takes on a new approach to assessing
effective connectivity. As with SEM, it attempts to measure how brain areas are coupled to
each other and how these couplings are changed for varying conditions of the experiment.
However, this procedure employs much more sophisticated mechanisms to incorporate
hemodynamic response modeling to reflect the neuronal activity in different regions, and the
transformation of these neuronal activities into a measured response.
To further investigate the cortical interactions involved in auditory-visual speech perception,
we performed SEM and DCM analyses on our fMRI data.
The effective connectivity
analyses were conducted on all subject groups to identify network patterns that underlie the
processing of visual speech.
Based on previous findings on functional specializations of
brain regions known to be associated with visual and auditory stimulus processing, along
with known anatomical connections in primates, a number of cortical regions were identified
and used to construct a plausible, yet simple anatomical model for our analyses.
An
overview of SEM and DCM are presented in sections 4.2 and 4.3, respectively, along with
the results obtained from both analyses.
4.1
Functional Connectivity
Functional connectivity is generally tested by using one of the following methods:
"
Single Value Decomposition (Eigenimage Analysis),
e
Partial Least Squares,
" Multidimensional Scaling,
*
Cross-correlation Analysis,
*
Principal Component Analysis, (nonlinear PCA),
e
Independent Component Analysis,
*
Canonical Correlation Analysis.
As stated previously, functional connectivity is essentially a statement of observed
correlations. Here, a simple way of measuring the amount a particular pattern of activity
contributes to the measure of functional connectivity is introduced.
Let a row vector p represent a particular pattern of activity (over the entire brain) where each
element represents the value of each voxel, and let matrix M represent a collection of
scanned image data over some period of time. Each row in M is a collection of voxel values
of the brain at a specific time; hence successive rows represent increase in time and each
column represent the temporal changes of one voxel. Matrix M can be assumed to be mean-
104
corrected. Here, the contribution of pattern p to the covariance structure can be measured by
the 2-norm of M -p, i.e. 11M -p12
IlM.p||
=p'-M'.M.p
In other words, the 2-norm above is the amount of functional connectivity that can be
accounted for by pattern p. If most of the temporal changes occur in pattern p, then the
correlation between the overall pattern of activity and p will have significant variance over
time. The 2-norm is a measurement of this temporal variance in spatial correlation between
the pattern of activity and the pattern defined by p.
This simple method of quantifying the functional connectivity can be used only when the
pattern p is specified.
One would have to specify the region of interest to employ this
method to measure the functional connectivity. In order to simply find the most prevalent
patterns of coherent activity, other methods need to be sought.
4.1.1
Partial Least Squares
The functional connectivity between two voxels can be extended to the functional
connectivity between two systems. The latter can be defined as the correlation or covariance
between their time-dependent activities. As shown above, the temporal activity of a pattern p
(or a system) is found by
t, =M.p,
and the temporal activity of another pattern q is
tq
=M.q.
Therefore the correlation between the systems described by vectors p and q is:
105
p, =t' -t, =q -M' .M.p
The correlation measured above represents the functional connectivity between the systems
described by p and q. To determine the functional connectivity between two systems in
separate parts of the brain, for example, left and right hemispheres, the data matrix M will
not be the same for p and q. In this case, the above equation will become:
Ppq =t', -t, =q' M' -M, p
To find patterns p and q which maximize Ppq SVD can be used as before.
[USV]=SVDM, .M,
M' M =U-S-V'
U' M -M, *V=S
The first columns of matrices U and V are the eigenimages that depict the two systems with
the greatest amount of functional connectivity. This method is not appropriate for identifying
more than two regions that may have strong connectivity since it only identifies systems in
pairs.
4.1.2
Eigenimage Analysis
To overcome the shortcoming described above, the concept of eigenimages can be used.
Eigenimages can be most commonly obtained by using singular value decomposition (SVD),
where SVD is a matrix operation that decomposes a matrix into two sets of orthogonal
vectors (i.e., two unitary orthogonal matrices) and a diagonal matrix with leading diagonal of
decreasing singular values. So, using the same definition of M as above,
[USV]= SVD{M}
M = U - S - V'
106
where U and V are unitary orthogonal matrices and S is a diagonal matrix with singular
values. Here, the singular value of each eigenimage is equivalent to the 2-norm of each
eigenimage. The columns of matrix V are the eigenimages in the order of how much each
contributes to functional connectivity whereas the column vectors of U are the timedependent profiles of each eigenimage.
Since SVD operation maximizes the largest
eigenvalue, the first eigenimage is the pattern that contributes most to the variancecovariance structure. It should be noted that eigenimages associated with the functional
connectivity are simply the principal components of the time-series. Therefore, the SVD
operation is analogous to principal component analysis, or the Karhunen-Loeve expansion
used in identifying spatial modes.
For example, the distribution of eigenvalues for CVCV Audio-Visual condition for the
hearing subjects is shown in Figure 4-1 below. Here, the first eigenimage of the activation
pattern (Figure 4-2) accounts for more than 70% of the variance-covariance structure.
X - eigenimage number. Y - elgenvalue (X)
Figure 4-1 Distribution of eigenvalues.
Figure 4-2 The first eigenimage.
The first eigenimage (Figure 4-2) shows the temporally correlated regions that were activated
throughout the CVCV Audio-Visual condition.
Qualitatively speaking, the functional
connectivity map does show some connectivity from visual to auditory cortical areas
(primary and secondary cortices are NOT shown in the map).
In eigenimage analysis, the covariance matrix from a time-series of images is subjected to
singular value decomposition. The resulting eigenimages comprise regions that are
functionally connected to each other. While this is a useful and simple way of characterizing
distributed activations, it cannot be used to generate statistical confidence measures.
Furthermore, functional connectivity in fMRI data does not necessarily imply neural
connectivity. It may be best if used in conjunction with an effective connectivity analysis
such as structural equation modeling or dynamic causal modeling. With these methods,
confidence measures can be obtained.
4.1.3
Multidimensional Scaling
Multidimensional scaling (MDS) was first developed for analyzing perceptual studies. It is
often used to represent the overall structure of a system based on pairwise measures of
similarities, confusability or perceptual distances. Essentially, in MDS, the brain anatomy is
represented in a space where the distance between brain regions corresponds to the level of
functional connectivity; however this method lacks applicability to neuroimaging data since
it only measures similarities among a set of time series.
4.2
4.2.1
Structural Equation Modeling
Theory
Roughly speaking, SEM involves creation of possible connectivity models involving brain
regions that are active for a given task, then testing the goodness of fit of these models to see
if they can account for a significant amount of the experimental data. Here we use this
technique to investigate possible connections between cortical regions that are active during
processing of visual and audio-visual speech stimuli in both normal hearing and congenitally
deaf individuals.
SEM is a multivariate technique used to analyze the covariance of observations (McIntosh et
al., 1996). When applying SEM techniques, one also has to find a compromise between
model complexity, anatomical accuracy and interpretability since there are mathematical
constraints that impose limits to how complex the model can be. The first step in the analysis
is to define an anatomical model (constraining model), and the next step is to use the interregional covariances of activity to estimate the parameters of the model.
Figure 4-3 Example of a structural model.
Consider a simple example above in Figure 4-3 (from McIntosh and Gonzalez-Lima, 1994).
Here, A, B, C and D represent the brain areas and the arrows labeled v, w, x, y, and z represent
the anatomical connections. These together comprise the anatomical model for structural
equation modeling analyses. In most cases, the time-series for each region A, B, C and D are
extracted from the imaging data (fMRI data), and are normalized to zero mean and unit
variance.
Then the covariance matrices are computed on the basis of this time-series or
observations obtained from these regions.
The values for v, w, x, y, and z are calculated through a series of algebraic manipulations and
are known as the path coefficients. These path coefficients (or connection strengths) are the
parameters of the model, and these represent the estimates of effective connectivity.
Essentially, the parameters of the model are estimated by minimizing the difference between
the observed covariances and the covariances implied by the anatomical structural model.
Mathematically, the above model can be written as a set of structural equations as:
B = vA + VB
C = xA + wB + V/,
D=yB+zC+
YID
For these equations, A, B, C and D are the known variables (measured covariances); v, w, x, y,
and z are the unknown variables. For each region, a separate 'I variable is included, and
these represent the residual influences. Simply stated, this variable can be interpreted as the
combined influences of areas outside the model and the influence of a brain region upon
itself (McIntosh and Gonzalez-Lima, 1992). The path coefficients are normally computed
using software packages such as AMOS, LISREL, and MX32. The starting values of the
estimates are obtained initially using two-stage least squares, and they are iteratively
modified using a maximum likelihood fit function (Joreskog and Sorbom, 1989).
Minimizing the differences between observed and implied covariances is usually done with
steepest-descent iterations.
The structural equation modeling technique differs from other statistical approaches such as
multiple regression or ANOVA where the regression coefficients are obtained from
minimizing the sum squared differences between the predicted and observed dependent
variables. In structural equation modeling, instead of considering individual observations (or
variables) as with other usual statistical approaches, the covariance structure is emphasized.
In the context of neural systems, the covariance measure corresponds to how much the neural
activities of two or more brain regions are related. Applying structural equation modeling
analysis to neuroimaging data has a particular advantage compared to applying it to
economics, social sciences or psychology datasets, since the connections (or pathways)
between the dependent variables (activity of brain areas) can be determined based on
anatomical knowledge and the activity can also be measured directly. With applications in
other fields, this is not always true: the models are sometimes hypothetical and cannot be
measured directly.
Goodness-of-fit Criteria
Typically in SEM, statistical inference is used to measure: (1) the goodness of the overall fit
of the model, i.e. how significantly different are the observed covariance structure and the
covariance structure implied by the anatomical model, and (2) the difference between
alternative models for modeling modulatory influence or experimental context by using the
nested or stacked model approach. For the purpose of assessing the overall fit of the model,
the x2 values relative to the degrees of freedom are most widely calculated. This is often
referred to as the chi-square test and is an absolute test of model fit. If the p-value associated
with the x2 value is below 0.05, the model is rejected in the absolute fit sense. Because the x2
goodness-of-fit criterion is very sensitive to sample size and non-normality of the data, often
other descriptive measures of fit are used in addition to the absolute x2 test. When the
number of samples is greater than a few hundred, the X2 test has a high tendency to always
show statistically significant results, ensuing in a rejected model. However other descriptive
fit statistics can be used in conjunction to the absolute test to assess the overall fit and are
used to claim a model to be accepted even though the ) 2 fit index may argue otherwise. Since
the sample sizes in our analyses are too large for the 2 test, the X2 measure is not used as the
sole indicator of fit. A number of goodness-of-fit criteria have been formulated for SEM
analyses; commonly used criteria include goodness-of-fit index (GFI), and adjusted GFI
(AGFI).
The GFI is essentially 1 minus the ratio of the sum of the squared differences
between the observed and implied covariance matrices to the observed variances. The AGFI
is the adjusted version of GFI where the degrees of freedom of a model and the number of
unknown variables are taken into consideration for adjustment. Both GFI and AGFI values
fall between 0 and 1, where 0 represents no fit and 1 is a perfect fit (Hu and Bentler, 1999).
Usually a value above 0.90 is considered acceptable, and a good fit. Other measures of fit
used in this study are the root mean square residual (RMR) and the root mean square error
approximation (RMSEA). RMR is the square root of the mean squared differences between
sample variances and covariances, and estimated variances and covariances, so a smaller
RMR value represents a better fit, and 0 represents a perfect fit. RMSEA incorporates the
parsimony criterion and is relatively independent of sample size and number of parameters.
A suggested rule of thumb for an RMSEA fit is that a value less than or equal to 0.06
indicates an adequate fit (Hu and Bentler, 1999).
Model Interpretation - Path Coefficients
The connection strength (path coefficient) represents the response of the dependent variable
to a unit change in an explanatory variable when other variables in the model are held
constant (Bollen, 1989). The path coefficients of a structural equation model are similar to
correlation or regression coefficients and are interpreted as follows (McIntosh and GonzalezLima, 1994):
" A positive coefficient means that a unit increase in the activity measure of one
structure leads to a direct increase in the activity measure of the structure it projects
to, proportional to the size of the coefficient.
*
A negative coefficient means that an increase in the activity measure in one structure
leads to a direct, proportional decrease in the activity measure of the structure it
projects to.
Differences in path coefficients for two different models (i.e., for two different conditions)
can be of two types.
112
e
A difference in the sign (without marked difference in absolute magnitude) reflects a
reversal in the interactions within that pathway, or a qualitative change across
conditions. In other words, the nature of the interaction between regions has changed.
" A difference in the absolute magnitude of the path coefficients (without sign change)
is interpreted as a change in the strength of the influences conveyed through that
pathway. The influence of a pathway or structure on the system is either increased or
decreased according to the difference of the magnitude.
" A difference in the sign with marked difference in absolute magnitude suggests that
there are discontinuities along these pathways; such path coefficients are more
difficult to interpret (McIntosh and Gonzalez-Lima, 1992).
Nested Model Comparison
Since SEM is inherently linear, it cannot directly model non-linear changes in connection
strength. However, to overcome this problem, two models can be constructed and these two
models can be compared to test for non-linear changes. This is known as the nested (or
stacked) model approach (Della-Maggiore et al., 2000; McIntosh, 1998). The first model
defined in this approach is the restricted null model, in which the path coefficients are forced
to be equal between all conditions and the second model is the corresponding alternate free
model, in which the path coefficients are allowed to change between different conditions or
subject groups. The X2 values are computed for both the null model and the alternate free
model with corresponding degrees of freedom.
If the x2 value for the null model is
statistically significantly larger than the alternate model, the null model is refuted and the
alternative model is assumed to provide a better fit. In other words, different conditions
within the free model are deemed to be significantly distinguishable in terms of their path
connectivity, and one can infer that there is a statistically significant global difference in path
coefficients between the conditions. The X2 diff is evaluated with the degrees of freedom equal
to the difference in the degrees of freedom for the null and free model.
If the X2 diff test for the null and free model is found to be statistically significant, one can also
use pair-wise parameter comparisons (Arbuckle and Wothke, 1999) to determine which pairs
of parameters are significantly different between the experimental conditions in the free
model. For the pair-wise parameter comparison test, critical ratios for differences between
two parameters in question are calculated by dividing the difference between the parameter
estimates by an estimate of the standard error of the difference.
Under appropriate
assumptions and with a correct model, the critical ratios follow a standard normal distribution.
4.2.2
Methods
Anatomical Model
Based on previous findings on functional specializations of brain regions known to be
associated with visual and auditory stimulus processing, along with known anatomical
connections in primates, a number of cortical regions were identified and used to construct a
plausible, yet relatively simple anatomical model for our SEM analyses. Six cortical regions
and their hypothesized connections comprised the structural model constructed for the
effective connectivity analyses. Here the collections of connectivity data on the macaque
brain (COCOMAC; http://www.cocomac.org) database was used extensively to search
interconnectivity patterns reported in the literature.
We hypothesized that there are
projections from higher-order visual cortex (V) to the angular (AG), the inferoposterior
temporal area (IPT), and to higher-order auditory cortex, more specifically the posterior
superior temporal sulcus (STS). We further assumed projections from the IPT and the AG to
the STS.
The opercular region of the inferior frontal gyrus (IFG, BA 44) and the lateral
region of premotor cortex and the lip area of primary motor cortex (M) - brain areas that are
generally believed to play a role in auditory-visual speech perception and production - were
also included in our anatomical model and were assumed to have connectivity with AG, IPT,
and STS. To define an anatomical model that would best account for the underlying neural
circuitry during auditory-visual speech perception in all of the hearing and hearing impaired
groups, we searched through a set of permissible functional connection patterns which
included our conjectured connectivity mentioned above.
After sorting through global fit
measures for a set of connectivity patterns, we identified the following structural model
114
(depicted as a path diagram in Figure 4-4) to provide the best fit across all subjects and two
conditions.
Figure 4-4 Anatomical model for SEM analyses (V - Higher-order Visual Cortex, AG - Angular
Gyrus, IPT - Inferoposterior Temporal Lobe, STS - Superior Temporal Sulcus, IFG - Inferior
Frontal Gyrus, M - Lateral Premotor Cortex & Lip area on primary motor cortex)
Data Extraction and Model Fitting
Activities in the cortical regions of interest (ROIs) of the SEM path models were extracted
for all subjects for the CVCV Visual-Only and CVCV Audio-Visual conditions. For each
subject, local maxima were identified within each region based on functional maps for the
CVCV
Visual-Only
condition.
The
mni2tal
algorithm
(http://www.mrc-
cbu.cam.ac.uk/Imaging) was used to transform the MNI coordinates into Talairach
coordinates, and the Talairach Daemon client (http://ric.uthscsa.edu/TDinfo/) was used to
identify the corresponding atlas labels and Brodmann's areas.
115
BOLD signals in each ROI were extracted separately from the right and the left hemispheres
using the Marsbar toolbox (http://marsbar.sourceforge.net) for the CVCV Visual-Only and
CVCV Audio-Visual conditions. For each ROI of a single subject, the average signal was
extracted using the SPM scaling design and the mean value option from a spherical region
(r=5mm) centered at the peak activation coordinate. The extracted series from each temporal
block were normalized for each subject and signal outliers were removed, and the first scan
in each block (TR = 3s) was discarded to account for the delay in hemodynamic response.
These values were then concatenated across all subjects to create a single time-series for each
ROI and experimental condition, and lastly the covariance matrix was calculated by treating
these time-series as the measurements of the observed variables. The SEM analyses were
conducted in AMOS
5 software (http://www.spss.com/amos/index.htm).
Maximum
likelihood estimation was performed on path coefficients between observed variables,
thereby giving a measure of causal influence. The statistical significance of these parameter
estimates was also computed.
4.2.3
Results
Although our structural model did not, for the sake of tractability, include all of the cortical
regions reported to be active in our fMRI analyses, results from our SEM analyses provide
some insights to connectivity patterns between constituent nodes of speechreading circuitry.
The structural model in Figure 4-4 was analyzed separately for all subject groups, and also
separately for the left and the right hemispheres, resulting in six independent models - NH
(left hemisphere), NH (right hemisphere), CD (left hemisphere), CD (right hemisphere), HA
(left hemisphere), and HA (right hemisphere). In order to identify any path model connection
strength changes between the CVCV Visual-only and the CVCV Audio-Visual conditions
(i.e. to test for changes in the connection strengths between when the auditory speech
information is absent vs. available), multi-group analyses were conducted with the nested
models approach.
The null (constrained) model's parameters were restricted to be equal
116
between the two conditions, whereas the free (unconstrained) model's parameters were
allowed to be different for the two separate conditions.
Several indices for goodness-of-fit, as discussed in Section 4.2.1, for the six nested models
are listed in Table 4-1 along with their x2 statistics for model comparisons. The goodness-offit indices indicate that the anatomical model (Figure 4-4) adequately fits the experimental
data for all subject groups and for both hemispheres, especially when the models were
unconstrained.
This implies that our anatomical model suitably represents a network of
cortical regions that may underlie audio-visual speech processing for all subject groups,
while being sensitive to the changes in the availability of auditory speech. The e fit index for
the NH (Right) model suggested that the absolute fit may not be acceptable (X2(6)= 12.771, P
= 0.047) as its p-value is near the borderline cut-off point of P > 0.05, but as stated previously,
other descriptive fit statistics (RMR = 0.013, GFI = 0.998, AFGI = 0.986, RMSEA = 0.024)
reflect a good overall fit, hence this model was not rejected in our analyses.
The stability index (Bentler and Freeman, 1983; Fox, 1980) was also calculated for each
model since our path model includes a nonrecursive subset of regions: AG, IPT, STS, IFG,
and M. As listed in Table 4-1, all NH, CD and HA right hemisphere models' estimates were
found to be well below one and thus stable. However, the CD (Left) and HA (Left) models'
stability indices were greater than one (CD (Left) STI = 2.387, 2.387, 1.13 1; HA (Left) STI =
2.109, 2.109, 1.540). If the stability index value is greater than or equal to one for any of the
nonrecursive subsets of a path model, the parameter estimates are known to yield an unstable
system, producing results that are particularly difficult to interpret. Therefore, we decided
not to present the parameter estimates from the CD (Left) and HA (Left) models. All nested
models except for the NH (Left) model (Xirff
=
23.995, df = 15, P = 0.065) showed
statistically significant differences across unconstrained and constrained models. Since the
NH (Left) model did not satisfy the conventional level of significance P < 0.05, its path
coefficients should be interpreted with some caution.
117
Goodness-of-fit Index Criteria
Stability Index
Model
Comparison
Model2
x2
P
RMR
GFI
Unconstrained
3.237
.779
.006
.999
.996
.000
Constrained
27.232
.163
.024
.996
.991
.012
Unconstrained
12.771
.047
.013
.998
.986
.024
Constrained
40.810
.006
.023
.993
.987
.022
Unconstrained
8.638
.195
.011
.999
.990
.015
Constrained
36.981
.017
.020
.994
.988
.019
Unconstrained
5.982
.425
.016
.999
.993
.000
Constrained
40.641
.006
.020
.994
.987
.021
Unconstrained
29.615
.100
.055
.984
.890
.088
Constrained
38.932
.010
.084
.992
.988
.019
Unconstrained
7.757
.256
.011
.999
.991
.002
Constrained
30.242
.103
.022
.993
.981
.011
AGFI
RMSEA
VO
P
AV
df=15
NH (Left)
.982
.982
.977
23.995
.065
28.039
.021
28.343
.020
34.658
.003
22.307
.010
36.952
.001
NH (Right)
.157
.157
.183
CD (Left)
2.387
2.387
1.141
CD (Right)
.328
.328
1.042
HA (Left)
2.109
2.109
1.540
HA (Right)
.122
.122
.642
Table 4-1 Goodness-of-fit and stability indices of SEM models for the NH, CD and HA groups: both
null (constrained) and free (unconstrained) models for each hemisphere [P < 0.05 for model
comparison (last column) represents a significant difference between the constrained and
unconstrained models].
Tables A-I, A-2 and A-3 (in Appendix A) list the estimated path coefficients that minimize
the difference between observed and model coefficients for the NH (Left and Right), CD
(Right) and HA (Right) models along with their corresponding standard errors, and p-values
for both the CVCV Visual-Only and CVCV Audio-Visual conditions. Here, the estimated
path coefficients represent the strength of connections or the strength of the influence
conveyed through that pathway. The last two columns of the table list critical ratios for pairwise parameter differences between the two experimental tasks and their levels of
significance. The estimated pathway connection strengths are also summarized graphically
in Figures 4-5, 4-6, 4-7 and 4-8 for the NH (Left), NH (Right), CD (Right) and HA (Right)
models respectively.
In these figures, the black text color represents the estimated path
coefficients for the CVCV Visual-Only condition, whereas the blue text color represents the
118
estimates for the CVCV Audio-Visual condition. The thicker arrows are used to represent
pathways with statistically significant pair-wise differences in connection strengths across the
two experimental tasks. The thicker black arrows are connections that increased in strength
for the CVCV Visual-Only condition (also can be interpreted as connections that decreased
in strength for the CVCV Audio-Visual condition), and the thicker blue arrows are
connections that increased in strength for the CVCV Audio-Visual condition (or decreased in
strength for the CVCV Visual-Only condition).
OM
.159
.172
-.005
.060
.051
.086
034
.313
.399-
.4
AG
.646
.44
171
P
45
.26
567
Figure 4-5 NH (left): estimated path coefficients [black text: CVCV Visual-Only, blue text: CVCV
Audio-Visual; thicker black arrows: connections with significant increase in strength for the CVCV
Visual-Only condition, thicker blue arrows: connections with significant increase in strength for the
CVCV Audio-Visual condition].
119
Figure 4-6 NH (right): estimated path coefficients [black text: CVCV Visual-Only, blue text: CVCV
Audio-Visual; thicker black arrows: connections with significant increase in strength for the CVCV
Visual-Only condition, thicker blue arrows: connections with significant increase in strength for the
CVCV Audio-Visual condition].
Figure 4-7 CD (right): estimated path coefficients [black text: CVCV Visual-Only, blue text: CVCV
Audio-Visual; thicker black arrows: connections with significant increase in strength for the CVCV
Visual-Only condition, thicker blue arrows: connections with significant increase in strength for the
CVCV Audio-Visual condition].
120
Figure 4-8 HA (right): estimated path coefficients [black text: CVCV Visual-Only, blue text: CVCV
Audio-Visual; thicker black arrows: connections with significant increase In strength for the CVCV
Visual-Only condition, thicker blue arrows: connections with significant increase in strength for the
CVCV Audio-Visual condition].
4.2.3.1
Superior Temporal Sulcus as AV Speech Integration Site
Overall, the anatomical model (Figure 4-4) employed in effective connectivity analyses was
found sufficient to model the network underlying visual speech perception for all subject
groups (see Table 4-1 for goodness-of-fit indices). Results obtained from SEM and DCM
analyses (see Section 4.3 for DCM results) applied to our anatomical model provide a strong
evidence for the possibility that the posterior part of superior temporal sulcus (STS) serves as
the multisensory integration site for AV speech.
This claim is also supported by the
activation pattern obtained from the CVCV Visual-Only condition, where visual speech alone
was found to activate only the posterior tip of STS/G in the NH group.
The auditory cortical areas reside on superior temporal cortex, comprised of core areas, and
the surrounding belt and parabelt areas. Auditory signal is known to begin its first stage in
processing in core areas, located in the transverse gyrus of Heschl on the dorsal surface of the
temporal lobe in the planum temporale. Studies in humans have also shown that belt and
parabelt areas on superior temporal gyrus are specialized for processing more complex
aspects of auditory stimuli (Belin et al., 1998; Kaas and Hackett, 2000; Rauschecker, 1997;
Rauschecker et al., 1995; Wessinger et al., 2001; Zatorre and Belin, 2001; Zatorre et al.,
2002a; Zatorre et al., 2002b). Evidence from fMRI studies of visual object processing have
shown that different categories of visual objects activate different regions of visual
association cortex in occipital and temporal lobes (Beauchamp et al., 2002; Beauchamp et al.,
2003; Kanwisher et al., 1997; Levy et al., 2001; Puce et al., 1996). Ventral part of temporal
cortex is known to respond to the form, color, and texture of objects, while lateral temporal
cortex is responds to the motion of objects (Beauchamp et al., 2002; Puce et al., 1998).
Since auditory-visual speech is considered more complex and also entails motion processing,
an anatomically well situated site for auditory-visual integration of speech would be
somewhere between auditory association cortex in the superior temporal gyrus and visual
association cortex in posterior lateral temporal cortex. Such site coincides near the superior
temporal sulcus, suggesting this region as a plausible AV speech integration site. Desimone
& Gross (1979) and Bruce et al. (1981) recorded from single neurons in the upper bank and
fundus of the anterior portion of the STS in macaque monkeys, an area known as the superior
temporal polysensory area (STP), and found that some units responded to all of visual,
auditory, and somatosensory stimuli.
The STP was also found to have distinguishable
cytoarchitecture from its surrounding cortex, and receive thalamic inputs unlike surrounding
cortex. Following these studies, multisensory neurons in STP have been repeatedly identified
and demonstrated to be responding to visual, auditory, and somatosensory stimuli by a
number of other investigators (Baylis et al., 1987; Mistlin and Perrett, 1990). In fMRI and
PET studies, the STS/G were found to respond to: auditory stimuli (Binder et al., 2000; Scott
et al., 2000; Wise et al., 2001), visual stimuli consisting of lipreading information (Calvert et
al., 1997; Calvert and Campbell, 2003; Campbell et al., 2001; MacSweeney et al., 2000;
MacSweeney et al., 2001; Olson et al., 2002), and audiovisual speech (Callan et al., 2001;
Calvert et al., 2000; Mottonen et al., 2002; Sams et al., 1991).
Moreover as mentioned above, the location of STS is anatomically well suited to be the
convergence zone for auditory and visual speech. The STP region in monkeys is shown to
have direct anatomical connections to a number of cortical areas; it receives inputs from
122
higher-order visual areas, including posterior parietal (Seltzer and Pandya, 1994) and
inferotemporal cortices (Saleem et al., 2000).
Connections from other areas to the STP
region also include reciprocal connections to secondary auditory areas of STG (Seltzer and
Pandya, 1991), premotor cortex (Deacon, 1992), Broca's area in human (Catani et al., 2005),
dorsolateral and ventrolateral prefrontal cortex (Petrides and Pandya, 2002), and somewhat
less direct connections from intraparietal sulcus (Kaas and Collins, 2004).
Inputs from
auditory cortical areas to the STP regions include input from the auditory belt (Morel et al.,
1993) and from the auditory parabelt (Hackett et al., 1998; Seltzer and Pandya, 1994). There
is also a possible reciprocal connection from STS to primary visual area as well.
Although many of the previous studies elected the STS region as the multisensory integration
site for audiovisual speech perception, some studies failed to support this claim. Olson et al.
(2002) demonstrated that the STS/G region did not show any difference in activation between
synchronous and asynchronous audiovisual speech, suggesting that STS may not be the
multisensory site where audio and visual components of speech are integrated. Instead of the
STS, they suggested the claustrum as a possible multisensory integration site. In an fMRI
study of the McGurk effect (Jones and Callan, 2003), the STS/G region did not discriminate
between congruent audiovisual stimuli (/aba/) and incongruent audiovisual stimuli (audio
/aba/ + visual /ava/), as one might expect to see in a multisensory integration site. Calvert et
al. (1999) also failed to show a greater activation in the STS for audiovisual speech over
auditory-only speech, although the auditory cortex showed a greater activation. However,
Callan et al. (2004) argued that the differences between studies that support the STS as a
polysensory integration site and those that do not lie in the nature of the stimuli used or the
presence of background noise. In their study, a spatial wavelet filter was applied to visual
speech stimuli to isolate activity enhancements due to visual speech information for place of
articulation, filtering out the activity resulting from processing gross visual motion of the
head, lip, and the jaw. This was done as an attempt to ensure that superadditive activation of
the STS is not due to greater attention (Jancke et al., 1999), but actually reflects multisensory
integration of visual speech information.
In any case, the results from this study also
supported the claim that STS is the place of AV speech convergence.
The authors also
suggested that different crossmodal networks may be involved according to the nature
(speech vs. non-speech) and modality of the sensory inputs.
A stimulus involving two
123
different modalities presented simultaneously at the same location has been shown to be able
to modulate the activations to the corresponding separately presented unimodal stimuli in
sensory specific cortical regions (Calvert et al., 1999; Calvert et al., 2000; Foxe et al., 2000;
Giard and Peronnet, 1999; Sams et al., 1991). However, the nature of the sensory inputs is
thought to determine what functional networks may be involved. For example, Calvert et al.
(2000) reported enhanced activity in the left STS when subjects saw and heard a speaker
reading a passage compared to audio-only, visual-only, and mismatched audiovisual
conditions and identified a cluster of voxels in the left STS as a multisensory integration site.
On the other hand, when non-speech stimuli were used, the superior colliculus was found to
display properties of overadditive and underadditive responses to congruent and incongruent
audiovisual non-speech stimuli, respectively (Calvert et al., 2001).
However, keeping up with the view that STS region is the auditory-visual speech
convergence site, our fMRI data collected while subjects were attending to the Visual-Only
and the Auditory-Visual speech conditions displayed significant activation in the superior
temporal cortex region for both conditions, although only the posterior portion was active for
the VO condition. The activation levels near STS region were distinctively higher than other
known polysensory regions.
Studies have also shown that regions of human STS show
preferential responses to biological stimuli, such as faces, animals, or human bodies (Haxby
et al., 1999; Kanwisher et al., 1997; Puce et al., 1995) than to other non-biological stimuli,
while middle temporal gyrus show strong responses to objects (non-biological stimuli)
(Beauchamp et al., 2002; Chao and Martin, 1999). Since our visual stimuli consisted of
lower half of speaker's face and since visual speech is purely biological, further supporting
the STS being actively involved in audiovisual speech integration. Our anatomical model for
the effective connectivity analyses assumed that the posterior superior temporal sulcus region
as the main site of auditory-visual speech convergence. The fit indices of our model (Section
4.2) also attest that having STS as the focal point of convergence was a reasonable hypothesis.
124
4.2.3.2
Visual -Temporal-Parietal Interactions
The NH Group
The path model for normally hearing subjects for the speechreading task displayed strong
positive connections from visual cortex to STS, both directly (V*STS) and indirectly
through AG (V4AG*STS).
The alternate pathway V4IPT@STS seemed to be less
prominent than V*AG*STS; the path coefficients for IPT@STS (0.109 for the right
hemisphere, -0.083 for the left hemisphere) are far less than the connection strengths of
AG*STS (0.456 for the right hemisphere, 0.761 for the left hemisphere).
In addition to examining the static patterns of connectivity for the speechreading network, we
also investigated changes in connectivity patterns associated with the availability of auditory
speech input. By studying such changes, if there are any, one can determine which regions
and connections are more imperative in the visual aspect of speech processing. The direct
connection from the visual area to the secondary auditory cortex (V*STS) increased in
strength when auditory information was absent. Although this difference between the tasks is
not statistically significant according to the pair-wise comparison test (P = 0.07, Z = -1.783),
it can be deemed acceptable with a less stringent criterion.
The left hemisphere for the
hearing subjects displayed more definite change in this particular pathway. The V*STS
pathway clearly seemed to be more active for the VO condition (0.442) than for the AV
condition (0.121) in normally hearing subjects with P < 0.05. So when auditory information
is no longer suitable for speech perception, the direct connection from visual area to STS may
be recruited to facilitate visual speech processing.
Continuing with exploring the left hemisphere of normally hearing subjects, statistics reveal
that V*IPT*STS becomes much more active in the AV condition. This trend was also
consistent in the right hemisphere, but the pair-wise comparison test did not yield statistically
significant differences between the two conditions for the V*IPT or IPT*STS pathways.
125
The CD and HA Groups
Unlike normally hearing subjects, the connection weight from V to STS for our congenitally
deaf subjects was -0.101, implying that this pathway is most likely not activated or
undertaken when deaf subjects are speechreading. Based on these results, the deaf subjects
appear to use more highly processed visual speech information rather than lower-level visual
input. Also the difference between V4IPT*STS and V4AGrSTS path coefficients is
considerably smaller in our deaf subjects, suggesting that pathway through IPT may be just
as critical as the pathway through AG in these subjects.
In comparison to the NH (Right) model, the CD (Right) model exhibited different patterns of
connectivity changes for conditions when auditory information is deprived versus available.
Most notably, the pathways involving AG, more specifically V*AG, AG0IPT, STS*AG
and AG4M exhibited significant changes in their connection strengths between the two
experimental tasks.
Both the V4AG and AG<IPT pathways displayed increased path
coefficients in the Visual-Only condition compared to the Audio-Visual condition.
The
STS*AG pathway seemed to have been inhibited more when the auditory speech is absent
than when it is available (the connection strength was -0.249 for the Visual-Only condition
and -0.67 for the Audio-Visual condition). In contrary to what we observed in the NH group,
the V*STS connections were weak for both conditions: -0.10 1 for the Visual-Only condition
and 0.090 for the Audio-Visual task. Finally, it should be noted that although our deaf
subjects are profoundly deaf by clinical definition, their residual hearing appears to be
utilized in differentiating the two experimental tasks, as this was more evident in our SEM
analysis than it was in the standard fMRI analyses.
The HA (Right) model behaved very similar to the CD (Right) model; the changes in
connection strengths were mostly in the same direction as the CD model, however some
pairwise comparison showed more statistical significant changes in connectivity patterns
between the VO and AV conditions.
126
Discussions
Results from our effective connectivity analyses provide strong evidence for a crucial role of
angular gyrus in visual speech processing in all subject groups. However in hearing impaired
subjects, the V4IPTcSTS pathway seems to be just as critical as the pathway through
angular gyrus.
The functional role of IPT is somewhat agreed to have associations with color information
processing, face, word and number recognition. Studies also claim that the IPT lobe is an
area that is strongly activated during face processing and is reported to be responsive to
gurning stimuli in both hearing and congenitally deaf individuals (MacSweeney et al., 2002a).
Also the visual cortex, particularly BA 19, is known to be well-connected with IPT. Hence,
involvement of IPT in auditory-visual speech perception was expected; however, it was not
really anticipated that the V4IPT@STS pathway would be significantly less active in the VO
condition than in the AV condition was not really anticipated. In sum, these coefficient
estimates support the view that the projection pathway from visual area to the angular gyrus
is the primary path of visual speech processing in normally hearing individuals.
As mentioned previously, angular gyrus is associated with an array of functional roles from
sound perception, touch, memory, speech processing, visual processing and language
comprehension to out-of-body experiences. From our study, the connectivity coefficients of
the path models tested make evident that angular gyrus is involved in auditory-visual speech
processing as well. This was particularly the case for the hearing impaired individuals. The
right angular gyrus was shown to be active in the CVCV Visual-Only condition and it was
also shown that only "good" speechreaders in the deaf group yielded significant activation in
right angular gyrus. However, the regression analysis of speechreading scores with neural
activities of the deaf group during speechreading did not implicate the right angular gyrus.
On the other hand, in hearing subjects, no activity was observed in angular gyrus for either
VO or AV conditions but it was shown to be active when subjects were categorized as good
and bad speechreaders (with fixed effects analysis), where the activation level was seen to be
higher in good speechreaders than bad speechreaders (Section 5.1.1). Also according to our
effective connectivity analyses, pathways (in both directions) between angular gyrus and
other nodes on the network were consistently found to be sensitive to experimental context
for all subject groups. Although it is unclear what the exact functional role of angular gyrus
might be, it seems to serve a critical role in visual speech perception when auditory
information is absent for hearing subjects, and when auditory information is available for
hearing impaired subjects. These findings may be indicating that angular gyrus is recruited
when information from a less familiar modality needs to be mapped onto a more familiar
modality.
It is also generally believed that multimodal integration follows the hierarchical processing
stream where sensory inputs enter their corresponding unimodal cortical areas and only after
unimodal sensory processing in primary cortical areas, multisensory processing occurs in
some multisensory region(s). This traditional view of hierarchical processing is challenged
by results from recent studies which suggest that multimodal integration might even occur
early in primary cortical areas (Fort et al., 2002; Giard and Peronnet, 1999). If subadditive
responses are considered not to be a requirement for a multisensory region as in Calvert et al.
(2001), auditory and visual cortices may also be potential polysensory sites, as they were
found to show superadditivity in audiovisual speech perception. Recent tracing studies have
revealed previously unknown direct feedback connections from the primary and secondary
auditory regions to peripheral regions of primary and secondary visual cortex (Falchier et al.,
2002; Rockland and Ojima, 2003). These connections are known to be sparsely projected to
the uppermost layer in primary visual cortex (VI), but they are much more densely projected
to both the upper and lower layers of secondary visual cortex (V2) (Rockland and Ojima,
2003). The connections between auditory and visual cortices are not known to be reciprocal,
but this may possibly be the case.
Somewhat in keeping with these recent findings we also found that anatomical models that
included a direct V to STS connection provided the best fit measures for structural equation
modeling analyses and in our hearing group, it was shown that direct pathway from V to STS
was strengthened when visual-only speech was perceived.
Although visual speech
perception does not involve actual integration of audiovisual speech per se, these known
anatomical connections between V2 to auditory cortices may somehow be utilized in
mapping visual speech to auditory percept.
Although not included in our anatomical model, another region of interest in parietal cortex is
the cortex of the intraparietal sulcus (IPS), a
very plausible auditory-visual speech
convergence site since it is a known polysensory region and is anatomically connected to
both visual and auditory cortical regions. The cortex of IPS, more specifically the posterior
third of the lateral bank of the IPS, is known to receive direct inputs from a number of visual
areas, including V2, V3, V3a, MT, MST, V4, and IT (Beck and Kaas, 1999; Blatt et al.,
1990; Nakamura et al., 2001). The auditory inputs to the IPS aseem to be less direct and
dense, but nonetheless they project from the dorsolateral auditory belt and parabelt (Hackett
et al., 1998). In agreement with anatomical and physiological evidence, functional data from
the current study also show activity in this region during auditory-visual speech perception.
4.2.3.3
Fronto-Temporal Interactions
The premotor/motor area (M) and inferior frontal gyrus (IFG) are two components of the
speech motor network that were included in our anatomical model. In addition to the fMRI
analyses results (Chapter 3), our SEM results further support the idea that motor-articulatory
strategies are employed in visual speech perception, as suggested in previous studies (Ojanen
et al., 2005; Paulesu et al., 2003).
Overall, it is evident that M and IFG are strongly influenced by the activity of STS while M
had little effect on STS; and IFG tend to have a significantly strong negative effect on STS in
all subjects during the VO conditions.
For the NH (Right) model, the path coefficients
clearly indicate that STS*M has enhanced connectivity strength for the Visual-Only task
compared to the Audio-Visual task.
Along the same line, the opposite pathway (i.e.
M*STS) had a negative connection during the Visual-Only condition, while a strong
positive connection was present in the Audio-Visual condition.
Another connectivity of
interest is IFG*STS which showed decreased negative connection strengths for the VisualOnly condition in comparison to the Audio-Visual condition.
129
One clearly distinguished pattern change between VO and AV tasks across the hemisphere is
the interaction between STS and M (both STS4M and M*STS), and IFG. Particularly, the
left IFG (Broca's area) is very strongly influenced by STS in both VO and AV conditions
(0.943, 0.924, respectively) and the reciprocal connection IFG*STS had strong negative
interactions (-1.021 for VO, -1.003 for AV); these parameter estimates are significantly
higher in magnitude compared to other connections in the network.
These results generally agree with the view that a listener's speech mirror neuron system
recruits speech motor regions to simulate the articulatory movements of the speaker during
visual speech perception, and uses it to facilitate perception when auditory information is
absent and gestureal information is available. The Broca's area is commonly known to be
activated during speech production (Friederici et al., 2000; Grafton et al., 1997; Huang et al.,
2002), but results from speech production studies (Huang et al., 2002; Wise et al., 2001)
seem to support the notion that Broca's area is not explicitly involved in controlling
articulation since activation is not associated with just oral speech, but with production of
sign language (Corina, 1999) and mere observation and imitation of some meaningful goaldirected movements as well (Grezes et al., 1999; Koski et al., 2002). It may be that Broca's
area's role is not just limited to speech production, but rather encompasses general
mechanisms for multimodal perception and action. In support of this, results from the current
SEM analyses provide evidence for connectivity between STS and Broca's area, which
seems to be facilitating auditory-visual speech integration.
The STS*M pathway in right hemisphere showed bias towards the Visual-Only condition,
whereas the converse was true for the connection in left hemisphere, in which the connection
was stronger for the Audio-Visual condition.
The M*STS pathway connections were
negative for both hemispheres in the Visual-Only condition, but displayed a strong positive
interaction from M to STS in the Audio-Visual condition.
130
4.2.3.4
Network Differences between NH and CD Groups
To systematically investigate if significant network differences in two of our subject groups the hearing and the congenitally deaf subject groups - existed for our anatomical model,
another multi-group analysis was performed for the following four different sets of nested
model: CVCV Visual-Only (Left), CVCV Visual-Only (Right), CVCV Audio-Visual (Left),
and CVCV Audio-Visual (Right). The null model for this analysis restricted the parameter
estimates to be equal for both the CD and the NH subject groups, and the alternative free
model allowed parameters to take on different values for each subject group.
In this
particular multi-group analysis, 'multi-group' consisted of data from two different subject
groups whereas in the analyses described in the previous section, 'multi-group' represented
data from two distinct experimental tasks. So, essentially with the same data, but with
different partitions of data, we were able to analyze another set of model comparisons. Same
array of fit indices and statistics were used as previously, these measures and indices are
listed in Table 4-2.
Using identical criteria for evaluating the overall fit and model
comparisons, only the AV (Right) model was found to satisfy all the requirements to be
properly interpreted; its estimated parameters are listed in Table A-4 and summarized by the
path diagrams in Figure 4-9 and 4-10. Although comparison between the subject groups for
the VO (Right) model did not meet the conventional level of significance 0.05 (P = 0.067),
since other indices were well within the threshold values we decided to include the results
from this model as well.
In Figures 4-9 and 4-10, the thicker black arrows represent
connections that increased in strength for the hearing subjects (interpretable as connections
that decreased in strength for the deaf subjects), and the thicker blue arrows indicate
connections that increased in strength for the deaf subjects (or decreased in strength for the
hearing subjects). The black text represents estimates for the NH group, and the blue text is
used for the CD group.
By examining the results from the pair-wise comparison tests (Table A-4 in Appendix A) and
as shown in Figures 4-9 and 4-10, the following pathways were found to be different across
the subjects groups: V*STS, VcIPT, STS*IPT, STS*AG, AG*M and IFG >M.
In
particular, VcSTS and STS*AG pathways were stronger for NH subjects during the VO
condition (Figure 4-9), but the VrSTS pathway was less inhibited for CD subjects in the AV
condition (Figure 4-10). Based on these results, it can be conjectured that the pathways from
V to STS and then to angular gyrus take part in more prominent roles in normally hearing
individuals during visual-only speech perception than in CD subjects. The V*STS pathway
also seems to be less involved in auditory-visual speech integration task for NH subjects than
in CD subjects. The stronger connectivity between IFG and M was also observed in hearing
subjects, suggesting that a stronger interaction exists between these two regions in NH
subjects. However, the AG*M pathway strength for deaf subjects (0.326) was almost twice
the value of hearing subjects' (0.167), and this pattern also was consistent in the Visual-Only
(Right) model.
In this particular analysis, the significant changes in connection strengths can be considered
to support a more of a physiological or anatomical explanation than the previous analysis
since the deaf individuals likely have different anatomical connectivity patterns due to lack of
acoustic sound exposure. So, here the differences in network patterns can reflect anatomical
differences between two subject groups, whereas in previous SEM analyses, the differences
in connectivity patterns represented more of a change in the interactions between cortical
regions for certain tasks.
132
Goodness-of-fit Index Criteria
Stability Index
Model
x2
P
RMR
GFI
AGFI
Unconstrained
.581
.997
.003
1.000
Constrained
24.211
.283
.041
.996
Unconstrained
7.757
.256
.020
.999
.991
.012
Constrained
31.644
.064
.040
.995
.990
.016
12.699
.048
.013
.998
.986
.023
RMSEA
VO
.999
.000
.866
.992
.009
AV
Model
Comparison
)diff
df=15
P
23.629
.072
23.887
.067
32.097
.006
40.763
.000
VO (Left)
.996
.893
VO (Right)
.065
.189
.237
AV (Left)
Unconstrained
44.795
.002
.037
.993
.986
.024
Unconstrained
7.207
.302
.009
.999
.992
.010
Constrained
47.970
.001
.036
.992
.985
.025
Constrained
.851
.884
4.104
AV (Right)
.240
.261
.261
Table 4-2 Goodness-of-fit and stability indices of SEM models for the CVCV Visual-Only and CVCV
Audio-Visual conditions: both null (constrained: CD - NH) and free (unconstrained) models for each
hemisphere [P < 0.05 for model comparison (last column) represents a significant difference between
the constrained and unconstrained models].
Figure 4-9 VO (right): estimated path coefficients [black text: NH, blue text: CD; thicker black
arrows: connections with significant increase in strength for the NH group, thicker blue arrows:
connections with significant increase in strength for the CD group].
133
Figure 4-10 AV (right): estimated path coefficients [black text: NH, blue text: CD; thicker black
arrows: connections with significant increase in strength for the NH group, thicker blue arrows:
connections with significant increase in strength for the CD group).
4.3
Dynamic Causal Modeling
There are several known weaknesses associated with SEMs, the main reason being the fact
that fMRI data are probably dominated by observation error (Friston et al., 2003).
As
previously mentioned, DCM utilizes much more sophisticated mechanisms to incorporate
hemodynamic response modeling of the neuronal activity in different regions and to
transform these neuronal activities into a measured response.
Summarizing briefly, dynamic causal modeling treats the brain as a deterministic nonlinear
dynamic system with multiple inputs and outputs (MIMO). By doing this, the problem of
measuring connectivity reduces to a fairly standard nonlinear system identification procedure
which uses Bayesian estimation for the parameters of a deterministic input-state-output
dynamic system.
As done in structural equation modeling, it is necessary to construct a reasonably realistic
neuronal model of interacting brain areas. However, as stated above, in dynamic causal
modeling, the neuronal model is supplemented with a forward model which describes the
synaptic activity and its relationship to observed data. In terms of fMRI measurements, the
hemodynamic response model would be the supplementary forward model.
Since this method has not been available very long, there are relatively few studies that have
implemented dynamic causal models to analyze neural connectivity. Friston et al. (2003)
have shown that the results obtained from dynamic causal models are consistent with those
obtained using structural equation modeling (Buchel and Friston, 1997) and Volterra
formulation (Friston and Buchel, 2000). Mechelli et al. (2003) combined fMRI and dynamic
causal modeling to investigate object category effects and obtained some promising results.
Penny et al. (2004a) recently published a paper on formal procedure for directly comparing
dynamic causal models and their hypotheses.
In this section, a summary of the theory
underlying DCM is presented, followed by the results obtained from the DCM analyses we
performed on our study.
4.3.1
Theory
Consider a model shown below.
Some indirect
modulatory inputs
x
v
u2(t) (e.g. attention)
w2
1
Perturbing direct inputs
Stimuli-bound u,(t)
yz
(e.g. visual speech)
3
Y3
y1
Y4
Y2
4 Outputs from 4 regions
Figure 4-11 Example DCM model [adopted from Friston et al. (2003)].
The above model consists of m inputs (m = 1) and I outputs (one output per region, 1 = 4)
where:
" m inputs: The inputs correspond to experimental design (e.g., boxcar or stick
stimulus functions) and are exactly the same as those used to form design matrices in
conventional analyses of fMRI.
" I outputs: Each of the I regions produces a measured output that corresponds to the
observed BOLD signal. In this example, there are 4 regions, therefore 4 outputs.
Each of these regions defined in the model also consists of five state variables. Thus, in this
particular example, the entire model would have twenty state variables in total (4 regions
multiplied by 5 state variables).
The five state variables for each region are:
1. neuronal activity (z),
2. vasodilatory signal (s),
3. normalized flow (/),
4. normalized venous volume (v), and
5. normalized deoxyhemoglobin content (q).
z
z
g(q,, v)
Activity-dependent signal
= Z - KS 1
Neuronal Input
- 1 (f1 - 1)
Flow induction
z
y-
f, = s,
9 Sfi
=
f,
- -Iva1i
Changes in dHb
z
Changes in volume
--
>
z
,4 =fjE(f ,p,)1p -vi"q, Iv, +z
y = g(q,v)
Hemodynamic
response
Figure 4-12 The hemodynamic model [adopted from Friston et al. (2003)].
Four of these state variables (state variables s, f v, q) correspond to the state variables of the
hemodynamic model presented in Friston et al. (2000) as shown in Figure 4-12. These state
variables are of secondary importance since these variables are required to compute the
observed BOLD response for one particular cortical region only, and are not explicitly
influenced by the states of other brain regions. The first state variable listed above (neuronal
activity, z) of each region plays the central role in estimation of the effective connectivity.
137
The neuronal activity state variable corresponds to neuronal or synaptic activity and is a
function of the neuronal states of other brain regions.
Each region or node in DCM has its associated hemodynamic model g(qv). The function
g(q,v) takes the five associated state variables for that particular region in question, and
computes the estimated output y. Again, four of these state variables are independent of what
is happening in other regions of the model - only the state variable z plays a role in
interactions between regions.
The flow diagram in Figure 4-12 depicts steps involved in calculating the output of a specific
region, where the set of five state variables, {z, s,
f
v, q} are used in computing the
hemodynamic response y. The state equations for the last four state variables {s, f v, q},
known as the hemodynamic state equations, are as follows:
Si = Z, -- KS, - 7(f,
f,=
S= f,
4q = fE(fi, p)
-1)
Si
- vI
p, -v
I'aq, Ivi.
The output equation used is:
yi = g(qi,v)
=
V(k,(1-q,)+ k2 (1-qi /v,)+k
3 (1-v,)),
where
k =7pi,
k2
=2,
and
k3 =
2p, -0.2
Thus far, there are five unknown parameters in computing the hemodynamic responses and
they are:
oh
={Kyvr zap}I
138
As for the last state variable z, since, for instance, the state variable z, is influenced by the
states of other regions (i.e. z 2 , z, z4 ), the state equations become much more complicated.
Let z = [zi,z2,z3,z 4 ] represent the set of four neuronal states for four different regions of
our example model. Then, simply put, the state equation of z is i = F(z, u, 0), where F is
some nonlinear function describing the neurophysiological influences that activity z in all 1
brain regions and inputs u exert upon activity changes in the others.
The theta variable
represents the parameters of the model whose posterior density we require for making
inferences.
Since F is some nonlinear function, a bilinear approximation is used to estimate the state
equations. Note that any linear approximations to nonlinear function F can be implemented
in this step. Here, a bilinear approximation was chosen by the inventors of dynamic causal
modeling because it is fairly simple to implement while providing a good approximation.
However, their most important the key reason was that using bilinear terms allows for the
interpretation of an experimental manipulation as activating a 'pathway' rather than a specific
cortical region - as further explained below.
The bilinear approximation reduces the parameters to three distinctive, independent sets
(Friston et al., 2003; Penny et al., 2004b):
1. the direct or extrinsic influence of inputs on brain states in any particular area
2. the intrinsic or latent connections that couple responses in one area to the state of
others, and
3. changes in this intrinsic coupling induced by inputs
After applying the bilinear approximation, the neuronal state equation becomes:
z ~ Az+
u,B'z+Cu = (A+ZujB-j)z+Cu
A
A-8F a-
=
_
--z
az az
139
B'j = aF=----2
zou. au az
j
i
8F
C
Expansion into matrix form, showing all regions, results in the following equations:
- -
Latent Connectivity
a21
a22
a42
0
...
Induced Connectivity
b2
a33
..z
-
aa
a 23
a53
a,
a44
a45
a54
a5 ,
+u 2 :
z=(A+
:2
z
F
c
+
0~
:
:
0
0_
b42
0
0
---
Jz
uB')z+Cu
Here, another set of parameters, Oc = {A, B', C} are the unknowns; these are the connectivity
or coupling matrices that we wish to identify to define the functional architecture and
interactions among brain regions at a neuronal level.
These connectivity matrices (parameters of the model) are interpreted as follows:
" The Jacobian or connectivity matrix A represents the first order connectivity among
the regions in the absence of input.
*
The matrices B' are effectively the change in coupling induced by the jth input. They
encode the input-sensitive changes in
ai /az
or, equivalently, the modulation of
effective connectivity by experimental manipulations. (Because B' are second-order
derivatives these terms are referred to as bilinear.)
" Finally, the matrix C embodies the extrinsic influences of inputs on neuronal
activity.
Returning to the original dynamic causal model in Figure 4-11, when all the state variables
are included, an overview of the model can be diagrammed as shown in Figure 4-13.
140
Finally, the full dynamic causal model can be summarized by 3 equations:
{z,s,f,v,q
x =
i
=
f(x,u,0), and
y = 2(x),
with parameters, 0= {oc,oh ,where Oc = {A,BI,
C and Oh
=
{KY , a, pI
Set
i4 =a,4z 4 + a4,z, + (a4 2 + ub;2)z,
/U2
Z4
Stimuli
g(V4, q4)
U1
yf
i =a,z, +a 4 z 4 +a 3 z_
=a,z,+a,3 z+c 1 u,
Z2
ZI
g(v1 , q.)
g(v 2 , q 2 )
i3
y1
-'
a 3z3+a3,z,
y2
Z3
g(v,, q 3 )
Y3
Figure 4-13 Example DCM model with its state variables [adopted from Friston et al. (2003)].
Given this full model, the parameters are estimated using Bayesian estimation along with the
expectation-maximization (EM) procedure.
p(Ojy)
oc
p(y|O)p(O)
The models of hemodynamics in a single region, and the models of neuronal states are used
to compute p(y 9), whereas the prior constraints are used for estimating the priors on the
parameters p(O). The hemodynamic priors encoded in the DCM technique are those used in
Friston (2002), which are basically the mean and the variance of the five hemodynamic
parameters obtained from 128 voxels using the single word presentation data.
Model Interpretation
Threshold
The final results of dynamic causal modeling are in the form of probabilities, providing a
measure of the level of statistical confidence. From this the posterior probability distribution
of coefficients (elements of matrix A, B and C) can be computed.
In order to obtain the final coupling strengths, it is ncessary to specify the threshold in Hz.
The coupling strength is specified in activity per second (i.e. Hz) and the result is a
probability that the effect is equal or greater than the specified threshold. Normally zero is
used as the threshold.
Connections
The strength of the intrinsic connections reflects coupling strength in the absence of
contextual modulation, computed across the entire time series. A given parameter from A
describes that component of the change in the neuronal state of a region which depends on
the neuronal state in other region(s). So, intrinsic connections in DCM are computed as the
average across the entire time series, which includes the time points corresponding to the
condition of interest.
4.3.2
Results
The anatomical model constructed for SEM analyses was used in the DCM analyses as well,
but with the single addition of a direct input (Figure 4-14).
Since DCM does not
accommodate bidirectional connections, the pathways between IPT and AG, and IFG and M
were changed to unidirectional connections IPT*AG and IFG'*M, respectively. The visual
speech signal was considered to be a direct perturbing input to the DCM (connected to region
V specifically). In SEM, there was a set of connections with statistically significant changes
between VO and AV conditions. The difference between the two conditions was presence vs.
absence of auditory speech signal. So in DCM, the presence or absence of auditory speech
signal was considered as the "modulation" or "context-dependency" rather than a direct input.
Those connections shown to have significant differences in parameter estimates were
hypothesized to be subjected to the modulatory effect in our DCM models (Figure 4-14).
Since we were able to analyze four different SEM models - which are NH (Left), NH (Right),
CD (Right), and HA (Right) - only these four models were tested further with the DCM
method, and the connectivity matrices were compared to the parameter estimates obtained
from SEM.
M
AG
IFG
Visua.l
IPT
V
Figure 4-14 The anatomical model for DCM analyses.
Local maxima were identified for each subject using the same procedure as in SEM analyses.
Activities in these regions were extracted using the voxels of interest (VOI) time-series
extraction option in SPM for two experimental conditions (CVCV VO and AV conditions).
The VOIs were spherically shaped with 5 mm radius and were created for each run and for
each subject. Since 7 to 10 runs were collected for each subject, 7 to 10 DCM models were
constructed for each subject (for each hemisphere in the NH group). Once the parameters of
all DCMs were estimated, they were averaged to form a single subject group DCM for both
hemispheres: NH (Left), NH (Right), CD (Right), HA (Right). Averaging was performed
using Matlab scripts from the DCM toolbox (Penny et al., 2004a).
Of the four DCM models tested, only the NH (Left) and the HA (Right) models' parameters
converged to finite values. The estimated values for the intrinsic connectivity matrix for the
NH and HA groups are shown in Figures 4-15 and 4-16 and summarized in Table A-5 (in
Appendix A). The DCM parameter estimates for the CD group and the right hemisphere
model for the NH group failed to converge; hence only the results for the hearing and hearing
aid groups are presented in this section. In Figures 4-15 and 4-16, the numerical values in
black font color represent entries from the intrinsic connectivity matrix, whereas blue font
color is used to display the modulatory effect sizes (i.e. entries from matrix B) for the
presence auditory speech signal. We also investigated the modulatory effects of the absence
of auditory speech signal; these results are presented in Appendix A. The corresponding
posterior probability values for intrinsic connection and modulatory effect strengths are not
displayed in Figures 4-15 and 4-16 (listed in Table A-5), but all statistically significant
connection strengths (posterior probability >= 0.90) are shown as solid arrows and nonsignificant (posterior probability < 0.90) connections as dashed arrows. As for modulatory
effect strengths, only the statistically significant values are shown in the figures. In other
words, the pathways with values in blue color are the pathways that were modulated by the
context of the experiment, which in this study was the presence of auditory speech signal.
M
.025
.021
,.113
.102
.002
-.005%%
'
.243
IFG
Visual
Speech
-. 2
.003
.031
.235
Figure 4-15 NH (left): Results from DCM analysis [black: intrinsic connection estimates for both
conditions combined; blue: modulatory effect estimates when auditory speech is presenti.
-.103
018
s
.315
-.093
I
.282
.101
.424
Visual
Speech
-.200
IFG
.316
.046
--
'
Figure 4-16 HA (right): Results from DCM analysis [black: intrinsic connection estimates for both
conditions combined; blue: modulatory effect estimates when auditory speech is present].
145
Since a direct comparison of magnitudes cannot be made between parameter estimates
obtained from DCM and SEM, only qualitative observations are made. The results from the
DCM analyses are in approximate agreement with the connectivity values obtained from
SEM. The intrinsic connectivity values show that V*AG*STS is stronger than
V4IPTt-STS for both the NH and HA subject groups, although the differences are smaller
than in the SEM analysis. The pathways that were found to be sensitive to the experimental
conditions in SEM analysis were also found to have statistically significant modulatory
effects in DCM analysis. There were two exceptions to this observation: AG*STS in NH
(Left), and V*IPT in HA (Right) models. However, the general trend of changes was similar
in both SEM and DCM analyses. For example, when there were increases in connection
strengths for the AV condition compared to the VO condition for SEM, the same pattern of
changes were also present in DCM as represented by positive values of modulatory effect
strengths. In particular, recruitment of the VC>IPTr>STS pathway recruitment for AV speech
integration in NH subjects was also found to be true in DCM analyses.
One notable
difference is that for the hearing subjects' model, the intrinsic connection from V*STS was
equal to zero, suggesting that these two regions are not connected at all. This conflicts with
what was reported in our SEM analyses, indicating that our general anatomical model might
not be comprehensive and robust enough to produce consistent results across different
analytical frameworks. However, most results are in agreement, and the results are further
discussed in the next chapter.
5
Summary of Results and Discussion
Summaries of results on the measures of speechreading and the identification of cortical
networks for AV speech perception in the three subject groups are presented in Section 5.1
and 5.2, along with some key findings from connectivity analyses, followed by concluding
remarks and future work in Section 5.3.
5.1
Normally Hearing
The normal hearing subjects displayed similar regions of cortical
activation for
speechreading (i.e. VO task), and audiovisual speech integration - except that auditory
cortical areas were considerably more active for the AV condition. The cortical areas found
to be active for speechreading included: visual cortex, auditory cortex (but not primary
auditory cortex), speech motor network areas (which include lip area of primary motor cortex,
premotor cortex, inferior frontal gyrus, left insula and supplementary motor area),
supramagrinal gyrus, thalamus, superior parietal cortex and fusiform gyrus. Thus, results
from our study add to existing evidence of the engagement of motor-articulatory strategies in
visual speech perception.
We also found that an individual's ability to process visual speech is related to the amount of
activity in superior temporal cortical areas, including primary auditory cortex (Al), pre-SMA,
IFS and right AG where good speechreaders showed greater activation in Al and right AG
and less activation in pre-SMA and IFS. This result helps to resolve contradictory findings
and claims from recent studies on whether visual speech perception (watching articulatory
gestures) can activate the human primary auditory cortex by the claim that all subjects'
speechreading ability varies widely from person to person and that it is significantly
correlated with activities in auditory cortical areas in visual speech perception.
Although the brain regions included in the VO and AV speech perception networks
overlapped extensively, the dynamics or interactions among these cortical areas seemed to
147
differ significantly across the two tasks, as demonstrated in effective connectivity analyses.
In particular, interactions in the left hemisphere, among visual areas, angular gyrus and the
posterior part of superior temporal cortex (V*AG*STS and V*STS) seemed to be the
prominent pathway from visual to temporal cortex when hearing participants were
speechreading (Figure 5-1), while connections between visual areas, inferoposterior temporal
lobe and posterior superior temporal area (Vt*IPTc*STS) were significantly active only
during audiovisual speech integration (Figure 5-2).
Consistent with a finding that the
Vc*AG*STS pathway increases its connection strength in the VO condition, the contrast
map obtained from subtracting the AV condition from the VO condition showed significant
activity in left angular gyrus. The left angular gyrus was less likely to be inhibited during the
VO condition than when the auditory speech signal was available, indicating that left angular
gyrus may be recruited when the task of speech perception becomes more difficult, i.e.,
without any auditory information.
The premotor/motor area and IFG were strongly
positively influenced by the activity of STS, and IFG showed a significantly strong negative
effect on STS.
M
IF
AG
IPT
Figure 5-1 NH subjects for the CVCV Visual-Only condition [black arrow: positive connection, blue
arrow: negative connection; thin arrow: weak connection; thick arrow: strong connection].
148
Figure 5-2 NH subjects for the CVCV Audio-Visual condition [black arrow: positive connection, blue
arrow: negative connection; thin arrow: weak connection; thick arrow: strong connection].
5.2
Hearing-Impaired
The speechreading network for the congenitally deaf subjects included most regions that
were reported in Calvert's study (1997), including Heschl's gyrus, right angular gyrus,
cerebellum and regions around right inferior frontal sulcus (IFS) in the frontal lobe. There
were several notable differences in cortical activation patterns for the hearing impaired
subjects in comparison to hearing subjects. First, there was far less activity in visual areas
and significantly more activity in auditory areas during VO and AV conditions for CD
subjects. This result is probably due to neural plasticity: a lack of acoustic input from the
earliest stages of development results in neural reorganization, moving much of visual speech
processing from visual areas to functionally vacant auditory cortical areas. As a caveat, we
observe that this result also may be due to the baseline condition not being an appropriate
control condition for our hearing impaired subjects.
Furthermore, there was a clear right hemisphere bias for both conditions for the CD group.
The amount of hemispheric bias seemed to be greater for the CD group than the HA group;
149
however both hearing impaired subject groups yielded a significantly larger amount of
activity in the right hemisphere. This was also confirmed in a simple regression analysis: the
amount of HA users' hearing impairment was found to be more significantly correlated with
neural activity in right STS/G regions than left STS/G. Finally, activations in the frontal lobe
near inferior frontal sulcus region also seemed to be generally related to the amount of
hearing loss, as it exhibited the greatest amount of activity in the CD group, somewhat less in
the HA group, and not at all in the NH group. The IFS area also seemed to be more active in
deaf participants with good speechreading skills than those with poor speechreading abilities.
IFS activity was not found to be correlated with amount of hearing impairment, but it was
shown to be correlated with speechreading test scores.
Additionally, the amount of acoustic signal gained by using hearing aids was significantly
correlated with right inferior frontal gyrus activity, whereas the amount of acoustic "speech"
signal gained was correlated with left inferior frontal gyrus (Broca's area) activity. These
results suggest that the right and left IFG may be crucial components in learning or adopting
new sound and speech information respectively.
The SEM analyses for the CD and HA groups, and the DCM analysis for the HA group
produced similar results in terms of identifying pathways that may underlie AV speech
perception.
Although the standard fMRI analyses yielded no statistically significant
differences between activation maps for CVCV Visual-Only and CVCV Audio-Visual
conditions in hearing impaired subject groups, we were able to detect differences using
effective connectivity analyses.
The direct pathway from visual areas to the superior
temporal sulcus region (V*STS) was actually found to be weak for both the VO (Figure 5-3)
and AV (Figure 5-4) conditions for the CD group. This result contrasts with that from the
NH group's model, in which V*STS was strengthened for the VO condition. Overall, the
CD group's network pattern for the VO condition was found to be more similar to the NH
group in the AV condition than the VO condition, where the pathway involving IPT seem to
be active. However, there seems to be more of a strong interaction between M and IFG in the
hearing impaired subjects than in hearing subjects.
150
The pathway from V AGcMc*STS was also found to be recruited for speech perception in
hearing impaired groups when they were presented with residual acoustic information in AV
conditions.
M
AG
IPT
Figure 5-3 Hearing impaired subjects for the CVCV Visual-Only condition [black arrow: positive
connection, blue arrow: negative connection; thin arrow: weak connection; thick arrow: strong
connection].
Figure 5-4 Hearing impaired subjects for the CVCV Audio-Visual condition [black arrow: positive
connection, blue arrow: negative connection; thin arrow: weak connection; thick arrow: strong
connection].
5.3
Concluding Remarks and Future Work
This dissertation addressed the question of how humans integrate sensory information; or
more specifically how auditory and visual speech information are fused to form a single
speech percept. We examined cortical networks that may underlie auditory-visual speech
perception in hearing participants.
We also focused on what kind of effects sensory
deprivation would have on the auditory-visual speech perception network by studying two
separate groups of hearing impaired individuals. Most studies on this topic have reported
results from standard neuroimaging analyses in which only static activation patterns were
obtained. We have supplemented standard fMRI analyses with SEM and DCM analyses to
explore the dynamics or interactions amongst brain regions, and also conducted
psychophysical tests to measure speechreading skills and examined their correspondences to
cortical activation patterns. Overall, to our knowledge, this dissertation research far the most
comprehensive study to date that has investigated neural processes associated with auditoryvisual speech perception and effects of hearing status on these neural processes.
As with any study, there are areas where more work can be done to yield additional
informative results. Most research studies involving subjects with impairment are faced with
problems of confounding factors and variability within the subject group; this was the case
for our hearing impaired subject populations. We attempted to overcome this problem by
performing a number of regression analyses in our hearing aid user group, to quantify
relationships between characteristic measures and activation patterns, while selecting a more
homogeneous congenitally deaf group. Clearly, more subjects with varying characteristics
would have added more to the findings obtained from the hearing aid group.
The effective connectivity analysis methods implemented are far from perfect, and as these
methods continue to be created, developed and improved, there will be opportunities for
using them in further investigations. Additionally, the anatomical model in our analyses is
not completely accurate or comprehensive, as evidenced by the lack of convergence in some
SEM and DCM results.
This was not unexpected since we implemented one generic
anatomical model for three different groups of subjects and even within each group there
were distinct differences in their speechreading skills and amount of hearing impairment.
Optimally, the most appropriate model for each of the different groups would have been
identified and constructed, but such an approach also comes at a high cost of time.
Furthermore, our current knowledge of human brain connectivity is still very limited. FMRI
data or any other neuroimaging data alone cannot provide exact criteria for determining and
identifying anatomically accurate regions and interconnections associated with specific
functional tasks. While anatomical data will provide more anatomically accurate models, it
will be difficult to keep the models simple enough to be tractable, since determining key
components of the model will require additional functional information. Ideally, data sets of
different forms (i.e., anatomical, physiological and functional) should be combined and
assessed collectively when constructing anatomical models for effective connectivity
analyses.
As for ensuring neuroanatomical accuracy of the models, most of the current knowledge of
the anatomical cortical circuits in humans is based on extrapolating from studies in monkeys.
Although there are many homologues between human and monkey brain regions,
corresponding regions and their boundaries are not always clearly defined and functional
differences exist in some constituent areas of cortex. So, the region definitions based on
anatomical, physiological, and functional data may not always agree, and the differences
among these methods may even give rise to conflicting outcomes.
Fortunately, recent
advances in diffusion tensor imaging and tracing technology show promise for resolving this
issue.
This research community has been moving towards producing more of a detailed
picture of human neuroanatomy, hence the results from these studies should be used in
building more neuroanatomically accurate models in the near future.
Appendix A
Model
CVCV Visual-Only
CVCV Audio-Visual
Est'd
Est'd
Path
Coeff
Std
Error
P
Path
Coeff
Std
Error
Pair-wise
Comparison
Critical
P
Ratio for
P
Diff
NH (Left)
V+AG
.357
.035
***
.265
.031
***
-2.242
*
V 4 IPT
.445
.029
***
.567
.028
***
2.185
*
V + STS
.442
.126
***
.121
.110
.082
-2.549
**
IPT
* AG
.126
.018
***
.171
.020
***
1.448
STS
.761
.128
***
.644
.124
***
-. 800
STS 4 AG
.140
.057
*
.141
.061
*
.188
IPT 4 STS
-.083
.126
.555
.342
.135
*
2.113
*
STS + IPT
.143
.053
*
-.061
.056
.464
-2.260
*
STS 4 M
.313
.107
.399
.138
M + STS
-.005
.212
.967
-.017
.284
.939
-.050
STS 4 IFG
.943
.057
***
.924
.068
***
-.486
IFG 4 STS
.604
AG
+
**
**
.700
-1.021
.225
***
-1.003
.216
***
FG * M
.159
.036
***
.172
.044
**
.407
IPT + M
.060
.035
*
.051
.047
.261
-.302
AG 4 M
.086
.066
.208
-.034
.069
.558
-1.636
V+AG
.324
.042
***
.287
.033
***
-1.744
V 4 IPT
.398
.047
***
.421
.029
***
V 4 STS
.229
.093
**
.090
.069
IPT -
NH (Right)
AG
.240
.034
***
.294
.023
AG 4 STS
.413
.110
***
.298
.113
STS 4 AG
.057
.089
.140
.087
.891
-1.718
***
*
1.444
-1.783
1.025
IPT 4 STS
.109
.181
.178
.113
.434
STS 4 IPT
.112
.128
-.002
.076
-1.202
STS + M
.513
.107
.067
.111
-2.335
*
M 4 STS
.554
.156
**
2.697
**
.529
.052
***
1.621
-.585
.108
***
-2.471
-.104
.135
STS 4 IFG
.456
.047
IFG 4 STS
-.203
.093
***
***
*
FG < M
.293
.017
.281
.023
***
-.375
IPT 4 M
.062
.053
.185
.050
***
1.007
AG 4 M
-.097
.061
.080
.052
***
*
1.930
Table A-1 SEM Results for the NH group models (left and right hemispheres). Estimated path
coefficients are shown for the CVCV Visual-Only and CVCV Audio-Visual conditions in the
unconstrained model [*** = P < 0.001; ** = P < 0.01; * = P < 0.051.
154
Model
CVCV Visual-Only
CVCV Audio-Visual
Est'd
Est'd
Path
Coeff
Std
Error
P
Path
Coeff
Std
Error
Pair-wise
Comparison
Critical
P
Ratio for
P
Diff
CD (Right)
V 4 AG
V 4 IPT
.392
.038
***
***
.363
.032
-.101
.082
.377
.052
***
AG 4 STS
.594
.120
+ AG
+ STS
-.249
.126
.417
.103
STS 4 IPT
V
+
STS
' AG
IPT
STS
IPT
-.154
.094
STS 4 M
.375
.144
M 4 STS
-.093
.200
.294
.043
***
-2.310
***
-1.103
.316
.036
.090
.091
2.425
*
.284
.039
***
-2.101
*
***
.414
.182
*
-1.557
*
-.067
.156
2.308
***
.164
.154
-1.847
**
.013
.129
1.585
.208
.206
-.863
.104
.298
.705
STS 4 IFG
.462
.042
***
.579
.069
IFG 4 STS
-.203
.073
**
-.328
.130
.302
.021
***
.248
.031
IPT 4 M
.026
.058
.046
.048
.316
+
023
-.
.076
.168
.090
2.124
IFG
AG
M
M
*
***
*
***
*
1.750
-.854
-1.742
*
Table A-2 SEM results for the CD (right hemisphere) model. Estimated path coefficients are shown
for the CVCV Visual-Only and CVCV Audio-Visual conditions in the unconstrained model [*** = P
< 0.001; ** = P < 0.01; * = P < 0.05].
155
Pair-wise
Pariso
CD Subjects
NH Subjects
Comparison
Model
Est'd
Path
Coeff
Est'd
Std
Error
P
Path
Coeff
Std
Error
Critical
P
Ratio for
P
Diff
VO (Right)
V + AG
.361
.060
***
.392
.041
***
.628
V + IPT
.377
.050
***
.344
.031
***
-.623
STS
.204
.102
-.008
.082
.924
-2.439
*
*
V
+
*
@AG
.231
.041
***
.320
.042
***
2.012
AG 4 STS
.474
.158
**
.501
.131
***
.294
STS 4 AG
.020
.145
.888
-. 166
.132
.209
IPT
-2.598
IPT 4 STS
.132
.174
.447
.231
.110
*
.602
STS 4 IPT
.103
.140
.464
-.015
.095
.873
-.930
STS+
M
.336
.122
**
.078
.128
.541
-1.318
M 4 STS
*
1.585
-.069
.149
.645
.306
.154
STS + IFG
.513
.052
**
.540
.056
***
IFG + STS
-. 190
.090
-. 364
.081
***
-1.539
-.672
*
.470
IFG t M
.281
.018
.263
.021
***
IPT + M
.105
.043
*
.112
.049
*
.092
AG 4 M
-.034
.071
.638
.127
.076
.094
1.360
V 4 AG
.385
.037
***
.333
.030
***
-1.207
V 4 IPT
.493
.030
***
.511
.026
***
V 4 STS
.495
.117
*
.100
.094
.285
-3.196
.137
.019
***
.183
.020
***
1.697
*
**
VO (Left)
IPT
@AG
.457
AG 4 STS
.751
.115
***
.762
.124
***
STS + AG
.113
.063
.073
.032
.063
.609
-1.394
IPT 4 STS
-.072
.125
.566
.147
.118
.211
1.275
STS 4 IPT
.131
.055
*
-.013
.054
.805
-1.886
STS 4 M
.312
.100
*
.223
.165
.176
-.509
M4 STS
.007
.186
.971
.268
.309
.386
.805
STS 4 IFG
.946
.058
***
.965
.068
***
.358
IFG 4 STS
-1.026
.203
***
-1.127
.215
***
-.832
IFG
**
.083
M
.157
.032
***
.101
.057
.075
-.956
IPT 4 M
.071
.034
*
.057
.040
.149
-.281
AG 4 M
.087
.063
.168
.153
.094
.104
.632
Table A-3 SEM results for the CVCV Visual-Only condition models (right and left): estimated path
coefficients for the NH and CD groups in the unconstrained model [***
= P < 0.05].
=
P < 0.001;
**=
P < 0.01;
*
156
NH Subjects
Model
Est'd
Path
Coeff
Pair-wise
Comparison
CD Subjects
Est'd
Std
Error
P
Path
Coeff
Std
Error
Critical
P
Ratio for
P
Diff
AV (Right)
V + AG
.304
.050
***
.322
.057
***
.402
V 4 IPT
.438
.040
***
.326
.039
***
V 4 STS
-.157
.088
.074
-.018
.087
.838
IPT
.405
.071
***
.338
.059
***
AG 4 STS
.464
.205
*
.453
.256
.077
-.111
STS 4 AG
-.179
.216
.407
-.253
.217
.245
-1.173
IPT + STS
.397
.141
**
.203
.176
.251
-1.653
*
+
**
*
STS
AG
IPT
-2.781
**
2.189
**
-1.290
-.288
.104
-.097
.155
.529
1.958
STS 4 M
-.075
.294
.800
-.085
.310
.784
-.120
M + STS
.477
.348
.171
.461
.365
.206
-.154
STS 4 IFG
.446
.050
***
***
1.403
IFG 4 STS
-.314
.155
IFG : M
.305
.030
IPT 4 M
.238
AG + M
.167
V 4 AG
V + IPT
V 4 STS
IPT
.515
.059
-.325
.136
***
.225
.036
***
-3.519
.129
.064
.111
.090
.222
-1.621
.125
.183
.326
.151
*
2.382
.293
.032
***
.294
.030
***
.019
.580
.027
***
.556
.026
***
-.648
.137
.105
.191
.006
.092
.950
-1.104
.179
.022
***
.137
.019
***
-1.558
AG 4 STS
.681
.106
***
.385
.107
***
-2.189
+
*
*
-.159
**
*
AV (Left)
STS
:*AG
AG
.096
.066
.144
.092
.067
.167
-.067
IPT 4 STS
.272
.120
*
.280
2.421
*
.052
STS 4 IPT
-.042
.057
.461
-.011
-.201
.841
.452
STS 4 M
.288
.166
.082
.004
.047
.962
-1.931
M 4 STS
.215
.302
.476
.598
3.202
**
1.312
+ IFG
.975
.081
***
.926
12.045
***
-. 825
IFG + STS
-1.009
.231
***
-.995
-4.128
*
@M
.137
.056
*
.074
2.176
IPT 4 M
.085
.051
.099
.078
1.972
*
-.109
AG 4 M
.015
.086
.859
.254
4.851
***
2.835
STS
IFG
*
.111
*
-1.192
**
Table A-4 SEM results for the CVCV Audio-Visual condition models (right and left hemispheres).
Estimated path coefficients are shown for the NH and CD groups in the unconstrained model [***=
P < 0.001; ** = P < 0.01; * = P < 0.051.
157
Intrinsic Connections
Model
Modulatory Effects
Est'd
Path
Posterior
Probability
Est'd
Path
Posterior
Probability
Est'd
Posterior
Path
Probability
Coeff (A)
(pA)
Coeff (B)
(pB)
Coeff (C)
(pC)
NH (Left)
V 4 AG
.386
*
-.015
.235
*
.316
*
*
V 4 IPT
.451
*
.110
*
V 4 STS
.000
*
-.022
.827
IPT 4 AG
.554
*
.013
.555
AG 4 STS
.102
*
-.022
*
STS 4 AG
.127
*
.008
.842
iPT 4 STS
.003
*
.031
*
STS 4 IPT
.216
*
-.061
*
STS + M
.113
*
.010
.834
M 4 STS
-.005
.329
.008
.716
IFG
.243
*
.034
.505
IFG 4 STS
-.421
*
.014
.677
IFG 4 M
.025
.781
.013
.668
IPT 4 M
-.020
.254
.011
.699
AG 4 M
.021
.408
.021
.571
STS 4
Direct Input
HA (Right)
V + AG
.482
*
-. 104
*
V 4 IPT
.233
*
.016
*
V 4 STS
.031
*
.056
*
IPT 4 AG
.257
*
.002
.812
AG 4 STS
.378
*
-.003
.551
STS 4 AG
-. 212
*
.057
*
IPT 4 STS
-.154
*
-.016
.845
STS 4 IPT
.417
*
-.002
.744
STS 4 M
.315
*
.067
.704
M 4 STS
-. 093
*
.101
STS 4 IFG
.424
*
.010
.708
IFG 4 STS
-.200
*
-.005
.445
IFG 4 M
.282
.505
.081
.818
IPT 4 M
.046
.800
.015
.624
AG 4 M
-. 103
*
.018
*
*
Table A-5 DCM results for the NH (left) and HA (right) models: estimated path coefficients for
intrinsic connections (A) and their posterior probabilities (pA), estimated modulatory effect values
(B) and their posterior probabilities (pB), estimated coefficient for direct input connection (C) and its
posterior probability (pC) [* = posterior probability >= 0.900].
158
References
Arbib, M., and Bota, M. (2003). Language evolution: neural homologies and neuroinformatics. Neural
Netw 16, 1237-1260.
Arbuckle, J., and Wothke, W. (1999). AMOS 4.0 user's guide. In (Chicago, Smallwaters Corportaion, Inc.).
Arnold, P., and Hill, F. (2001). Bisensory augmentation: A speechreading advantage when speech is clearly
audible and intact. Br J Psychol 92 Part2, 339-355.
Averbeck, B. B., Chafee, M. V., Crowe, D. A., and Georgopoulos, A. P. (2002). Parallel processing of
serial movements in prefrontal cortex. Proc Nat] Acad Sci U S A 99, 13172-13177.
Averbeck, B. B., Chafee, M. V., Crowe, D. A., and Georgopoulos, A. P. (2003a). Neural activity in
prefrontal cortex during copying geometrical shapes. I. Single cells encode shape, sequence, and metric
parameters. Exp Brain Res 150, 127-141.
Averbeck, B. B., Crowe, D. A., Chafee, M. V., and Georgopoulos, A. P. (2003b). Neural activity in
prefrontal cortex during copying geometrical shapes. 11. Decoding shape segments from neural ensembles.
Exp Brain Res 150, 142-153.
Barbas, H., Ghashghaei, H., Dombrowski, S. M., and Rempel-Clower, N. L. (1999). Medial prefrontal
cortices are unified by common connections with superior temporal cortices and distinguished by input
from memory-related areas in the rhesus monkey. J Comp Neurol 410, 343-367.
Baylis, G. C., Rolls, E. T., and Leonard, C. M. (1987). Functional subdivisions of the temporal lobe
neocortex. J Neurosci 7, 330-342.
Beauchamp, M. S., Lee, K. E., Haxby, J. V., and Martin, A. (2002). Parallel visual motion processing
streams for manipulable objects and human movements. Neuron 34, 149-159.
Beauchamp, M. S., Lee, K. E., Haxby, J. V., and Martin, A. (2003). FMRI responses to video and pointlight displays of moving humans and manipulable objects. J Cogn Neurosci 15, 991-1001.
Beck, P. D., and Kaas, J. H. (1999). Cortical connections of the dorsomedial visual area in old world
macaque monkeys. J Comp Neurol 406, 487-502.
Belin, P., Zilbovicius, M., C., S., , Thivard, L., Fontaine, A., Masure, M. C., and Samson, Y. (1998).
Lateralization of speech and auditory temporal processing. JCogn Neurosci 10, 536-540.
159
Bentler, P. M., and Freeman, E. H. (1983). Tests for stability in linear structural equation systems.
Psychometrika 48, 143-145.
Bernstein, L. E., Auer, E. T., Jr., Moore, J. K., Ponton, C. W., Don, M., and Singh, M. (2002). Visual
speech perception without primary auditory cortex activation. Neuroreport 13, 311-315.
Binder, J. R., Frost, J. A., Hammeke, T. A., Bellgowan, P. S., Springer, J. A., Kaufman, J. N., and Possing,
E. T. (2000). Human temporal lobe activation by speech and nonspeech sounds. Cereb Cortex 10, 512-528.
Binnie, C. A. (1977). Attitude changes following speechreading training. Scand Audiol 6, 13-19.
Blatt, G. J., Andersen, R. A., and Stoner, G. R. (1990). Visual receptive field organization and corticocortical connections of the lateral intraparietal area (area LIP) in the macaque. J Comp Neurol 299, 421-445.
Bohland, J. W., and Guenther, F. H. (2006). An fMRI investigation of syllable sequence production.
Neuroimage 32, 821-841.
Bruce, C., Desimone, R., and Gross, C. G. (1981). Visual properties of neurons in a polysensory area in
superior temporal sulcus of the macaque. J Neurophysiol 46, 369-384.
Buchel, C., and Friston, K. J. (1997). Modulation of connectivity in visual pathways by attention: cortical
interactions evaluated with structural equation modelling and fMRI. Cereb Cortex 7, 768-778.
Buchsbaum, B., Pickell, B., Love, T., Hatrak, M., Bellugi, U., and Hickok, G. (2005). Neural substrates for
verbal working memory in deaf signers: fMRI study and lesion case report. Brain Lang 95, 265-272.
Burnham, D., and Dodd, B. (2004). Auditory-visual speech integration by prelinguistic infants: perception
of an emergent consonant in the McGurk effect. Dev Psychobiol 45, 204-220.
Burton, M. W., Locasto, P. C., Krebs-Noble, D., and Gullapalli, R. P. (2005). A systematic investigation of
the functional neuroanatomy of auditory and visual phonological processing. Neuroimage 26, 647-661.
Callan, D. E., Callan, A. M., Honda, K., and Masaki, S. (2000). Single-sweep EEG analysis of neural
processes underlying perception and production of vowels. Brain Res Cogn Brain Res 10, 173-176.
Callan, D. E., Callan, A. M., Kroos, C., and Vatikiotis-Bateson, E. (2001). Multimodal contribution to
speech perception revealed by independent component analysis: a single-sweep EEG case study. Brain Res
Cogn Brain Res 10, 349-353.
Callan, D. E., Jones, J. A., Munhall, K., Callan, A. M., Kroos, C., and Vatikiotis-Bateson, E. (2003). Neural
processes underlying perceptual enhancement by visual speech gestures. Neuroreport 14, 2213-2218.
Callan, D. E., Jones, J. A., Munhall, K., Kroos, C., Callan, A. M., and Vatikiotis-Bateson, E. (2004).
Multisensory integration sites identified by perception of spatial wavelet filtered visual speech gesture
information. J Cogn Neurosci 16, 805-816.
Calvert, G. A., Brammer, M. J., Bullmore, E. T., Campbell, R., Iversen, S. D., and David, A. S. (1999).
Response amplification in sensory-specific cortices during crossmodal binding. Neuroreport 10, 2619-2623.
Calvert, G. A., Bullmore, E. T., Brammer, M. J., Campbell, R., Williams, S. C., McGuire, P. K., Woodruff,
P. W., Iversen, S. D., and David, A. S. (1997). Activation of auditory cortex during silent lipreading.
Science 276, 593-596.
Calvert, G. A., and Campbell, R. (2003). Reading speech from still and moving faces: the neural substrates
of visible speech. J Cogn Neurosci 15, 57-70.
Calvert, G. A., Campbell, R., and Brammer, M. J. (2000). Evidence from functional magnetic resonance
imaging of crossmodal binding in the human heteromodal cortex. Curr Biol 10, 649-657.
Calvert, G. A., Hansen, P. C., Iversen, S. D., and Brammer, M. J. (2001). Detection of audio-visual
integration sites in humans by application of electrophysiological criteria to the BOLD effect. Neuroimage
14, 427-438.
Campbell, R., MacSweeney, M., Surguladze, S., Calvert, G., McGuire, P., Suckling, J., Brammer, M. J.,
and David, A. S. (2001). Cortical substrates for the perception of face actions: an fMRI study of the
specificity of activation for seen speech and for meaningless lower-face acts (gurning). Brain Res Cogn
Brain Res 12, 233-243.
Carmichael, S. T., and Price, J. L. (1995). Sensory and premotor connections of the orbital and medial
prefrontal cortex of macaque monkeys. J Comp Neurol 363, 642-664.
Catani, M., Jones, D. K., and ffytche, D. H. (2005). Perisylvian language networks of the human brain. Ann
Neurol 57, 8-16.
Chao, L. L., and Martin, A. (1999). Cortical regions associated with perceiving, naming, and knowing
about colors. J Cogn Neurosci 11, 25-35.
Corina, D. P. (1999). On the nature of left hemisphere specialization for signed language. Brain Lang 69,
230-240.
Crosson, B., Benefield, H., Cato, M. A., Sadek, J. R., Moore, A. B., Wierenga, C. E., Gopinath, K.,
Soltysik, D., Bauer, R. M., Auerbach, E. J., et al. (2003). Left and right basal ganglia and frontal activity
during language generation: contributions to lexical, semantic, and phonological processes. J Int
Neuropsychol Soc 9, 1061-1077.
D'Esposito, M., Ballard, D., Aguirre, G. K., and Zarahn, E. (1998). Human prefrontal cortex is not specific
for working memory: a functional MRI study. Neuroimage 8, 274-282.
Dale, A. M., Fischl, B., and Sereno, M. I. (1999). Cortical surface-based analysis. I. Segmentation and
surface reconstruction. Neuroimage 9, 179-194.
Deacon, T. W. (1992). Cortical connections of the inferior arcuate sulcus cortex in the macaque brain.
Brain Res 573, 8-26.
Della-Maggiore, V., Sekuler, A. B., Grady, C. L., Bennett, P. J., Sekuler, R., and McIntosh, A. R. (2000).
Corticolimbic interactions associated with performance on a short-term memory task are modified by age. J
Neurosci 20, 8410-8416.
Desimone, R., and Gross, C. G. (1979). Visual areas in the temporal cortex of the macaque. Brain Res 178,
363-380.
Desjardins, R. N., and Werker, J. F. (2004). Is the integration of heard and seen speech mandatory for
infants? Dev Psychobiol 45, 187-203.
Dodd, B. (1979). Lipreading in infants: attention to speechrepresented in- and out- of synchrony. Cognitive
Psychology 11, 478-484.
Doya, K. (1999). What are the computations of the cerebellum, the basal ganglia and the cerebral cortex?
Neural Netw 12, 961-974.
Dronkers, N. F. (1996). A new brain region for coordinating speech articulation. Nature 384, 159-161.
Erber, N. P. (1979). An Approach to Evaluating Auditory Speech Perception Ability. Volta Rev 80, 344350.
Evans, A. C., Collins, D. L., Mill, S. R., Brown, E. D., Kelly, R. L., and T. M. Peters (1993). 3D statistical
neuroanatomical models from 305 MRI volumes. Paper presented at: IEEE-Nuclear Science Symposium
and Medical Imaging Conference.
Falchier, A., Clavagnier, S., Barone, P., and Kennedy, H. (2002). Anatomical evidence of multimodal
integration in primate striate cortex. J Neurosci 22, 5749-5759.
162
Fiez, J. A., Raife, E. A., Balota, D. A., Schwarz, J. P., Raichle, M. E., and Petersen, S. E. (1996). A
positron emission tomography study of the short-term maintenance of verbal information. J Neurosci 16,
808-822.
Fine, I., Finney, E. M., Boynton, G. M., and Dobkins, K. R. (2005). Comparing the effects of auditory
deprivation and sign language within the auditory and visual cortex. J Cogn Neurosci 17, 1621-1637.
Finney, E. M., Fine, I., and Dobkins, K. R. (2001). Visual stimuli activate auditory cortex in the deaf. Nat
Neurosci 4, 1171-1173.
Fischl, B., Sereno, M. I., and Dale, A. M. (1999). Cortical surface-based analysis. II: Inflation, flattening,
and a surface-based coordinate system. Neuroimage 9, 195-207.
Fort, A., Delpuech, C., Pernier, J., and Giard, M. H. (2002). Early auditory-visual interactions in human
cortex during nonredundant target identification. Brain Res Cogn Brain Res 14, 20-30.
Fox, J. (1980). Effect analysis in structural equation models. Sociological Methods and Research 9, 3-28.
Fox, P. T., Huang, A., Parsons, L. M., Xiong, J. H., Zamarippa, F., Rainey, L., and Lancaster, J. L. (2001).
Location-probability profiles for the mouth region of human primary motor-sensory cortex: model and
validation. Neuroimage 13, 196-209.
Foxe, J. J., Morocz, 1. A., Murray, M. M., Higgins, B. A., Javitt, D. C., and Schroeder, C. E. (2000).
Multisensory auditory-somatosensory interactions in early cortical processing revealed by high-density
electrical mapping. Brain Res Cogn Brain Res 10, 77-83.
Friederici, A. D., Wang, Y., Herrmann, C. S., Maess, B., and Oertel, U. (2000). Localization of early
syntactic processes in frontal and temporal cortical areas: a magnetoencephalographic study. Hum Brain
Mapp 11, 1-11.
Friston, K. J. (2002). Bayesian estimation of dynamical systems: an application to fMRI. Neuroimage 16,
513-530.
Friston, K. J., and Buchel, C. (2000). Attentional modulation of effective connectivity from V2 to V5/MT
in humans. Proc Natl Acad Sci U S A 97, 7591-7596.
Friston, K. J., Buechel, C., Fink, G. R., Morris, J., Rolls, E., and Dolan, R. J. (1997). Psychophysiological
and modulatory interactions in neuroimaging. Neuroimage 6, 218-229.
Friston, K. J., Harrison, L., and Penny, W. (2003). Dynamic causal modelling. Neuroimage 19, 1273-1302.
163
Friston, K. J., Mechelli, A., Turner, R., and Price, C. J. (2000). Nonlinear responses in fMRI: the Balloon
model, Volterra kernels, and other hemodynamics. Neuroimage 12, 466-477.
Gabrieli, J. D., Poldrack, R. A., and Desmond, J. E. (1998). The role of left prefrontal cortex in language
and memory. Proc Natl Acad Sci U S A 95, 906-913.
Ghazanfar, A. A., and Logothetis, N. K. (2003). Neuroperception: facial expressions linked to monkey calls.
Nature 423, 937-938.
Giard, M. H., and Peronnet, F. (1999). Auditory-visual integration during multimodal object recognition in
humans: a behavioral and electrophysiological study. J Cogn Neurosci 11, 473-490.
Gizewski, E. R., Lambertz, N., Ladd, M. E., Timmann, D., and Forsting, M. (2005). Cerebellar activation
patterns in deaf participants for perception of sign language and written text. Neuroreport 16, 1913-1917.
Grafton, S. T., Fadiga, L., Arbib, M. A., and Rizzolatti, G. (1997). Premotor cortex activation during
observation and naming of familiar tools. Neuroimage 6, 231-236.
Graziano, M. S., Reiss, L. A., and Gross, C. G. (1999). A neuronal representation of the location of nearby
sounds. Nature 397, 428-430.
Green, K. P., and Gerdeman, A. (1995). Cross-modal discrepancies in coarticulation and the integration of
speech information: the McGurk effect with mismatched vowels. J Exp Psychol Hum Percept Perform 21,
1409-1426.
Green, K. P., Kuhl, P. K., Meltzoff, A. N., and Stevens, E. B. (1991). Integrating speech information across
talkers, gender, and sensory modality: female faces and male voices in the McGurk effect. Percept
Psychophys 50, 524-536.
Grezes, J., Costes, N., and Decety, J. (1999). The effects of learning and intention on the neural network
involved in the perception of meaningless actions. Brain 122 (Pt 10), 1875-1887.
Guenther, F. H., Ghosh, S. S., and Tourville, J. A. (2006). Neural modeling and imaging of the cortical
interactions underlying syllable production. Brain Lang 96, 280-301.
Hackett, T. A., Stepniewska, I., and Kaas, J. H. (1998). Subdivisions of auditory cortex and ipsilateral
cortical connections of the parabelt auditory cortex in macaque monkeys. J Comp Neurol 394, 475-495.
Hackett, T. A., Stepniewska, I., and Kaas, J. H. (1999). Callosal connections of the parabelt auditory cortex
in macaque monkeys. Eur J Neurosci 11, 856-866.
164
Haxby, J. V., Ungerleider, L. G., Clark, V. P., Schouten, J. L., Hoffman, E. A., and Martin, A. (1999). The
effect of face inversion on activity in human neural systems for face and object perception. Neuron 22, 189199.
Hikosaka, 0., Sakai, K., Miyauchi, S., Takino, R., Sasaki, Y., and Putz, B. (1996). Activation of human
presupplementary motor area in learning of sequential procedures: a functional MRI study. JNeurophysiol
76, 617-621.
Hoffmeister, R. J. (1994). Metalinguistic skills in deaf children: knowledge of synonyms and antonyms in
ASL. Paper presented at: the Post Milan: ASL and English Literacy Conference (Washington, DC,
Gallaudent University Press).
Hoffieister, R. J. (1999). American Sign Language Assessment Instrument (ASLAI). In Center for the
Study of Communication and the Deaf, Boston University (Boston, MA).
Horwitz, B., McIntosh, A. R., Haxby, J. V., and Grady, C. L. (1995). Network analysis of brain cognitive
function using metabolic and blood flow data. Behav Brain Res 66, 187-193.
Hu, L., and Bentler, P. (1999). Cutoff criteria for fit indices in covariance structure analysis: conventional
criteria versus new alternatives. Structural Equation Modeling 6, 1-55.
Huang, J., Carr, T. H., and Cao, Y. (2002). Comparing cortical activations for silent and overt speech using
event-related fMRI. Hum Brain Mapp 15, 39-53.
Jancke, L., Mirzazade, S., and Shah, N. J. (1999). Attention modulates activity in the primary and the
secondary auditory cortex: a functional magnetic resonance imaging study in human subjects. Neurosci
Lett 266, 125-128.
Johansen-Berg, H., Behrens, T. E., Robson, M. D., Drobnjak, I., Rushworth, M. F., Brady, J. M., Smith, S.
M., Higham, D. J., and Matthews, P. M. (2004). Changes in connectivity profiles define functionally
distinct regions in human medial frontal cortex. Proc Natl Acad Sci U S A 101, 13335-13340.
Jones, J. A., and Callan, D. E. (2003). Brain activity during audiovisual speech perception: an fMRI study
of the McGurk effect. Neuroreport 14, 1129-1133.
Jones, J. A., and Munhall, K. G. (1997). The effects of separating auditory and visual sources on
audiovisual integration of speech. Canadian Acoustics 25, 13-19.
Jurgens, U. (1984). The efferent and afferent connections of the supplementary motor area. Brain Research
300, 63-81.
165
Kaas, J., and Collins, C. E. (2004). The resurrection of multisensory cortex in primates: connection patterns
that integrate modalities. In The Handbook of Multisensory Processes, G. A. Culvert, C. Spence, and B. E.
Stein, eds. (Cambridge, MA, MIT Press), pp. 285-293.
Kaas, J. H., and Hackett, T. A. (2000). Subdivisions of auditory cortex and processing streams in primates.
Proc Natl Acad Sci U S A 97, 11793-11799.
Kanwisher, N., McDermott, J., and Chun, M. M. (1997). The fusiform face area: a module in human
extrastriate cortex specialized for face perception. JNeurosci 17, 4302-4311.
Kent, R. D., and Tjaden, K. (1997). Brain Functions Underlying Speech. In The Handbook of Phonetic
Sciences, W. J. Hardcastle, and A. Marchal, eds., pp. 220-255.
Kerns, J. G., Cohen, J. D., Stenger, V. A., and Carter, C. S. (2004). Prefrontal cortex guides contextappropriate responding during language production. Neuron 43, 283-291.
Koechlin, E., Ody, C., and Kouneiher, F. (2003). The architecture of cognitive control in the human
prefrontal cortex. Science 302, 1181-1185.
Kohler, E., Keysers, C., Umilta, M. A., Fogassi, L., Gallese, V., and Rizzolatti, G. (2002). Hearing sounds,
understanding actions: action representation in mirror neurons. Science 297, 846-848.
Koski, L., Wohlschlager, A., Bekkering, H., Woods, R. P., Dubeau, M. C., Mazziotta, J. C., and Iacoboni,
M. (2002). Modulation of motor and premotor activity during imitation of target-directed actions. Cereb
Cortex 12, 847-855.
Krauss, G. L., Fisher, R., Plate, C., Hart, J., Uematsu, S., Gordon, B., and Lesser, R. P. (1996). Cognitive
effects of resecting basal temporal language areas. Epilepsia 37, 476-483.
Lehdricy, S., Ducros, M., Thivard, L., Van de Moortele, P., Francois, C., Poupon, C., Swindale, N., Ugurbil,
K., and Kim, D. S. (2004). Diffusion tensor fiber tracking shows distinct corticostriatal circuits in humans.
Annals of Neurology 55, 522-529.
Levy, I., Hasson, U., Avidan, G., Hendler, T., and Malach, R. (2001). Center-periphery organization of
human object areas. Nat Neurosci 4, 533-539.
Liberman, A. M., and Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition 21,
1-36.
Liberman, A. M., and Mattingly, I. G. (1989). A specialization for speech perception. Science 243, 489-494.
166
Lu, M. T., Preston, J. B., and Strick, P. L. (1994). Interconnections between the prefrontal cortex and the
premotor areas in the frontal lobe. J Comp Neurol 341, 375-392.
Lui, G., Van Ostrand, E., and Newman, N. (1999). Cranial Nerve II and Afferent Visual Pathways. In
Textbook of Clinical Neurology, C. Goetz, and E. Pappert, eds. (Philadelphia, W.B. Saunders Co.), pp.
102-121.
Luppino, G., Matelli, M., Camarda, R., and Rizzolatti, G. (1993). Corticocortical connections of area F3
(SMA-proper) and area F6 (pre-SMA) in the macaque monkey. J Comp Neurol 338, 114-140.
MacSweeney, M., Amaro, E., Calvert, G. A., Campbell, R., David, A. S., McGuire, P., Williams, S. C.,
Woll, B., and Brammer, M. J. (2000). Silent speechreading in the absence of scanner noise: an eventrelated fMRI study. Neuroreport 11, 1729-1733.
MacSweeney, M., Calvert, G. A., Campbell, R., McGuire, P. K., David, A. S., Williams, S. C., Woll, B.,
and Brammer, M. J. (2002a). Speechreading circuits in people born deaf. Neuropsychologia 40, 801-807.
MacSweeney, M., Campbell, R., Calvert, G. A., McGuire, P. K., David, A. S., Suckling, J., Andrew, C.,
Woll, B., and Brammer, M. J. (2001). Dispersed activation in the left temporal cortex for speech-reading in
congenitally deaf people. Proc R Soc Lond B Biol Sci 268, 451-457.
MacSweeney, M., Campbell, R., Woll, B., Brammer, M. J., Giampietro, V., David, A. S., Calvert, G. A.,
and McGuire, P. K. (2006). Lexical and sentential processing in British Sign Language. Hum Brain Mapp
27, 63-76.
MacSweeney, M., Campbell, R., Woll, B., Giampietro, V., David, A. S., McGuire, P. K., Calvert, G. A.,
and Brammer, M. J. (2004). Dissociating linguistic and nonlinguistic gestural communication in the brain.
Neuroimage 22, 1605-1618.
MacSweeney, M., Woll, B., Campbell, R., McGuire, P. K., David, A. S., Williams, S. C., Suckling, J.,
Calvert, G. A., and Brammer, M. J. (2002b). Neural systems underlying British Sign Language and audiovisual English processing in native users. Brain 125, 1583-1593.
Massaro, D. W., Thompson, L. A., Barron, B., and Laren, E. (1986). Developmental changes in visual and
auditory contributions to speech perception. J Exp Child Psychol 41, 93-113.
Matsuzaka, Y., Aizawa, H., and Tanji, J. (1992). A motor area rostral to the supplementary motor area
(presupplementary motor area) in the monkey: neuronal activity during a learned motor task. J
Neurophysiol 68, 653-662.
167
Mattingly, I. G., and Studdert-Kennedy, M., eds. (1991). Modularity and the motor theory of speech
perception (Hillsdale, NJ, Erlbaum).
McGurk, H., and MacDonald, J. (1976). Hearing lips and seeing voices. Nature 264, 746-748.
McIntosh, A., R., and Gonzalez-Lima, F. (1992). The application of structural modeling to metabolic
mapping of functional neural systems. In NATO AS 1 series: Advances in metabolic mapping techniques
for brain imaging of behavioral and learning functions F. Gonzalez-Lima, T. Finkenstadt, and H. Scheich,
eds. (Dordrecht, Kluwer Academic), pp. 219-258.
McIntosh, A., R., and Gonzalez-Lima, F. (1994). Structural equation modeling and its application to
network analysis in functional brain imaging. Hum Brain Mapp 2.
McIntosh, A. R. (1998). Understanding neural interactions in learning and memory using functional
neuroimaging. Ann N Y Acad Sci 855, 556-571.
McIntosh, A. R., Grady, C. L., Haxby, J. V., Ungerleider, L. G., and Horwitz, B. (1996). Changes in limbic
and prefrontal functional interactions in a working memory task for faces. Cereb Cortex 6, 571-584.
Mechelli, A., Price, C. J., Noppeney, U., and Friston, K. J. (2003). A dynamic causal modeling study on
category effects: bottom-up or top-down mediation? J Cogn Neurosci 15, 925-934.
Meltzoff, A. N. (1990). Towards a development cognitive science: The implications of cross-modal
matching and imitation for the development of representation and memory in infants. Annals of New York
Academy of Sciences 608.
Miller, E. K., and Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annu Rev
Neurosci 24, 167-202.
Mistlin, A. J., and Perrett, D. I. (1990). Visual and somatosensory processing in the macaque temporal
cortex: the role of'expectation'. Exp Brain Res 82, 437-450.
Morel, A., Garraghty, P. E., and Kaas, J. H. (1993). Tonotopic organization, architectonic fields, and
connections of auditory cortex in macaque monkeys. J Comp Neurol 335, 437-459.
Mottonen, R., Krause, C. M., Tiippana, K., and Sams, M. (2002). Processing of changes in visual speech in
the human auditory cortex. Brain Res Cogn Brain Res 13, 417-425.
Munhall, K. G., and Tohkura, Y. (1998). Audiovisual gating and the time course of speech perception. J
Acoust Soc Am 104, 530-539.
Nakamura, H., Kuroda, T., Wakita, M., Kusunoki, M., Kato, A., Mikami, A., Sakata, H., and Itoh, K.
(2001). From three-dimensional space vision to prehensile hand movements: the lateral intraparietal area
links the area V3A and the anterior intraparietal area in macaques. J Neurosci 21, 8174-8187.
Ojanen, V., Mottonen, R., Pekkola, J., Jaaskelainen, I. P., Joensuu, R., Autti, T., and Sams, M. (2005).
Processing of audiovisual speech in Broca's area. Neuroimage 25, 333-338.
Olson, I. R., Gatenby, J. C., and Gore, J. C. (2002). A comparison of bound and unbound audio-visual
information processing in the human cerebral cortex. Brain Res Cogn Brain Res 14, 129-138.
Paulesu, E., Perani, D., Blasi, V., Silani, G., Borghese, N. A., De Giovanni, U., Sensolo, S., and Fazio, F.
(2003). A functional-anatomical model for lipreading. J Neurophysiol 90, 2005-2013.
Pekkola, J., Laasonen, M., Ojanen, V., Autti, T., Jaaskelainen, 1.P., Kujala, T., and Sams, M. (2006).
Perception of matching and conflicting audiovisual speech in dyslexic and fluent readers: an fMRI study at
3 T. Neuroimage 29, 797-807.
Pekkola, J., Ojanen, V., Autti, T., Jaaskelainen, 1.P., Mottonen, R., Tarkiainen, A., and Sams, M. (2005).
Primary auditory cortex activation by visual speech: an fMRI study at 3 T. Neuroreport 16, 125-128.
Penny, W. D., Stephan, K. E., Mechelli, A., and Friston, K. J. (2004a). Comparing dynamic causal models.
Neuroimage 22, 1157-1172.
Penny, W. D., Stephan, K. E., Mechelli, A., and Friston, K. J. (2004b). Modelling functional integration: a
comparison of structural equation and dynamic causal models. Neuroimage 23 Suppi 1, S264-274.
Petrides, M. (1985). Deficits on conditional associative-leaming tasks after frontal- and temporal-lobe
lesions in man. Neuropsychologia 23, 601-614.
Petrides, M. (1991). Functional specialization within the dorsolateral frontal cortex for serial order memory.
Proc Biol Sci 246, 299-306.
Petrides, M., Alivisatos, B., and Frey, S. (2002). Differential activation of the human orbital, midventrolateral, and mid-dorsolateral prefrontal cortex during the processing of visual stimuli. Proc Natl Acad
Sci U S A 99, 5649-5654.
Petrides, M., and Pandya, D. N. (2002). Comparative cytoarchitectonic analysis of the human and the
macaque ventrolateral prefrontal cortex and corticocortical connection patterns in the monkey. Eur J
Neurosci 16, 291-3 10.
169
Puce, A., Allison, T., Asgari, M., Gore, J. C., and McCarthy, G. (1996). Differential sensitivity of human
visual cortex to faces, letterstrings, and textures: a functional magnetic resonance imaging study. J Neurosci
16, 5205-5215.
Puce, A., Allison, T., Bentin, S., Gore, J. C., and McCarthy, G. (1998). Temporal cortex activation in
humans viewing eye and mouth movements. J Neurosci 18, 2188-2199.
Puce, A., Allison, T., Gore, J. C., and McCarthy, G. (1995). Face-sensitive regions in human extrastriate
cortex studied by functional MRI. J Neurophysiol 74, 1192-1199.
Rauschecker, J. P. (1997). Processing of complex sounds in the auditory cortex of cat, monkey, and man.
Acta Otolaryngol Suppl 532, 34-38.
Rauschecker, J. P., Tian, B., and Hauser, M. (1995). Processing of complex sounds in the macaque
nonprimary auditory cortex. Science 268, 111-114.
Reisberg, D., McLean, J., and Goldfield, A. (1987). Easy to hear but hard to understand: a lipreading
advantage with intact auditory stimuli. In Hearing by Eye: The Psychology of LipReading, B. Dodd, and R.
Campbell, eds. (Hillsdale, New Jersey, Lawrence Erlbaum Associates), pp. 97-113.
Rizzolatti, G., and Arbib, M. A. (1998). Language within our grasp. Trends Neurosci 21, 188-194.
Rizzolatti, G., and Fadiga, L. (1998). Grasping objects and grasping action meanings: the dual role of
monkey rostroventral premotor cortex (area F5). Novartis Found Symp 218, 81-95; discussion 95-103.
Rizzolatti, G., Fogassi, L., and Gallese, V. (2002). Motor and cognitive functions of the ventral premotor
cortex. Curr Opin Neurobiol 12, 149-154.
Rizzolatti, G., Luppino, G., and Matelli, M. (1998). The organization of the cortical motor system: new
concepts. Electroencephalogr Clin Neurophysiol 106, 283-296.
Robert-Ribes, J., Schwartz, J. L., and Escudier, P. (1995). A comparison of models for fusion of the
auditory and visual sensors in speech perception. Artificial Intelligence Review 9.
Rockland, K. S., and Ojima, H. (2003). Multisensory convergence in calcarine visual areas in macaque
monkey. Int J Psychophysiol 50, 19-26.
Romanski, L. M., Bates, J. F., and Goldman-Rakic, P. S. (1999). Auditory belt and parabelt projections to
the prefrontal cortex in the rhesus monkey. J Comp Neurol 403, 141-157.
170
Rosenblum, L. D. (2005). The primacy of multimodal speech perception. In Handook of Speech Perception,
D. Pisoni, and R. Remez, eds. (Malden, MA, Blackwell), pp. 51-78.
Rosenblum, L. D., Schmuckler, M. A., and Johnson, J. A. (1997). The McGurk effect in infants. Percept
Psychophys 59, 347-357.
Sakai, K. L., Tatsuno, Y., Suzuki, K., Kimura, H., and Ichida, Y. (2005). Sign and speech: amodal
commonality in left hemisphere dominance for comprehension of sentences. Brain 128, 1407-1417.
Saleem, K. S., Suzuki, W., Tanaka, K., and Hashikawa, T. (2000). Connections between anterior
inferotemporal cortex and superior temporal sulcus regions in the macaque monkey. J Neurosci 20, 50835101.
Sams, M., Aulanko, R., Hamalainen, M., Hari, R., Lounasmaa, 0. V., Lu, S. T., and Simola, J. (1991).
Seeing speech: visual information from lip movements modifies activity in the human auditory cortex.
Neurosci Lett 127, 141-145.
Scott, S. K., Blank, C. C., Rosen, S., and Wise, R. J. (2000). Identification of a pathway for intelligible
speech in the left temporal lobe. Brain 123 Pt 12, 2400-2406.
Sekiyama, K. (1997). Cultural and linguistic factors in audiovisual speech processing: the McGurk effect in
Chinese subjects. Percept Psychophys 59, 73-80.
Sekiyama, K., and Tohkura, Y. (1991). McGurk effect in non-English listeners: few visual effects for
Japanese subjects hearing Japanese syllables of high auditory intelligibility. J Acoust Soc Am 90, 17971805.
Sekiyama, K., and Tohkura, Y. (1993). Inter-language differences in the influence of visual cues in speech
perception. Journal of Phonetics 21, 427-444.
Seltzer, B., and Pandya, D. N. (1991). Post-rolandic cortical projections of the superior temporal sulcus in
the rhesus monkey. J Comp Neurol 312, 625-640.
Seltzer, B., and Pandya, D. N. (1994). Parietal, temporal, and occipital projections to cortex of the superior
temporal sulcus in the rhesus monkey: a retrograde tracer study. J Comp Neurol 343, 445-463.
Shima, K., Hoshi, E., and Tanji, J. (1996). Neuronal activity in the claustrum of the monkey during
performance of multiple movements. J Neurophysiol 76, 2115-2119.
Shima, K., and Tanji, J. (1998). Both supplementary and presupplementary motor areas are crucial for the
temporal organization of multiple movements. J Neurophysiol 80, 3247-3260.
Shima, K., and Tanji, J. (2000). Neuronal activity in the supplementary and presupplementary motor areas
for temporal organization of multiple movements. J Neurophysiol 84, 2148-2160.
Skipper, J. I., Nusbaum, H. C., and Small, S. L. (2005). Listening to talking faces: motor cortical activation
during speech perception. Neuroimage 25, 76-89.
Sumby, W., and Pollack, I. (1954). Visual Contribution to Speech Intelligibility in Noise. Journal of
Acoustical Society of America 26, 212-215.
Summerfield, A. Q. (1987). Some preliminaries to a comprehensive account of audiovisual speech
perception. In Hearing by eye: The psychology of lipreading, B. Dodd, and R. Campbell, eds. (Hillsdale,
NJ, Erlbaum).
Summerfield, A.
Q. (1991). Visual perception of phonetic gestures. In Modularity and the motor theory of
speech perception, I. G. Mattingly, and M. Studdert-Kennedy, eds. (New Jersey, Erlbaum), pp. 117-137.
Summerfield, A. Q., MacLeod, A., McGrath, M., and Brooke, N. M. (1989). Lips, teeth, and the benefits of
lipreading. In Handbook of Research in Face Processing, A. W. Young, and H. D. Ellis, eds. (Amsterdam).
Surguladze, S. A., Calvert, G. A., Brammer, M. J., Campbell, R., Bullmore, E. T., Giampietro, V., and
David, A. S. (2001). Audio-visual speech perception in schizophrenia: an fMRI study. Psychiatry Res 106,
1-14.
Tanji, J., and Shima, K. (1994). Role for supplementary motor area cells in planning several movements
ahead. Nature 371, 413-416.
Tanji, K., Suzuki, K., Yamadori, A., Tabuchi, M., Endo, K., Fujii, T., and Itoyama, Y. (2001). Pure
anarthria with predominantly sequencing errors in phoneme articulation: a case report. Cortex 37, 671-678.
Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, 0., Delcroix, N., Mazoyer, B.,
and Joliot, M. (2002). Automated anatomical labeling of activations in SPM using a macroscopic
anatomical parcellation of the MINI MRI single-subject brain. Neuroimage 15, 273-289.
Watkins, K., and Paus, T. (2004). Modulation of motor excitability during speech perception: the role of
Broca's area. J Cogn Neurosci 16, 978-987.
Wessinger, C. M., VanMeter, J., Tian, B., Van Lare, J., Pekar, J., and Rauschecker, J. P. (2001).
Hierarchical organization of the human auditory cortex revealed by functional magnetic resonance imaging.
J Cogn Neurosci 13, 1-7.
Wilson, S. M., Saygin, A. P., Sereno, M. I., and lacoboni, M. (2004). Listening to speech activates motor
areas involved in speech production. Nat Neurosci 7, 701-702.
Wise, R. J., Scott, S. K., Blank, S. C., Mummery, C. J., Murphy, K., and Warburton, E. A. (2001). Separate
neural subsystems within 'Wernicke's area'. Brain 124, 83-95.
Wright, T. M., Pelphrey, K. A., Allison, T., McKeown, M. J., and McCarthy, G. (2003). Polysensory
interactions along lateral temporal regions evoked by audiovisual speech. Cereb Cortex 13, 1034-1043.
Zatorre, R. J., and Belin, P. (2001). Spectral and temporal processing in human auditory cortex. Cereb
Cortex 11, 946-953.
Zatorre, R. J., Belin, P., and Penhune, V. B. (2002a). Structure and function of auditory cortex: music and
speech. Trends Cogn Sci 6, 37-46.
Zatorre, R. J., Bouffard, M., Ahad, P., and Belin, P. (2002b). Where is 'where' in the human auditory
cortex? Nat Neurosci 5, 905-909.
Download