Introduction

advertisement
Introduction
This document outlines a functional magnetic resonance imaging experiment designed to look at
specific aspects of auditory representation of speech sounds. The experiment deals with the
neural response to a sudden disruption in the auditory feedback loop, as elicited by an unexpected
acoustic shift of speech in real-time.
Background
When infants first learn to speak, they are learning a mapping between speech gestures and the
sounds these gestures allow them to produce. Their own voices act as auditory feedback,
enabling a precise tuning, over time, of the motor-acoustic mapping. Adults with years of speech
practice have well-tuned neural mappings. When there is a disconnect between the expected and
observed acoustic consequences of an articulatory gesture, feedback control allows detection and
then correction of the error. By influencing subjects’ perception of their own speech, we can
induce such a discrepancy and elicit compensatory movements that counteract the perceived
error.
Introducing gradual shifts in formant structure1 or pitch2 causes subjects to gradually adapt to the
perturbation, producing speech with an opposing shift. In addition, recent experiments3,4 have
shown subjects’ ability to compensate for rapidly introduced perturbations, both acoustic and
motor. In this experiment, sudden auditory perturbations of subjects’ speech will occur in an
attempt to induce activity in the auditory error cells that detect this disconnect. Furthermore, we
will attempt to distinguish between
Stimuli
Stimuli will consist of written text indicating the word the subject should speak aloud for each
trial. The stimulus set will include one-syllable CVC words. Furthermore, on a fraction of trials,
randomly chosen, the subject’s speech will be abruptly perturbed; that is, the vowel’s first
formant will be shifted either up or down by approximately 200 Hz.
We have chosen 20%. It is slightly larger than the 15% guideline used in other standard/deviant
experiments, but because the direction of the shift is unpredictable (10% in each direction at
random), we are using a larger percentage of deviants to maximize usable data while staying
within the time constraints of an fMRI experiment.
The direction and degree of perturbation depends on the asymmetry of the subject’s phonetic
boundaries. A subject whose normal production of “bet” is acoustically closer to their æ/ε
boundary than their I/ε boundary will experience the percept shown in gray: an upward shift will
be heard as “bat,” whereas a downward shift of the same acoustic amount will be heard as a bad
version of “bet.” However, a subject with different phonetic boundaries, or a different vowel
pronunciation, could experience the reverse percept represented in white. Subjects will be
screened prior to scanning to establish their phonetic boundaries, thus determining which
directions and degrees of shift are appropriate.
Text stimuli
“bet”
“putt”
Up-pert
bat [bæt]
bet *[bεt]
pot [pɑt]
putt *[pʌt]
Down-pert
bet *[bεt]
bit [bIt]
putt *[pʌt]
put [pƱt]
Table 1. Possible
stimulus percepts
per condition
Condition
Unperturbed
Up-pert
Down-pert
# stim
presentations
[bVt]
[pVt]
64
64
8
8
8
8
Total
128
16
16
160
Table 2. Number of
stimulus presentations
per condition
The stimuli will be presented visually for ~2 seconds. Eighty (80) repetitions of each word will
be presented. Following the word presentation, the subject will utter the word shown on the
visual display. There will be a 2-second delay after the start of stimulus presentation to give the
subject time to speak the one-syllable word. During this execution phase the perturbation will be
introduced: 10% of trials will be accompanied by a downward shift and 10% by an upward shift.
The other 80% of trials will remain unperturbed.
After the 2-second delay, the stimulus presentation software will trigger the scanner to collect two
volumes of functional data. There will then be a pause of 10 seconds before the next trial to
allow for the return of the BOLD signal to the steady state.
Timeline for single trial. The inter-stimulus interval is approximately 15-18 seconds.
150 trials x ~15-18 seconds = ~38-45 minutes total experiment length (need to add breaks)
5 runs of 30 trials x ~18 seconds/trial = ~9 minutes per run
24 unperturbed trials
3 up-pert
3 down-pert
160 trials x 20% perturbed stimuli = 32 deviant trials in 4 (maximum) conditions
Experimental Protocol
The experiment is event-related, using the triggering mechanism used in past studies. Because
the image acquisition is timed to occur several seconds after the stimulus onset, subjects speak in
relative silence. The acquisition parameters will be typical of those used in previous experiments
(echo planar imaging, 30 slices covering the entire cortex and cerebellum aligned to the AC-PC
line, 5 mm slice thickness, 0 mm gap between slices, flip angle = 90°).
Subjects
Subjects will consist of right-handed men and women, ages 18-55, whose first language is
American English.
References
Houde and Jordan (1998). Sensorimotor adaptation in speech production. Science 279:121316.
[1]
Jones JA and Munhall KG (2000). Perceptual calibration of F0 production: Evidence from
feedback perturbation. J. Acoust. Soc. Am. 108(3):1246-51.
[2]
Xu et al. (2004). Compensation for pitch-shifted auditory feedback during the production of
Mandarin tone sequences. J. Acoust. Soc. Am. 116(2):1168-78.
[3]
Tourville et al. (?). Effects of acoustic and articulatory perturbation on cortical activity during
speech production. (poster, APE/PPB study)
[4]
Download