Introduction This document outlines two functional magnetic resonance imaging experiments and the theory behind and expected results of these experiments. At this time, only the first experiment is described in detail. Both experiments are designed to look at specific aspects of speech motor planning, in particular the vocalization of memory-guided sequences of syllables. Very generally, the first experiment deals with identifying the neural substrates involved in speech sequence performance, and considers issues of the representation of sequence items. The second experiment deals with issues of timing, including rhythm, syllable stress, and speaking rate. Background Fluent speech requires the complex coordination of multiple movements at appropriate times and in the appropriate temporal order. Currently in the DIVA model, the issue of sequencing multiple speech movements is not realistically addressed. Instead, individual phonemes are treated as targets, and once a target is reached, a next target is loaded algorithmically from a list. The neural mechanisms responsible for representing and executing a sequence of sounds, including temporal characteristics such as rhythm, stress, and speaking rate need to be addressed in a more serious manner with a focus on understanding the true neural systems responsible for sequence regulation. Summary of Data Still working on this… coming soon. Experiment I: Neural Correlates of Speech Sequences Stimuli Stimuli consist of sequences of five syllables that vary in sequential complexity and in syllabic complexity. Here, sequential complexity refers to the intricacy of the sequence across syllables. That is, a sequence in which the same syllable (irrespective of the syllable content itself) is repeated five times is the least complex sequence; a sequence consisting of five different syllables is the most complex sequence. Syllabic complexity refers to the intricacy of the articulations specified by an individual syllable. For example, the syllable ‘ba’, which requires only a bilabial closure prior to the steady state vowel /a/ has less syllabic complexity than the syllable ‘bla’, which requires an additional tongue movement to reach /l/ between the lip closure and the vowel. Note: Sequences that vary in complexity have been used successfully in functional imaging of finger movement sequences (e.g. Boecker et al, 1998) with the idea that increasing complexity requires additional resources in areas used in sequence execution. These results are discussed in the Expectations section below. The four stimulus types to be used as well as a control condition are summarized in the table below: 1 Low syllabic complexity, Low sequential complexity 2 High syllabic complexity, Low sequential complexity 3 Low syllabic complexity, High sequential complexity 4 High syllabic complexity, High sequential complexity 5 Presentation of nonsense characters arranged similar to 1-4 Syllables are simply CV’s, and each syllable is simply repeated 5 times with each syllable of equal duration and without stress stra-stra-stra-straSyllables have more complex stra phonemic content (CCCV’s), with each syllable repeated 5 times. ba-di-ta-ro-gu Syllables are CV’s, but each syllable is a different CV, such that sequence content is richer. stra-fla-bro-kli-shro Phonemic content of each syllable is more complicated than CV’s, each syllable is different. Simple control – should help cancel out visual cue information, provide a baseline. Nonsense characters should prevent formation of a speech-like sequence. ba-ba-ba-ba-ba In each case, the subject will be presented visually (for a short duration (~2 seconds)) with an example of the sequence that they are to produce (using the notation discussed above). The visual cue will then be replaced by a white fixation point for np milliseconds. Subjects are instructed to memorize the sequential and temporal content of the sequence, and to prepare to repeat the utterance. Thus, the period during which subjects see the white fixation point can be thought of as a preparation phase. It is important that the sequence information is not available externally to the subject when the utterance is initiated, as externally or memory-guided sequence execution may use a separate neural circuit (see Goldberg 1985, Passingham 1994 for instance). Following the brief (approximately 1-2 seconds chosen randomly) preparation phase, the subject will either be presented with a “go” signal (the fixation point changes color), or nothing will change. In the case in which the “go” signal is received, the subject will then utter the sequence of syllables as they were trained (prior to the experiment subjects will be given examples and will practice for a short time under supervision of the experimenter to ensure that syllables are presented at a normal speaking rate, with approximately even duration and without stress on any particular syllable). In the case where nothing changes, the subject will remain in the preparation phase (as nothing will have cued them to switch to execution, nor will they be able to anticipate a change in the experimental phase since the length of the preparation phase is random). Note: In order to save time, and to collect more data for each condition, we could limit the ‘no go’ case to one or two of stimulus types 1-4. Since the go type is really designed to look at motor execution, I would not expect to see different areas show up in the go/nogo contrasts for different stimulus types. Following an approximately 2-3 second execution phase (when the ‘go’ signal is not received, this phase can be effectively removed from the design) where the scanner will remain silent, the stimulus presentation software will trigger the scanner to collect three full volumes of functional data. (see experimental protocol below). Following this acquisition, there will be an approximately 10 second delay until the next stimulus presentation in order to allow for the return of the BOLD signal to steady state. Time Visual presentation of sequence to be uttered (ns ms fixed) Preparation phase – white fixation point (np ms pseudo-random) Execution phase – Data red fixation point (ne acquisition ms random jitter) phase (3 TR’s + delay) Figure 1: Timeline for the presentation of a single stimulus In total, the inter-stimulus interval is approximately 20-25 seconds. Experimental Protocol The experiment is event-related, using the triggering mechanism we have developed and used in past studies. This will allow the subjects to speak in relative silence, which keeps the production task as close to speech under natural conditions as possible inside the magnet (see Munhall, 2001). The scanner will be triggered following the execution phase (see Figure 1), and three full volumes will be collected. The acquisition parameters will be typical of those used in our previous experiments (echo planar imaging, 30 slices covering the entire cortex and cerebellum aligned to the AC-PC line, 5 mm slice thickness, 0 mm gap between slices, flip angle = 90) with the exception that the TR will be reduced to a minimal or near minimal duration for the magnet used (approximately 2s in the Siemens 3T Trio whole-body system present at MGH Bay 4). Subjects Subjects will consist or right-handed (to reduce probability of right-lateralized speech processing) men and women whose first language was American English. Expected Results By contrasting each of conditions 1-4 with condition 5, we should elicit the neural circuits involved in representing and executing sequences of speech sounds. I expect to see activations in SMA, pre-SMA, basal ganglia, cerebellum, primary motor cortex, ventral pre-motor cortex and anterior cingulate, as well as in auditory areas responding to the sound of one’s own voice. Go / No Go Contrasts Because in some cases, the subject will receive no ‘go’ signal, we can contrast each of conditions 1-4 with ‘go’ signal with the corresponding ‘no go’ case. Since the sequential information remains stored (and presumably prepared and ready for use) in the ‘no go’ case, we should find areas related to sequence selection and initiation, rather than sequence representation by subtracting the ‘no go’ cases from the ‘go’ cases. I would expect to see increased activation in basal ganglia and SMA proper (but not pre-SMA) as well as motor cortex in these subtractions. 2 vs 1 (and also 4 vs 3) – These contrasts only change the phonemic (articulatory) composition of each syllable in the sequence. Some researchers (e.g. MacNeilage, 1998) have suggested that the “frame” and “content” for speech sequences may be somewhat independent of one another. In these cases, the “frame” – meant here as the sequential content across syllables – is constant, but the “content” of each syllable is varied. Additionally, at least one case study (Ziegler et al, 1997) suggests that SMA may be “blind to the segmental content of each syllable.” If there is no significant difference between the simple and complex syllable content tasks proposed herein (in SMA and preSMA), this may lend support to these ideas. If there is a significant difference in activation, we might consider that sequential representations in SMA/pre-SMA have access to sub-syllabic information. 3 vs. 1 (and 4 vs. 2) – The difference between these conditions is the level of sequential complexity. Presumably, the representation of a heterogeneous sequence requires more resources in the brain region(s) responsible for that representation than does a homogeneous sequence. This positive correlation between sequence complexity and BOLD response has been seen in non-speech sequencing tasks. For example, Boecker et al (1998) showed positive correlations between rCBF and key-press sequence complexity in rostral SMA (pre-SMA), and the associated pallido-thalamic loop as well as in parietal area 7 and primary motor cortex using PET. This supports the notion that pre-SMA is primarily responsible for sequence representation, and I would expect to see a similar positive correlation between sequence complexity in speech tasks. Since I am hypothesizing that SMA proper is responsible for initiation of the motor plan, I would not expect to see such a correlation between complexity and BOLD response here.) References Ackermann, H., Konczak, J., Hertrich, I. (1997). The temporal control of repetitive articulatory movements in parkinson’s disease. Brain and Language, 56, 312-319. Alexander, G.E., Crutcher, M.D. (1990). Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends in Neuroscience, 13(7), 266-271. Boecker, H., Dagher, A., Ceballos-Baumann, A.O., Passingham, R.E., Samuel, M., Friston, K.J., Poline, J.-B., Dettmers, C., Conrad, B., Brooks, D.J. (1998). Role of the human rostral supplementary motor area and the basal ganglia in motor sequence control: Investigations with H215O PET. Journal of Neurophysiology, 79, 1070-1080. Clower, W.T., Alexander, G.E. (1998). Movement sequence-related activity reflecting numerical order of components in supplementary and presupplementary motor areas. Journal of Neurophysiology, 80, 1562-1566. Crosson, B., Sadek, J.R., Maron, L., Gökcay, D., Mohr, C.M., Auerbach, E.J., Freeman, A.J., Leonard, C.M., Briggs, R.W. (2001). Journal of Cognitive Neuroscience, 13(2), 272-283. Goldberg, G. (1985). Supplementary motor area structure and function: review and hypotheses. Behavioral and Brain Sciences, 8, 567-616. Hikosaka, O., Nakahara, H., Rand, M.K., Sakai, K., Lu, X., Nakamura, K., Miyachi, S., Doya, K. (1999). Parallel neural networks for learning sequential procedures. Trends in Neuroscience, 22(10), 464-471. Hlustik, P., Solodkin, A., Gullapalli, R.P., Noll, D.C., Small, S.L. (2002). Functional lateralization of the human premotor cortex during sequential movements. Brain and Cognition, 49, 54-62. Ho, A.K., Bradshaw, J.L., Cunnington, R., Phillips, J.G., Iansek, R. (1998). Sequence heterogeneity in parkinsonian speech. Brain and Language, 64, 122-145. Ivry, R.B. (1996). The representation of temporal information in perception and motor control. Current Opinion in Neurobiology, 6, 851-857. Jonas, S. (1987). The supplementary motor region and speech. In: The Frontal Lobes Revisited. New York: IRBN Press, 241-250. Keller E. 1990. Speech motor timing. In Hardcastle W. J. and Marchal A. (eds) Speech Production and Speech Modelling, 343-364. Dordrecht: Kluwer Academic Publishers. Lieberman, P. (2001). Human language and our reptilian brain: the subcortical bases of speech, syntax, and thought. Perspectives in Biology and Medicine, 44(1), 32-51. Macar, F., Lejeune, H., Bonnet, M., Ferrara, A., Pouthas, V., Vidal, F., Maquet, P. (2002). Activation of the supplementary motor area and of attentional networks during temporal processing. Experimental Brain Research, 142, 475-485. MacNeilage, P.F. (1998). The frame/content theory of evolution of speech production. Behavioral and Brain Sciences, 21, 499-546. Munhall, K.G. (2001). Functional imaging during speech production. Acta Psychologia, 107(1-3), 95-117. Pai, M.C. (1999). Supplementary motor area aphasia: a case report. Clinical Neurology and Neurosurgery, 101, 29-32. Picard, N., Strick, P.L. (2001). Imaging the premotor areas. Current Opinion in Neurobiology, 11, 663-672. Pickett, E.R., Kuniholm, E., Protopapas, A., Friedman, J., Lieberman, P. (1998). Selective speech motor, syntax and cognitive deficits associated with bilateral damage to the putamen and the head of the caudate nucleus: a case study. Neuropsychologia, 36(2), 173-188. Riecker, A., Ackermann, H., Wildgruber, D., Meyer, J., Dogil, G., Haider, H., Grodd, W. (2000). Articulatory/Phonetic sequencing at the level of the anterior perisylvian cortex: A functional magnetic resonance imaging (fMRI) study. Brain and Language, 75, 259-276. Schubotz, R.I., von Cramon, D.Y. (2001). Interval and ordinal properties of sequences are associated with distinct premotor areas. Cerebral Cortex, 11, 210-222. Tanji, J. (2001). Sequential organization of multiple movements: involvement of cortical motor areas. Annual Reviews of Neuroscience, 24, 631-651. Tanji, K., Suzuki, K., Yamadori, A., Tabuchi, M., Endo, K., Fujii, T., Itoyama, Y. (2001). Pure anarthria with predominantly sequencing errors in phoneme articulation: a case report. Cortex, 37, 671-678. Verwey, W.B., Lammens, R., van Honk, J. (2002). On the role of the SMA in the discrete sequence production task: a TMS study. Neuropsychologia, 40, 1268-1276. Wildgruber, D., Kischka, U., Ackermann, H., Klose, U., Grodd, W. (1999). Dynamic pattern of brain activation during sequencing of word strings evaluated by fMRI. Cognitive Brain Research, 7, 285-294. Ziegler, W., Kilian, B., Deger, K. (1997). The role of the left mesial frontal cortex in fluent speech: Evidence from a case of left supplementary motor area hemorrhage. Neuropsychologia, 35(9), 1197-1208.