Tracking the timecourse of multiple context effects in assimilated speech Bob McMurray David Gow Dept. of Brain and Cognitive Sciences University of Rochester Massachusetts General Hospital With thanks to Dana Subik, Joe Toscano & John Costalis Overview 1) Bridging fields yields: New solutions to old problems. New questions. Laboratory Phonology Spoken Word Recognition 2) Coping with Coronal-Place Assimilation during online recognition. 3) Implications for language processing & phonology. Bridging Fields: Laboratory Phonology Laboratory Phonology: How perceptual and articulatory constraints drive sound change and shape phonological systems Rich information source in the signal: Constraints inferred through acoustic and articulatory measures. Do phonological constraints inform word recognition? Can details of word recognition inform phonological constraints? Bridging Fields: Spoken Word Recognition Perceptual models tend to come in two varieties: Ignore systematic acoustic variation. 100 Discrimination Spoken word recognition models that assume phonemic inputs as input to the lexicon and meaning. 100 % /p/ Phoneme perception models that relate acoustic properties to categorical perception Discrimination ID (%/pa/) 0 B VOT 0 P Limits of categorical perception Categorical perception (CP) is task-dependent, and does not appear to take place in tasks that involve spontaneous, naturalistic speech understanding. McMurray, Aslin, Tanenhaus, Spivey & Subik (in prep) Within category variation that should be lost in CP affects lexical processes Andruski, Blumstein & Burton (1994), Gow & Gordon, 1995; Utman, Blumstein & Burton (2000), Dahan, Magnuson, Tanenhaus & Hogan (2001), Gow (2001; 2002; 2003) McMurray, Tanenhaus & Aslin (2003) Systematic acoustic variation and SWR Speech perception and phonology relate signal properties to perception. Properties of the signal must be related to meaning—lexical activation. Meaning Case study: English Place Assimilation Assimilation English coronal place assimilation /coronal # labial/ [labial # labial] /coronal #velar/ [velar # velar] Prior work has treated this change as • discrete • phonemically neutralizing [g I m]# berriesnonword? cap box? [k p ]# box cat box? How are words recognized despite neutralization? Phonological inference (Gaskell & Marslen-Wilson, 1996; 1998; 2001) If [labial # labial] infer /coronal # labial/ greem beans green cap box cat cap (Gaskell & Marslen-Wilson, 1996; Gow, 2001) (Gaskell & Marslen-Wilson, 2001; Gow, 2002) Assimilation as Continuous Detail Assimilatory modification is acoustically continuous. F3 Transitions in /æC/ Contexts 1850 2800 1800 2750 Frequency (Hz) Frequency (Hz) F2 Transitions in /æC/ Contexts 1750 1700 1650 2650 1600 2600 1550 2550 Pitch Period coronal assimilated labial 2700 Pitch Period Assimilation blends cues to two places of articulation An Alternative View Assimilation redistributes and blends information Coronality of assimilated item cat box [ktp # bAks] Labiality of assimilating item In theory: assimilation creates correlated cues… Blend might facilitate recognition of context [ ktp # bAks ] Assimilating context might disambiguate blend How can we determine if listeners use this information during recognition? These questions require a method that: • Measures lexical activation. • Sensitive to continuous acoustic detail. • Sensitive to temporal uptake of information. • Measures consideration of multiple items in parallel. Visual World Paradigm Visual World Paradigm • Subjects hear spoken language and manipulate objects in a visual world. • Visual world includes set of objects whose names represent competing hypotheses for the input. • Eye-movements to each object are monitored throughout the task. Tanenhaus, Spivey-Knowlton, Eberhart & Sedivy (1995) Allopenna, Magnuson & Tanenhaus (1998) • Fixation probability ~ lexical activation. • Sensitive to within-category acoustic variability (McMurray, Tanenhaus & Aslin, 2003; Dahan, Magnuson, Tanenhaus & Hogan, 2001) • Eye-movements fast and timelocked to speech—temporal dynamics. • Multiple competitors in same trial. • Meaning based, natural task: Subjects must interpret speech to perform task. Experiment 1 Assimilation facilitates recognition. Present subjects with assimilated or non assimilated speech. Measure activation for items that follow assimilation. Methods Subject hears “select the maroon “select the maroong goose” goose” Prediction: More fixations to goose after assimilated consonants. 34 Subjects. 16 sets of items. Subjects exposed to pictures/names before each block. Stimuli cross-spliced from natural tokens—assimilation is not complete… continuous acoustic information. Spliced from “maroon duck” “select the maroon duck” “select the maroon goose” “select the maroong duck” *** “select the maroong goose” Spliced from “maroon goose” Eye-movements temporally aligned to onset of second word (goose or duck). 200 ms Trials 1 2 3 4 5 … many more trials Target = Maroon Goose Competitor = Maroon Duck Unrelated = Patriotic Duck and Goose Time Results Looks to the target (goose) 0.8 Assimilated 0.7 Non Assimilated 0.6 0.6 0.5 Assimilated 0.5 0.4 0.3 0.2 0.1 0 0 Fixation Proportion Fixation Proportion 0.9 200 Non Assimilated 0.4 0.3 0.2 400 600 800 1000 Time (ms) 0.1 0 0 100 200 Time (ms) 300 400 500 Looks to the competitor (duck) Fixation Proportion 0.25 0.2 0.15 Assimilated Non Assimilated 0.1 0.05 p = .03* 0 0 200 400 600 Time (ms) 800 1000 Experiment 1: Summary Continuous variation due to assimilation • not variability to be conquered… • signal to be used. Assimilated coronals allow progressive operations. • facilitate consistent targets • exclude inconsistent competitors earlier Consistent with prior work using priming (Gow, 2001; 2003; Gow & Im, in press) Lexical Ambiguity? Even incomplete modification can create lexical ambiguity. cat box catp box ? cat cap Does subsequent context regressively modify the interpretation of assimilated segments? catp box catp drawing Experiment 2 Subject hears “select the catp “select the catp box” drawing” Prediction: Fixations to cat or cap is a function of context. catp box Assim Non-Coronal Fixation Proportion 0.6 0.5 0.4 0.3 0.2 Coronal (cat) 0.1 Non-Coronal (cap) 0 0 400 800 Time (ms) 1200 1600 catp drawing Assim Coronal Fixation Proportion 0.6 0.5 0.4 0.3 0.2 Coronal (cat) 0.1 Non-Coronal (cap) 0 0 400 800 Time (ms) 1200 Regressive effect is more biasing for non felicitous assimilation… perceptual locus? 1600 Progressive effects too? Regressive effects: Context biases interpretation of ambiguous token. Will we see a progressive effect at the same time? Weaker effects Possibly due to item variability 0.4 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 Competitor 0.35 Fixation Proportion Fixation Proportion Target Non-assimilated Assimilated 0.3 0.25 0.2 0.15 0.1 Non-assimilated Assimilated 0.05 0 200 400 600 Time (ms) 800 1000 0 200 400 600 Time (ms) 800 1000 Experiment 2: Summary Assimilated coronals allow regressive operations: • Context useful in resolving ambiguity. Similar Progressive operations to experiment 1. What kind of computation is responsible? … relationship to continuous detail in signal important Continuous Signal Continuous Response Progressive & Regressive effects vary continuously across items. Experiment 1: Progressive effect on target Progressive Effect 0.150 0.100 0.050 0.000 -0.050 -0.100 0 2 4 6 8 Item # 10 12 14 16 Experiment 2: Regressive effect 0.5 Regressive Effect 0.4 0.3 0.2 0.1 0 -0.1 -0.2 -0.3 0 2 4 6 8 10 12 14 16 Item # What can the acoustics properties of these items tell us about perceptual variability? Measured F1, F2, F3, Closure Duration of original items. Regression: F1, F2, F3, Closure Interaction with Labiality. Not enough power (items) to reach significance but model accounted for: Experiment 1 • Progressive effect: 75% of variance Experiment 2 • Progressive effect: 57% of variance. • Regressive effect: 37% of variance Continuous acoustic variation is related to perceptual processes… how? A Perceptual Account Feature cue parsing (Gow, 2003) [k t b p 3000 0 0 0.760454 Time (s) l Feature cue parsing (Gow, 2003) 3000 0 0 0.760454 Time (s) Any feature is encoded by multiple cues that are integrated Feature cue parsing (Gow, 2003) 3000 0 0 0.760454 Time (s) Feature cue parsing (Gow, 2003) 3000 0 0 0.760454 Time (s) Assimilation creates cues consistent with multiple places Feature cue parsing (Gow, 2003) Extract feature cues Feature cue parsing (Gow, 2003) Group feature cues by similarity and resolve ambiguity Integration is by the same process within a segment. Standard component of information integration in perception. Feature cue parsing (Gow, 2003) example: cat…. catp # box | [cor] [lab] [LAB] catp # drawing catp # | | | [cor] [COR] [lab] [cor] [lab] Feature cue parsing (Gow, 2003) example: cat…. catp # Box | [cor] [lab] [LAB] catp # Drawing catp # | [cor] [COR] [lab] [cor] [lab] Progressive and regressive effects fall out of grouping Implications for Phonology Feature cue parsing based on basic perceptual grouping principles: • Not specific to assimilation. Parsing errors may lead to sound change: • Pressure on languages to avoid errors • Maximize contrast between adjacent segments. • Minimize juxtaposition of similar segments Feature parsing errors may lead to sound change e.g. Shona Dissimilation (Ohala, 1981) Pre-Shona -b w a [LAB] [labio-velar glide] Shona -b a [LAB] (labio) [velar glide] Gow & Zoll (2002) Conclusions English coronal place assimilation neutralizes phonemic distinctiveness. Perceptual recovery cannot be based on symbolic processes. Continuous perceptual mechanisms sensitive to systematic acoustic variation yield • Progressive activation of upcoming material • Regressive ambiguity resolution. Such mechanisms may play a role in sound change. Conclusions Bridging spoken word recognition and laboratory phonology helps both fields. • In phonology: perceptual mechanisms for handling variation may constrain languages’ sound structures. • In SWR: assimilation is not noise to be conquered, but signal to be used. Systematic Phonological Variation Perceptual Processes Sound Change Perceptual Processes …