Categorical perception of speech: Task variations in infants and adults Bob McMurray Jessica Maye Andrea Lathrop and Richard N. Aslin And a big thanks to Julie Markant Categorical Perception & Task Variations Overview Previous work • Categorical perception and gradient sensitivity to subphonemic detail. • Categorical perception in infants Reassessing this with HTPP & AEM • Infants show gradient sensitivity • A new methodology • Adult analogues Categorical perception & gradiency Categorical Perception Categorical Perception: Is subphonemic detail retained [and used] during speech perception? For a long time… NO! Subphonemic variation is discarded in favor of a discrete label. Non-categorical Perception A number of psychophysical-type results showed listeners’ sensitivity to within-category detail. 4AIX Task Pisoni & Lazarus (1974) Speeded Response Carney, Widen & Viemeister (1977) Training Samuel (1977) Pisoni, Aslin, Henessey & Perey (1982) Rating Task Miller (1997) Massaro & Cohen (1983) Word Recognition These results did not reveal: Whether on-line word recognition is sensitive to such detail. Whether such sensitivity is useful during recognition. Word Recognition Mounting evidence that word-recognition is sensitive: • Lahiri & Marslen-Wilson (1991): vowel nasalization • Andruski, Blumstein & Burton (1994): VOT • Gow & Gordon (1995): word segmentation • Salverda, Dahan & McQueen (in press): embedded words and vowel length. • Dahan, Magnuson, Tanenhaus & Hogan (2001): coarticulatory cues in vowel. Gradient Sensitivity McMurray, Tanenhaus & Aslin (2002) • Eye-movements to objects after hearing items from 9-step VOT continuum. • Systematic relationship between VOT and looks Bear to the competitor. Response= Response= Competitor Fixations 0.08 0.07 Looks to 0.06 0.05 0.04 Looks to Category Boundary 0.03 0.020 5 10 15 20 VOT (ms) 25 30 35 40 Gradient Sensitivity This systematic, gradient relationship between lexical activation and acoustic detail would allow the system take advantage of fine-grained regularities in the signal. Gow, McMurray & Tanenhaus (Sat., 6:00 poster session) •Anticipate upcoming material. •Resolve Ambiguity If fine-grained detail is useful we might expect infants and children to •Show gradient sensitivity to variation •Tune their sensitivity to learning environment ….BUT Categorical perception in infants Infant categorical perception c Early findings of categorical perception for infants (e.g. Eimas, Siqueland, Jusczyk & Vigorito) have never been refuted. Most studies use: Habituation (many repetitions) Synthetic Speech Single continuum Perhaps a different method would be more sensitive? Head-Turn Preference Procedure Jusczyk & Aslin (1995) Infants exposed to a chunk of language: • Words in running speech. • Stream of continuous speech (ala stat. learning) • Word list After exposure, memory for exposed items (or abstractions) is assessed by comparing listening time to consistent items with inconsistent items. How do we measure listening time? After exposure… Center Light blinks. Brings infant’s attention to center. How do we measure listening time? When infant looks at center… One of the side-lights blinks. How do we measure listening time? Beach… Beach… Beach… When infant looks at side-light… she hears a word. How do we measure listening time? When infant looks at side-light… she hears a word. …as long as she keeps looking… Infants show gradient sensitivity Experiment 1: Gradiency in Infants c 7.5 month old infants exposed to either 4 b-, or 4 p-words Bomb Bear Bail Beach Palm Pear Pail Peach 80 repetitions total Form a category of the exposed class of words. Measure listening time on Bear Pear Pear Bear Bear* Pear* (Original word) (opposite) (VOT closer to boundary). Experiment 1: Stimuli Stimuli constructed by cross-splicing natural, recorded tokens of each end point. B: P: M= 3.6 ms VOT M= 40.7 ms VOT B*: M=11.9 ms VOT P*: M=30.2 ms VOT Both were judged /b/ or /p/ at least 90% consistently by adult listeners. B: 98.5% P: 99% B*: 97% P*: 96% Measuring gradient sensitivity Looking time is an indication of interest. After hearing all of those B-words P sounds pretty interesting. So: infants should look longer for pear than bear. Listening Time What about in between? Categorical Gradient Bear Bear* Pear Individual Differences Novelty/Familiarity preference varies across infants and experiments. We’re only interested in the middle stimuli (b*, p*). Infants categorized as novelty or familiarity preferring by performance on the endpoints. Novelty Familiarity B 27 11 P 19 10 Within each group will we see evidence for gradiency? Novelty Results Novelty infants, Trained on B VOT: p=.001** Linear Trend: p=.001** .14 10000 .004** Listening Time (ms) 9000 8000 7000 6000 5000 4000 B B* P Novelty Results Novelty infants, Trained on P VOT: p=.001** Linear Trend: p=.001** .001** 10000 .1 Listening Time (ms) 9000 8000 7000 6000 5000 4000 P P* B Familiarity Results Familiarity infants showed similar effects. P exposure Trend: p=.009 P vs P*: p=.057 P* vs. B: p=.096 Listening Time (ms) 9000 8000 7000 6000 5000 4000 B B* P Trained on P 10000 9000 Listening Time (ms) B exposure Trend: p=.001 B vs B*: p=.19 B* vs P: p=.21 Trained on B 10000 8000 7000 6000 5000 4000 P P* B Experiment 1: Conclusions • 7.5 month old infants show gradient sensitivity to subphonemic detail. • Individual differences in familiarity/novelty preferences. Why? • Length of exposure? • Individual factors? • Limitations of paradigm may hinder further study: • More repeated measures • Better understanding of “task” • Wider age-range. Anticipatory Eye-Movements A new methodology An ideal methodology would Yield an arbitrary, identification response. Yield a response to a single stimuli Yield many repeated measures Much like a forced-choice identification Anticipatory Eye-Movements (AEM): Train Infants to look left or right in response to a single auditory stimulus Anticipatory Eye-Movements Visual stimulus moves under occluder. Reemergence serves as “reinforcer” Concurrent auditory stimulus predicts endpoint of occluded trajectory. Subjects make anticipatory eyemovements to the expected location—before the stimulus appears. Teak Lamb Anticipatory Eye-Movements After training on original stimuli, infants are tested on a mixture of • original, trained stimuli (reinforced) Maintain interest in experiment. Provide objective criterion for inclusion • new, generalization stimuli (unreinforced) Examine category structure/similarity relative to trained stimuli. Experiment 2: Pitch and Duration Goals: Use AEM to assess auditory categorization. Assess infants’ abilities to “normalize” for variations in pitch and duration… or… Are infants’ sensitive to acoustic-detail during a lexical identification task... Experiment 2: Pitch and Duration Training: “Teak” -> rightward trajectory. “Lamb” -> leftward trajectory. “teak!” Test: Lamb & Teak with changes in: Duration: 33% and 66% longer. Pitch: 20% and 40% higher If infants ignore irrelevant variation in pitch or duration, performance should be good for generalization stimuli. “lamb!” If infants’ lexical representations are sensitive to this variation, performance will degrade. The Stimuli QuickTime™ and a Cinepak decompressor are needed to see this picture. Training stimulus The Stimuli QuickTime™ and a Cinepak decompressor are needed to see this picture. Testing stimulus Results Each trials is scored as correct: longer looking time to the correct side. incorrect: longer looking time to incorrect side. Binary DV—similar to 2AFC. On trained stimuli: 11 of 29 infants performed better than chance–this is a tough tasks for infants. Perhaps more training. Results Pitch p>.1 Duration p=.002 Proportion Correct Trials On generalization stimuli: 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 Duration Pitch 0.1 0 Training Stimuli D1 / P1 Stimulus D2 / P2 Experiment 2: Conclusions Infants’ developing lexical categories show graded sensitive to variation in duration. Possibly not to pitch Might be an effect of “task relevance” AEM yields more repeated measurements better understood task: 2AFC Could it yield a picture of the entire developmental time course? Is AEM applicable to a wider age range? Treating undergraduates like babies Extreme case: Adult perception. Adults generally won’t Look at blinking lights… Suck on pacifiers… Kick their feet at mobiles… Result: few infant methodologies allow direct analogues to adults. They do make eye-movements… …could AEM be adapted? Treating undergraduates like babies Pilot study. 5 adults exposed to AEM stimuli. Training: “Ba” “Pa” left right Test Ba – Pa (0-40 ms) VOT continuum. % /p/ Results 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 2AFC AEM 0 5 10 15 20 25 30 35 40 VOT (ms) Second group of subjects run in an explicit 2AFC. Same category boundary. Steeper slope: less sensitivity to VOT. Adult AEM: Conclusions AEM paradigm can be used unchanged for adults. Should work with older children as well. Results show same category boundary as traditional 2AFC tasks, perhaps more sensitivity to fine-grained acoustic detail. Potentially useful for speech categorization when categories are not: nameable pictureable immediately obvious Conclusions Like adults,7.5-month-old infants show gradient sensitivity to subphonemic detail. VOT Duration Perhaps not pitch (w.r.t. lexical categories) Conclusions Task makes the difference: Moving to HTPP from habituation revealed subphonemic sensitivity. Taking into account individual differences crucial. Moving to AEM yields Better ability to examine tuning over time. Ability to assess perception across lifespan with a single paradigm. Categorical perception of speech: Task variations in infants and adults Bob McMurray Jessica Maye Andrea Lathrop and Richard N. Aslin And a big thanks to Julie Markant Natural Stimuli Infants may show more sensitivity to natural speech Stimuli constructed from natural tokens of actual words with progressive cross-splicing. Palm Bomb Experiment 1: Reprise Difficult to examine how sensitivity might be tuned to environmental factors in head-turn-preference procedure. Listening Time • High variance/individual differences—can’t predict novelty/familiarity. • Only a single point to look at. • Between-subject comparison. • Difficult interaction to obtain 6 m/o 8 m/o 10 m/o Bear Bear* Pear Experiment 1: Reprise AEM presents a potential solution: 1) Looking at whole continuum would yield more power. 10 m/o 8 m/o 6 m/o Bear Pear 2) Is AEM applicable to a wider age range? The Stimuli QuickTime™ and a Cinepak decompressor are needed to see this picture. Training stimulus Data analysis Data coded by naive coders from video containing pupil & scene monitors. Left Left Center Data analysis Left-In Left-In Left-out Left-out Right Right Right Right-In Start Right-out Right-out Off Eye-movements coded from maximal size of stimulus to first appearance (or end of trial). Left-out, Right-out, center & start treated as “neither”. Left-in, Left treated as anticipation to left. Right-in, Right treated as anticipation to right.