Randy Diehl Psy 394U Speech Perception Background Readings General phonetics Catford, J.C. Fundamental problems in phonetics (1977). Ladefoged, P. A course in phonetics, 3rd Ed. (1993). Hardcastle, W.J. & Laver, J. (eds.) The handbook of phonetic sciences (1997). Distinctive feature theory Jakobson, R. & Halle, M. Fundamentals of language (1956). Jakobson, R., Fant, G., & Halle, M. Preliminaries to speech analysis (1963). Chomsky, N. & Halle, M. The sound pattern of English (1968), Chapter 7. Ladefoged, P. Phonetic prerequisites for a distinctive feature theory. In Valdman, A. (ed.), Papers in linguistics and phonetics to the memory of Pierre Delattre. The Hague: Mouton (pp. 273-285), 1972. Vennemann, T. & Ladefoged, P. Phonetic features and phonological features. Lingua, 32, 61-74, 1973. McCarthy, J.J. Feature geometry and dependency: A review. Phonetica, 43, 84-108, 1988. Diehl, R.L. & Lindblom, B. Explaining the structure of feature and phoneme inventories. In S. Greenberg, W.A. Ainsworth, A. Popper, and R. Fay (eds.) Speech processing in the auditory system New York: Springer (pp. 101-162), 2004. Acoustic theory of speech production Fant, G. Acoustic theory of speech production (1970). Flanagan, J.L. Speech analysis, synthesis, and perception (1972). Stevens, K.N. Acoustic phonetics (1998). Signal processing and linear systems analysis Markel, J.D. & Gray, A.H. Jr. Linear prediction of speech (1976). Oppenheim, A.V. & Willsky, A.S. Signals and systems (1983). Rosen, S. & Howell, P. Signals and systems for speech and hearing (1991, recently revised). Anatomy and physiology of speech production Minifie, F.D., Hixon, T.J., Williams, F. (Eds.) Normal aspects of speech, hearing and language (1973). Chapters 3, 4, 5, 6. Hardcastle, W.J. Physiology of speech production (1976). Dickson, D.R. & Maue-Dickson, W. Anatomical and physiological bases of speech (1982). Acoustic and articulatory phonetics Denes, P.B. & Pinson, E.N. The speech chain (1963). Lehiste, I. (Ed.) Readings in acoustic phonetics (1967). Lieberman, P. & Blumstein, S.E. Speech physiology, speech perception, and acoustic phonetics (1988). Kent, R. & Read, C. The acoustic analysis of speech (1992). Olive, J.P., Greenwood, A. & Coleman, J. Acoustics of American English Speech (1993). Stevens, K.N. Acoustic phonetics (1998). Pickett, J.M. The acoustics of speech communication (1999). Early Haskins work: in search of the acoustic cues Delattre, P.C., Liberman, A.M., & Cooper, F.S. Acoustic loci and transitional cues for consonants. JASA, 1955, 27, 769-773. Delattre, P.C., Liberman, A.M., Cooper, F.S. & Gerstman, L. An experimental study of the acoustic determinants of vowel color: observations on one- and two-formant vowels synthesized from spectrographic patterns. Word, 1952, 8, 195-210. Delattre, P.C., Liberman, A.M. & Cooper, F.S. Formant transitions and loci as acoustic correlates of place of articulation in American fricatives. Studia Linguistica, 1964, 104-121. Harris, K.S., Hoffman, H.S., Liberman, A.M., Delattre, P.C., & Cooper, F.S. Effect of third-formant transitions on the perception of the voiced stop consonants. JASA, 1958, 30, 122-126. Liberman, A.M. Some results of research on speech perception. JASA, 1957, 29, 117123. Liberman, A.M., Delattre, P.C., & Cooper, F.S. Some cues for the distinction between voiced and voiceless stops in initial position. Language and Speech, 1958, 1, 153167. Liberman, A.M., Delattre, P.C. & Cooper, F.S. The role of selected stimulus variables in the perception of unvoiced stop consonants. American Journal of Psychology, 1952, 65, 497-516. Liberman, A.M., Delattre, P.C., & Cooper F.S. Tempo of frequency change as a cue for distinguishing classes of speech sounds. Journal of Experimental Psychology, 1956, 52, 127-137. Lisker, L. Closure duration and the intervocalic voiced-voiceless distinction in English. Language, 1957, 13, 256-267. Lisker, L. Minimal cues for separating /w, r, l, y/ in intervocalic position. Word, 1957, 13, 256-267. O'Connor, J.D., Gerstman, L.J., Liberman, A.M., Delattre, P.C. & Cooper, F.S. Acoustic cues for the perception of initial /w, j, r, l/ in English. Word, 1957, 13, 24-43. The speech mode hypothesis Liberman, A.M., Cooper, F.S., Shankweiler, D.P. & Studdert-Kennedy, M. The perception of the speech code. Psychological Review, 1967, 74, 431-461. Liberman, A.M. On the finding that speech is special. American Psychologist, 1982, 37, 148-167. Liberman, A.M. & Mattingly, I.G. The motor theory of speech perception revised. Cognition, 1985, 21, 1-36. Liberman, A.M. Speech: A special code (1996). 2 Categorical perception of speech Lisker, L., & Abramson, A.S. The voicing dimension: Some experiments in comparative phonetics. Proceedings of the 6th International Congress of Phonetic Sciences (pp. 563-567). Prague: Academia, 1970. Abramson, A.S. & Lisker, L. Discriminability along the voicing continuum: Crosslanguage tests. In Proceedings of the 6th International Congress of Phonetic Sciences (pp. 569-573). Prague: Academia, 1970. Liberman, A.M., Harris, K.S., Eimas, P.D., Lisker, L. & Bastian, J. An effect of learning on speech perception: The discrimination of durations of silence with and without phonemic significance. Language and Speech, 1961, 4, 175-195. Liberman, A.M., Harris, K.S., Hoffman, H.S. & Griffith, B.C. The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimental Psychology, 1957, 54, 358-368. Liberman, A.M., Harris, K.S., Kinney, J.A. & Lane, H.L. The discrimination of relative onset time of the components of certain speech and nonspeech patterns. Journal of Experimental Psychology, 1961, 61, 378-388. Mattingly, I.G., Liberman, A.M., Syrdal, A.K., & Halwes, T.G. Discrimination in speech and non-speech modes. Cognitive Psychology, 1971, 2, 131-157. Miyawaki, K., Strange, W., Verbrugge, R., Liberman, A.M., Jenkins, J.J., & Fujimura, O. An effect of linguistic experience: The discrimination of /r/ and /l/ by native speakers of Japanese and English. Perception & Psychophysics, 1975, 18, 331-340. Repp, B.H. Categorical perception: Issues, methods, findings. In Lass, N.J. (ed.), Speech and language: Advances in basic research and practice, Vol. 10 (pp. 243335). Academic Press, 1984. Pastore, R.E. Categorical perception: Some psychophysical models. In S. Harnad (ed.), Categorical perception (pp. 29-52), 1987. Rosen, S. & Howell, P. Auditory, articulatory, and learning explanations of categorical perception in speech. In S. Harnad (ed.), Categorical perception (pp. 113-160), 1987. Harnad, S. (ed.). Categorical perception (1987). Panagiotis, S.G., Diehl, R.L., Breier, J.I., Molis, M.R., Zouridakis, G., & Papanicolaou, A.C. MEG correlates of categorical perception of a voice onset time continuum in humans. Cognitive Brain Research, 1998, 7, 215-219. Sharma, A. & Dorman, M.F. Cortical auditory evoked potential correlates of categorical perception of voice-onset time. JASA, 1999, 106, 1078-1083. Categorical perception of nonspeech Miller, J.D., Wier, C.C., Pastore, R.E., Kelly, W.J. & Dooling, R.J. Discrimination and labeling of noise-buzz sequences with varying noise-lead times: An example of categorical perception. JASA, 1976, 60, 410-417. Pisoni, D.B. Identification and discrimination of the relative onset time of two component tones: Implications for voicing perception in stops. JASA, 1977, 1352-1361. Holt, L.L., Lotto, A.J., & Diehl, R.L. Auditory discontinuities interaction with categorization: Implications for speech perception. JASA, in press. 3 Speech perception by animals Kuhl, P.K. & Miller, J.D. Speech perception by the chinchilla: Identification functions for synthetic VOT stimuli. JASA, 1978, 63, 905-917. Kuhl, P.K. Discrimination of speech by nonhuman animals: Basic auditory sensitivities conducive to the perception of speech-sound categories. JASA, 1981, 70, 340-349. Kuhl, P.K. & Padden, D.M. Enhanced discrimination at the phonetic boundaries for the voicing feature in macaques. Perception & Psychophysics, 1982, 32, 542-550. Kuhl, P.K. & Padden, D.M. Enhanced discrimination at the phonetic boundaries for the place feature in macaques. JASA, 1983, 73, 1003-1010. Kluender, K.R., Diehl, R.L., & Killeen, P.R. Japanese quail can learn phonetic categories. Science, 1987, 237, 1195-1197. Kluender, K.R. Effects of first formant onset properties on voicing judgments result from processes not specific to humans. JASA, 1991, 90, 83-96. Kluender, K.R. & Lotto, A.J. Effects of first formant onset frequency on [-voice] judgments result from general auditory processes not specific to humans. JASA, 1994, 95, 1044-1052. Lotto, A.J., Kluender, K.R. & Holt, L.L. Perceptual compensation for coarticulation by Japanese quail (Coturnix coturnix japonica). JASA, 1997, 102, 1134-1140. Kluender, K.R., Lotto, A.J., Holt, L.L. & Bloedel, S.L. Role of experience for languagespecific functional mappings of vowel sounds. JASA, 1998, 104, 3568-3582. Electrophysiological studies of of speech perception in animals Young, E.D. & Sachs, M.B. Processing speech in the peripheral auditory system. In T. Myers, J. Laver & J. Anderson (Eds.), The cognitive representation of speech (pp. 75-92), 1981. Sachs, M.B., Young, E.D. & Miller, M.I. Encoding of speech features in the auditory nerve. In R.Carlson & B. Granström (Eds.), The representation of speech in the peripheral auditory system (pp. 115-130), 1982. Delgutte, B. & Kiang, N.Y.S. Speech coding in the auditory nerve: (I-IV). JASA, 1984, 75, 866-918. Smoorenburg, G.F. Discussion of physiological correlates of speech perception. In M.E.H. Schouten (Ed.), The psychophysics of speech perception (pp. 393-399). 1987. Sinex, D.G. & McDonald, L.P. Average discharge rate representation of voice onset time in the chinchilla auditory nerve. JASA, 1988, 83, 1817-1827. Sinex, D.G. & McDonald, L.P. Synchronized discharge rate representation of voice onset time in the chinchilla auditory nerve. JASA, 1989, 85, 1995-2004. Sinex, D.G., McDonald, L.P. & Mott, J.B. Neural correlates of nonmonotonic temporal acuity for voice onset time. JASA, 1991, 90, 2441-2449. Delgutte, B. Auditory neural processing of speech. In Harcastle, W.J. & Laver, J. (eds.), The handbook of phonetic sciences (pp. 508-538), 1997. Trading relations in speech and nonspeech 4 Best, C.T., Morrongiello, B. & Robson, R. Perceptual equivalence of acoustic cues in speech and nonspeech perception. Perception & Psychophysics, 1981, 29, 191-211. Repp, B.H. Phonetic trading relations and context effects: New experimental evidence for a speech mode of perception. Psychological Bulletin, 1982, 92, 81-110. Summerfield, A.Q. Differences between spectral dependencies in auditory and phonetic temporal processing: Relevance to the perception of voicing in initial stops. JASA, 1982, 72, 51-61. Hillenbrand, J. Perception of sine-wave analogs of voice onset time stimuli. JASA, 1984, 75, 231-240. Parker, E.M., Diehl, R.L. & Kluender, K.R. Trading relations in speech and nonspeech. Perception & Psychophysics, 1986, 34, 314-322. Parker, E.M. Auditory constraints on the perception of stop voicing: The influence of lower-tone frequency on judgments of tone-onset simultaneity. JASA, 1988, 83, 1597-1607. Infant perception of speech and nonspeech Eimas, P.D., Siqueland, E.R., Jusczyk, P. & Vigorito, J. Speech perception in infants. Science, 1971, 171, 303-306. Kuhl, P.K. Speech perception in early infancy: Perceptual constancy for spectrally dissimilar vowel categories. JASA, 1979, 66, 1669-1679. Kuhl, P.K. & Meltzoff, A. N. The bimodal perception of speech in infancy. Science, 1982, 218, 1138-1141. Aslin, R.N., Pisoni, D.B. & Jusczyk, P.W. Auditory development and speech perception in infancy. In M.M. Haith & J.J. Campos (Eds.), Infancy and the biology of development, 1983. Jusczyk, P. W., Pisoni, D. B., Walley, A. and Murray, J. Discrimination of relative onset time of two-component tones by infants. Journal of the Acoustical Society of America, 1980, 67, 262-270. Jusczyk, P. W., Pisoni, D. B., Reed, M. A., Fernald, A. and Myers, M. Infants' discrimination of the duration of a rapid spectrum change in nonspeech signals. Science, 1983, 222, 175-177. Jusczyk, P. W. Language acquistion: Speech sounds and the beginning of phonology. In Miller, J.L. & Eimas, P.D. Speech, language, and communication (pp. 263-301). San Diego: Academic Press, 1995. Jusczyk, P.W. The discovery of spoken language (1997). Werker, J.F. & Tees, R.C. Cross-language speech perception: Evidence for perceptual reorganization during the first year of life. Infant Behavior and Development, 1984, 7, 49-63. Werker, J. F. and Lalonde, C. E. Cross-language speech perception: Initial capabilities and developmental change. Developmental Psychology, 1988, 24, 672-683. Werker, J.F. The ontogeny of speech perception. In I.G. Mattingly & M. StuddertKennedy (Eds.), Modularity and the motor theory of speech perception (pp. 91109), 1991. Mehler, J., Jusczyk, P., Lambertz, G., Halsted, N., Bertoncini, J. & Amiel-Tison, C. A precursor of language acquistion in young infants. Cognition, 1988, 29, 143-178. Kuhl, P.K., Williams, K.A., Lacerda, F., Stevens, K.N. & Lindblom, B. Linguistic experience alters phonetic perception in infants by 6 months of age. Science, 5 1992, 255, 606-608. Kuhl, P.K. &Meltzoff, A.N. Infant vocalizations in response to speech: Vocal imitation and developmental change. JASA, 1996, 100, 2425-2438. Saffran, J. R., Aslin, R. N., and Newport, E. L. Statistical learning by 8-month-old infants. Science, 1996, 274, 1926-1928. Eimas, P.D. Segmental and syllabic representations in the perception of speech by young infants. JASA, 1999, 105, 1901-1911. Intonation and stress Fry, D.B. Duration and intensity as physical correlates of linguistic stress. JASA, 1955, 27, 765-768. (Also in Lehiste.) Hadding-Kock, K & Studdert-Kennedy, M. An experimental study of some intonation contours. Phonetica, 1964, 11, 175-185. Hadding, K. & Studdert-Kennedy, M. Are you asking me, telling me, or talking to yourself? Journal of Phonetics, 1974, 2, 7-14. Cutler, A. Phoneme-monitoring reaction time as a function of preceding intonation contour. Perception & Psychophysics, 1976, 20, 55-60. Darwin, C. J. On the dynamic use of prosody in speech perception. In A. Cohen & S.G. Nooteboom (Eds.), Structure and process in speech perception (1975). Nakatani, L.H. & Schaffer, J.A. Hearing "words" without words: Prosodic cues for word perception. JASA, 1978, 63, 234-245. Pierrehumbert, J.B. The perception of fundamental frequency declination. JASA, 1979, 66, 363-369. Thorsen, N.G. A study of the perception of sentence intonation--Evidence from Danish. JASA, 1980, 67, 1014-1031. Steele, S.A. Interaction of vowel F0 and prosody. Phonetica, 1986, 43, 92-105. Kohler, K.J. & van Dommelen, W.A. Prosodic effects on lenis/fortis perception: Preplosive F0 and LPC synthesis. Phonetica, 1986, 43. 70-75. Silverman, K. F0 segmental cues depend on intonation: The case of the rise after voiced stops. Phonetica 1986, 43, 76-91. Whalen, D.H., Abramson, A.S., Lisker, L. & Mody, M. Gradient effects of fundamental frequency on stop consonant voicing judgments. Phonetica, 1990, 47, 36-49. Sluijter, A. & Terken, J. Beyond sentence prosody: Paragraph intonation in Dutch. Phonetica, 1994, 50, 180-188. Swerts, M. Prosodic features at discourse boundaries of different strength. JASA, 1997, 101, 514-521. Timing, rate and segment duration Klatt, D. Linguistic uses of segmental duration in English: Acoustic and perceptual evidence. JASA, 1976, 59, 1208-1221. Klatt, D. & Cooper, W.E. Perception of segment duration in sentence contexts. In A. Cohen & S.G. Nooteboom (Eds.), Structure and process in speech perception (1975). Lehiste, I., Olive, J.P. & Streeter, L.A. Role of duration in disambiguating sytactically ambiguous sentences. JASA, 1976, 60, 1199-1202. 6 Miller, J.L. & Liberman, A.M. Some effects of later-occurring information on the perception of stop consonant and semivowel. Perception & Psychophysics, 1979, 25, 457-465. Miller, J.L. Effects of speaking rate on segmental distinctions. In P.D. Eimas & J.L. Miller (Eds.), Perspectives on the study of speech (1981). Summerfield, A.Q. On articulatory rate and perceptual constancy in phonetic perception. Journal of Experimental Psychology: Human Perception and Performance, 1981, 7, 1074-1095. Fowler, C.A. Converging sources of evidence on spoken and perceived rhythms of speech: Cyclic production of vowels in monosyllabic stress feet. Journal of Experimental Psychology: General, 1983, 112-386-412. Diehl, R.L. & Walsh, M.A. An auditory basis for the stimulus-length effect in the perception of stops and glides. JASA, 1989, 85, 2154-2164. Fowler, C.A. Sound-producing sources as objects of perception: Rate normalization and nonspeech perception. JASA, 1990, 88, 1236-1249. (Also read reply by Diehl, Walsh, and Kluender in JASA, 1991, 89, 2905-2909.) Merzenich, M. M., Jenkins, W. M., Johnston, P., Schreiner, C., Miller, S. L., and Tallal, P. Temporal processing deficits of language-learning impaired children ameliorated by training. Science, 1996, 271, 77-81. Tallal, P., Miller, S. L., Bedi, G., Byma, G., Wang, X., Nagarajan, S. S., Schreiner, C., Jenkins, W. M., and Merzenich, M. M. Language comprehension in language-learning impaired children improved with acoustically modified speech. Science, 1996, 271, 81-84. Vowel perception Ladefoged, P. & Broadbent, D.E. Information conveyed by vowels. JASA, 1957, 29, 98104. Lindblom, B. & Studdert-Kennedy, M. On the role of formant transitions in vowel recognition. JASA, 1967, 42, 830-843. Carlson, R., Fant, G., & Granström, B. Two-formant models, pitch and vowel perception. In G. Fant & M.A.A. Tatham (Eds.), Auditory analysis and perception of speech (pp. 55-82), 1975. Chistovich, L.A. & Lublinskaya, V.V. The 'center of gravity' effect in vowel spectra and critical distance between the formants: Psychoacoustical study of the perception of vowel-like stimuli. Hearing Research, 1979, 1, 185-195. Traunmüller, H. Perceptual dimension of openness in vowels. JASA, 1981, 69, 14651475. Darwin, C.J. Perceiving vowels in the presence of another sound: Constraints on formant perception. JASA, 1984, 76, 1636-1647. Darwin, C.J. & Gardner, T.B. Mistuning a harmonic of a vowel: Grouping and phase effects on vowel quality. JASA, 1986, 79, 838-845. Strange, W., Jenkins, J.J., Johnson, T.L. Dynamic specification of coarticulated vowels. JASA, 1983, 74, 695-705. Strange, W. Information for vowels in formant transitions. Journal of Memory and Language, 1987, 26, 550-557. Strange, W. Evolving theories of vowel perception. JASA, 1989, 85, 2081-2087. 7 Syrdal, A.K. & Gopal, H.S. A perceptual model of vowel recognition based on the auditory representation of American English vowels. JASA, 1986, 79, 1086-1100. Nearey, T.M. Static, dynamic, and relational properties in vowel perception. JASA, 1989, 85, 2088-2113. Miller, J.D. Auditory-perceptual interpretation of the vowel. JASA, 1989, 85, 21142134. Andruski, J.E. & Nearey, T.M. On the sufficiency of compound target specification of isolated vowels and vowels in /bVb/ syllables. JASA, 1992, 91, 390-410. Zahorian, S.A. & Jagharghi, A.J. Spectral shape versus formants as acoustic correlates for vowels. JASA, 1993, 94, 1966-1982. Hoemeke, K.A. & Diehl, R.L. Perception of vowel height: The role of F1-F0 distance. JASA, 1994, 96, 661-674. Moon, S.Y. & Lindblom, B. Interaction between duration, context, and speaking style in English stressed vowels. JASA, 1994, 96, 40-55. Rosner, B.S. & Pickering, J.B. Vowel perception and production (1994). Hillenbrand, J.M., Getty, L.A., Clark, M.J. & Wheeler, K. Acoustic characteristics of American English vowels. JASA, 1995, 97, 3099-3111. Hillenbrand, J.M. & Nearey, T.M. Identification of resynthesized /hVd/ utterances: Effects of formant contour. JASA, 1999, 105, 3509-3523. Jenkins, J.J., Strange, W. & Trent, S.A. Context-independent dynamic information for the perception of coarticulated vowels. JASA, 1999, 106, 438-448. Perception of concurrent vowels Assmann, P.F. & Summerfield, Q. Modeling the perception of concurrent vowels: Vowels with different fundamental frequencies. JASA, 1990, 88, 680-697. Summerfield, Q. & Assmann, P.F. Perception of concurrent vowels: Effects of harmonic misalignment and pitch-period asynchrony. JASA, 1991, 89, 1364-1377. Assmann, P.F. & Summerfield, Q. The contribution of waveform interactions to the perception of concurrent vowels. JASA, 1994, 95, 471-484. Culling, J.F. & Darwin, C.J. Perceptual separation of simultaneous vowels: Within and across-formant grouping by F0. JASA, 1993, 93, 3454-3467. Culling, J.F. & Darwin, C.J. Perceptual and computational separation of simultaneous vowels: Cues arising from low frequency beating. JASA, 1994, 95, 1559-1569. Culling, J.F. & Summerfield, Q. The role of frequency modulation in the perceptual segregation of concurrent vowels. JASA, 1995, 98, 837-846. Assmann, P.F. Modeling the perception of concurrent vowels: Role of formant transitions. JASA, 1996, 100, 1141-1152. de Cheveigné, A., Kawahara, H., Tsuzaki, M. & Aikawa, K. Concurrent vowel identification. I. Effects of relative amplitude and F0 difference. JASA, 1997, 101, 2839-2847. de Cheveigné, A., McAdams, S., & Marin, C. M.H. Concurrent vowel identification. II. Effects of phase, harmonicity, and task. JASA, 1997, 101, 2848-2856. de Cheveigné, A. Concurrent vowel identification. III. A neural model of harmonic interference cancellation. JASA, 1997, 101, 2857-2865. The “perceptual magnet effect” 8 Grieser, D. & Kuhl, P.K. Categorization of speech by infants: Support for speech-sound prototypes. Developmental Psychology, 1989, 25, 577-588. Kuhl, P.K. Human adults and human infants show a ‘perceptual magnet effect’ for the prototypes of speech categories, monkeys do not. Perception & Psychophysics, 1991, 50, 93-107. Kuhl, P.K., Williams, K.A., Lacerda, F., Stevens, K.N., & Lindblom, B. Linguistic experience alters phonetic perception in infants by 6 months of age. Science, 1992, 255, 606-608. Lively, S.E. & Pisoni, D.B. On prototypes and phonetic categories: A critical assessment of the perceptual magnet effect in speech perception. Journal of Experimental Psychology: Human Perception and Performance, 1995, 23, 1665-1679. Kluender, K.R., Lotto, A.J., Holt, L.L. & Bloedel, S.L. Role of experience for language-specific functional mappings of vowel sounds. JASA, 1998, 104, 3568-3582. Lotto, A.J., Kluender, K.R., & Holt, L.L. Depolarizing the perceptual magnet effect. Journal of the Acoustical Society of America, 1998, 103, 3648-3655. (See also Letters to the Editor by Guenther and Lotto, JASA, 2000, 3576-3580.) Guenther, F.H., Husain, F.T., Cohen, M.A., & Shinn-Cunningham, B.G. Effects of categorization and discrimination training on auditory perceptual space. Journal of the Acoustical Society of America, 1999, 106, 2900-2912. Talker normalization Ladefoged, P. & Broadbent, D.E. Information conveyed by vowels. JASA, 1957, 29, 98-104. Johnson, K. & Mullennix, J.W. (eds.) Talker variability in speech processing (1997). Perception of consonant place: Invariant cues? Stevens, K.N. & Blumstein, S.E. Invariant cues for place of articulation in stop consonants. JASA, 1978, 64, 1358-1368. Kewley-Port, D. Time-varying features as correlates of place of articulation in stop consonants. JASA, 1983, 73, 322-335. Sussman, H.M., McCaffrey, H.A., Matthews S.A. An investigation of locus equations as a source of relational invariance for stop place categorization. JASA, 1991, 90, 1309-1325. Sussman, H.M. The representation of stop consonants in three-dimensional acoustic space. Phonetica, 1991, 48, 18-31. Fruchter, D. & Sussman, H.M. The perceptual relevance of locus equations. JASA, 1997, 102, 2997-3008. Sussman, H.M., Fruchter, D. Hilbert, J. & Sirosh, J. Linear correlates in the speech signal: The orderly output constraint. Behavioral and Brain Sciences, 1998, 21, 241-259. (See commentaries and reply in same issue.) Chennoukh, S., Carré, R. & Lindblom, B. Locus equations in light of articulatory modeling. Journal of the Acoustical Society of America, 1997, 102, 2380-2389. Context and higher-order effects on speech perception 9 Warren, R.M. Perceptual restoration of missing speech sounds. Science, 1970, 167, 392-393. Elman, J.L., Diehl, R.L. & Buchwald, S.E. Perceptual switching in bilinguals. JASA, 1977, 62, 971-974. Mann, V.A. & Repp, B.H. Influence of vocalic context on perception of the /ß/-/s/ distinction. Perception & Psychophysics, 1980, 28, 213-228. Ganong, W.F., III. Phonetic categorization in auditory word perception. Journal of Experimental Psychology: Human Perception and Performance, 1980, 6, 110-125. Luce, P., Pisoni, D. B., and Goldinger, S. Similarity neighborhoods of spoken words. In G. T. M. Altmann (Ed.), Cognitive models of speech processing: Psycholinguistic and computational perspectives. Cambridge, MA: MIT Press, 122-147, 1990. Cutler, A. Spoken word recognition and production. In Miller, J.L. & Eimas, P.D. (eds.), Speech, language, and communication (pp. 97-136). San Diego: Academic Press, 1995. Speech perception in a second language Lively, S. E., Pisoni, D. B., Yamada, R. A., Tohkura, Y., and Yamada, T. Training Japanese listeners to identify English /r/ and /l/. III. Long-term retention of new phonetic categories. Journal of the Acoustical Society of America, 1994, 96, 20762087. Best, C.T. A direct realist view of cross-language speech perception. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 171-204). Baltimore: York Press , 1995. Flege, J. E. Second language speech learning: Theory, findings, and problems. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233-277). Baltimore: York Press , 1995. Guion, S.G., Flege, J.E., Akahane-Yamada, R. & Pruitt, J.C. An investigation of current models of second language speech perception: The case of Japanese adults’ perception of English consonants. Journal of the Acoustical Society of America, 2000, 107, 2711-2724. Visual information for speech perception McGurk, H. & MacDonald, J. Hearing lips and seeing voices. Nature, 1976, 264, 746-748. Kuhl, P.K. & Meltzoff, A. N. The bimodal perception of speech in infancy. Science, 1982, 218, 1138-1141. Massaro, D.W. & Cohen, M.M. Evaluation and integration of visual and auditory information in speech perception. Journal of Experimental Psychology: Human Perception and Performance, 1983, 9, 753-771. Massaro, D.W. & Cohen, M.M. Perception of synthesized audible and visible speech. Psychological Science, 1990, 1, 55-63. Massaro, D.W., Cohen, M.M. & Smeele, P.M.T. Perception of asynchronous and conflicting visual and auditory speech. JASA, 1996, 100, 1777-1786. Summerfield, Q. Visual perception of phonetic gestures. In Mattingly, I.G. & StuddertKennedy, M. (eds.), Modularity and the motor thoery of speech perception (pp. 117137). Hillsdale, N.J.: Lawrence Erlbaum, 1991. 10 Rosenblum, L.D., Schmuckler, M.A. & Johnson, J.A. The McGurk effect in infants. Perception & Psychophysics, 1997, 59, 347-357. Grant, K.W., Walden, B.E. & Seitz, P.F. Auditory-visual speech recognition by hearingimpaired subjects: Consonant recognition, sentence recognition, and auditory-visual integration. JASA, 1998, 103, 2677-2690. Perception of acoustically (electrically) altered speech (including cochlear implant speech perception) Remez, R.E., Rubin, P.E., Pisoni, D.B. & Carrell, T.D. Speech perception without traditional speech cues. Science, 212, 947-950, 1981. Remez, R.E., Rubin, P.E., Berns, S.M., Pardo, J.S. & Lang, J.M. On the perceptual organization of speech. Psychological Review, 1994, 101, 129-156. Van Tassell, D.J., Soli, S.D., Kirby, V.M. & Widin, G.P. Speech waveform envelope cues for consonant recognition. Journal of the Acoustical Society of America, 1987, 82, 1152-1161. Turner, C.W., Souza, P.E. & Gorget, L.N. Use of temporal envelope cues in speech recognition by normal and hearing-impaired listeners. Journal of the Acoustical Society of America, 1995, 97, 2568-2576. Shannon, R.V., Zeng, F-G., Kamath, V., Wygonski, J. & Ekelid, M. Speech recognition with primarily temporal cues. Science, 1995, 270, 303-304. Dorman, M.F., Loizou, P.C. & Rainey, D. Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noiseband outputs. Journal of the Acoustical Society of America, 1997, 102, 2403-2411. Fishman, K., Shannon, R.V., Slattery, W.A. Speech recognition as a function of the number of electrrodes used in the SPEAK cochlear implant speech processor. Journal of Speech and Hearing Research, 1997, 40, 1201-1215. Shannon, R.V., Zeng, F-G., & Wygonski, J. Speech recognition with altered spectral distribution of envelope cues. JASA, 1998, 104, 2467-2476. Rosen, S., Faulkner, A. & Wilkinson, L. Adaptation by normal listeners to upward spectral shifts of speech: Implications for cochlear implants. Journal of the Acoustical Society of America, 1999, 106, 3629-3636. Eisenberg, L.S., Shannon, R.V., Martinez, A.S., Wygonski, J. & Boothroyd, A. Speech recognition with reduced spectral cues as a function of age. Journal of the Acoustical Society of America, 2000, 107, 2704-2710. Auditory models applied to speech perception Bladon, R.A.W. & Lindblom, B. Modeling the judgment of vowel quality differences. Journal of the Acoustical Society of America, 1981, 69, 1414-1422. Greenberg, S. Representation of speech in the auditory periphery (Theme issue). Journal of Phonetics, 1988, 16. Cohen, J.R. Application of an auditory model to speech recognition. Journal of the Acoustical Society of America, 1989, 85, 2623-2629 Jenison, R.L., Greenberg, S., Kluender, K.R. & Rhode, W.S. A composite model of the auditory periphery for the processing of speech based on the filter response functions of single auditory-nerve fibers. Journal of the Acoustical Society of America, 1991, 90, 773-786. 11 Hermansky, H. Should recognizers have ears? Speech Communication, 1998, 25, 3-24. Kewley-Port, D. & Zheng, Y. Auditory models of formant frequency discrimination for isolated vowels. JASA, 1998, 103, 1654-1666. Tchorz, J. & Kollmeier, B. A model of auditory perception as front end for automatic speech recognition. Journal of the Acoustical Society of America, 1999, 106, 20402050. De Cheveigné, A. & Kawahara, H. Missing-data model of vowel identification. JASA, 1999, 105, 3497-3508. Neuroimaging studies of speech perception Zatorre, R.J., Meyer, E., Gjedde, A. & Evans, A.C. PET studies of phonetic processing of speech: Review, replication, and reanalysis. Cerebral Cortex, 1996, 6, 21-30. Fitch, R.H., Miller, S., & Tallal, P. Neurobiology of speech perception. Annual Review of Neuroscience, 1997, 20, 331-353. Panagiotis, S.G., Diehl, R.L., Breier, J.I., Molis, M.R., Zouridakis, G., & Papanicolaou, A.C. MEG correlates of categorical perception of a voice onset time continuum in humans. Cognitive Brain Research, 1998, 7, 215-219. Mummer, C.J., Ashburner, J., Scott, S.K., & Wise, R.J.S. Functional neuroimaging of speech perception in six normal and two aphasic subjects. JASA, 1999, 106, 449457. Theoretical approaches Analysis by synthesis Stevens, K.N. & Halle, M. Remarks on analysis by synthesis and distinctive features. In W. Wathen-Dunn (Ed.), Models for the perception of speech and visual form (1967) Stevens, K.N. Toward a model for speech recognition. JASA, 1960, 32, 47-55. Motor theory (See references under "The speech mode hypothesis") Feature detectors Abbs, J.H. & Sussman, H.M. Neurophysiological feature detectors and speech perception: A discussion of theoretical implications. Journal of Speech and Hearing Research, 1971, 14, 23-36. Eimas, P.D. & Corbit, J.D. Selective adaptation of linguistic feature detectors. Cognitive Psychology, 1973, 4, 99-109. Diehl, R.L. Feature detectors for speech: A critical reappraisal. Psychological Bulletin, 1981, 89, 1-18. Sussman, H.M. Neural coding of relational invariance in speech: human language analogs to the barn owl. Psychological Review, 1989, 96, 631-642. Quantal theory and the invariance hypothesis Stevens, K.N. & Blumstein, S.E. The search for invariant acoustic correlates of phonetic features. In P.D. Eimas & J.L. Miller (Eds.), Perspectives on the study of speech (pp. 1-38), Hillsdale, NJ: Lawrence Erlbaum, 1981. 12 Stevens, K.N. On the quantal nature of speech. Journal of Phonetics, 1989, 17, 345. (See also commentaries in same issue.) Direct realism Fowler, C.A. An event approach to the study of speech perception. Journal of Phonetics, 1986, 14, 3-28. (See also commentaries in same issue.) Ohala, J. J. Speech perception is hearing sounds, not tongues. Journal of the Acoustical Society of America, 1966, 90, 1718-1725. Fowler, C.A. Listeners do hear sounds, not tongues. JASA, 1996, 99, 1730-1741. The theory of adaptive dispersion and the auditory enhancement hypothesis Lindblom, B., MacNeilage, P. & Studdert-Kennedy, M. Self-organizing processes and the explanation of phonological universals. In B. Butterworth, B. Comrie, & Ö. Dahl (Eds.), Explanation for language universals (pp. 181-203), 1984. Lindblom, B. Phonetic universals in vowel systems. In J.J. Ohala & J.J. Jaeger (Eds.), Experimental phonology (pp. 13-44), 1986. Diehl, R.L., Lindblom, B., & Creeger, C.P. (2003). Increasing realism of auditory representations yields further insights into vowel phonetics. Proceedings of the 15th International Congress of Phonetic Sciences, Vol. 2, pp. 1381-1384. Adelaide: Causal Publications. Diehl, R.L., & Kluender, K.R. (1989). On the objects of speech perception. Ecological Psychology, 1, 121-144. (See commentaries and reply in the same issue.) Diehl, R.L., Kluender, K.R. & Walsh, M.A. Some auditory bases of speech perception and production. In W. Ainsworth (Ed.), Advances in speech, hearing and language processing (pp. 243-268), 1990. Diehl, R.L., Lotto., A.J., & Holt, L.L. Speech perception. Annual Review of Psychology, 2004, 55, 149-179. Quantitative detection and decision models Oden, G.C. & Massaro, D. W. Integration of featural information in speech perception. Psychological Review, 1978, 85, 172-191. Massaro, D.W. Categorical partition: A fuzzy-logical model of categorization behavior. In S. Harnad (Ed.), Categorical perception (pp. 254-286), 1987. Macmillan, N.A. Beyond the categorical/continuous distinction: A psychophysical approach to processing modes. In S. Harnad (Ed.), Categorical perception (pp. 5388), 1987. Nearey, T.M. The segment as a unit of speech perception. Journal of Phonetics, 1990, 18, 347-374. Nearey, T.M. Speech perception as pattern recognition. JASA, 1997, 101, 3241-3254. (See commentary by Kluender and Lotto, JASA, 1999, 105, 503-511.) Kingston, J. & Macmillan, N.A. Integrality of nasalization and F1 in vowels in isolation and before oral and nasal consonants: A detection-theoretic application of the Garner paradigm. Journal of the Acoustical Society of America, 1995, 97, 12611285. Macmillan, N.A., Kingston, J., Thorburn, R., Dickey, L.W. & Bartels, C. Integrality of nasalization and F1. II. Basic sensitivity and phonetic labeling measure distinct 13 sensory and decision-rule interactions. Journal of the Acoustical Society of America, 1999, 106, 2913-2932. Machine recognition and connectionist models Reddy, D.R. Machine models of speech perception. In R.A. Cole (Ed.), Perception and production of fluent speech (pp. 215-242), 1980. Klatt, D.H. Speech perception: A model of acoustic-phonetic analysis and lexical access. In R.A. Cole (Ed.), Perception and production of fluent speech (pp. 243-288), 1980. O'Shaughnessy, D. Speech communication: Human and machine (pp. 413-478), 1987. McClelland, J.L. & Elman, J.L. The TRACE model of speech perception. Cognitive Psychology, 1986, 18, 1-86. Elman, J.L. & Zipser, D. Learning the hidden structure of speech. Institute for Cognitive Science, Report 8701, Feb. 1987. Rabiner, L. R. and Juang, B. H. (1986). An introduction to hidden markov models. IEEE ASSP Magazine, January, 4-16. Poritz, A. B. (1988). Hidden markov models: A guided tour. ICASSP, 7-13, New York City. Norris, D. Shortlist: a connectionist model of continuous speech recognition. Cognition, 52, 189-234, 1994. Liu, S.A. Landmark detection for distinctive feature-based speech recognition. JASA, 1996, 100, 3417-3430. Ainsworth, W.A. Some approaches to automatic speech recognition. In Hardcastle, W.J. & Laver, J. (eds.), The handbook of phonetic sciences (pp. 721-743), 1997. 14