Speech Perception Re.. - HomePage Server for UT Psychology

advertisement
Randy Diehl
Psy 394U
Speech Perception
Background Readings
General phonetics
Catford, J.C. Fundamental problems in phonetics (1977).
Ladefoged, P. A course in phonetics, 3rd Ed. (1993).
Hardcastle, W.J. & Laver, J. (eds.) The handbook of phonetic sciences (1997).
Distinctive feature theory
Jakobson, R. & Halle, M. Fundamentals of language (1956).
Jakobson, R., Fant, G., & Halle, M. Preliminaries to speech analysis (1963).
Chomsky, N. & Halle, M. The sound pattern of English (1968), Chapter 7.
Ladefoged, P. Phonetic prerequisites for a distinctive feature theory. In Valdman, A.
(ed.), Papers in linguistics and phonetics to the memory of Pierre Delattre. The
Hague: Mouton (pp. 273-285), 1972.
Vennemann, T. & Ladefoged, P. Phonetic features and phonological features. Lingua, 32,
61-74, 1973.
McCarthy, J.J. Feature geometry and dependency: A review. Phonetica, 43, 84-108, 1988.
Diehl, R.L. & Lindblom, B. Explaining the structure of feature and phoneme inventories.
In S. Greenberg, W.A. Ainsworth, A. Popper, and R. Fay (eds.) Speech processing
in the auditory system New York: Springer (pp. 101-162), 2004.
Acoustic theory of speech production
Fant, G. Acoustic theory of speech production (1970).
Flanagan, J.L. Speech analysis, synthesis, and perception (1972).
Stevens, K.N. Acoustic phonetics (1998).
Signal processing and linear systems analysis
Markel, J.D. & Gray, A.H. Jr. Linear prediction of speech (1976).
Oppenheim, A.V. & Willsky, A.S. Signals and systems (1983).
Rosen, S. & Howell, P. Signals and systems for speech and hearing (1991, recently
revised).
Anatomy and physiology of speech production
Minifie, F.D., Hixon, T.J., Williams, F. (Eds.) Normal aspects of speech, hearing and
language (1973). Chapters 3, 4, 5, 6.
Hardcastle, W.J. Physiology of speech production (1976).
Dickson, D.R. & Maue-Dickson, W. Anatomical and physiological bases of speech
(1982).
Acoustic and articulatory phonetics
Denes, P.B. & Pinson, E.N. The speech chain (1963).
Lehiste, I. (Ed.) Readings in acoustic phonetics (1967).
Lieberman, P. & Blumstein, S.E. Speech physiology, speech perception, and acoustic
phonetics (1988).
Kent, R. & Read, C. The acoustic analysis of speech (1992).
Olive, J.P., Greenwood, A. & Coleman, J. Acoustics of American English Speech
(1993).
Stevens, K.N. Acoustic phonetics (1998).
Pickett, J.M. The acoustics of speech communication (1999).
Early Haskins work: in search of the acoustic cues
Delattre, P.C., Liberman, A.M., & Cooper, F.S. Acoustic loci and transitional cues for
consonants. JASA, 1955, 27, 769-773.
Delattre, P.C., Liberman, A.M., Cooper, F.S. & Gerstman, L. An experimental study of
the acoustic determinants of vowel color: observations on one- and two-formant
vowels synthesized from spectrographic patterns. Word, 1952, 8, 195-210.
Delattre, P.C., Liberman, A.M. & Cooper, F.S. Formant transitions and loci as acoustic
correlates of place of articulation in American fricatives. Studia Linguistica, 1964,
104-121.
Harris, K.S., Hoffman, H.S., Liberman, A.M., Delattre, P.C., & Cooper, F.S. Effect of
third-formant transitions on the perception of the voiced stop consonants. JASA,
1958, 30, 122-126.
Liberman, A.M. Some results of research on speech perception. JASA, 1957, 29, 117123.
Liberman, A.M., Delattre, P.C., & Cooper, F.S. Some cues for the distinction between
voiced and voiceless stops in initial position. Language and Speech, 1958, 1, 153167.
Liberman, A.M., Delattre, P.C. & Cooper, F.S. The role of selected stimulus variables in
the perception of unvoiced stop consonants. American Journal of Psychology,
1952, 65, 497-516.
Liberman, A.M., Delattre, P.C., & Cooper F.S. Tempo of frequency change as a cue for
distinguishing classes of speech sounds. Journal of Experimental Psychology,
1956, 52, 127-137.
Lisker, L. Closure duration and the intervocalic voiced-voiceless distinction in English.
Language, 1957, 13, 256-267.
Lisker, L. Minimal cues for separating /w, r, l, y/ in intervocalic position. Word, 1957,
13, 256-267.
O'Connor, J.D., Gerstman, L.J., Liberman, A.M., Delattre, P.C. & Cooper, F.S. Acoustic
cues for the perception of initial /w, j, r, l/ in English. Word, 1957, 13, 24-43.
The speech mode hypothesis
Liberman, A.M., Cooper, F.S., Shankweiler, D.P. & Studdert-Kennedy, M. The
perception of the speech code. Psychological Review, 1967, 74, 431-461.
Liberman, A.M. On the finding that speech is special. American Psychologist, 1982, 37,
148-167.
Liberman, A.M. & Mattingly, I.G. The motor theory of speech perception revised.
Cognition, 1985, 21, 1-36.
Liberman, A.M. Speech: A special code (1996).
2
Categorical perception of speech
Lisker, L., & Abramson, A.S. The voicing dimension: Some
experiments in comparative phonetics. Proceedings of the 6th International
Congress of Phonetic Sciences (pp. 563-567). Prague: Academia, 1970.
Abramson, A.S. & Lisker, L. Discriminability along the voicing continuum: Crosslanguage tests. In Proceedings of the 6th International Congress of Phonetic
Sciences (pp. 569-573). Prague: Academia, 1970.
Liberman, A.M., Harris, K.S., Eimas, P.D., Lisker, L. & Bastian, J. An effect of learning
on speech perception: The discrimination of durations of silence with and without
phonemic significance. Language and Speech, 1961, 4, 175-195.
Liberman, A.M., Harris, K.S., Hoffman, H.S. & Griffith, B.C. The discrimination of
speech sounds within and across phoneme boundaries. Journal of Experimental
Psychology, 1957, 54, 358-368.
Liberman, A.M., Harris, K.S., Kinney, J.A. & Lane, H.L. The discrimination of relative
onset time of the components of certain speech and nonspeech patterns. Journal of
Experimental Psychology, 1961, 61, 378-388.
Mattingly, I.G., Liberman, A.M., Syrdal, A.K., & Halwes, T.G. Discrimination in speech
and non-speech modes. Cognitive Psychology, 1971, 2, 131-157.
Miyawaki, K., Strange, W., Verbrugge, R., Liberman, A.M., Jenkins, J.J., & Fujimura, O.
An effect of linguistic experience: The discrimination of /r/ and /l/ by native
speakers of Japanese and English. Perception & Psychophysics, 1975, 18, 331-340.
Repp, B.H. Categorical perception: Issues, methods, findings. In Lass, N.J. (ed.),
Speech and language: Advances in basic research and practice, Vol. 10 (pp. 243335). Academic Press, 1984.
Pastore, R.E. Categorical perception: Some psychophysical models. In S. Harnad (ed.),
Categorical perception (pp. 29-52), 1987.
Rosen, S. & Howell, P. Auditory, articulatory, and learning explanations of categorical
perception in speech. In S. Harnad (ed.), Categorical perception (pp. 113-160),
1987.
Harnad, S. (ed.). Categorical perception (1987).
Panagiotis, S.G., Diehl, R.L., Breier, J.I., Molis, M.R., Zouridakis, G., & Papanicolaou,
A.C. MEG correlates of categorical perception of a voice onset time continuum in
humans. Cognitive Brain Research, 1998, 7, 215-219.
Sharma, A. & Dorman, M.F. Cortical auditory evoked potential correlates of categorical
perception of voice-onset time. JASA, 1999, 106, 1078-1083.
Categorical perception of nonspeech
Miller, J.D., Wier, C.C., Pastore, R.E., Kelly, W.J. & Dooling, R.J. Discrimination and
labeling of noise-buzz sequences with varying noise-lead times: An example of
categorical perception. JASA, 1976, 60, 410-417.
Pisoni, D.B. Identification and discrimination of the relative onset time of two
component tones: Implications for voicing perception in stops. JASA, 1977,
1352-1361.
Holt, L.L., Lotto, A.J., & Diehl, R.L. Auditory discontinuities interaction with
categorization: Implications for speech perception. JASA, in press.
3
Speech perception by animals
Kuhl, P.K. & Miller, J.D. Speech perception by the chinchilla: Identification
functions for synthetic VOT stimuli. JASA, 1978, 63, 905-917.
Kuhl, P.K. Discrimination of speech by nonhuman animals: Basic auditory sensitivities
conducive to the perception of speech-sound categories. JASA, 1981, 70, 340-349.
Kuhl, P.K. & Padden, D.M. Enhanced discrimination at the phonetic boundaries for the
voicing feature in macaques. Perception & Psychophysics, 1982, 32, 542-550.
Kuhl, P.K. & Padden, D.M. Enhanced discrimination at the phonetic boundaries for the
place feature in macaques. JASA, 1983, 73, 1003-1010.
Kluender, K.R., Diehl, R.L., & Killeen, P.R. Japanese quail can learn phonetic
categories. Science, 1987, 237, 1195-1197.
Kluender, K.R. Effects of first formant onset properties on voicing judgments result from
processes not specific to humans. JASA, 1991, 90, 83-96.
Kluender, K.R. & Lotto, A.J. Effects of first formant onset frequency on [-voice]
judgments result from general auditory processes not specific to humans. JASA,
1994, 95, 1044-1052.
Lotto, A.J., Kluender, K.R. & Holt, L.L. Perceptual compensation for coarticulation by
Japanese quail (Coturnix coturnix japonica). JASA, 1997, 102, 1134-1140.
Kluender, K.R., Lotto, A.J., Holt, L.L. & Bloedel, S.L. Role of experience for languagespecific functional mappings of vowel sounds. JASA, 1998, 104, 3568-3582.
Electrophysiological studies of of speech perception in animals
Young, E.D. & Sachs, M.B. Processing speech in the peripheral auditory system. In T.
Myers, J. Laver & J. Anderson (Eds.), The cognitive representation of speech (pp.
75-92), 1981.
Sachs, M.B., Young, E.D. & Miller, M.I. Encoding of speech features in the auditory
nerve. In R.Carlson & B. Granström (Eds.), The representation of speech in the
peripheral auditory system (pp. 115-130), 1982.
Delgutte, B. & Kiang, N.Y.S. Speech coding in the auditory nerve: (I-IV). JASA, 1984,
75, 866-918.
Smoorenburg, G.F. Discussion of physiological correlates of speech perception. In
M.E.H. Schouten (Ed.), The psychophysics of speech perception (pp. 393-399).
1987.
Sinex, D.G. & McDonald, L.P. Average discharge rate representation of voice onset time
in the chinchilla auditory nerve. JASA, 1988, 83, 1817-1827.
Sinex, D.G. & McDonald, L.P. Synchronized discharge rate representation of voice onset
time in the chinchilla auditory nerve. JASA, 1989, 85, 1995-2004.
Sinex, D.G., McDonald, L.P. & Mott, J.B. Neural correlates of nonmonotonic
temporal acuity for voice onset time. JASA, 1991, 90, 2441-2449.
Delgutte, B. Auditory neural processing of speech. In Harcastle, W.J. & Laver, J. (eds.),
The handbook of phonetic sciences (pp. 508-538), 1997.
Trading relations in speech and nonspeech
4
Best, C.T., Morrongiello, B. & Robson, R. Perceptual equivalence of acoustic cues in
speech and nonspeech perception. Perception & Psychophysics, 1981, 29, 191-211.
Repp, B.H. Phonetic trading relations and context effects: New experimental evidence
for a speech mode of perception. Psychological Bulletin, 1982, 92, 81-110.
Summerfield, A.Q. Differences between spectral dependencies in auditory and phonetic
temporal processing: Relevance to the perception of voicing in initial stops. JASA,
1982, 72, 51-61.
Hillenbrand, J. Perception of sine-wave analogs of voice onset time stimuli. JASA,
1984, 75, 231-240.
Parker, E.M., Diehl, R.L. & Kluender, K.R. Trading relations in speech and nonspeech.
Perception & Psychophysics, 1986, 34, 314-322.
Parker, E.M. Auditory constraints on the perception of stop voicing: The influence of
lower-tone frequency on judgments of tone-onset simultaneity. JASA, 1988, 83,
1597-1607.
Infant perception of speech and nonspeech
Eimas, P.D., Siqueland, E.R., Jusczyk, P. & Vigorito, J. Speech perception in
infants. Science, 1971, 171, 303-306.
Kuhl, P.K. Speech perception in early infancy: Perceptual constancy for spectrally
dissimilar vowel categories. JASA, 1979, 66, 1669-1679.
Kuhl, P.K. & Meltzoff, A. N. The bimodal perception of speech in infancy. Science,
1982, 218, 1138-1141.
Aslin, R.N., Pisoni, D.B. & Jusczyk, P.W. Auditory development and speech perception
in infancy. In M.M. Haith & J.J. Campos (Eds.), Infancy and the biology of
development, 1983.
Jusczyk, P. W., Pisoni, D. B., Walley, A. and Murray, J. Discrimination of relative
onset time of two-component tones by infants. Journal of the Acoustical
Society of America, 1980, 67, 262-270.
Jusczyk, P. W., Pisoni, D. B., Reed, M. A., Fernald, A. and Myers, M. Infants'
discrimination of the duration of a rapid spectrum change in nonspeech signals.
Science, 1983, 222, 175-177.
Jusczyk, P. W. Language acquistion: Speech sounds and the beginning of phonology. In
Miller, J.L. & Eimas, P.D. Speech, language, and communication (pp. 263-301).
San Diego: Academic Press, 1995.
Jusczyk, P.W. The discovery of spoken language (1997).
Werker, J.F. & Tees, R.C. Cross-language speech perception: Evidence for perceptual
reorganization during the first year of life. Infant Behavior and Development, 1984,
7, 49-63.
Werker, J. F. and Lalonde, C. E. Cross-language speech perception: Initial capabilities
and developmental change. Developmental Psychology, 1988, 24, 672-683.
Werker, J.F. The ontogeny of speech perception. In I.G. Mattingly & M. StuddertKennedy (Eds.), Modularity and the motor theory of speech perception (pp. 91109), 1991.
Mehler, J., Jusczyk, P., Lambertz, G., Halsted, N., Bertoncini, J. & Amiel-Tison, C. A
precursor of language acquistion in young infants. Cognition, 1988, 29, 143-178.
Kuhl, P.K., Williams, K.A., Lacerda, F., Stevens, K.N. & Lindblom, B. Linguistic
experience alters phonetic perception in infants by 6 months of age. Science,
5
1992, 255, 606-608.
Kuhl, P.K. &Meltzoff, A.N. Infant vocalizations in response to speech: Vocal imitation
and developmental change. JASA, 1996, 100, 2425-2438.
Saffran, J. R., Aslin, R. N., and Newport, E. L. Statistical learning by 8-month-old
infants. Science, 1996, 274, 1926-1928.
Eimas, P.D. Segmental and syllabic representations in the perception of speech by young
infants. JASA, 1999, 105, 1901-1911.
Intonation and stress
Fry, D.B. Duration and intensity as physical correlates of linguistic stress. JASA, 1955,
27, 765-768. (Also in Lehiste.)
Hadding-Kock, K & Studdert-Kennedy, M. An experimental study of some intonation
contours. Phonetica, 1964, 11, 175-185.
Hadding, K. & Studdert-Kennedy, M. Are you asking me, telling me, or talking to
yourself? Journal of Phonetics, 1974, 2, 7-14.
Cutler, A. Phoneme-monitoring reaction time as a function of preceding intonation
contour. Perception & Psychophysics, 1976, 20, 55-60.
Darwin, C. J. On the dynamic use of prosody in speech perception. In A. Cohen & S.G.
Nooteboom (Eds.), Structure and process in speech perception (1975).
Nakatani, L.H. & Schaffer, J.A. Hearing "words" without words: Prosodic cues for word
perception. JASA, 1978, 63, 234-245.
Pierrehumbert, J.B. The perception of fundamental frequency declination. JASA, 1979,
66, 363-369.
Thorsen, N.G. A study of the perception of sentence intonation--Evidence from Danish.
JASA, 1980, 67, 1014-1031.
Steele, S.A. Interaction of vowel F0 and prosody. Phonetica, 1986, 43, 92-105.
Kohler, K.J. & van Dommelen, W.A. Prosodic effects on lenis/fortis perception:
Preplosive F0 and LPC synthesis. Phonetica, 1986, 43. 70-75.
Silverman, K. F0 segmental cues depend on intonation: The case of the rise after voiced
stops. Phonetica 1986, 43, 76-91.
Whalen, D.H., Abramson, A.S., Lisker, L. & Mody, M. Gradient effects of fundamental
frequency on stop consonant voicing judgments. Phonetica, 1990, 47, 36-49.
Sluijter, A. & Terken, J. Beyond sentence prosody: Paragraph intonation in Dutch.
Phonetica, 1994, 50, 180-188.
Swerts, M. Prosodic features at discourse boundaries of different strength. JASA, 1997,
101, 514-521.
Timing, rate and segment duration
Klatt, D. Linguistic uses of segmental duration in English: Acoustic and perceptual
evidence. JASA, 1976, 59, 1208-1221.
Klatt, D. & Cooper, W.E. Perception of segment duration in sentence contexts. In A.
Cohen & S.G. Nooteboom (Eds.), Structure and process in speech perception
(1975).
Lehiste, I., Olive, J.P. & Streeter, L.A. Role of duration in disambiguating sytactically
ambiguous sentences. JASA, 1976, 60, 1199-1202.
6
Miller, J.L. & Liberman, A.M. Some effects of later-occurring information on the
perception of stop consonant and semivowel. Perception & Psychophysics, 1979,
25, 457-465.
Miller, J.L. Effects of speaking rate on segmental distinctions. In P.D. Eimas & J.L.
Miller (Eds.), Perspectives on the study of speech (1981).
Summerfield, A.Q. On articulatory rate and perceptual constancy in phonetic perception.
Journal of Experimental Psychology: Human Perception and Performance, 1981, 7,
1074-1095.
Fowler, C.A. Converging sources of evidence on spoken and perceived rhythms of
speech: Cyclic production of vowels in monosyllabic stress feet. Journal of
Experimental Psychology: General, 1983, 112-386-412.
Diehl, R.L. & Walsh, M.A. An auditory basis for the stimulus-length effect in the
perception of stops and glides. JASA, 1989, 85, 2154-2164.
Fowler, C.A. Sound-producing sources as objects of perception: Rate normalization and
nonspeech perception. JASA, 1990, 88, 1236-1249. (Also read reply by Diehl,
Walsh, and Kluender in JASA, 1991, 89, 2905-2909.)
Merzenich, M. M., Jenkins, W. M., Johnston, P., Schreiner, C., Miller, S. L., and
Tallal, P. Temporal processing deficits of language-learning impaired children
ameliorated by training. Science, 1996, 271, 77-81.
Tallal, P., Miller, S. L., Bedi, G., Byma, G., Wang, X., Nagarajan, S. S., Schreiner,
C., Jenkins, W. M., and Merzenich, M. M. Language comprehension in
language-learning impaired children improved with acoustically modified
speech. Science, 1996, 271, 81-84.
Vowel perception
Ladefoged, P. & Broadbent, D.E. Information conveyed by vowels. JASA, 1957, 29, 98104.
Lindblom, B. & Studdert-Kennedy, M. On the role of formant transitions in vowel
recognition. JASA, 1967, 42, 830-843.
Carlson, R., Fant, G., & Granström, B. Two-formant models, pitch and vowel
perception. In G. Fant & M.A.A. Tatham (Eds.), Auditory analysis and perception
of speech (pp. 55-82), 1975.
Chistovich, L.A. & Lublinskaya, V.V. The 'center of gravity' effect in vowel spectra and
critical distance between the formants: Psychoacoustical study of the perception of
vowel-like stimuli. Hearing Research, 1979, 1, 185-195.
Traunmüller, H. Perceptual dimension of openness in vowels. JASA, 1981, 69, 14651475.
Darwin, C.J. Perceiving vowels in the presence of another sound: Constraints on
formant perception. JASA, 1984, 76, 1636-1647.
Darwin, C.J. & Gardner, T.B. Mistuning a harmonic of a vowel: Grouping and phase
effects on vowel quality. JASA, 1986, 79, 838-845.
Strange, W., Jenkins, J.J., Johnson, T.L. Dynamic specification of coarticulated
vowels. JASA, 1983, 74, 695-705.
Strange, W. Information for vowels in formant transitions. Journal of Memory and
Language, 1987, 26, 550-557.
Strange, W. Evolving theories of vowel perception. JASA, 1989, 85, 2081-2087.
7
Syrdal, A.K. & Gopal, H.S. A perceptual model of vowel recognition based on the
auditory representation of American English vowels. JASA, 1986, 79, 1086-1100.
Nearey, T.M. Static, dynamic, and relational properties in vowel perception. JASA,
1989, 85, 2088-2113.
Miller, J.D. Auditory-perceptual interpretation of the vowel. JASA, 1989, 85, 21142134.
Andruski, J.E. & Nearey, T.M. On the sufficiency of compound target specification of
isolated vowels and vowels in /bVb/ syllables. JASA, 1992, 91, 390-410.
Zahorian, S.A. & Jagharghi, A.J. Spectral shape versus formants as acoustic correlates
for vowels. JASA, 1993, 94, 1966-1982.
Hoemeke, K.A. & Diehl, R.L. Perception of vowel height: The role of F1-F0 distance.
JASA, 1994, 96, 661-674.
Moon, S.Y. & Lindblom, B. Interaction between duration, context, and speaking style in
English stressed vowels. JASA, 1994, 96, 40-55.
Rosner, B.S. & Pickering, J.B. Vowel perception and production (1994).
Hillenbrand, J.M., Getty, L.A., Clark, M.J. & Wheeler, K. Acoustic characteristics of
American English vowels. JASA, 1995, 97, 3099-3111.
Hillenbrand, J.M. & Nearey, T.M. Identification of resynthesized /hVd/ utterances:
Effects of formant contour. JASA, 1999, 105, 3509-3523.
Jenkins, J.J., Strange, W. & Trent, S.A. Context-independent dynamic information for
the perception of coarticulated vowels. JASA, 1999, 106, 438-448.
Perception of concurrent vowels
Assmann, P.F. & Summerfield, Q. Modeling the perception of concurrent vowels:
Vowels with different fundamental frequencies. JASA, 1990, 88, 680-697.
Summerfield, Q. & Assmann, P.F. Perception of concurrent vowels: Effects of harmonic
misalignment and pitch-period asynchrony. JASA, 1991, 89, 1364-1377.
Assmann, P.F. & Summerfield, Q. The contribution of waveform interactions to the
perception of concurrent vowels. JASA, 1994, 95, 471-484.
Culling, J.F. & Darwin, C.J. Perceptual separation of simultaneous vowels: Within and
across-formant grouping by F0. JASA, 1993, 93, 3454-3467.
Culling, J.F. & Darwin, C.J. Perceptual and computational separation of simultaneous
vowels: Cues arising from low frequency beating. JASA, 1994, 95, 1559-1569.
Culling, J.F. & Summerfield, Q. The role of frequency modulation in the perceptual
segregation of concurrent vowels. JASA, 1995, 98, 837-846.
Assmann, P.F. Modeling the perception of concurrent vowels: Role of formant
transitions. JASA, 1996, 100, 1141-1152.
de Cheveigné, A., Kawahara, H., Tsuzaki, M. & Aikawa, K. Concurrent vowel
identification. I. Effects of relative amplitude and F0 difference. JASA, 1997, 101,
2839-2847.
de Cheveigné, A., McAdams, S., & Marin, C. M.H. Concurrent vowel identification. II.
Effects of phase, harmonicity, and task. JASA, 1997, 101, 2848-2856.
de Cheveigné, A. Concurrent vowel identification. III. A neural model of harmonic
interference cancellation. JASA, 1997, 101, 2857-2865.
The “perceptual magnet effect”
8
Grieser, D. & Kuhl, P.K. Categorization of speech by infants: Support for speech-sound
prototypes. Developmental Psychology, 1989, 25, 577-588.
Kuhl, P.K. Human adults and human infants show a ‘perceptual magnet effect’ for
the prototypes of speech categories, monkeys do not. Perception &
Psychophysics, 1991, 50, 93-107.
Kuhl, P.K., Williams, K.A., Lacerda, F., Stevens, K.N., & Lindblom, B. Linguistic
experience alters phonetic perception in infants by 6 months of age. Science,
1992, 255, 606-608.
Lively, S.E. & Pisoni, D.B. On prototypes and phonetic categories: A critical assessment
of the perceptual magnet effect in speech perception. Journal of Experimental
Psychology: Human Perception and Performance, 1995, 23, 1665-1679.
Kluender, K.R., Lotto, A.J., Holt, L.L. & Bloedel, S.L. Role of experience for
language-specific functional mappings of vowel sounds. JASA, 1998, 104,
3568-3582.
Lotto, A.J., Kluender, K.R., & Holt, L.L. Depolarizing the perceptual magnet
effect. Journal of the Acoustical Society of America, 1998, 103, 3648-3655.
(See also Letters to the Editor by Guenther and Lotto, JASA, 2000, 3576-3580.)
Guenther, F.H., Husain, F.T., Cohen, M.A., & Shinn-Cunningham, B.G. Effects of
categorization and discrimination training on auditory perceptual space. Journal of
the Acoustical Society of America, 1999, 106, 2900-2912.
Talker normalization
Ladefoged, P. & Broadbent, D.E. Information conveyed by vowels. JASA, 1957, 29,
98-104.
Johnson, K. & Mullennix, J.W. (eds.) Talker variability in speech processing (1997).
Perception of consonant place: Invariant cues?
Stevens, K.N. & Blumstein, S.E. Invariant cues for place of articulation in stop
consonants. JASA, 1978, 64, 1358-1368.
Kewley-Port, D. Time-varying features as correlates of place of articulation in stop
consonants. JASA, 1983, 73, 322-335.
Sussman, H.M., McCaffrey, H.A., Matthews S.A. An investigation of locus equations as
a source of relational invariance for stop place categorization. JASA, 1991, 90,
1309-1325.
Sussman, H.M. The representation of stop consonants in three-dimensional acoustic
space. Phonetica, 1991, 48, 18-31.
Fruchter, D. & Sussman, H.M. The perceptual relevance of locus equations. JASA,
1997, 102, 2997-3008.
Sussman, H.M., Fruchter, D. Hilbert, J. & Sirosh, J. Linear correlates in the speech
signal: The orderly output constraint. Behavioral and Brain Sciences, 1998, 21,
241-259. (See commentaries and reply in same issue.)
Chennoukh, S., Carré, R. & Lindblom, B. Locus equations in light of articulatory
modeling. Journal of the Acoustical Society of America, 1997, 102, 2380-2389.
Context and higher-order effects on speech perception
9
Warren, R.M. Perceptual restoration of missing speech sounds. Science, 1970, 167,
392-393.
Elman, J.L., Diehl, R.L. & Buchwald, S.E. Perceptual switching in bilinguals. JASA,
1977, 62, 971-974.
Mann, V.A. & Repp, B.H. Influence of vocalic context on perception of the /ß/-/s/
distinction. Perception & Psychophysics, 1980, 28, 213-228.
Ganong, W.F., III. Phonetic categorization in auditory word perception. Journal of
Experimental Psychology: Human Perception and Performance, 1980, 6, 110-125.
Luce, P., Pisoni, D. B., and Goldinger, S. Similarity neighborhoods of spoken words. In
G. T. M. Altmann (Ed.), Cognitive models of speech processing: Psycholinguistic
and computational perspectives. Cambridge, MA: MIT Press, 122-147, 1990.
Cutler, A. Spoken word recognition and production. In Miller, J.L. & Eimas, P.D. (eds.),
Speech, language, and communication (pp. 97-136). San Diego: Academic Press,
1995.
Speech perception in a second language
Lively, S. E., Pisoni, D. B., Yamada, R. A., Tohkura, Y., and Yamada, T. Training
Japanese listeners to identify English /r/ and /l/. III. Long-term retention of new
phonetic categories. Journal of the Acoustical Society of America, 1994, 96, 20762087.
Best, C.T. A direct realist view of cross-language speech perception. In W. Strange
(Ed.), Speech perception and linguistic experience: Issues in cross-language
research (pp. 171-204). Baltimore: York Press , 1995.
Flege, J. E. Second language speech learning: Theory, findings, and problems. In W.
Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language
research (pp. 233-277). Baltimore: York Press , 1995.
Guion, S.G., Flege, J.E., Akahane-Yamada, R. & Pruitt, J.C. An investigation of current
models of second language speech perception: The case of Japanese adults’
perception of English consonants. Journal of the Acoustical Society of America,
2000, 107, 2711-2724.
Visual information for speech perception
McGurk, H. & MacDonald, J. Hearing lips and seeing voices. Nature, 1976, 264,
746-748.
Kuhl, P.K. & Meltzoff, A. N. The bimodal perception of speech in infancy. Science,
1982, 218, 1138-1141.
Massaro, D.W. & Cohen, M.M. Evaluation and integration of visual and auditory
information in speech perception. Journal of Experimental Psychology: Human
Perception and Performance, 1983, 9, 753-771.
Massaro, D.W. & Cohen, M.M. Perception of synthesized audible and visible speech.
Psychological Science, 1990, 1, 55-63.
Massaro, D.W., Cohen, M.M. & Smeele, P.M.T. Perception of asynchronous and
conflicting visual and auditory speech. JASA, 1996, 100, 1777-1786.
Summerfield, Q. Visual perception of phonetic gestures. In Mattingly, I.G. & StuddertKennedy, M. (eds.), Modularity and the motor thoery of speech perception (pp. 117137). Hillsdale, N.J.: Lawrence Erlbaum, 1991.
10
Rosenblum, L.D., Schmuckler, M.A. & Johnson, J.A. The McGurk effect in infants.
Perception & Psychophysics, 1997, 59, 347-357.
Grant, K.W., Walden, B.E. & Seitz, P.F. Auditory-visual speech recognition by hearingimpaired subjects: Consonant recognition, sentence recognition, and auditory-visual
integration. JASA, 1998, 103, 2677-2690.
Perception of acoustically (electrically) altered speech (including cochlear implant speech
perception)
Remez, R.E., Rubin, P.E., Pisoni, D.B. & Carrell, T.D. Speech perception without
traditional speech cues. Science, 212, 947-950, 1981.
Remez, R.E., Rubin, P.E., Berns, S.M., Pardo, J.S. & Lang, J.M. On the perceptual
organization of speech. Psychological Review, 1994, 101, 129-156.
Van Tassell, D.J., Soli, S.D., Kirby, V.M. & Widin, G.P. Speech waveform envelope
cues for consonant recognition. Journal of the Acoustical Society of America,
1987, 82, 1152-1161.
Turner, C.W., Souza, P.E. & Gorget, L.N. Use of temporal envelope cues in speech
recognition by normal and hearing-impaired listeners. Journal of the Acoustical
Society of America, 1995, 97, 2568-2576.
Shannon, R.V., Zeng, F-G., Kamath, V., Wygonski, J. & Ekelid, M. Speech
recognition with primarily temporal cues. Science, 1995, 270, 303-304.
Dorman, M.F., Loizou, P.C. & Rainey, D. Speech intelligibility as a function of the
number of channels of stimulation for signal processors using sine-wave and noiseband outputs. Journal of the Acoustical Society of America, 1997, 102, 2403-2411.
Fishman, K., Shannon, R.V., Slattery, W.A. Speech recognition as a function of the
number of electrrodes used in the SPEAK cochlear implant speech processor.
Journal of Speech and Hearing Research, 1997, 40, 1201-1215.
Shannon, R.V., Zeng, F-G., & Wygonski, J. Speech recognition with altered spectral
distribution of envelope cues. JASA, 1998, 104, 2467-2476.
Rosen, S., Faulkner, A. & Wilkinson, L. Adaptation by normal listeners to upward
spectral shifts of speech: Implications for cochlear implants. Journal of the
Acoustical Society of America, 1999, 106, 3629-3636.
Eisenberg, L.S., Shannon, R.V., Martinez, A.S., Wygonski, J. & Boothroyd, A. Speech
recognition with reduced spectral cues as a function of age. Journal of the
Acoustical Society of America, 2000, 107, 2704-2710.
Auditory models applied to speech perception
Bladon, R.A.W. & Lindblom, B. Modeling the judgment of vowel quality
differences. Journal of the Acoustical Society of America, 1981, 69, 1414-1422.
Greenberg, S. Representation of speech in the auditory periphery (Theme issue). Journal
of Phonetics, 1988, 16.
Cohen, J.R. Application of an auditory model to speech recognition. Journal of the
Acoustical Society of America, 1989, 85, 2623-2629
Jenison, R.L., Greenberg, S., Kluender, K.R. & Rhode, W.S. A composite model of the
auditory periphery for the processing of speech based on the filter response
functions of single auditory-nerve fibers. Journal of the Acoustical Society of
America, 1991, 90, 773-786.
11
Hermansky, H. Should recognizers have ears? Speech Communication, 1998, 25, 3-24.
Kewley-Port, D. & Zheng, Y. Auditory models of formant frequency discrimination for
isolated vowels. JASA, 1998, 103, 1654-1666.
Tchorz, J. & Kollmeier, B. A model of auditory perception as front end for automatic
speech recognition. Journal of the Acoustical Society of America, 1999, 106, 20402050.
De Cheveigné, A. & Kawahara, H. Missing-data model of vowel identification. JASA,
1999, 105, 3497-3508.
Neuroimaging studies of speech perception
Zatorre, R.J., Meyer, E., Gjedde, A. & Evans, A.C. PET studies of phonetic processing
of speech: Review, replication, and reanalysis. Cerebral Cortex, 1996, 6, 21-30.
Fitch, R.H., Miller, S., & Tallal, P. Neurobiology of speech perception. Annual Review
of Neuroscience, 1997, 20, 331-353.
Panagiotis, S.G., Diehl, R.L., Breier, J.I., Molis, M.R., Zouridakis, G., & Papanicolaou,
A.C. MEG correlates of categorical perception of a voice onset time continuum in
humans. Cognitive Brain Research, 1998, 7, 215-219.
Mummer, C.J., Ashburner, J., Scott, S.K., & Wise, R.J.S. Functional neuroimaging of
speech perception in six normal and two aphasic subjects. JASA, 1999, 106, 449457.
Theoretical approaches
Analysis by synthesis
Stevens, K.N. & Halle, M. Remarks on analysis by synthesis and distinctive features. In
W. Wathen-Dunn (Ed.), Models for the perception of speech and visual form (1967)
Stevens, K.N. Toward a model for speech recognition. JASA, 1960, 32, 47-55.
Motor theory
(See references under "The speech mode hypothesis")
Feature detectors
Abbs, J.H. & Sussman, H.M. Neurophysiological feature detectors and speech
perception: A discussion of theoretical implications. Journal of Speech and
Hearing Research, 1971, 14, 23-36.
Eimas, P.D. & Corbit, J.D. Selective adaptation of linguistic feature detectors. Cognitive
Psychology, 1973, 4, 99-109.
Diehl, R.L. Feature detectors for speech: A critical reappraisal. Psychological Bulletin,
1981, 89, 1-18.
Sussman, H.M. Neural coding of relational invariance in speech: human language
analogs to the barn owl. Psychological Review, 1989, 96, 631-642.
Quantal theory and the invariance hypothesis
Stevens, K.N. & Blumstein, S.E. The search for invariant acoustic correlates of
phonetic features. In P.D. Eimas & J.L. Miller (Eds.), Perspectives on the
study of speech (pp. 1-38), Hillsdale, NJ: Lawrence Erlbaum, 1981.
12
Stevens, K.N. On the quantal nature of speech. Journal of Phonetics, 1989, 17, 345. (See also commentaries in same issue.)
Direct realism
Fowler, C.A. An event approach to the study of speech perception. Journal of
Phonetics, 1986, 14, 3-28. (See also commentaries in same issue.)
Ohala, J. J. Speech perception is hearing sounds, not tongues. Journal of the Acoustical
Society of America, 1966, 90, 1718-1725.
Fowler, C.A. Listeners do hear sounds, not tongues. JASA, 1996, 99, 1730-1741.
The theory of adaptive dispersion and the auditory enhancement hypothesis
Lindblom, B., MacNeilage, P. & Studdert-Kennedy, M. Self-organizing processes and
the explanation of phonological universals. In B. Butterworth, B. Comrie, & Ö.
Dahl (Eds.), Explanation for language universals (pp. 181-203), 1984.
Lindblom, B. Phonetic universals in vowel systems. In J.J. Ohala & J.J. Jaeger
(Eds.), Experimental phonology (pp. 13-44), 1986.
Diehl, R.L., Lindblom, B., & Creeger, C.P. (2003). Increasing realism of auditory
representations yields further insights into vowel phonetics. Proceedings of the 15th
International Congress of Phonetic Sciences, Vol. 2, pp. 1381-1384. Adelaide:
Causal Publications.
Diehl, R.L., & Kluender, K.R. (1989). On the objects of speech perception.
Ecological Psychology, 1, 121-144. (See commentaries and reply in the same
issue.)
Diehl, R.L., Kluender, K.R. & Walsh, M.A. Some auditory bases of speech perception
and production. In W. Ainsworth (Ed.), Advances in speech, hearing and language
processing (pp. 243-268), 1990.
Diehl, R.L., Lotto., A.J., & Holt, L.L. Speech perception. Annual Review of
Psychology, 2004, 55, 149-179.
Quantitative detection and decision models
Oden, G.C. & Massaro, D. W. Integration of featural information in speech perception.
Psychological Review, 1978, 85, 172-191.
Massaro, D.W. Categorical partition: A fuzzy-logical model of categorization behavior.
In S. Harnad (Ed.), Categorical perception (pp. 254-286), 1987.
Macmillan, N.A. Beyond the categorical/continuous distinction: A psychophysical
approach to processing modes. In S. Harnad (Ed.), Categorical perception (pp. 5388), 1987.
Nearey, T.M. The segment as a unit of speech perception. Journal of Phonetics, 1990,
18, 347-374.
Nearey, T.M. Speech perception as pattern recognition. JASA, 1997, 101, 3241-3254.
(See commentary by Kluender and Lotto, JASA, 1999, 105, 503-511.)
Kingston, J. & Macmillan, N.A. Integrality of nasalization and F1 in vowels in isolation
and before oral and nasal consonants: A detection-theoretic application of the
Garner paradigm. Journal of the Acoustical Society of America, 1995, 97, 12611285.
Macmillan, N.A., Kingston, J., Thorburn, R., Dickey, L.W. & Bartels, C. Integrality of
nasalization and F1. II. Basic sensitivity and phonetic labeling measure distinct
13
sensory and decision-rule interactions. Journal of the Acoustical Society of
America, 1999, 106, 2913-2932.
Machine recognition and connectionist models
Reddy, D.R. Machine models of speech perception. In R.A. Cole (Ed.), Perception and
production of fluent speech (pp. 215-242), 1980.
Klatt, D.H. Speech perception: A model of acoustic-phonetic analysis and lexical access.
In R.A. Cole (Ed.), Perception and production of fluent speech (pp. 243-288), 1980.
O'Shaughnessy, D. Speech communication: Human and machine (pp. 413-478), 1987.
McClelland, J.L. & Elman, J.L. The TRACE model of speech perception. Cognitive
Psychology, 1986, 18, 1-86.
Elman, J.L. & Zipser, D. Learning the hidden structure of speech. Institute for Cognitive
Science, Report 8701, Feb. 1987.
Rabiner, L. R. and Juang, B. H. (1986). An introduction to hidden markov models. IEEE
ASSP Magazine, January, 4-16.
Poritz, A. B. (1988). Hidden markov models: A guided tour. ICASSP, 7-13, New York
City.
Norris, D. Shortlist: a connectionist model of continuous speech recognition. Cognition,
52, 189-234, 1994.
Liu, S.A. Landmark detection for distinctive feature-based speech recognition. JASA,
1996, 100, 3417-3430.
Ainsworth, W.A. Some approaches to automatic speech recognition. In Hardcastle, W.J.
& Laver, J. (eds.), The handbook of phonetic sciences (pp. 721-743), 1997.
14
Download