manuscript sonorant 0507 2008

Sonorant in spoken word recognition 1 The feature [sonorant] in spoken word recognition Chao-Yang Lee1, Danny R. Moates2, and Russell Fox2 1 School of Hearing, Speech and Language Sciences, 2Department of Psychology Ohio University, Athens, OH 45701, USA Running title: Sonorant in spoken word recognition Corresponding author: Chao-Yang Lee School of Hearing, Speech and Language Sciences Grover Center W225 Ohio University Athens, OH 45701, USA Telephone: (740) 593-0232 Fax: (740) 593-0287 E-mail: leec1@ohio.edu Sonorant in spoken word recognition 2 ABSTRACT Distinctive features characterize the internal organization of phonetic segments, but their role in representing and processing spoken words has not been evaluated extensively. It is normally assumed in models of spoken word recognition that all phonetic segments and features are treated equally in lexical processing, but this assumption has been challenged by findings showing varying degrees of difficulty in word reconstruction. The present study examined the role of the feature [sonorant] in two form priming experiments with the lexical decision task. Participants responded to primetarget pairs, where non-word primes contrasting with real-word targets in one consonant were either a match (e.g., [knrm] - conform) or mismatch (e.g., [knwrm] conform) in the feature [sonorant]. The results showed faster response when the prime and target matched in the feature [sonorant]. The effect, however, was limited to fricative targets only. Stop targets did not show reduced reaction time compared to the sonorant targets. These findings suggest that speech sounds classified by the feature [sonorant] are processed differently during spoken word recognition and that this processing difference is modulated by further featural classifications. Sonorant in spoken word recognition 3 INTRODUCTION Spoken word recognition involves extracting information from the acoustic signal and mapping the information onto the mental lexicon. Naturally, two major issues in the study of spoken word recognition are the mechanism of the mapping process and the nature of lexical representations. It has been established by cognitive models of spoken word recognition that the mapping process implicates lexical activation and competition (e.g., Luce & Pisoni, 1998; Marslen-Wilson & Welsh, 1978; McClelland & Elman, 1986; Norris, 1994). As for lexical representations, units of lexical processing have been proposed at different levels ranging from spectral templates and distinctive features to the phonetic segment and the syllable. The purpose of this study was to examine the role of distinctive features in spoken word recognition. In particular, the role of the feature [sonorant] in lexical processing was examined in two form priming experiments. Although the nature of sublexical representations remains debated, implicit in most models of spoken word recognition is the assumption that all tokens of a sublexical unit are treated equally in lexical processing. For example, for models with the phonetic segment as the basic unit, all phonetic segments, vowels or consonants, are assumed to be equally effective in participating in lexical activation and competition. This assumption, however, has been challenged by studies showing the vowel mutability effect (van Ooijen, 1996; Cutler, Sebastián-Gallés, Soler-Vilageliu, & Ooijen, 2000). In particular, van Ooijen (1996) showed in a word reconstruction task that English listeners tend to change vowels rather than consonants when asked to turn a non-word sequence to a real word by changing one sound. For example, when given the non-word Sonorant in spoken word recognition 4 teeble, listeners are more likely to propose table rather than feeble. Cutler et al. (2000) further showed in a cross-linguistic study on word reconstruction that the tendency to change vowels rather than consonants appeared to be language independent, reflecting the intrinsic differences between the information provided by vowels and consonants. In other words, not all phonetic segments are treated equal in lexical processing. Obviously all sounds are not equal in their internal structure. Linguists have long noted that the phonetic segment can be further analyzed into bundles of distinctive features (Jakobson, Fant, & Halle, 1952; Chomsky & Halle, 1968). Importantly, these categorical features are grounded in physical principles governing the articulatoryacoustic-auditory relationships (Stevens, 1972, 1989, 1997). Specifically, non-linear or “quantal” relations exist in the mapping from articulation onto acoustics and from acoustics onto auditory responses. Consequently, continuous changes in one domain (e.g., articulation) could result in discrete changes in another domain (e.g., acoustics). It is these non-linear relations that serve as the basis for the categorically specified distinctive features (Stevens, 1972, 1989, 1997). Given the internal organization of phonetic segments revealed by featural specifications, it is conceivable that distinctive features would play a role in spoken word recognition. Indeed, sensitivity to featural specification in lexical access has been demonstrated in many experimental investigations (Connine, Blasko, & Titone, 1993; Connine, Blasko, & Wang, 1994; Milberg, Blumstein, & Dworetzky, 1988). Distinctive features have also figured in some cognitive models of spoken word recognition. For example, TRACE (McClelland & Elman, 1986) incorporates a feature-level representation in addition to phonemic and lexical nodes. The Cohort model has also Sonorant in spoken word recognition 5 acknowledged the role of sublexical features in lexical activation. Specifically, candidacy into word-initial cohort can tolerate certain featural mismatches (Marslen-Wilson, 1993; Marslen-Wilson & Warren, 1994; Marslen-Wilson & Zwitserlood, 1989). While ample evidence exists for the feature-level representation and for the role of features in spoken word recognition, the implicit assumption remains that all types of distinctive features are treated equal in lexical processing. However, from the vowel mutability effect (van Ooijen, 1996; Cutler, et al., 2000), it is clear that processing differences are present between vowels and consonants. The next question is whether similar processing differences also exist for other types of contrasts specified by distinctive features. The answer to this question has implications for the speech sound structure proposed by linguists (i.e., distinctive features and their organization) and the relevance of these features in the processing of spoken words. There are reasons to expect that lexical processing differences are present for contrasts other than the vowel-consonant distinction. It has been proposed that distinctive features are not an unorganized bundle but rather are grouped into a hierarchical structure (Clements, 1985; McCarthy, 1988; Halle, 1992; Halle & Stevens, 1991). For example, there is a consensus that the major class features [consonantal] and [sonorant] form the “root” of a feature tree and that other features are derived from the root with further reference to specific articulators (Kenstowicz, 1994). Stevens (2002, 2005) developed a model for lexical access based on the distinctive features proposed by Halle (1992). In this model, acoustic “landmarks” for consonants, vowels, and glides are first identified. Acoustic parameters and cues are then extracted from the vicinity of the landmarks to estimate the values of other features. Based on the estimations, lexical hypotheses are Sonorant in spoken word recognition 6 generated and compared to words stored in the lexicon. Lexical access is achieved when a match is found. Compared to other cognitive models of spoken word recognition, Stevens’ (2002, 2005) model explicitly specifies lexical representation in terms of distinctive features. Procedures have also been developed for automatically estimating the landmarks and some features. It is not clear, however, whether the proposed procedures also reflect lexical processing by humans, particularly in real-time speech processing. That is, are human listeners also engaged in consonant/vowel landmark detection prior to feature estimation? Do human listeners evaluate all features simultaneously or do they give preference to particular features? Studies showing the vowel mutability effect (van Ooijen, 1996; Cutler, et al., 2000) appear to have provided evidence for the processing difference between consonants and vowels (i.e., the feature [consonantal]). What remains to be evaluated is the processing of other features by human listeners. The feature [sonorant] is a good candidate for addressing this issue. Every language distinguishes sonorant consonants from obstruent consonants, just as all languages distinguish consonants from vowels. This is part of the reasons why [sonorant] is one of two major class features and is one of the two features placed at the root of feature geometry (Kenstowicz, 1994). Furthermore, [sonorant] is one of the articulatorfree features (Halle, 1992), meaning that it does not specify any specific articulators, but rather reflects general characteristics of consonant constriction in the vocal tract and the acoustic effect of forming the constriction (Stevens, 2002, 2006). In particular, irrespective of the articulators involved, obstruent consonants are produced with substantial intraoral air pressure and sonorant consonants are produced without such Sonorant in spoken word recognition 7 significant pressure. Despite the seemingly fundamental status of these articulator-free features, neither cognitive models of spoken word recognition nor Steven’s model (2002, 2005) indicates whether these features are processed any differently from other features. Nonetheless, there exists some evidence for the processing of [sonorant] by human listeners. Marks, Moates, Bond and Stockmal (2002) conducted a word reconstruction study using American English and Spanish materials. Participants heard a non-word (e.g., bavalry or mavalry) that could be changed into a real word (e.g., cavalry) by changing just one consonant. The consonant to be recovered was an obstruent in half the cases and a sonorant in the other half. Half the obstruents were replaced with other obstruents (match in [sonorant]) and half were replaced with sonorants (mismatch in [sonorant]). Similarly, half the sonorants were replaced with other sonorants (match) and the other half were replaced with obstruents (mismatch). The results showed that when an obstruent was replaced by another obstruent, reconstructing the correct word was significantly more accurate than when the obstruent was replaced by a sonorant. In contrast, sonorant target words showed no such effect. That is, accuracy of constructing sonorant target words did not differ between the match and mismatch conditions. Analogous to the vowel mutability effect, Marks et al. (2002) showed that speech sounds, when divided into sonorants and obstruents, were not processed equally by human listeners. When there was a match in the feature [sonorant], word reconstruction was more accurate. However, this statement was true only for obstruent target words but not for sonorant target words. The processing difference between sonorants and obstruents was attributed to the observation that sonorants are phonetically similar to vowels while obstruents are maximally distinct from vowels. That is, sonorant Sonorant in spoken word recognition 8 consonants are probably more “mutable” than obstruent consonants in spoken word recognition. Marks et al. (2002) was the first study to evaluate the impact of the feature [sonorant] in spoken word recognition by humans. The present study extended Marks et al. (2002) in several ways. First, the form priming paradigm (Zwitserlood, 1996) with a lexical decision task was used. Form priming has been used extensively to investigate the nature of lexical representation and process. The use of a task different from word reconstruction could evaluate the generalizability of the [sonorant] effect found in Marks et al. (2002). More importantly, the speeded-response task could provide a potentially more sensitive measure than accuracy alone and would better assess the on-line nature of lexical processing, as has been shown in earlier investigations on features (Connine, et al., 1993, 1994; Milberg, et al., 1988). In the present study, prime-target pairs were constructed where the target (e.g., conform) was preceded by one of two types of nonword primes: one with a sound change matching the target in [sonorant] (e.g., [knrm]) and the other with a sound change mismatching the target in [sonorant] (e.g., [knwrm]). If listeners are sensitive to the [sonorant] specification in word recognition, response should be facilitated in the matching condition relative to the mismatching condition. Second, the present study divided obstruent consonants into fricatives and stops. In a post hoc analysis not reported in the Marks et al. (2002) study, it was discovered that among the obstruent words, response accuracy appeared to differ between fricative and stop consonants. Coincidentally, these two classes of sounds are distinguished in the feature system by another articulator-free feature [continuant]. A subsequent word Sonorant in spoken word recognition 9 reconstruction study of the feature [continuant] also revealed that reconstruction of fricative words were less error-prone in the match condition than in the mismatch condition (Moates, Sutherland, Bond, & Stockmal, manuscript in preparation). In contrast, stop words showed no such difference. In other words, fricative words alone could be responsible for the mismatch effect found in the Marks et al. (2002) study. For these reasons, it was decided to examine fricatives vs. sonorants (Experiment 1) and stops vs. sonorants (Experiment 2) separately to evaluate the potential difference between fricatives and stops. EXPERIMENT 1 Method Materials Ninety-eight English words were selected as real-word targets in the priming experiment. All words have one of 14 target phonemes including seven fricative consonants [f, v, , , s, z, ] and seven sonorant consonants [m, n, , l, r, j, w]. Half of the words have two syllables and the other half have three syllables. Half of the words have the target phoneme in the onset of the stressed syllable and the other half in the coda of the stressed syllable. Target words in the fricative and resonant lists were balanced for variables affecting lexical access. A set of t tests showed no significant difference in word frequency (p = 0.68), number of segments (p = 0.72), number of consonants (p = 0.97), and uniqueness point (p = 0.24). The consonant change occurred before the uniqueness point in all target words. Ideally there would be a total of 112 items, including 14 (seven fricatives and seven sonorants) x 2 (two- vs. three-syllables) x 2 (syllable onset vs. coda) x 2 (tokens). However, only 98 words could be selected due to phonotactic Sonorant in spoken word recognition 10 constraints (e.g., [] does not appear in syllable-onset position; [j, w] do not appear in syllable-coda position). For each word, two non-word primes were constructed by replacing the target phoneme in the real word: one with a phoneme matching the value of the feature [sonorant] in the target phoneme, and the other with a sound mismatching the value of the feature [sonorant] in the target phoneme. For example, for the word conform, where the target phoneme is [f], a matching prime was [knrm] and a mismatching prime was [knwrm]. In addition to the real-word targets, 98 pronounceable non-word fillers were constructed to serve as non-word targets. Similar to the word target setup, these nonwords included both two-syllable and three-syllable items with the target phoneme in either the onset or coda of a stressed syllable. The 14 target phonemes were identical to those used in the word targets. For each non-word, two non-word primes were constructed by replacing the target phoneme in the non-word: one with a phoneme matching the value of the feature [sonorant] in the target phoneme, and the other with a sound mismatching the value of the feature [sonorant] in the target phoneme. For example, for the target [rflv], where the critical sound is [f], a feature-matching prime was [rblv] and a mismatching prime was [rmlv]. In other words, the prime-target relationship in the non-word target set was identical to that in the word target set. The complete set of stimuli is listed in Appendix A. Participants Forty undergraduate students (25 females and 15 males) at Ohio University participated in the experiment. All were native speakers of American English with self- Sonorant in spoken word recognition 11 reported normal hearing, speech, and language. They received partial course credit for participating in the experiment. Procedure The stimuli were recorded by a phonetically-trained female speaker of American English. The recording was made in a sound-treated booth in the School of Hearing, Speech and Language Sciences at Ohio University with a high-quality microphone (Audio-technica AT825 field recording microphone) connected through a preamplifier and A/D converter (USBPre microphone interface) to a Windows personal computer (Dell). The recording was sampled using the Brown Lab Interactive Speech System (BLISS, Mertus, 2000) at 20 kHz with 14-bit quantization. The stimuli, saved as individual audio files, were imported to AVRunner, the subject-testing program in BLISS, for stimulus presentation. Two stimulus lists were constructed with the following considerations. The relationship between the prime and target (match, mismatch) was intended to be a withinsubject factor. It was also determined that participants were not to hear the same stimulus more than once during the experiment to avoid any familiarity or learning effects. To these ends, two stimulus lists were constructed such that for a given target, each of the two primes was assigned to a different list such that each list would include both prime types without repeating any stimulus. The fillers (non-word primes and non-word targets) were assigned to the two lists in the same way. Therefore, no primes or targets were repeated in any list. In sum, each list included 196 targets (98 word targets and 98 nonword targets) with the two prime types (match, mismatch) equally distributed. Each participant was randomly assigned to be tested on one list only. The presentation of lists Sonorant in spoken word recognition 12 was counterbalanced across participants such that the two lists were presented equally often across participants. For each participant, AVRunner assigned a uniquely randomized presentation order such that no two participants received the same order of presentation. The inter-stimulus interval between the prime and target was 50 milliseconds. The inter-trial interval was three seconds. Participants were tested individually in a quiet room in the School of Hearing, Speech and Language Sciences at Ohio University. They listened to the stimuli through a pair of high-quality headphones (Koss R80) connected to a Windows personal computer (Dell). The participants were told that they would be listening to pairs of auditory stimuli, where the second item could a real word or a non-word in English. Their task was to judge whether the second item in a pair was a real word or a non-word by pressing the computer keys labeled with YES (for real words) or NO (for non-words). They were also instructed to respond as quickly as possible as reaction time would be measured. Prior to the actual experiment, 10 practice trials, none appeared in the actual experiment, were given to familiarize the participants with the experimental procedure. Many of the target words were of low frequency, and it was possible that participants might not know some of them. Following the experiment, participants received a word check sheet to test whether any target words were unfamiliar to them. The sheet listed 20 phonotactically legal nonwords and the 20 target words having the lowest word frequencies. Participants were asked to circle all items they thought were not words. If a participant circled any real words, those words were removed from that participant's data set. Data analysis Sonorant in spoken word recognition 13 Response accuracy and reaction time were recorded by BLISS automatically. Reaction time was measured from the onset of the target. Only responses to real word targets were analyzed and only correct responses were included in the reaction time analysis. Repeated measures ANOVAs were conducted on response accuracy and reaction time with relation between prime and target (match, mismatch) and target phoneme (fricative, sonorant) as fixed factors and participants as a random factor. Results Figure 1 shows the average reaction time of lexical decision by relation and target phoneme. The ANOVAs revealed a significant main effect of relation (F (1, 39) = 7.88, p < .01). In particular, response was faster when there was a match between the prime and target phonemes (950 ms) than when there was no match (968 ms). The main effect of target phoneme was also significant (F (1, 39) = 10.97, p < .005). Specifically, response was faster for sonorants (943 ms) than for fricatives (976 ms). The relation-target phoneme interaction was also significant (F (1, 39) = 5.82, p < .05). As Figure 1 shows, the interaction arose because the [sonorant] feature mismatch slowed down reaction for fricatives but not for sonorants. Table 1 shows the average number of errors in the lexical decision task by relation and target phoneme. Overall, participants made very few errors. Still, the ANOVAs revealed a significant main effect of target phoneme (F (1, 39) = 41.6, p < .0001). Specifically, response was more accurate for sonorants (0.7 out of 49) than for fricatives (1.7 out of 49). The relation-phoneme interaction was also significant (F (1, 39) = 5.03, p < .05). The interaction arose because the [sonorant] feature mismatch resulted in more Sonorant in spoken word recognition 14 errors for fricatives but not for sonorants. The pattern of errors is similar to that of the reaction time, indicating no tradeoff between speed and accuracy. Summary As predicted, lexical decision response was faster when there was a match in the feature [sonorant] between the prime and target, indicating the match/mismatch in this feature does impact lexical processing. However, the feature match facilitated response only when the target phoneme was a fricative consonant. In contrast, the feature match did not make a difference when the target phoneme was a sonorant consonant. This pattern is identical to what was found in Marks et al. (2002) with a word reconstruction task. Together these findings suggest that the effect of feature match hinges on the type of target phoneme involved. A mismatch in [sonorant] disrupted response to fricative targets but not sonorant targets. Would this result generalize to all obstruents? The next experiment examined the feature match effect with another group of obstruent consonants, the stop consonants, to evaluate whether the feature match effect was limited only to fricative consonants. EXPERIMENT 2 Method Materials Ninety-two English words were selected to be the real-word targets in the priming experiment. All words have one of 13 target phonemes including six stop consonants [p, b, t, d, k, ] and seven sonorant consonants [m, n, , l, r, j, w]. As in the previous experiment, half of the words have two syllables and the other half have three syllables. Sonorant in spoken word recognition 15 Half of the words have the target phoneme in the onset of the stressed syllable and the other half in the coda of the stressed syllable. Target words in the stop and resonant lists were balanced for variables affecting lexical access. A set of t tests showed no significant difference in word frequency (p = 0.57), number of segments (p = 0.45), number of consonants (p = 0.38), and uniqueness point (p = 0.74). The consonant change occurred before the uniqueness point in all target words. Ideally there would be a totally of 104 items, including 13 (six stops and seven sonorants) x 2 (two- vs. three-syllables) x 2 (syllable onset vs. coda) x 2 (tokens). However, only 92 words could be selected due to phonotactic constraints, as was noted in experiment 1. For each word, two non-word primes were constructed by replacing the target phoneme in the real word: one with a phoneme matching the value of the feature [sonorant] in the target phoneme, and the other with a phoneme mismatching the value of the feature [sonorant] in the target phoneme. For example, for the word pothole, where the target phoneme is [p], a matching prime was [bthol] and a mismatching prime was [wthol]. In addition to the real-word targets, 92 pronounceable non-word fillers were constructed to serve as non-word targets. Similar to the word-target setup, these nonwords included both two-syllable and three-syllable items with the target phoneme in either the onset or coda of a stressed syllable. The target phonemes were identical to those used in the word targets. For each non-word, two non-word primes were constructed by replacing the target phoneme in the non-word: one with a sound matching the value of the feature [sonorant] in the target phoneme, and the other with a sound Sonorant in spoken word recognition 16 mismatching the value of the feature [sonorant] in the target phoneme. For example, for the target [ptl], where the critical sound is [p], a feature-matching prime was [stl] and a mismatching prime was [ntl]. In other words, the prime-target relationship in the non-word target set was identical to that in the word target set. The complete set of stimuli is listed in Appendix B. Participants Forty undergraduate students (26 females and 14 males) at Ohio University participated in the experiment. All were native speakers of American English with selfreported normal hearing, speech, and language. They received partial course credit for participating in the experiment. None of the participants participated in the previous experiment. Procedure The procedure was identical to that of Experiment 1. Results Figure 2 shows the average reaction time of lexical decision by relation and target phoneme. The ANOVAs revealed no main or interaction effects. The average reaction time for the matching relation was 925 ms (SD = 121); for the mismatching relation it was 933 ms (SD = 117). The average reaction time for stops was 926 ms (SD = 119); for sonorants it was 933 ms (SD = 119). Although a mismatch appeared to slow down stops more than sonorants, the interaction was not statistically significant. Table 2 shows the average number of errors in the lexical decision task by relation and target phoneme. Again, participants made very few errors. The ANOVAs revealed a significant main effect of relation (F (1, 39) = 40.03, p < .0001). In particular, response Sonorant in spoken word recognition 17 was more error-prone when the target phonemes matched in the feature [sonorant] (0.9 out of 49) than when the phonemes did not match (1.8 out of 49). This result is rather counter-intuitive given that one would expect the matching relation to facilitate responses and to generate fewer errors. However, since the actual number of errors is very small (2% for match and 4% for mismatch), the statistical difference found here may not be meaningful. Summary In contrast to the fricative-sonorant comparison in Experiment 1, results from the stop-sonorant comparison showed no significant reaction time difference between the match and mismatch conditions for either stop or sonorant target phonemes. While the null result for sonorant target phonemes was consistent with the finding from Experiment 1, the lack of effect for stop consonants suggests that the feature matching effect did not apply to all obstruent consonants. GENERAL DISCUSSION The research question in this study was whether a match or mismatch in the feature [sonorant] would impact spoken word recognition by humans. Two form priming experiments with the lexical decision task investigated this issue. The results showed that reaction was faster when the prime and target matched in [sonorant] for obstruent consonants but not sonorant consonants. The results further showed that the match effect was restricted to fricative consonants only. Stop consonants did not show the match effect. The first result replicated findings from Marks et al. (2002), who found word reconstruction to be more accurate when target and replacing segments were matched on the feature [sonorant] than when they were mismatched. This effect occurred for Sonorant in spoken word recognition 18 obstruents but not sonorants, just as what was found in the present study. Taken together, the finding that feature mismatch resulted in more errors and increased reaction time in word recognition indicates that distinctive features, originally posited on the basis of articulation and acoustics, are also implicated in perceptual processing. The findings also challenge the assumption that all phonetic segments and features are treated equally in spoken word recognition. What could be the reason for the different patterns between sonorants and obstruents? As noted earlier, Marks et al. (2002) speculated that sonorants were phonetically similar to vowels, which could explain why a feature mismatch did not disrupt word reconstruction for sonorants as substantially as it did for obstruents. Articulatorily, both are consonants, which are produced with a narrow constriction in the vocal tract. Acoustically, the formation and subsequent release of the constriction introduces acoustic discontinuity at both the formation and the release. However, obstruents and sonorants are different in that the articulation of sonorants involves an abrupt switching of the airflow to a different path in the vocal tract without a substantial increase in intraoral air pressure (Stevens, 1998, 2002). Liu (1996) developed a sonorant detector as part of an algorithm for automatic speech recognition. She noted that energy in the second to fourth formant range decreases at the formation and increases at the release of a sonorant consonant. During the constriction, the spectrum remains relatively steady, especially at low frequencies, since the vocal tract shape is relatively constant. Given that vowel landmarks are where there is maximum amplitude in the first formant range (Stevens, 2002), there seems to be some merit to the argument that sonorant consonants are phonetically to similar to vowels. This acoustic similarity could account Sonorant in spoken word recognition 19 for the obstruent-sonorant contrast observed in Marks et al. (2002) and the current study. This finding is also consistent with the finding that vowels are appreciably easier to process in the word reconstruction task (Ooijen, 1996; Cutler, et al., 2000). The finding that responses to fricatives are different from responses to stops is also noteworthy. Although obstruents overall generated a feature match effect in Marks et al. (2002), the contrast between the two experiments in the current study further showed that the source of the effect is limited to fricatives. The feature match effect is thus not spread evenly across all obstruents. As noted, the feature [continuant] classifies obstruent consonants into fricatives and stops. Stops are produced with a complete closure in the vocal tract; therefore they generate abrupt amplitude decrease and increase at consonant closure and release. Fricatives, on the other hand, are produced with a sustained narrow constriction and thus continuous turbulent noise (Stevens, 1998, 2002, 2005). Given their distinct articulatory and acoustic properties, it is perhaps not surprising that they are processed differently during spoken word recognition, as revealed by the current study. How would the current results compare to other spoken word recognition studies using the form priming paradigm? In the literature, two types of form priming have been shown. First, the prime and target overlap at the stimulus onset in one or more segments, using either words or pseudowords as primes. Priming has been shown with both lexical decision and naming tasks. Second, the prime and target overlap in the rime of the target word. Rime overlap generally leads to priming more often than does onset overlap, and pseudoword primes produce greater facilitation than word primes. Rime priming has been shown with both lexical decision and naming tasks. Sonorant in spoken word recognition 20 Several studies illustrate these two types of priming. Radeau, Morais and Segui (1995) used three-phoneme words as both primes and targets in lexical decision and naming tasks. The primes overlapped the targets in either the first two segments or the last two segments. Final overlap produced facilitation in both tasks, but initial overlap produced no facilitation. Slowiaczek, McQueen, Soltano and Lynch (2000) also showed final overlap priming with monosyllabic words in both naming and continuous lexical decision tasks. The rime was a major contributor to the priming effect, but the amount of phonological overlap was also an important contributor. The Radeau et al. (1995) and Slowiaczek et al. (2000) studies used monosyllabic targets, but the targets in our two studies were 2- and 3-syllable words. Marslen-Wilson, Moss and van Halen (1996, Exp. 1) showed rime priming with 2- and 3-syllable Dutch words. These words were semantic mediators to targets in a lexical decision task. Nonword primes to the mediators differed from the mediators in only the first segment. Emmorey (1989) used pairs of 2-syllable words in a lexical decision task. Words with a strong-weak syllabic stress pattern (the pattern in 79% of our 2-syllable words) showed large priming effects when primes and targets shared the last syllable. Sharing only the rime (vowel plus final consonants) did not produce priming. Burton (1992) also used 2syllable primes and targets in lexical decision and naming shadowing tasks. Both tasks showed facilitation for second syllable overlap but no effect for initial syllable overlap. Priming with initial overlap was shown by Corina (1992). Two-syllable items, overlapping in the first syllable, produced significant priming in a lexical decision task The present experiments used primes that differed from the targets in only one segment, showing a mix of onset and rime overlap. As noted in the Method, the target Sonorant in spoken word recognition 21 segment varied in its position in the target word. For 66 (41%) of the words, the target fell in the onset of the word. This is the condition for rime priming. An additional 12 items (7%) fell in the coda of the second or third syllable. This is the condition for onset priming. The remaining 83 items fell in the onset of the second or third syllable or the rime of the first syllable, conditions that do not clearly match the conditions for either onset or rime priming. Nonetheless, post hoc analyses showed no interactions between feature matching and target location (onset vs. offset) or between feature matching and number of syllables (2 vs. 3), indicating the feature matching effect was uniform across all stimuli. The results reported in this study have some implications for models of spoken word recognition. Three models of spoken word recognition incorporate feature processors: TRACE (McClelland & Elman, 1986), the Distributed Cohort Model (Gaskell & Marslen-Wilson, 1997), and Stevens' Feature-Based Model (Stevens, 2002, 2005). In TRACE, speech input is first assessed by a set of feature detectors. The effect of a feature mismatch in the prime, as used in the present two experiments, is to reduce the number of appropriate features activated for the target segment and its target word, thereby reducing activation of the target word and its probability of being recognized. In this manner, TRACE explains the effect of feature mismatch when fricatives are targets. When the targets were stops or sonorants, however, there was no effect for feature mismatch, and TRACE seemingly has no mechanism for explaining that outcome. The Gaskell and Marslen-Wilson (1997) model is a distributed connectionist model that represents lexical knowledge in a distributed substrate having abstract representation of both the forms and meanings of words. Successive sets of distinctive Sonorant in spoken word recognition 22 features, representing connected speech, are input to the model. The identity of the phonological form of a word is assessed by the goodness of fit between the output computed by the model and the nearest word entry in the distributed network. The effect of feature mismatch in the primes, as occurred with fricatives in our Experiment 1, can be explained by the poorer goodness of fit. When feature mismatch occurred with stops and sonorants, as in our Experiment 2, the model predicts poorer goodness of fit as well, but the results of the experiment showed no increase in latencies relative to the match condition. Stevens' Feature-Based Model estimates distinctive features from the acoustic properties and landmarks in the speech signal. These features are matched against ordered bundles of features representing the phonology of words in the mental lexicon. Word identification occurs when a sequence of estimated features matches a sequence in the mental lexicon. When a feature mismatch occurs, the latency for identifying a word should increase, as occurred in the feature mismatch for fricatives in our Experiment 1. The model does not explain why latencies were unaffected by the feature mismatch condition for stops and sonorants. Sonorant in spoken word recognition 23 REFERENCES Burton, M. W. (1992, November). Syllable priming in auditory word recognition. Poster presented at the 33rd Annual Meeting of the Psychonomic Society, St. Louis, MO. Chomsky, N. and Halle, M. (1968). The Sound Patttern of English. New York: Haper & Row. Clements, G. N. (1985). The geometry of phonological features. Phonology Yearbook, 2, 225-252. Connine, C. M., Blasko, D. G., and Titone, D. (1993). Do the beginnings of words have a special status in auditory word recognition? Journal of Memory & Language, 32, 193-210. Connine, C. M., Blasko, D. G., and Wang, J. (1994). Vertical similarity in spoken word recognition: Multiple lexical activation, individual differences, and the role of sentence in context. Perception & Psychophysics, 56, 624-636. Corina, D. P. (1992). Syllable priming and lexical representations: Evidence from experiments and simulations. In Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society (pp. 779-784). Bloomington: Indiana University. Cutler, A., Sebastián-Gallés, N., Soler-Vilageliu, O., and van Ooijen, B. (2000). Constraints of vowels and consonants on lexical selection: Cross-linguistic comparisons. Memory & Cognition, 28, 746-755. Emmorey, K. D. (1989). Auditory morphological priming in the lexicon. Language and Cognitive Processes, 4, 73-92. Sonorant in spoken word recognition 24 Gaskell, M. G., and Marslen-Wilson, W. D. (1997). Integrating form and meaning: A distributed model of speech perception. Language and Cognitive Processes, 12, 613-656. Halle, M. (1992). Features. In W. Bright (ed.), Oxford International Encyclopedia of Linguistics (pp. 207-212). New York: Oxford University Press. Halle, M., and Stevens, K. N. (1991). Knowledge of language and the sounds of speech. In J. Sundberg, L. Nord, and R. Carlson (Eds.), Music, Language, Speech and Brain (pp. 1-19). London: MacMillan. Jakbson, R., Fant, C. G. M., and Halle, M. (1952). Preliminaries to speech analysis: the distinctive features and their correlates. MIT Acoustics Laboratory Technical Report 13. Reprinted 1967, Cambridge, MA: MIT Press. Kenstowicz, M. (1994). Phonology in Generative Grammar. Cambridge: Blackwell Publishers. Liu, S. A. (1996). Landmark detection for distinctive feature-based speech recognition. Journal of the Acoustical Society of America, 100, 3417-3430. Luce, P. A., and Pisoni, D. B. (1998). Recognizing spoken words: The Neighborhood Activation Model. Ear & Hearing, 19, 1-36. Marks, E. A., Moates, D. R., Bond, Z, S., and Stockmal, V. (2002). Word reconstruction and consonant features in English and Spanish. Linguistics, 40, 421-438. Marks, E. A., Moates, D. R., Bond, Z. S., and Vazquez, L. (2002). Vowel mutability: The case of monolingual Spanish listeners and bilingual Spanish-English listeners. Southwest Journal of Linguistics, 21, 73-99. Sonorant in spoken word recognition 25 Marslen-Wilson, W. D. (1993). Issues of process and representation in lexical access. In G. Altmann & R. Shillcock (Eds.), Cognitive Models of Speech Processing: The Second Sperlonga Meeting (pp. 187-210). Hillsdale, NJ: Erlbaum. Marslen-Wilson, W. D., Moss, H. E., and van Halen, S. (1996). Perceptual distance and competition in lexical access. Journal of Experimental Psychology: Human Perception and Performance, 22, 1376-1392. Marslen-Wilson, W. D., and Warren, P. (1994). Levels of perceptual representation and process in lexical access: Words, phonemes and features. Psychological Review, 101, 653-675. Marslen-Wilson, W. D., and Welsh, A. (1978). Processing interactions and lexical access during word recognition in continuous speech. Cognitive Psychology, 10, 29-63. Marslen-Wilson, W. D., and Zwitserlood, P. (1989). Accessing spoken words: The importance of word onsets. Journal of Experimental Psychology: Human Perception & Performance, 15, 576-585. McCarthy, J. J. (1988). Feature geometry and dependency: a review. Phonetica, 45, 84–108. McClelland, J. L., and Elman, J. L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1-86. Mertus, J. A. (2000). BLISS: The Brown Lab Interactive Speech System. Brown University. Milberg, W., Blumstein, S. E., and Dworetzky, B. (1988). Phonological factors in lexical access: Evidence from an auditory lexical decision task. Bulletin of the Psychonomic Society, 26, 305-308. Sonorant in spoken word recognition 26 Moates, D. R., Sutherland, M. T., Bond, Z. S., and Stockmal, V. The feature [continuant] in word reconstruction. Manuscript in preparation. Norris, D. (1994). SHORTLIST: A connectionist model of continuous speech recognition. Cognition, 52, 189-234. Ooijen, B. van (1996). Vowel mutability and lexical selection in English: Evidence from a word reconstruction task. Memory & Cognition, 24, 573-583. Radeau, M., Morais, J., and Segui, J. (1995). Phonological priming between monosyllabic spoken words. Journal of Experimental Psychology: Human Perception and Performance, 21, 1297-1311. Slowiaczek, L. M., McQueen, J. M., Soltano, E. G., and Lynch, M. (2000). Phonological representations in prelexical speech processing: Evidence from form-based priming. Journal of Memory and Language, 43, 530-560. Stevens, K. N. (1972). The quantal nature of speech: Evidence from articulatoryacoustic data. In P. B. Denes and E. E. David, Jr. (Eds.), Human Communication: A Unified View (pp. 51–66). New York: McGraw-Hill. Stevens, K. N. (1989). On the quantal nature of speech. Journal of Phonetics, 17, 3-46. Stevens, K. N. (1997). Articulatory-acoustic-auditory relationships. In W. J. Hardcastle and J. Laver (Eds.), The Handbook of Phonetic Sciences (pp. 507-538). Cambridge: MIT Press. Stevens, K. N. (1998). Acoustic Phonetics. Cambridge: MIT Press. Sonorant in spoken word recognition 27 Stevens, K. N. (2002). Toward a model for lexical access based on acoustic landmarks and distinctive features. Journal of the Acoustic Society of America, 111, 1872-1891. Stevens, K. N. (2005). Features in speech perception and lexical access. In D. B. Pisoni and R. E. Remez (Eds.), The Handbook of Speech Perception (pp. 125-155). Cambridge: Blackwell Publishers. Zwitserlood, P. (1996). Form priming. Language and Cognitive Processes, 11, 589-596. Sonorant in spoken word recognition 28 ACKNOWLEDGMENTS We thank Sara Kellgreen for administering the experiment and assisting in data analysis and Carla Youngdahl for recording the materials. We also thank Z. S. Bond for many helpful discussions. Sonorant in spoken word recognition 29 Table 1. Average number of errors (out of possible 49) in the lexical decision task in Experiment 1. Standard deviation is shown in parenthesis. Prime-target relation Target phoneme Match Mismatch Fricatives 1.53 (1.38) 1.88 (1.54) Sonorants 0.8 (1.09) 0.6 (0.81) Sonorant in spoken word recognition 30 Table 2. Average number of errors (out of possible 46) in the lexical decision task in Experiment 2. Standard deviation is shown in parenthesis. Prime-target relation Target phoneme Match Mismatch Fricatives 1.65 (0.89) 1.05 (0.93) Sonorants 1.85 (0.95) 0.73 (0.88) Sonorant in spoken word recognition 31 FIGURE CAPTIONS Figure 1. Average reaction time (+SE) for fricative and sonorant targets in Experiment 1. Figure 2. Average reaction time (+SE) for stop and sonorant targets in Experiment 2.

manuscript sonorant 0507 2008

Related documents

Products

Support

manuscript sonorant 0507 2008

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib