The interplay of root, suffix and whole-word frequency in processing derived words* Cristina Burani and Anna M. Thornton In three lexical decision experiments we investigated whether the relative frequency of root, derivational suffix and whole-word affects processing of Italian printed derived stimuli. Experiment 1 considered pseudowords made up of pseudoroots combined with either high-,medium-, or low-frequency suffixes. Only pseudowords with high-frequency suffixes resulted in increased decision times and higher error rates relative to nonsuffixed pseudowords. Experiments 2 and 3 dealt with suffixed derived words. In Experiment 2, low-frequency words with high-or low-frequency roots and with high or low-frequency suffixes were orthogonally contrasted. Lexical decision latencies were a function of the frequency of both the root and the suffix. However, post-hoc comparisons showed an effect of wholeword familiarity. In Experiment 3, low-frequency derived words with orthogonal variation of root and suffix frequency, and equal whole-word familiarity, were investigated, and were contrasted with low-frequency nonderived words. Words with high-frequency roots showed quicker and more accurate lexical decision responses, irrespective of suffix frequency. By contrast, words with low-frequency roots, irrespectively of suffix frequency, did not differ from nonderived words. These results are interpreted within Schreuder and Baayen’s (1995) parallel dualroute model for morphological processing, as evidence for both benefits and costs of morphemic access, due to the balancing of the quantitative characteristics of root, suffix and whole word. 1. Introduction In most models of morphological processing, it is assumed that the probability of accessing morphological constituents of words is conditioned, at the different processing stages, by many properties of the morphologically complex words. Among these properties, the frequency of morphological constituents relative to the frequency of the complex word as a whole form can play a major role. Evidence for reliance on morphological structure in accessing printed complex words comes from low-frequency words which include higher frequency constituents (see, e.g., Andrews 1986; Burani and Caramazza 1987; Meunier and Segui 1999). In parallel dual-route models of lexical 158 Burani and Thornton access (see, e.g., Burani and Laudanna 1992; Chialant and Caramazza 1995; Frauenfelder and Schreuder 1992; Schreuder and Baayen 1995), words composed of more than one morpheme may activate in parallel two types of access units, namely units corresponding to the whole word and units corresponding to the morphemes included in the stimulus. In these models, the relative frequency of the whole word and of the constituent morphemes affect the relative time-course of activation of the different units in the different components. Hence, frequency is the major determinant of the relative probability that lexical access is either whole-wordbased or morpheme-based. The assumption underlying these models is that the higher the frequency of a given lexical unit, be it a word, a root or an affix, the greater the likelihood that this unit is quickly activated and processed in the different processing components. What is crucial, in determining the probability that lexical access is provided by either whole-word or morpheme processing, is the complex balance existing between the frequency of the whole word and the frequency of its constituent morphemes, both roots and affixes, i.e., it is relative frequency, rather than absolute frequency (for a similar proposal, see Hay 2000; 2001). Hence, it might be predicted that a transparent derived word which has low-frequency in the language, like Italian bassezza (‘lowness’), but is composed of a very frequent root (i.e., bass-, ‘low’) and a very frequent suffix (i.e., -ezza, ‘-ness’) is likely to be accessed via activation of its morphemic constituents, rather than via the unit corresponding to the wholeword, which is supposed to be very scarcely activated. This prediction implies that the frequencies of both the root and the suffix are capable of affecting processing, and calls for evidence concerning the roles of both root and suffix frequency. However, and surprisingly, only the frequency of roots has been considered so far, while the frequency of affixes has been usually neglected. No study on lexical access to derived words has systematically varied the quantitative values of derivational affixes. By contrast, these values have been investigated in the context of pseudoword processing (see below). 1.1. Studies on words Several studies conducted in different languages, including English, Italian, French and Dutch, have shown that access times and accuracy to suffixed The interplay of root, suffix and whole-word frequency 159 derived words are significantly affected by root frequency (see, e.g., Beauvillain 1996; Bradley 1979; Burani and Caramazza 1987; Colé, Beauvillain and Segui 1989; Holmes and O’Regan 1992; Schreuder, Burani, and Baayen 2002). Lexical decisions were faster and more accurate when a suffixed word, usually of low frequency, included a root of high frequency. The facilitatory effect of high-frequency root morphemes was found both when calculation of root frequency included the frequency of the base word and its inflected forms only (Burani and Caramazza 1987), and when it was extended to include the frequencies of all the derived word-forms sharing the same root (Colé, Beauvillain and Segui 1989). The root frequency effect has been found in the context of suffixed derived words that were both orthographically and phonologically transparent with respect to their base root, and were usually transparent for meaning with respect to the meaning of their base. However, there has been evidence of root frequency effects also for derived words that included bound roots, or could be rated as semantically opaque with respect to their base (see, e.g., Holmes and O’Regan 1992; Schreuder, Burani and Baayen 2002). In the studies in which root frequency effects have been found, suffixes were usually productive and had high frequency. Suffix frequency was not directly investigated per se, but it was usually kept constant across categories by including the same suffixes in the high-frequency and lowfrequency root sets. 1.2. Studies on pseudowords An investigation of suffix frequency per se was recently made, by adopting pseudoword contexts made up of illegal root-suffix combinations. Burani et al. (1997) submitted, to both lexical decision and naming, pseudowords that were made up of real roots combined with derivational suffixes not compatible with the root. In order to demonstrate that the probability of access through activation of morphemic units corresponding to suffixes is constrained by their frequency values (see also Laudanna and Burani 1995), Burani et al. (1997) made use of suffixes belonging to two distinct frequency ranges. In one experimental set, roots were combined with highfrequency suffixes, and the resulting pseudowords were contrasted with pseudowords in which the same roots were combined with control sequences that had analogous orthographic frequency in final position of Italian words, but were not suffixes. In the second set, a comparison was 160 Burani and Thornton made between pseudowords composed of roots plus low-frequency suffixes, and the same roots combined with control low-frequency orthographic final sequences. Roots were in both cases of medium frequency. In order to control for asymmetries in the possibility to assign meaning to suffixed pseudowords of the two kinds (i.e., with high- and low-frequency suffixes, respectively), suffixed pseudowords in the two frequency sets were matched for mean interpretability values derived from participants’ empirical ratings. The lexical decision results by Burani et al. (1997) showed that the interference effect which is usually found on pseudowords that include real affixes (see, e.g., Caramazza, Laudanna and Romani 1988; Taft and Forster 1975; see also, for derivational suffixes, Jarvella and Wennstedt 1993) is conditioned by the frequency of the embedded suffixes: Longer reaction times and higher error rates were found, with respect to control pseudowords, only when pseudowords included high-frequency suffixes. By contrast, pseudowords with low-frequency suffixes took no longer to be rejected than control pseudowords. From these results on suffixed pseudowords, Burani et al. (1997) concluded that the probability that suffixes will affect processing is conditioned by their frequency (see also Laine 1996, for Finnish productive derivational suffixes causing interference effects on pseudoword lexical decision). 1.3. Suffix frequency, suffix numerosity, and productivity In considering the frequency of suffixes, two main quantitative measures can be adopted. On the one hand, frequency in the proper sense is calculated on word tokens, by summing up the cumulative frequency in a given corpus of all the word tokens in which a given suffix occurs. On the other hand, suffix frequency can be measured by calculating the number of word types in which a given suffix occurs in a given language. This second measure could be named numerosity of the suffix (as proposed by Burani et al. 1995). There could be reasons for considering numerosity (i.e., suffix typefrequency) as a better quantitative characterization for suffixes and a stronger predictor of performance in access tasks. Suffix numerosity is closely related to suffix productivity thus allowing the suffix to “emerge” as a separate processing unit (see, e.g., Baayen 1989; 1992; Bybee 1995a). However, there is a strong link and a complex interplay among suffix type The interplay of root, suffix and whole-word frequency 161 and token frequency, productivity and probability of morphemic parsing (Hay and Baayen 2002). In the study by Burani et al. (1997), suffix numerosity and suffix frequency were not disentangled because, after inspection of frequency distributions, it was found that suffix token-frequency and suffix numerosity tended to be highly correlated. Suffix frequency was used in a broad sense to subsume the two quantitative measures that could affect processing. Consequently, suffixes were either high or low on both dimensions, frequency and numerosity, calculated on a corpus of Italian written language (Istituto di Linguistica Computazionale CNR 1989). 1.4. Other properties of suffixes relevant for word processing Some recent research has investigated the role of properties of derivational suffixes in lexical access to words (see Bertram, Laine and Karvinen 1999, for Finnish; Bertram, Schreuder and Baayen 2000, for Dutch). For both Finnish and Dutch, properties like suffix productivity and suffix homonymy (i.e., suffix ambiguity in serving more than one semantic function) were found to affect processing, with words including productive and nonhomonymous suffixes being more likely to induce morpheme-based processing (see also, for Japanese, Hagiwara et al. 1999). In the studies by Bertram, Laine and Karvinen (1999) and Bertram, Schreuder and Baayen (2000), no information was given on the frequency values of the productive vs. unproductive suffixes, and the issue of suffix frequency/numerosity was not assessed directly. A variation in productivity usually corresponds to a variation in frequency/numerosity. However, although very related to suffix productivity, suffix frequency and numerosity do not necessarily correspond to productivity. There may be differences in suffix numerosity that do not correspond to differences in suffix productivity. At the same time, it is not always the case that differences in productivity correspond to differences in suffix frequency or numerosity. 2 Thus there are reasons for investigating suffix frequency/numerosity in derived words, without identifying these quantitative measures with productivity. 162 Burani and Thornton 2. The present study While there is evidence that root frequency affects access to printed suffixed derived words, evidence for a role of quantitative properties of suffixes in visual processing comes almost exclusively from pseudowords. The present study aimed at assessing simultaneously the roles of both root and suffix frequency in Italian derived words, by testing different combinations of roots and suffixes with differing frequency. If the role of quantitative properties of morphemes have to be assessed per se, derived words with morphemic constituents of different frequencies should be matched for a number of factors, including orthographic/phonological transparency, semantic transparency, and whole-word frequency. Italian derivation occurs mostly through agglutination of suffixes to roots which are not occurring words themselves (see Peperkamp 1995). Orthographic/phonological transparency of derived forms with respect to their roots is quite common, wide-spreading across different frequency ranges of both words and morphemes, and can be easily controlled for. Semantic transparency can also be kept under control, while varying suffix frequency/numerosity. Suffix numerosity, i.e., the number of word types in which a given affix occurs, is one determinant of semantic transparency, but it does not identify with it (see Bybee 1995a). Moreover, in intramodal tasks there might be reasons for expecting effects of morphological constituency also in derived words that are less transparent for meaning or in semantically opaque words (see, e.g., Bentin and Feldman 1990; Feldman and Soltano 1999; Plaut and Gonnerman 2000; Schreuder, Burani and Baayen 2002; Stolz and Feldman 1995; Vannest and Boland 1999; but see also, for contrasting evidence in cross-modal tasks, Marslen-Wilson et al. 1994). Given the basic prediction that low-frequency words with two highfrequency constituents should be the best candidates for access through morphemes, which predictions could be made for low-frequency words that include only one high-frequency constituent (either the root or the affix)? Would lexical access be equally sensitive to the higher frequency constituent? Would it be differentially sensitive to the frequency of the root and the affix, respectively? In some studies the assumption has been made that root frequency effects should manifest themselves in low-frequency derived words with frequent and productive suffixes. To our knowledge, there were no investigations of whether root frequency effects would show up in the context of low-frequency suffixes. Similarly, no study has inves- The interplay of root, suffix and whole-word frequency 163 tigated whether high-frequency suffixes may affect the probability of morpheme-based access in the context of low-frequency roots. In synthesis, no study has addressed the issue of whether low-frequency derived words that include either one or both low-frequency constituents were likely to activate morphemic units at all. Our predictions were developed in the framework of the model proposed first by Schreuder and Baayen (1995) (see, for recent updates, Baayen and Schreuder 1999; 2000; Baayen, Schreuder and Sproat 2000). This is a race model for the recognition of morphologically complex words in which there are two parallel access routes, one based on whole-form information, and the other based on morphemic decomposition. One assumption of the model is that, for the visual modality, the complete input is available from the start. Thus in principle, for low-frequency derived words, both root and suffix frequency should affect processing at the stages in which morphemic access representations are activated over time by the sensory input. In the framework of this race model, it is crucial to assess the complex balance between access through storage (whole-word activation) and access through computation (morphemic activation). An open issue is how processing proceeds for low-frequency words which include low-frequency constituent morphemes. This implies assessing the relation existing, in terms of processing costs and benefits, between components in which morphemic units are segmented and activated, and subsequent components in which they are re-combined in order to derive meaning. So far, the probability of faster access through whole-word activation has been suggested for high-frequency derived words that tend to be highly lexicalized (Baayen and Neijt 1997; Burani and Laudanna 1992; Bybee 1995b; Chialant and Caramazza 1995; Frauenfelder and Schreuder 1992; 3 Schreuder and Baayen 1995). Additionally, whole-word storage has been proposed for English derived words which include non-neutral affixes and whose stems tend to cluster around recurring patterns thus constituting “gangs” (Alegre and Gordon 1999b), or for derived words which include unproductive suffixes (Bertram, Laine and Karvinen 1999; Bertram, Schreuder and Baayen 2000; Hagiwara et al. 1999). However, the possibility should be conceived that also low-frequency derived words which include low-frequency morphemic constituents – even if phonologically transparent – are more likely to be accessed as whole forms through direct whole-word access. For these words, the reduced probability of access through morphemic decomposition would de- 164 Burani and Thornton rive from the fact that the slight difference between the frequency of morphemes and whole-word frequency is not large enough for morphological processing to result in benefits relative to whole-word based lexical access. The following visual lexical decision experiments addressed the latter issues by combining evidence from pseudoword and word processing. Experiment 1 was conducted on pseudowords, whereas both Experiments 2 and 3 involved words. Experiment 1 on pseudowords aimed at replicating and extending results on the role of suffix quantitative properties in nonlexical contexts. It addressed issues that should help in interpreting results from the two following experiments on words. By including suffixes in pseudoword contexts in which the initial orthographic sequence did not correspond to an existing root, the role of suffix frequency/numerosity could be differentiated from its consequences on the semantic transparency or interpretability of a newly derived form with respect to its base. At the same time, by investigating stimulus contexts in which the lexical morphemic unit (the suffix) occurred in the rightmost part of the stimulus in the absence of a morphemic unit on its left side, we aimed at providing evidence for a role of morphemic units which is independent of sequential left-to-right processing. Experiments 2 and 3 addressed the issue of the interplay of root, suffix and whole-word frequency in access to derived words, by orthogonally varying high- and low-frequency roots and suffixes in low-frequency transparent derived Italian words. In order to specify the balance between the processing routes based on whole-word and morphemic units, respectively, the derived words were contrasted with nonderived words of analogously low frequency (Experiment 3). 3. Experiment 1 In Experiment 1, we aimed at replicating, with three sets of suffixes, the effect of suffix frequency in lexical decision to pseudowords found by Burani et al. (1997). In that study, suffixes were combined with real roots to form pseudowords. In the present experiment, the pseudowords were obtained by combining suffixes of varying frequencies with orthographic sequences that did not correspond to real roots. If suffix frequency effects were to occur in lexical decision to pseudowords of this sort, strong evidence for the role of suffixes in visual processing would be provided. If activation of the morphemic units comprising a stimulus occurs irrespective of their sequential positions within the stimulus, provided they are The interplay of root, suffix and whole-word frequency 165 frequent enough, the prediction could be made that high-frequency suffixes are activated and play some role also when affixed after nonroots. The expected result has twofold implications. On the one hand, if highfrequency suffixes delay lexical decisions to pseudowords even when they occur in stimuli that do not contain real roots, it could be concluded that the effects of frequency/numerosity of suffixes occur at a processing stage in which affix morphemes are available independently of their semantic content. When suffixes are combined with nonroots, no interpretability of the combination should be expected, because of the absence of a meaningful component in first position (for the effects of interpretability of new root-suffix combinations in lexical access, see Burani et al. 1999; see also, for the effects of interpretability on novel Dutch compounds, Coolen, van Jaarsveldt and Schreuder 1991; van Jaarsveldt, Coolen and Schreuder 1994). On the other hand, if we were to show a frequency effect induced by a morphemic unit located in the rightmost position of the stimulus, within an orthographic context in which no lexical or morphemic unit occurs in left position, we would challenge a sequential search model, which predicts that the frequency of the second constituent should not affect lexical processing (Taft and Forster 1976). Evidence against this sort of model has been provided by studies which used both compound nonwords (Lima and Pollatsek 1983), and real compound words (Andrews 1986; Andrews and Davies 1999; Pollatsek, Hyönä and Bertram 2000). The main finding of these studies, which mainly employed lexical decision, but also the recording of eye movements in sentence reading (Pollatsek, Hyönä and Bertram 2000), was an effect of the lexical status or of the frequency of the second constituent. Evidence for a frequency effect of the second constituent when this is a suffix is still lacking. However, even in a theoretical framework which incorporates principles of interactive activation (Taft 1994), it is still assumed that, while inflectional endings would be stripped off in word processing, derivational suffixes would not, because of their different role in processing. Within the latter framework, all the pseudowords that are tested in our experiment, provided they are equated in their leftmost nonlexical part on purely orthographic grounds, should be rejected equally fast, irrespective of the presence of a suffix on their rightmost side. In contrast with these predictions, if interference effects on nonword lexical decision do arise when the stimulus includes a high-frequency suffix in combination with a nonroot orthographic sequence, a model in which the 166 Burani and Thornton processing system activates frequent morphemic units, irrespective of their relative locations within the word, would be supported In Experiment 1, three sets of Italian derivational suffixes were selected. The three sets, matched for length in letters and phonemes, and for orthographic/phonological structure, differed only for frequency, calculated both on word tokens and on word types. Suffixes in the three sets could be considered of high, medium and low frequency, respectively, by considering the overall distribution of frequency values of Italian suffixes of the same length. The main prediction was that high-frequency suffixes should cause more interference on nonword decision when included in pseudoword contexts, relative to low-frequency suffixes. Suffixes of medium frequency might either not constitute sufficiently activated processing units for interference to occur, or they might show interference effects of a smaller size than high-frequency suffixes. 3.1. Method 3.1.1. Materials and design Nine suffixes were selected, equally subdivided in three experimental sets, of high, medium and low frequency, respectively. Frequencies, in this experiment and in all the following experiments, were derived from a corpus of Italian written language of 1.5 million tokens (Istituto di Linguistica Computazionale CNR 1989). The mean suffix frequencies in the three sets, calculated on word tokens, were 1,060 per 1.5 million (range: 639-1,557); 68.3 (range: 55-90), and 12.3 (range: 7-18), for the three sets, respectively. Differences in mean suffix numerosity between the three sets, calculated on word types in the corpus, paralleled differences in frequency. The mean number of word types in which suffixes occurred were 165 (range: 145187), 20.6 (range: 11-39), and 6.3 (range: 2-10) for the three sets, respectively. There were both nominal and adjectival suffixes. No suffix was homonymous with another Italian suffix. All suffixes were four-letter long and were matched across sets for length in phonemes and for syllabic structure. For each suffix, a control sequence with similar orthographic and syllabic structure was selected. The sequences corresponding to a suffix and the control sequences were matched for orthographic frequency in word final position in each set. The mean frequencies of control sequences in The interplay of root, suffix and whole-word frequency 167 word final position were: 1,538 per 1.5 million for the first set; 472 for the second set; 64 for the third set. These values were matched to the mean frequencies of the orthographic strings corresponding to the selected suffixes, calculated in word final position and including both real suffixes and pseudosuffixes: 1,335 per 1.5 million for the high-frequency set; 411 for the medium-frequency set, and 62 for the low-frequency set, respectively. The suffixes were combined with orthographically legal letter sequences that did not correspond to any existing root. Each suffix was combined with four different pseudoroots, for a total of twelve pseudowords in each set. The length of pseudowords fell within the length range of Italian words including the same suffix, and respected as much as possible the distribution of word length for each suffix in the Italian language (calculations were based on Ratti et al. 1988). Mean lengths in letters of the pseudowords were 9.1, 8.6, and 8.7 for the three sets, respectively. Each suffixed pseudoword was matched with a control pseudoword that included the same pseudoroot in combination with the orthographic sequence that constituted the control sequence for the suffix. Thus pseudowords in each suffixed-control pair had the same length, the same syllabic structure, similar orthographic/phonological structure, and similar orthographic frequency of the final part, either corresponding to a suffix or to a nonsuffix. Pseudowords including suffixes and control sequences were also matched for bigram frequency. Mean bigram frequencies, calculated on the base of the natural logarithm, were: 10.49, 10.57, and 10.60 for pseudowords in the three suffixed sets; they were 10.80, 10.55, and 10.44 for pseudowords in control sets. In combining initial letter strings with suffixes and control letter sequences, we avoided the presence in the pseudowords of embedded real words. Pseudowords in the six sets were also matched for their overall degree of orthographic similarity to a real word, i.e., for the number of orthographic neighbors. Adopting the N-count measure (Coltheart et al. 1977), i.e., the total number of words that can be obtained from each pseudoword by replacing one letter at a time with another letter, while preserving the other letters’ positions, we determined that the great majority of pseudowords had a null N-count (with a few exceptions of N-count = 1, balanced across sets), i.e., we obtained pseudowords that were equally dissimilar from existing words, according to the N-count metric. In synthesis, there were six sets of pseudowords, arranged in a 2x3 design, in which the main factors were the presence vs. absence of a suffix, and the high vs. medium vs. low frequency of the orthographic sequence 168 Burani and Thornton corresponding either to a suffix or to a nonsuffix. In each of the six sets, there were 12 pseudowords (4 for each suffix or control sequence), for a total of 72 experimental stimuli, 36 suffixed and 36 controls. The experimental items, with the mean RT and percent error for each item, are reported in Appendix A. In order to avoid presenting the same pseudoroot to the same participant both in the suffixed and in the nonsuffixed control condition, each experimental set was split in two subsets of 6 items each. Each participant was presented with 36 experimental pseudowords, 18 suffixed and 18 pseudosuffixed, in which no pseudoroot was repeated. In each subset there were two instances of the same suffix or final control sequence. For each set of suffixed and control pseudowords, the entire set of single scores was provided by two participants presented with two complementary sublists. In each sublist, the 36 experimental pseudowords were presented together with 66 filler pseudowords and 102 filler words. Each participant was presented with a total of 204 stimuli. Filler stimuli were the same in each of the two sublists: Words included medium/low frequency singular nouns and adjectives, either derivationally suffixed or nonsuffixed, in a proportion that reflected the composition of the Italian basic dictionary in the medium/low frequency range (see Thornton, Iacobini and Burani 1994; 1997). Each suffix and each control final sequence that occurred in experimental pseudowords was also included in the same number of filler words. Filler pseudowords were drawn from words analogous to the filler words by changing one or two letters in different positions. Mean length was the same for words and pseudowords (range: 6-11 letters). The list was presented to participants in a single experimental session, arranged in three randomized blocks of 68 items each. For each block, participants were assigned to one of two different randomizations of items. Each experimental list was preceded by a practice list of 50 items, 25 words and 25 pseudowords, assigned in the same proportion to two randomized blocks. 3.1.2. Procedure Participants were tested individually in a soundproof experimental booth. They received standard lexical decision printed instructions in which they were asked to decide as quickly and as accurately as possible whether a presented letter string was an Italian word or not. If it was a word (YES The interplay of root, suffix and whole-word frequency 169 response), they had to press the right one of two response keys, otherwise (NO response) the left one. For left-handed participants, the order of the response buttons was reversed. Each trial started with the presentation of a fixation mark (a cross) in the center of the screen for 400 ms, followed after 300 ms by the stimulus centered at the same position. Stimuli were presented on a monitor in white uppercase letters on a dark background and remained on the screen until the participant pressed one of the two response buttons. They disappeared after a time period of 1,500 milliseconds if no response was given. A new trial began 1,200 ms after responding or time-out. If a participant responded more slowly than the preset limit of 1.5 sec, the words FUORI TEMPO (‘out of time’) appeared on the screen. If the participant gave the wrong response, the word ERRORE (‘error’) appeared on the screen. This signal was displayed for 500 ms. The interval between the disappearance of the feedback and the next warning signal was 1,200 ms. There was a pause after each block of stimuli. The total duration of the experimental session was approximately 20 minutes. 3.1.3. Participants Forty-eight participants, mostly University students, were paid to participate in the experiment. All were native speakers of Italian. 3.2. Results and discussion The data of four participants, whose mean reaction times for correct responses or whose error rates were more extreme than 2 s.d from the mean of all participants, were excluded from further analysis. Using the remaining forty-four participants, the mean reaction times and error rates for all items were obtained and one pair of items in the medium-frequency set was removed because the number of errors for one of the two members of the pair (crofusso) was more than 2.5 s.d. above the mean. When means for length, bigram frequency and N-count were recalculated after removing the two paired items, the sets were still balanced. The remaining observations were used to calculate participants’ and items’ mean reaction times and error scores. Mean reaction times by items and percentages of errors for 170 Burani and Thornton the three experimental categories and their respective controls are shown in Table 1. Table 1. Experiment 1. Mean reaction times by items in ms. and % error. Suffixed and control (nonsuffixed) pseudowords, with high-frequency (HF), medium-frequency (MF), and low-frequency (LF) final sequences Suffixed Control Difference HF Mean RT % Error 739 13.6 700 6.1 +39 + 7.5 MF Mean RT %Error 701 4.1 701 7.4 0 - 3.3 LF Mean RT %Error 680 4.6 679 4.2 +1 + 0.4 Results were submitted to a mixed three-way analysis of variance with two within-participants factors: Suffixedness (suffixed vs. nonsuffixed pseudowords) and frequency of final sequence, both suffix and control (high vs. medium vs. low). The third between-participants factor was list (first vs. second sublist, each administered to one half of participants). The ANOVAs were performed both by participants and by items and showed interaction between suffixedness and frequency on both reaction times (F1(2,84) = 7.59, p<.001, MSE= 1,412.1; F2(2,58) = 3.34, p=.04, MSE= 977.3) and errors (F1(2,84) = 5.92, p=.004, MSE= 117.2; F2(2,58) = 5.97, p=.004, MSE= 1.48). Results differed across the three experimental sets when suffixed pseudowords were compared to their respective controls. Comparisons between suffixed-control pairs based on the Duncan test on means by items revealed that suffixed pseudowords in the highfrequency set were significantly (39 ms) slower (p=.03) and gave rise to significantly (7.5%) more errors (p=.003) with respect to controls. By contrast, suffixed pseudowords were equally fast relative to their controls in both the medium-frequency and the low-frequency sets (p>.1 in both cases). Percent errors on suffixed pseudowords were 3.3 less and 0.4 more in the medium-frequency and in the low-frequency sets, respectively. These differences were not significant (p>.1 in both cases). An effect of frequency was found on both reaction times (F1(2,84) = 19.78, p<.001, MSE= 1,505.6; F2(2,58) = 10.12, p<.001, MSE= 977.3), The interplay of root, suffix and whole-word frequency 171 and errors (F1(2,84) = 5.76, p=.004, MSE= 122.5; F2(2,58) = 6.28, p=.003, MSE= 1.48). The main effect of suffixedness (suffixed vs. nonsuffixed pseudowords) was found on reaction times by participants only (F1(1,42) = 6.57, p=.013, MSE= 900.46; F2(1,58) = 2.56, p>.1, MSE= 977.3). No main effect of suffixedness was found on errors (F1(1,42) = 2.03, p>.1, MSE= 72.23; F2(1,58) = 1.27, p>.1, MSE= 1.48). As revealed by the strong interaction between suffixedness and frequency of final sequence, and by post-hoc comparisons, response latencies and percentages of errors to suffixed pseudowords were higher, relative to their controls, only when the pseudowords included a high-frequency suffix. By contrast, pseudowords that included either medium-frequency or low-frequency suffixes did not reveal longer reaction times nor lower accuracy with respect to matched orthographic controls. These results confirm those obtained by Burani et al. (1997): High-frequency suffixes activate corresponding morphemic access units in pseudoword contexts. By contrast, no access unit seems to be available for suffixes that are either medium- or low-frequency, at least not in pseudoword contexts and within the time required to perform lexical decision. The present results allow us to build on the findings by Burani et al. (1997). In the present experiment, the interference effect caused by highfrequency suffixes occurred with suffixes that were combined with nonexisting roots. Hence, activation of morphemic lexical units corresponding to suffixes occurred in the absence of a real root on their left side. This finding hardly seems compatible with sequential search accounts (Taft and Forster 1976), and with recent reformulations (Taft 1994), which predict that the frequency of the second constituent, the derivational suffix, should not affect lexical processing in the absence of a lexical unit as first constituent. 4. Experiment 2 Experiment 1 provided evidence for a role in processing of suffix frequency, with high-frequency suffixes significantly affecting rejection latencies in visual lexical decisions to pseudowords. In Experiment 1 there was no evidence that suffixes of medium/low frequency which extended up to a frequency of 90 per 1.5 million constituted effective processing units: No interference arose in lexical decision, when a suffix was either medium- or low-frequency. 172 Burani and Thornton The role of suffix frequency in lexical decision to real words was assessed in Experiment 2, by varying both root and suffix frequency in transparent derived words of low surface frequency. Derived words included suffixes belonging to two sets of differing frequencies. Suffixes of high frequency were contrasted with suffixes that were of medium/low frequency. In Experiment 1 there was no evidence for differences between medium- and low-frequency suffixes. Hence, suffixes from both the latter frequency ranges were pooled together in a single set. For simplicity, hereafter we will refer to medium/low- frequency suffixes as lowfrequency suffixes. All low-frequency derived words should in principle be accessed through constituent morphemes – even when both the root and the suffix are low-frequency, the derived word is nevertheless lower in frequency than its constituent morphemes. Hence, for low-frequency derived words that are equated for all the relevant properties except for frequency of the two constituent morphemes, either high or low, predictions were that both reaction times and error rates should not be function of whole-word frequency, but should rather reflect differences in the frequency of morphemic constituents. Words including higher-frequency morphemes were expected to be accessed more quickly and more accurately than words including lowerfrequency morphemes, with words including both root and suffix of high frequency being the fastest and the most accurate, and words with lowfrequency root and suffix being the slowest and the least accurate. Derived words in which only one morpheme, either the root or the suffix, is of high frequency, were expected to show intermediate reaction times and error rates. If for printed stimuli simultaneous parallel activation of both root and affix is assumed, irrespectively of their relative positions within the word, words in which the high-frequency constituent is either the root or the suffix were not expected to differ in activation times. 4.1. Method 4.1.1. Materials and design Four sets of equally low-frequency suffixed derived words were selected. In the four sets, root and suffix frequency varied orthogonally: The first set included high-frequency roots and high-frequency suffixes (HH); the sec- The interplay of root, suffix and whole-word frequency 173 ond set included low-frequency roots and high-frequency suffixes (LH); the third set included high-frequency roots and low-frequency suffixes (HL); the fourth set included low-frequency roots and low-frequency suffixes (LL). Suffixes were either high- or low-frequency on both tokens and types, i.e., on both frequency tout court and numerosity. The root frequency measure included the cumulative frequency of both the inflected and the derived forms of the base. Thirteen words (nouns and adjectives) were included in each set, for a total of nine different suffixes in each set. No suffix was homonymous with a different Italian suffix. The same suffixes were included in the two highfrequency suffix sets and in the two low-frequency suffix sets, respectively. Suffixes were three to five letters long. High-frequency suffixes and low-frequency suffixes were matched for length and syllabic structure. Roots were different in the four sets. Root length was balanced across sets. The roots belonged to different grammatical categories (i.e., nouns, adjectives and verbs) that were balanced across sets. All the derived words were presented in singular citation form. They were orthographically and phonologically transparent with respect to their bases, i.e., there was no orthographic/phonological assimilation at the boundary between root and suffix. Across the four sets, words were matched for surface frequency, length, syllable structure and bigram frequency. The 52 experimental words were presented together with 108 filler words and 160 filler pseudowords, for a total of 320 stimuli. Any suffix that occurred in experimental words occurred also in the same number of filler pseudowords. Filler pseudowords were drawn from words analogous to the filler words by changing one or two letters in different positions in the word. Filler words included medium/low-frequency singular nouns and adjectives, either morphologically complex or simple, in a proportion that reflected the composition of the Italian basic dictionary in the medium/low frequency range (Thornton, Iacobini and Burani 1994; 1997). Mean length was the same for words and pseudowords (range: 6-11 letters). The list was presented to participants in a single experimental session, arranged in four randomized blocks of 80 items each. Each participant was presented with a different block randomization and with a different randomization of items within each block. Each experimental list was preceded by a practice list of 40 items, 20 words and 20 pseudowords, assigned in the same proportion to two randomized blocks. 174 Burani and Thornton 4.1.2. Procedure The procedure was the same as in Experiment 1. The experimental session lasted about 30 minutes. 4.1.3. Participants Forty-five participants, mostly University students, were paid to participate in the experiment. All were native speakers of Italian. 4.2. Results and discussion The data of ten participants, who made more than 15 percent errors on the experimental words, were excluded from further analysis. Using the remaining thirty-five participants, the mean reaction times and error rates for all items were obtained. We removed four experimental words that showed error rates exceeding 40% from the data set. One word (tenerume) was removed in set HL, two words (aratore and larvale) in set LH, and one word (ameboide) in set LL. One item (rimanenza) was removed in set HH because it was the only word which included a prefixed bound root of an irregular verb. Removal of these items did not affect the matching of the four sets for the relevant variables. In Table 2 the mean values with standard deviations for the variables in each experimental set are reported. The list of the experimental items, with root frequency, suffix frequency, word frequency, mean RT and percent error for each item, are reported in Appendix B. The remaining observations were used to calculate participant and item mean reaction times and error scores. Mean reaction times by items and percentages of errors for the four experimental sets are shown in Table 3. The interplay of root, suffix and whole-word frequency 175 Table 2. Experiment 2. Mean values and standard deviations (s.d.) for the relevant variables. HH = Derived words with high-frequency root and high-frequency suffix HL = Derived words with high-frequency root and low-frequency suffix LH = Derived words with low-frequency root and high-frequency suffix LL = Derived words with low-frequency root and low-frequency suffix HH Root frequency Family size Suffix frequency Suffix numerosity Semantic relatedness Familiarity Bigram frequency Word length in letters Root length in letters Suffix length in letters Word frequency HL LH LL Mean s.d. Mean s.d. Mean s.d. Mean s.d. 501 412.1 507 409.8 33.5 19.1 33.3 21.9 15.7 1,859 246 3.77 6.57 10.79 8.2 4.4 3.8 3.1 8.05 1,515 123.6 0.43 0.78 0.4 1.5 1.4 0.6 2.8 10.20 57 17.3 3.47 6.05 10.69 8.2 4.3 3.9 3.5 3.82 33.7 10.8 0.89 1.03 0.3 1.1 1.1 0.5 3.2 4.4 1,636 217 3.66 6.15 10.56 8.4 4.5 3.9 2.2 2.5 1,260 101.39 0.68 0.9 0.3 1.0 1.0 0.5 2.3 4.2 58 17.7 3.45 5.75 10.60 8.2 4.4 3.8 1.7 1.7 32.5 10.6 0.95 1.01 0.3 0.9 0.8 0.6 1.2 Table 3. Experiment 2. Mean reaction times by items in ms and % error. Suffixed derived words with high-frequency root and high-frequency suffix (HH); high-frequency root and low-frequency suffix (HL); lowfrequency root and high-frequency suffix (LH); low-frequency root and low-frequency suffix (LL). HH HL Mean Reaction Time % Error 597 2.5 LH 624 5.9 LL Mean Reaction Time % Error 634 6.7 670 12.2 Results were submitted to two-way analyses of variance, with root frequency (high vs. low) and suffix frequency (high vs. low) as the two factors. There were main effects of both root frequency and suffix frequency on both reaction times and error rates. For root frequency, F1(1,34)=77.83, p<.001, MSE= 771.3; F2(1,43)= 7.96, p<.01, MSE= 2,551.55 on reaction times; F1(1,34)= 13.96, p<.001, MSE= 68.9; F2(1,43)= 6.08, p<.025, MSE= 6.59 on error rates. For suffix frequency, F1(1,34)= 28.92, p<.001, 176 Burani and Thornton MSE= 1,088; F2(1,43)= 4.77, p<.05, MSE= 2,551.55 on reaction times; F1(1,34)= 11.00, p=.002, MSE= 63.59; F2(1,43)= 4.38, p<.05, MSE= 6.59 on error rates. There was no interaction between the two factors (p>.1 in all the analyses). Results on both reaction times and error rates strictly paralleled differences in frequency of constituent morphemes, with words including highfrequency constituents determining quicker and more accurate performance. No differential role of root frequency with respect to suffix frequency was apparent in the data. Hence, results seemed to confirm the hypothesis of access through activation of morphemic constituents for lowfrequency derived words. We controlled post-hoc for possible residual asymmetries in the properties of the experimental words that could have contributed to the effect. Three properties of the derived words were considered: the semantic relatedness to the base, the word’s morphological family size, and the word familiarity. 4.2.1. Ratings of semantic relatedness with the base According to some authors, effects of morphological structure should be found preferentially in derived words that are semantically transparent (or related) with respect to the base (Marslen-Wilson et al. 1994; but see also, for contrasting data and accounts, Bentin and Feldman 1990; Feldman and Soltano 1999; Plaut and Gonnerman 2000; Schreuder, Burani and Baayen 2002; Stolz and Feldman 1995; Vannest and Boland 1999). In selecting stimuli, we aimed at balancing words in the four sets for the degree of semantic transparency. We controlled for semantic transparency of the derived words with respect to their bases by excluding semantically opaque words, and by including in each set approximately the same number of suffixes that could be considered either productive or unproductive on the basis of different measures of productivity. In each frequency set there were suffixes that could be considered productive because they had been used to coin a substantial number of neologisms in the last fifty years, or could be rated as productive on the basis of the quantitative measure of productivity proposed by Baayen (1989; 1992). Analogously, in each set there was a similar number of suffixes that could be considered scarcely productive or unproductive on either one or both the latter measures. The interplay of root, suffix and whole-word frequency 177 However, it could not be excluded that, independently of productivity rated in the latter ways, derived words including high-frequency suffixes might result in greater semantic transparency with respect to their bases by virtue of suffix numerosity itself. High suffix numerosity is related to a greater number of derived words which tend to share a similar part of meaning, namely the meaning carried by the suffix. Analogously, it could not be excluded that words might have different semantic transparency values due to specific idiosyncrasies. Derived words were submitted to empirical ratings for semantic relatedness with their base. Each derived word was paired with its base word. The printed list of word-pairs was presented in different random orders to thirty-eight University students who had not participated in the lexical decision experiment. Participants had to rate, for each pair, on a five-point scale ranging from “Very unrelated” to “Very related”, how “related in meaning” they thought the first word (the derived word) was to the second word (the base word). Mean ratings of semantic relatedness with the base were 3.77 (s.d. 0.43) for HH words; 3.47 (s.d. 0.89) for HL words; 3.66 (s.d. 0.68) for LH words; 3.45 (0.95) for LL words, respectively. A two-way ANOVA with root frequency (high vs. low) and suffix frequency (high vs. low) as factors was performed on semantic relatedness rating means both by participants and by items. A suffix frequency effect was found, by participants only (F1 (1,37) = 15.65, p<.001, MSE= 195.1; F2 (1,43) = 1.35, p=.25, MSE=.55), with words including high-frequency suffixes being rated as significantly more related for meaning to their base words (mean semantic relatedness: 3.71) than words including low-frequency suffixes (mean semantic relatedness: 3.46). Neither root frequency nor the interaction were significant (F<1 in both cases). In order to assure that the suffix frequency effect was not confounded with the fact that words with high-frequency suffixes were more transparent for meaning than words with low-frequency suffixes, two further analyses were carried out. First, the results of lexical decision were reanalyzed by excluding from the two sets with low-frequency suffixes the least transparent items (four items in all), thus obtaining new sets that were perfectly matched for semantic transparency. After matching sets for semantic transparency, the results of ANOVAs did not change, but showed even stronger effects of both root and suffix frequency (F2 (1,39) = 7.3, p<.025, MSE= 2,314.56; F2 (1,39) = 9.46, p<.005, MSE= 2,314.56 on reaction times for root and suffix, respectively; F2 (1,39) = 6.08, p<.025, MSE= 6.36; F2 178 Burani and Thornton (1,39) = 7.88, p<.01, MSE= 6.36, on error rates for root and suffix, respectively), and no interaction (p>.1). Furthermore, post-hoc correlation analysis did not reveal significant correlation (one-tailed test) of reaction time with semantic relatedness (r = -.14, t(45)=0.92, p>.1). Hence we could exclude that semantic transparency was responsible for the effects found (see also, for evidence that semantic transparency itself cannot explain why some suffixes induce decomposition while others do not, Vannest and Boland 1999). 4.2.2. Morphological family size Recently, Bertram, Baayen and Schreuder (2000) reported evidence for the role, in the lexical processing of Dutch complex words, of morphological family size, i.e., the type count of derived words and compounds with a given base word as a constituent. This type count of the number of morphological family members, that has been found to be a strong independent co-determinant of response latencies for Dutch monomorphemic and inflected words (de Jong, Schreuder and Baayen 2000; Schreuder and Baayen 1997), also affected latencies to derived words. Bertram, Baayen and Schreuder (2000) suggested that a large family size of the base word facilitates lexical processing for most suffixed derived words. According to the authors, the facilitatory effect of a large family size is due to semantic activation spreading from a complex word to its family members. The family size of the base root – i.e., the root numerosity – could in principle affect processing of Italian derived words. In our study, a larger morphological family size should be expected in both sets with high root frequency, relative to the sets with low root frequency. For each target word, the number of word types that share the same root in the corpus was counted. The obtained mean number of morphological family members was 15.7 (s.d. 8.05) for HH words, 10.2 (s.d. 3.82) for HL words, 4.4 (s.d. 2.5) for LH words, and 4.2 (s.d. 1.7) for LL words, respectively. A twoway ANOVA with root frequency (high vs. low) and suffix frequency (high vs. low) as the two factors was performed on family size values in the four sets. As expected, a difference in the numerosity of the morphological family between words with high and low frequency roots was found (F(1,43)= 39.01, p<.0001, MSE= 22.5). A difference in morphological family size between words with high and low frequency suffixes was also found (F(1,43)= 4.24, p<.05, MSE= 22.5), and a marginally significant The interplay of root, suffix and whole-word frequency 179 interaction (F(1,43)= 3.43, p=.07, MSE=22.5). A two-tailed t-test between HH and HL sets revealed a significant difference in family size between the two sets with high-frequency root (t(22)= 2.11, p<.05). The differences in family size among the experimental sets were in the same direction as the differences in response times, with HH words having a mean larger number of family members (15.7) than HL words (10.2), and words with high-frequency suffixes a mean larger number of family members (10.1) than words with low-frequency suffixes (7.2). Hence we assessed whether these differences could be responsible for part of the effect that was found in lexical decision. Two further analyses were made. First, we reanalyzed the results of lexical decision by excluding two items in each of the two sets with highfrequency roots, to obtain new sets that were matched for mean family size. After matching sets for family size, the results of ANOVAs did not change, but still showed effects of both root and suffix frequency (F2(1,39)= 7.82, p=.008, MSE= 2,602.1; F2 (1,39) = 4.31, p<.05, MSE= 2,602.1, on reaction times for root and suffix, respectively; F2 (1,39) = 6.7, p=.01, MSE= 6.75; F2 (1,39) = 4.54, p= .04, MSE= 6.75, on error rates for root and suffix, respectively), and no interaction (p>.1). Furthermore, post-hoc correlation analysis did not reveal any correlation of reaction time with family size (r=.04, t(45)=0.27, p>.1). Thus the hypothesis that differences in family size were responsible for part of the effects found in lexical decision could be rejected. 4.2.3. Familiarity ratings Words from the low-frequency range of a corpus may differ in familiarity, and familiarity is usually a good predictor of lexical decision performance (Connine et al. 1990; Gernsbacher 1984). Although our derived words were matched for whole-word frequency, we made a post-hoc check for familiarity. The derived words were submitted to twenty-seven University students for familiarity ratings. Participants had to rate the printed words on a seven-point scale ranging from “Unknown” (1) to “Very well known” (7). All the derived words received high familiarity ratings. However, there were differences between the four groups. Mean familiarity ratings were: 6.57 (s.d. 0.78) for HH words; 6.05 (s.d. 1.03) for HL words; 6.15 (s.d. 0.9) for LH words; 5.75 (s.d. 1.01) for LL words. 180 Burani and Thornton A two-way ANOVA with root frequency (high vs. low) and suffix frequency (high vs. low) as the two factors performed on familiarity rating means both by participants and by items showed significant effects of both factors, by participants only (F1(1,26) = 15.03, p<.001, MSE= .51; F2(1,43) = 2.56, p>.1, MSE= .88 for root frequency; F1(1,26) = 44.11, p<.001, MSE= .46; F2(1,43) = 1.74, p>.1, MSE= .88 for suffix frequency), and no interaction (F<1). Results on familiarity ratings paralleled results on reaction times and accuracy, with words including high-frequency roots and high-frequency suffixes being rated as more familiar than words with low-frequency roots and suffixes. Differences in rated familiarity were so clean that we could not select post-hoc a subset matched for familiarity. Moreover, post-hoc correlation analysis revealed a strong correlation of reaction time with familiarity (r = -.66, t(45) = 5.83, p<.0001). 4.2.4. Ad interim considerations In synthesis, the results of post-hoc controls on three possible factors contributing to the effects found at lexical decision (i.e., semantic relatedness with the base, morphological family size, and word familiarity) evidentiated word familiarity as a possible determinant of lexical decision performance. There could be reasons for adopting differences in rated familiarity of morphologically complex words as evidence per se for access to morphemic constituency (see Bertram, Baayen and Schreuder 2000; Schreuder and Baayen 1997). However, it could also be the case that word familiarity provides a different source of explanation for the effect we found. In Experiment 3, we tried to disentangle the role of morphemic frequency from that of word familiarity, while addressing further processing issues. 5. Experiment 3 The aim of Experiment 3 was twofold. First, we aimed at detailing the effects found in Experiment 2 with different and larger sets of derived words, better balanced for properties like semantic relatedness with the base, morphological family size and rated familiarity. Second, we assessed whether slower reaction times and higher error rates to words including both a root and a suffix of low frequency (LL words) were a function of The interplay of root, suffix and whole-word frequency 181 the low frequency of constituent morphemes, or whether performance was merely a function of low whole-word surface frequency. The latter possibility would imply that, for low-frequency derived words with both constituents of low-frequency, the output of lexical decision does not result from morphological processing, but from whole-word processing. For lowfrequency derived words whose constituents are both low-frequency, the moderate difference between the frequency of morphemes and whole-word frequency (with root and suffix only slightly higher in frequency than the whole-word) might not be large enough for morphological processing to result in benefits relative to access based on the whole-word. To address the latter issue, a set of nonderived words (ND) was matched for frequency to four new sets of low-frequency derived words that included morphemes of differing frequencies. Analogously to Experiment 2, the first set of derived words included high-frequency roots and high-frequency suffixes (HH); the second set included low-frequency roots and high-frequency suffixes (LH); the third set high-frequency roots and low-frequency suffixes (HL); the fourth set low-frequency roots and lowfrequency suffixes (LL). Words in the five sets (the four sets of derived words and the set of underived words) had the same mean low surface frequency and were matched for all the relevant variables, including familiarity. The only difference was that words in the fifth (nonderived) set did not 4 include any derivational suffix. Predictions were the following. If LL derived words with both lowfrequency constituents are accessed preferentially as whole-words and no morpheme-based access succeeds because of including exceedingly lowfrequency roots and suffixes, LL derived words should show similar results to nonderived (ND) words: For both LL and ND words, reaction times and error rates should be function of surface frequency only. If, by contrast, activation of morphemes is involved in access to LL derived words, and if morphemic activation implies processing advantages because of accessing a root and a suffix which are, although slightly, higher in frequency than the whole-word, LL derived words should be quicker than nonderived words. The latter do not benefit in fact from any constituent morpheme of higher frequency.5 If the closer matching for familiarity, semantic relatedness and morphological family size obtained for words in the experimental sets of Experiment 3 does not make any contribution to results, HL and LH derived words which include one high-frequency constituent (either the root or the suffix), should show intermediate reaction times and error rates relative to LL words on the one hand, and HH derived words on the other. 182 Burani and Thornton Accordingly, HH derived words with both constituents of high-frequency should be the most quickly and most accurately recognized. 5.1. Method 5.1.1. Materials and design The five experimental sets included seventeen words each, nouns and adjectives in analogous proportions in each set. In the four derived sets, there were nine different high-frequency suffixes, and ten different lowfrequency suffixes, with analogous distributions in the two high-frequency suffix sets, and in the two low-frequency suffix sets, respectively. No suffix was homonymous to a different Italian suffix. Suffixes were three to five letters long. High-frequency suffixes and low-frequency suffixes were matched for length and syllabic structure. Roots were different in the four sets, and belonged to different grammatical categories (i.e., nouns, adjectives and verbs) that were balanced across sets. Root length was balanced across sets. Analogously to Experiment 2, suffixes were either high- or low-frequency on both word tokens (suffix frequency) and word types (suffix numerosity), and the root frequency included the cumulative frequency of both the inflected and the derived forms of the base. The highfrequency root words had also a larger mean family size than lowfrequency root words. Words were matched for mean morphological family size across sets with the same root frequency. All the derived words were orthographically and phonologically transparent with respect to their bases. All words in the five sets were presented in the citation form. Words were matched, across the five sets, for surface frequency, length, syllabic structure and bigram frequency. The five sets were also matched for rated familiarity (familiarity ratings were obtained on a seven-point scale from twenty-four participants, see Experiment 2 for details about the method), and the four sets of derived words were matched for semantic relatedness with the base (semantic relatedness ratings were obtained on a seven-point scale from twenty-four different participants; see Experiment 2 for details concerning the method). The mean values with standard deviations for the relevant variables in each experimental set are reported in Table 4. The list of stimuli, with root frequency, suffix frequency, word frequency, mean 6 RT and percent error for each item are reported in Appendix C. The interplay of root, suffix and whole-word frequency 183 The 85 experimental words were presented together with 115 filler words and 200 filler pseudowords, for a total of 400 stimuli. Any suffix that occurred in the experimental words occurred also in the same number of filler pseudowords. The filler pseudowords were drawn from words analogous to the filler words by changing one or two letters in different positions in the word. The filler words included medium/low-frequency singular nouns and adjectives, either morphologically complex or simple, in a proportion that reflected the composition of the Italian basic dictionary in the medium/low frequency range (Thornton, Iacobini and Burani 1994; 1997). The mean length was the same for words and pseudowords (range: 6-11 letters). The list was presented to participants in a single experimental session, arranged in five randomized blocks of eighty items each. Each participant was presented with a different block randomization and with a different randomization of items within each block. Each experimental list was preceded by a practice list of 40 items of 20 words and 20 pseudowords, assigned in the same proportion to two randomized blocks. Table 4. Experiment 3. Mean values and standard deviation (s.d.) for the relevant variables. HH = Derived words with high-frequency root and high-frequency suffix HL = Derived words with high-frequency root and low-frequency suffix LH = Derived words with low-frequency root and high-frequency suffix LL = Derived words with low-frequency root and low-frequency suffix ND = Nonderived words, Freq: Frequency; Num: Numerosity; Rel: Relatedness HH Root Freq. Family Size Suffix Freq. Suffix Num. Semantic Rel. Familiarity Bigram Freq. Word Length Root Length Suffix Length Word Freq. Mean 547 12.3 2,119 247 5.41 6.42 10.72 8.2 4.4 3.8 5.3 HL s.d. 536.5 8.81 1,653 141.2 0.56 0.34 0.44 1.29 1.06 0.66 4.04 Mean 554 11.1 75 21 5.01 6.24 10.74 8.6 4.5 4.2 4.6 LH s.d. 526.2 4.05 37.26 12.6 0.77 0.6 0.25 1.22 1.23 0.64 4.94 Mean 38 4.6 1,892 227 5.34 6.22 10.58 8.1 4.3 3.8 3.1 LL s.d. 16.72 1.94 1,479 127.33 0.65 0.46 0.52 1.20 1.22 0.66 2.68 Mean 31 4.1 68 21 5.24 6.15 10.47 8.8 5.0 3.8 3.1 ND s.d. 22.9 2.11 43.9 12.5 0.82 0.48 0.27 0.88 0.94 0.66 3.53 Mean 9.5 1.8 — — — 6.21 10.75 7.9 — — 0.4 s.d. 7.98 0.97 — — — 0.68 0.29 0.83 — — 3.47 5.1.2. Procedure The procedure was the same as in Experiment 2. The experimental session lasted about 30 minutes. 184 Burani and Thornton 5.1.3. Participants Fifty participants, mostly University students, were paid to participate in the experiment. All were native speakers of Italian. 5.1.4. Results and discussion The data of three participants, whose mean reaction times for correct responses or whose error rates were more extreme than 2 s.d. from the mean of all participants, were excluded from further analysis. Using the remaining forty-seven participants, the mean reaction times and error rates for all items were obtained. Mean reaction times by items and percentages of errors for the five experimental sets are shown in Table 5. Table 5. Experiment 3. Mean reaction times by items in ms and % error. Suffixed derived words with high-frequency root and high-frequency suffix (HH); high-frequency root and low-frequency suffix (HL); low-frequency root and high-frequency suffix (LH); low-frequency root and low-frequency suffix (LL); nonderived words (ND). HH HL LH LL ND Mean Reaction Time 603 605 641 645 640 % Error 5.4 8.6 14.1 17.1 13.6 Results on the five sets were submitted to one-way ANOVAs. Additionally, two-way analyses of variance, with root frequency (high vs. low) and suffix frequency (high vs. low) as the two factors were performed on the four derived sets. Results of one-way ANOVAs showed a significant difference among experimental categories, on both reaction times (F1(4,184) = 35.77, p <.0001, MSE= 483.08; F2(4,80) = 4.05, p<.005, MSE= 1,816.13) and errors (F1(4,184) = 21.09, p<.0001, MSE= 1.43; F2(4,80) = 2.84, p<.03, MSE= 29.35). Post-hoc comparisons based on the Duncan’s test on the The interplay of root, suffix and whole-word frequency 185 means by items showed that both HH and HL derived words gave rise to significantly faster reaction times than words in the other three sets (for all comparisons involving HH or HL words, p<.025). Additionally, HH and HL derived words did not differ from one another (p>.1). Analogously, no differences were found among LH, LL and ND words (always p>.1). A similar pattern was found on errors, with significant differences between HH words on the one hand, and LH, LL, and ND words, on the other (always p<.05), and between HL and LL words (p=.05). The two-way ANOVAs on the four derived sets, with root frequency (high vs. low) and suffix frequency (high vs. low) as the two factors, confirmed a root effect only, by both participants and items, on both reaction times and errors (F1 (1,46) = 129.3, p<.0001, MSE= 438.24; F2 (1,64) = 14.59, p<.0001, MSE= 1,740.71 for reaction times; F1 (1,46) = 66.38, p<.0001, MSE= 1.53; F2 (1,64) = 9.14, p<.004, MSE= 30.66 for errors). On reaction times, no suffix effect was found (F<1), and no interaction (F<1). A suffix effect was found on errors, in the analysis by participants only (F1 (1,46) = 9.83, p<.003, MSE= 1.35; F2 (1,64) = 1.20, p>.1, MSE= 30.7), with words with low-frequency suffixes giving rise to more errors than words with high-frequency suffixes. No interaction was found on errors (F<1). Results of Experiment 3 confirmed morpheme-based processing for HH words. As expected, these derived words showed faster reaction times and higher accuracy because of including high-frequency morphological constituents. The outcomes of Experiment 3 also suggested whole-word processing for LL derived words – the latter words, whose constituent morphemes were low-frequency, did not show any advantage with respect to nonderived words of the same surface frequency. The results of Experiment 3 were not in accordance with the predictions made for HL and LH derived words. Differently from Experiment 2, in which the two sets of derived words which include one high-frequency constituent did not differ from each other, and showed intermediate reaction times and accuracy with respect to words including two highfrequency constituents on the one hand, and words including two lowfrequency constituents on the other, in Experiment 3 HL and LH words gave rise to contrasting results. Apparently, the imperfect balance in rated familiarity among word sets was responsible for part of the effects found in Experiment 2. A better matching among experimental sets led to a different pattern of results in Experiment 3. While words with high-frequency root 186 Burani and Thornton and low-frequency suffix (HL words) were as fast as words with both root and suffix of high frequency (HH words), words with a high-frequency suffix but a low-frequency root (LH words) did not differ either from words in which both constituents were low-frequency (LL words), nor from words with no morphological constituency, i.e., nonderived (ND) words. From these results it could be argued that the major determinant of lexical decision performance to suffixed derived words is the root frequency, with no role for the frequency of the suffix. In the General discussion a processing account will be discussed, that reconciles these results on words with the apparently contrasting results on suffixed pseudowords, including results of Experiment 1, in which a strong role for suffix frequency was found. 6. General discussion Three lexical decision experiments investigated how the frequency of morphemic constituents, namely roots and suffixes, affects the visual processing of low-frequency derived stimuli. In Experiment 1, pseudowords made up of Italian derivational suffixes of various frequencies and meaningless pseudoroots were contrasted to pseudowords in which the same pseudoroots were combined with control orthographically legal final sequences. These final sequences were matched to the suffixes for frequency and for other relevant variables, but were not suffixes themselves. There were three sets of suffixed/control pseudowords, differing in the frequency of the final sequences, either high, medium or low. Only pseudowords that included high-frequency suffixes showed interference, namely longer decision times and higher error rates, with respect to their matched non suffixed controls. Neither pseudowords including medium-frequency suffixes nor pseudowords with low-frequency suffixes differed from their respective controls. These results are in accordance with previous results by Burani et al. (1997), and extend them to pseudoword contexts in which no root is present. The results provide further evidence that the activation of morphemic access units corresponding to suffixes is constrained, in visual tasks, by the quantifiable characteristics of the suffixes themselves (see also Laudanna and Burani 1995; Laudanna, Burani and Cermele 1994). Experiment 2 and 3 explored whether the frequency of the root or the suffix affected lexical decision to low-frequency suffixed derived Italian The interplay of root, suffix and whole-word frequency 187 words. Four sets of derived nouns and adjectives were contrasted in Experiment 2. All of the words in the four sets had a low frequency, but differed with respect to the frequency of their morphemic constituents, either high or low with orthogonal variation. The results showed that reaction times and accuracy to derived words were affected by the frequency of both roots and suffixes. Lexical decisions were faster and more accurate when the derived words included two high-frequency constituents, they were the slowest and the least accurate when both constituents had low frequency, and had intermediate times and error rates when the derived words included only one high-frequency constituent, either the root or the suffix. No differential effects of root and suffix frequency were found. However, post-hoc controls showed that the effects of morphemic frequency on lexical decision performance were possibly confounded with effects of wordfamiliarity, which differed among the four sets in a way that paralleled differences in the frequency of morphological constituents. Four new sets of low-frequency derived words were tested in Experiment 3. The derived words differed in the frequency of the constituent roots and suffixes, which were orthogonally varied as in Experiment 2. However, the derived words were also matched for ratings of familiarity. In this experiment, the sets of derived words were contrasted with a set of low-frequency nonderived words, matched to the derived words for all the relevant variables, including word familiarity. When word familiarity was controlled, lexical decisions to derived words were found to be a function of the frequency of the root only, irrespective of suffix frequency. Decision latencies to the two sets of derived words with high-frequency roots, irrespective of suffix frequency, were faster than to the two sets of derived words with low-frequency roots. Moreover, with low root frequency items, there was no effect of suffix frequency, either high or low. Finally, performance to words with low-frequency roots did not differ from performance to nonderived words, thus suggesting analogous processing for derived words with low-frequency constituents and nonderived words. Collectively, these findings reveal effects of suffix as well as root units in the processing of derived stimuli. They also indicate that results may be affected by word familiarity. We will discuss first the results on real words, then we will show how they could be reconciled with results on pseudoword processing, in which an effect of suffix frequency was found. Experiment 3 showed that, when low-frequency derived words were equated for ratings of familiarity, only those that included a high-frequency root resulted in faster access relative to nonderived words of similar fre- 188 Burani and Thornton quency/familiarity. Thus access through activation of morphemes is beneficial only for derived words with high-frequency roots. By contrast, suffixed derived words that did not include any high-frequency constituent, or only included a high-frequency suffix, did not show any processing advantage with respect to derived words of similar low-frequency. Hence, lexical decision latencies to words with low-frequency roots, irrespective of suffix frequency, seem to be a function of surface (whole-word) frequency, and are analogous to words with no morphemic constituency. Thus the main result of Experiment 3 is that access to low-frequency derived words is not always obtained via morphological parsing. The full-form access route is also involved, and in some cases it determines response latency. How could this pattern of results be interpreted? In the framework of a parallel dual-route model, whole-word activation would start simultaneously and proceed with analogous time courses for all the five categories of words considered in Experiment 3 because of their similarly low surface frequency. However, different outcomes are expected for the different words, due to the different probability of morphemic access. That words with high-frequency roots and high-frequency suffixes (HH words) should result in faster and more accurate performance with respect to the other words is an expected finding, because the higher frequency of morphemes with respect to the frequency of the whole-word combination should affect positively both the speed and accuracy of access through morphemic activation. The result for derived words in which both the root and the suffix are low-frequency (LL words) is also expected, on the assumption that the slightly higher frequency of the constituents might not be large enough to result in advantageous morphemic processing with respect to whole-word processing. However, the asymmetrical results between words with high-frequency root and low-frequency suffix (HL words), and words with low-frequency root and high-frequency suffix (LH words) call for some refinement of a model in which whole-word activation and morphemic activation are assumed to occur in parallel. Some suggestion comes from studies that have investigated how, in silent sentence reading, eye fixations on compounds are affected by component morphemes. In these studies, it has been shown that words composed of eight letters or more, like our derived words, typically require more than one eye-fixation. When (at least) two eye-fixations are involved, the frequency of the first morphemic constituent strongly influences the duration of the first fixation on the target word (Hyönä and The interplay of root, suffix and whole-word frequency 189 Pollatsek 1998). The frequency of the second morpheme influences gaze duration, but its influence is not immediate, and only affects relatively later processing (Pollatsek, Hyönä and Bertram 2000). Furthermore, second constituent effects may overlap in time with whole-word effects: The frequency of the whole word influences eye movements at least as early as the frequency of the second constituent, and can have an effect even before access of the second constituent (Pollatsek, Hyönä and Bertram 2000). In the framework suggested by these authors, the identification of morphologically complex words involves parallel processing of both morphological constituents and whole-word representations. The two morphological constituents each have an effect but differ in the time course. While the probability of whole-word processing would be the same for all the derived words with similar frequency/familiarity, a head start to access through morphological segmentation might be available only for words with a highfrequency first constituent (in our case, the root). A high-frequency second constituent (here, the suffix) would not result in a significant advantage, with respect to whole-word access. In a framework in which no different head start is assumed for morphemes occurring in different word positions, not even when they are of relevant length (Baayen and Schreuder 2000; Baayen, Schreuder and Sproat 2000), the present results could be interpreted as arising at stages following segmentation, in which the morphemes are combined and the result of composition is checked for licensing. The model proposed by Baayen, Schreuder and coworkers distinguishes, in the route for morphological processing, a number of successive processing stages. According to Baayen and Schreuder (1999) (see also Baayen and Schreuder 2000; Baayen, Schreuder and Sproat 2000), access representations are activated over time by the sensory input through the stages of perceptual identification and segmentation. Access representations are assigned resting activation levels that are proportional to their frequency of occurrence, with high-frequency morphemes reaching a pre-set activation threshold more 8 quickly than low-frequency morphemes. Once an access representation for a root or an affix reaches the threshold activation level, it is copied to a short-term memory buffer. Once the representations provide full, nonoverlapping spannings of the input, they are passed on to the following processing stages of licensing (the checking for subcategorisation compatibility), composition (the compositional computation of the meaning of the whole from its parts) and semantic activation of semantically related representations in long-term memory. 190 Burani and Thornton In this model, derived words with different morphological constituents should result in different processing times. The access representations of words whose constituents are both high-frequency (HH words) would be passed on to the processing stages of licensing, composition and semantic activation more quickly than the other words. For words in which only one constituent is high-frequency (HL and LH words), the access representation corresponding to the high-frequency constituent is quickly copied to the short-term memory buffer. However, the full segmentation of the input would be delayed by the slow activation of the access representation corresponding to the low-frequency morpheme. At the segmentation stage, at least one frequent constituent should be easily identifiable for effective segmentation to occur, in that identification of one highly activated morpheme should leave the remainder as a prime candidate for a second constituent. When no morphemic access representation can be easily segmented, like in LL words in which both constituents are low-frequency, additional time consuming procedures have to be called upon. Thus, there might be some extra segmentation cost involved for morphemic access to LL words, which would give an advantage to wholeword access. At the segmentation stage, HH words would be favoured by morphological processing with respect to both HL words and LH words. However, at the composition and licensing stages, a main role for the root morpheme could be assumed. At these stages, for both HH and HL suffixed words, a highly activated root is available. In both cases, the root representation activates the set of derivational affixes that are compatible with it. The amplitude of this set is similar for both HH and HL words, which have the same mean derivational family size (see Table 4). Thus little evidence from the orthographic input is needed in both cases, in order to activate the correct suffix among the set of pre-activated ones: Independently of its frequency, the presented suffix should be activated, combined, and the combination checked for licensing, among the same (small) number of possible suffixes. In the case of LH words, no highly activated root is available from the start. The representation that is highly activated at the composition and licensing stages is the one corresponding to the suffix. At these stages, the representation corresponding to a high-numerosity suffix is compatible with a large number of representations corresponding to roots (as many as 240 root types on average, if calculated in the corpus; about 650 root types if calculated in the dictionary). Hence, the orthographic evidence required from the input should be large, in order to be correctly The interplay of root, suffix and whole-word frequency 191 activated, composed and finally licensed. But this is not the case when the stimulus includes a scarcely activated low-frequency root, as in the case of LH derived words. The process of composition and licensing would be long for LH words, thus allowing for processing through whole-word to win the race. In summary, for both LH and LL words, for which activation via morphemes is so drastically slowed down in different processing components, whole-word processing would be faster in providing access to the lexicon, not differently from nonderived (ND) words, for which wholeword processing is the only possible route. The results on suffixed pseudowords that were obtained in Experiment 1 can now be reconciled with results on suffixed words. In Experiment 1, pseudowords made up of a pseudoroot and a high-frequency/numerosity suffix resulted in longer reaction times and lower accuracy with respect to matched pseudowords in which the same pseudoroots were combined with pseudosuffixes. Pseudowords made up of a pseudoroot and a highfrequency suffix are in a way not that different from LH words, which include a low-frequency root and a high-frequency suffix. Therefore, analogously to suffixed LH words, pseudowords of this sort are subject to morphemic segmentation because of the very high frequency/numerosity of one constituent, namely the suffix. After reaching the threshold activation level and being copied to the short-term memory buffer, the access representation corresponding to the suffix cannot be passed on to the subsequent stages of composition and licensing because no full spanning of the input is provided. However, the very same attempt at morphemic segmentation causes delay or interference in the process of nonword decision, thus resulting in slower decision times and higher error rates. By contrast, neither delay nor interference would occur in the case of pseudowords made up of a pseudoroot and a medium/low-frequency suffix, because here no morphemic constituent would be frequent enough to trigger morphemic segmentation (at least not within the deadline for lexical decision). The results obtained by Burani et al. (1997) can find a similar account. The pseudowords investigated by Burani et al. (1997) were made up of a real root of medium/low frequency and a suffix that could be either highor low-frequency. Pseudowords with medium-frequency root and highfrequency suffixes activate two access representations that are both likely to be passed on to the composition and licensing stages. Thus they were subject to (morpho-)lexical interference in nonword decision. However, interference was not found in lexical decision to either pseudowords made up of a (medium-frequency) root and a low-frequency suffix, or to 192 Burani and Thornton pseudowords made up of a (medium-frequency) root and a pseudosuffix. In the latter cases no morphemic constituent was frequent enough to trigger morphemic segmentation that would leave the remainder as a prime candidate for a second constituent. In summary, the results reported in the present study on both suffixed derived words and suffixed pseudowords, together with previous results on morphological pseudowords, point to the conclusion that the assumption of a higher probability of morphemic parsing for low-frequency derived words (Burani and Laudanna 1992; Frauenfelder and Schreuder 1992) needs to be further specified. Not all low-frequency derived words might be similarly subject to be accessed via morphological parsing. In formulating predictions for access through morphemic decomposition, the complex balancing between root and affix properties, including quantitative properties like frequency and numerosity, are to be taken into account. The complex balancing of root and suffix quantitative properties should be relevant for expecting costs of morphological parsing, as well as benefits. In the studies by Bertram, Laine and Karvinen (1999) and by Bertram, Schreuder and Baayen (2000), conducted on Finnish morphologically complex words, it has been proposed that some derived words (those with unproductive or homonymous suffixes) are accessed on the basis of storage, thus are as quick as monomorphemic Finnish words. This prediction finds confirmation in the lexical decision to our derived words with lowfrequency root and low-frequency suffix (LL words), which did not differ from nonderived (ND) words. However, this same prediction would be too simplistic in the case of derived words with a low-frequency/numerosity suffix and a high-frequency root (HL words) on the one hand, and for derived words with a high-frequency/numerosity suffix but a low-frequency root (LH words) on the other hand. Apparently, neither Bertram, Laine and Karvinen (1999), nor Bertram, Schreuder and Baayen (2000) specifically investigated the differences between these cases. The present results on Italian suffixed derived words provide some evidence for the processing of derived words of this sort. The interplay of root, suffix and whole-word frequency Appendices Appendix A Experiment 1 Suffixed pseudowords and their controls with mean RT and % Error. HF = Pseudowords with high-frequency final sequence HF Suffixed Control ITEM RT Prucezza 723 matirezza 757 accogezza %E ITEM RT %E 13.6 prucondo 638 0 9.1 matirondo 699 13.6 748 9.1 accogondo 759 18.2 sillerezza 734 13.6 sillerondo 709 4.5 feldismo 768 22.7 feldanca 624 0 rovollismo 714 4.5 rovollanca 683 0 rachenismo 718 13.6 rachenanca 704 13.6 cabilismo 771 22.7 cabilanca 806 9.1 cempenista 788 22.7 cempenosto 693 0 livonista 754 18.2 livonosto 662 4.5 ascobista 744 9.1 ascobosto 711 0 pirgista 655 4.5 pirgosto 718 9.1 MEAN 739 13.6 700 6.1 193 194 Burani and Thornton Experiment 1 Suffixed pseudowords and their controls with mean RT and % Error. MF = Pseudowords with medium-frequency final sequence MF Suffixed Control ITEM RT lengardo 710 elbirardo 725 fozzardo mevinardo %E ITEM RT %E 9.1 lengerta 660 0 0 elbirerta 721 4.5 681 0 fozzerta 747 9.1 674 9.1 mevinerta 737 22.7 varbesco 757 0 varbanna 649 13.6 orcittesco 742 0 orcittanna 683 4.5 sicoresco 728 9.1 sicoranna 748 4.5 dolmibesco 664 0 dolmibanna 707 9.1 stemigno 719 13.6 stemusso 702 0 trudigno 686 0 trudusso 676 9.1 seltigno 622 4.5 seltusso 684 4.5 701 4.1 MEAN 701 7.4 The interplay of root, suffix and whole-word frequency Experiment 1 Suffixed pseudowords and their controls with mean RT and % Error. LF = Pseudowords with low-frequency final sequence LF Suffixed Control ITEM RT gurnense 713 adricense 685 tirfense %E ITEM RT %E 4.5 gurnombe 653 4.5 0 adricombe 634 4.5 633 0 tirfombe 686 0 enovense 614 0 enovombe 660 9.1 siramigia 735 13.6 siramegio 687 0 amittigia 718 13.6 amittegio 678 4.5 furnigia 662 0 furnegio 684 9.1 davelligia 691 0 davellegio 765 4.5 parcinoide 739 4.5 parcinaudo 647 0 balmoide 688 4.5 balmaudo 701 4.5 taripoide 642 9.1 taripaudo 672 4.5 cettoide 643 4.5 cettaudo 676 4.5 MEAN 680 4.6 MEAN 679 4.2 195 196 Burani and Thornton Appendix B Experiment 2 Experimental items with mean RT and % Error HH = Words with high-frequency root and high-frequency suffix HL = Words with high-frequency root and low-frequency suffix HH WORD HL Root Suf- Word Freq. fixFreq Freq. RT %E WORD Root Suffix Word Freq. Freq. Freq. RT %E . bassezza 414 1,557 1 665 0 umanoide 324 12 1 683 giornalismo 307 639 9 588 0 pazzoide 175 12 1 631 2.9 centrismo 501 639 2 652 5.7 ferrigno 369 55 2 739 14.3 acquario 17.2 1,120 1,177 3 540 0 testardo 698 60 4 548 0 nudista 253 984 1 613 0 donnesco 1,556 90 2 658 11.4 macchinista 305 984 8 600 0 guerresco 630 90 1 623 8.6 amatore 603 2,074 3 563 8.6 levatoio 927 102 5 606 0 tiratore 524 2,074 1 601 2.9 spogliatoio 108 102 1 590 0 goloso 117 1,335 2 539 2.9 violaceo 126 13 5 677 14.3 dentale 250 888 1 622 5.7 pacifico 475 53 10 547 2.9 ottimale 125 888 1 573 0 frutteto 335 46 9 622 0 credibile 1,485 262 5 603 2.9 roseto 363 46 1 567 0 501 1,859 3.1 597 2.5 MEAN 507 57 3.5 624 5.9 MEAN The interplay of root, suffix and whole-word frequency Appendix C Experiment 3 Experimental items with mean Rt and % Error. HH = Words with high-frequency root and high-frequency suffix HL = Words with high-frequency root and low-frequency suffix HH WORD HL Root Suffix Word Freq. Freq. Freq. RT %E WORD Root Suffix Word Freq. Freq. RT %E Freq. liberismo 877 639 1 661 14.9 donnesco 1,556 90 2 651 27.7 carnoso 329 1,335 4 619 0 campestre 646 59 5 604 4.3 godibile 187 956 3 620 19.1 guerresco 630 90 1 603 2.1 scenario 244 1,177 13 559 4.3 muraglia 510 184 17 580 6.4 numerale 475 4,942 1 642 6.4 sanguigno 540 55 10 575 0 consumismo 160 639 2 628 0 pacifico 475 53 10 549 0 temibile 223 956 6 600 6.4 spogliatoio 108 102 1 585 0 pensatore 2,204 2,074 1 613 0 parlatoio 2,075 102 2 687 12.8 ordinanza 773 1,485 8 568 0 bambinesco 693 90 1 610 6.4 erboso 236 1,335 6 605 4.3 onorifico 215 53 1 664 17.0 piccolezza 937 1,557 5 600 0 pazzoide 173 12 1 620 21.3 2.1 serale 1,266 4,942 11 604 12.8 romanzesco 109 90 1 606 velocista 150 984 3 575 0 testardo 698 60 4 571 0 bruttezza 228 1,557 3 605 4.3 alpestre 187 59 5 651 34.0 bestiale 233 4,942 8 573 0 canneto grossezza 391 1,557 2 610 navale 394 4,942 13 573 MEAN 547 2,119 5.29 603 204 46 13 597 8.5 17 roseto 363 46 1 576 4.3 2.1 poliziesco 238 90 3 549 0 5.4 MEAN 554 75.3 4.6 605 8.6 197 198 Burani and Thornton Experiment 3 Experimental items with mean Rt and % Error. LH = Words with low-frequency root and high-frequency suffix LL = Words with low-frequency root and low-frequency suffix LH WORD LL Root Suffix Word Freq. Freq. RT %E WORD Freq. Root Suffix Word Freq. Freq. RT %E Freq. limatura 22 1,482 2 696 31.9 luridume 10 23 2 679 42.5 astrale 68 4,942 3 616 10.6 zingaresco 33 90 2 636 2.1 flautista 26 984 3 677 23.4 bugiardo 77 60 13 535 2.1 maestoso 25 1,335 10 611 66 90 1 730 14.9 igienista 30 984 1 635 14.9 beffardo 49 60 5 608 12.8 gaiezza 33 1,557 2 660 14.9 viscidume 8 23 2 743 40.4 rozzezza 33 1,557 2 653 14.9 orinatoio 10 102 1 702 31.9 rugoso 21 1,335 3 610 2.1 agrumeto 22 46 1 660 23.4 intuibile 32 956 3 692 40.4 cupidigia 20 7 6 645 14.9 rudezza 33 1,557 2 607 2.1 brodaglia 44 184 1 700 46.8 doganale 22 4,942 1 660 4.3 gigantesco 72 90 10 576 0 schedario 31 1,177 8 620 6.4 lerciume 11 23 2 690 36.2 punibile 69 956 1 600 14.9 fiabesco 10 90 1 564 2.1 lagnanza 45 1,485 1 641 27.7 serbatoio 29 102 1 576 2.1 lanoso 69 1,335 2 659 19.1 burlesco 12 90 1 608 6.4 fenomenale 54 4,942 1 651 8.5 sudiciume 22 23 2 629 0 pessimismo 35 639 7 602 0 prolifico 28 53 1 679 12.8 38.1 1,892 3.1 641 31 68 3.1 645 17.1 MEAN 4.3 serpentesco 14.1 MEAN The interplay of root, suffix and whole-word frequency 199 Experiment 3 Experimental items with mean Rt and % Error. ND = Nonderived words ND WORD avezzo Root Word Freq. Freq. RT %E 18 6 670 12.8 caparbio 9 2 684 14.9 damigiana 6 2 692 31.9 sbilenco 8 4 702 31.9 cospicuo 12 4 651 19.1 giaguaro 2 1 661 12.8 aragosta 8 2 602 8.5 scirocco 12 8 561 2.1 farabutto 8 4 631 14.9 inerme 9 7 682 25.5 ascesso 1 1 649 6.4 collasso 3 3 559 0 scarlatto 35 15 668 17.0 laringe 3 2 668 25.5 calunnia 14 7 605 2.1 rimorchio 9 4 630 4.3 trapezio 4 3 570 2.1 MEAN 9.5 4.4 640 13.6 Notes * We are indebted to Emanuela Rellini for running Experiment 2, and to Alberto Spuntarelli for his precious contribution to the various phases of Experiment 3. We also thank Harald Baayen, Rob Schreuder, Alessandro Laudanna, Daniela Traficante, Lisa S. Arduino, Francesca M. Dovetto and Sergio Carlomagno for valuable discussions. Finally, we thank the referees, Laurie Feldman, Jen Hay and Harald Baayen, whose revisions and comments helped to improve the paper. 200 1. 2. 3. 4. 5. 6. 7. Burani and Thornton An exception is Bradley’s (1979) study, in which words involving different suffixes with different effects on the orthographic/phonological characteristics of the stem were contrasted (see also Tsapkini, Kehayia and Jarema 1998; 1999; Vannest and Boland 1999). For instance, the three Italian suffixes -ista, -ezza and -oide could be compared. While the first two have high frequency and high numerosity in a corpus (frequency: 984 and 1,557/1,500,000, respectively; numerosity: 145 and 187, respectively), they differ in productivity, if we take as indicator of productivity the number of words containing these suffixes found in five Italian dictionaries of neologisms. While -ista appears in as many as 417 neologisms, -ezza appears in only 3 neologisms. These data can be further compared with those concerning the suffix -oide, which has a very low frequency and low numerosity in the corpus (frequency: 12/1,500,000; numerosity: 10), but appears in 11 neologisms, i.e., in four times as many as the neologisms in which the very frequent suffix -ezza appears. If the higher probability of storage (or whole-word access) vs. computation (or morphemic access) is determined by high whole-word frequency, also inflected high-frequency words should be likely to be accessed as whole-forms. Recent evidence for storage of high-frequency regularly inflected English words comes from Alegre and Gordon (1999a). Related evidence can be found in Baayen, Dijkstra and Schreuder (1997), and in Baayen, Burani and Schreuder (1997), for Dutch and Italian regularly inflected words, respectively. Although nonderived words did not include any derivational affix, they included, like the derived words and like almost any Italian word, an inflectional suffix for number and gender. For this reason we name them "nonderived", not "monomorphemic". We selected nonderived words with very low cumulative root frequency, in a way that the cumulative root frequency was as close as possible to surface frequency. Any contribution of the frequency of the inflection to morphemic access would be the same for all the five word sets. The total number of selected items was not as large as we had aimed at. It turned out that in trying to match items for all the variables, we had exploited all the possible words in our corpus. Specifically, derived words in the sets with low-frequency suffixes (HL and LL), that were already very few in the corpus, had to be further reduced in number in order to be matched with the other words. Many LL words were excluded because they had lower familiarity with respect to words in the other sets. Additionally, the inclusion of nonderived words made even more difficult to match words across sets, because of the tendency of nonderived words to be shorter than derived words (see Campos 1993). Interestingly, matching derived words for rated familiarity left the robust root effect unchanged, while suffix frequency effects were completely washed out. The interplay of root, suffix and whole-word frequency 8. 201 This suggests that familiarity rating is a task that is much more sensitive to properties of the suffix than to properties of the root. Recent evidence from our laboratory (Burani, Bimonte and Barca, in preparation) has shown that, on a larger sample of low-frequency suffixed derived words (N=122), familiarity ratings had a significant correlation with (log) suffix frequency (r = .43, p<.0001), while they had no correlation with (log) root frequency (r = -.06). Further research is required in order to investigate why should this be case. This "segmentation-through-recognition" approach should avoid the objections raised by Andrews and Davis (1999) to the interactive activation accounts of morphological decomposition, by assuming that "The activation weights of access representations are increased only for constituents that are aligned with the left or right edge of the word, or that are aligned with access representations that have reached threshold and that themselves are edgealigned, either with the word edge itself or with another edge-aligned constituent in the short-term-memory buffer." (Baayen and Schreuder 2000: 5). References Alegre, Maria, and Peter Gordon 1999a Frequency effects and the representational status of regular inflections. Journal of Memory and Language 40: 41–61. Alegre, Maria, and Peter Gordon 1999b Rule-based versus associative processes in derivational morphology. Brain and Language, 68: 347–354. Andrews, Sally 1986 Morphological influences on lexical access: Lexical or nonlexical effects? Journal of Memory and Language 25: 726–740. Andrews, Sally, and Colin Davis 1999 Interactive activation accounts of morphological decomposition: Finding the trap in the Mousetrap? Brain and Language 68: 355– 361. Baayen, R. Harald 1989 A corpus-based approach to morphological productivity. Statistical analysis and psycholinguistic interpretation. Ph.D dissertation, Department of Linguistics, Vrije Universiteit, Amsterdam. Baayen, R. Harald 1992 Quantitative aspects of morphological productivity. In Yearbook of morphology 1991, Geert E. Booij and Jaap van Marle (eds.), 109–149. Dordrecht: Kluwer. 202 Burani and Thornton Baayen, R. Harald, Cristina Burani, and Robert Schreuder 1997 Effects of semantic markedness in the processing of regular nominal singulars and plurals in Italian. In Yearbook of morphology 1996, Geert E. Booij and Jaap van Marle (eds.), 13–33. Dordrecht: Kluwer. Baayen, R. Harald, Ton Dijkstra, and Robert Schreuder 1997 Singulars and plurals in Dutch: Evidence for a parallel dual-route model. Journal of Memory and Language 37: 94–117. Baayen, R. Harald, and Anneke Neijt 1997 Productivity in context: a case study of a Dutch suffix. Linguistics 35: 565–587. Baayen, R. Harald, and Robert Schreuder 1999 War and peace: Morphemes and full forms in a noninteractive activation parallel dual-route model. Brain and Language 68: 27–32. Baayen, R. Harald, and Robert Schreuder 2000 Towards a psycholinguistic computational model for morphological parsing. Philosophical Transactions of The Royal Society (Series A: Mathematical, Physical and Engineering Sciences) 358: 1–13. Baayen, R. Harald, Robert Schreuder, and Richard Sproat 2000 Morphology in the mental lexicon: A computational model for visual word recognition. In Lexicon development for speech and language processing, Frank van Eynde and Dafydd Gibbon (eds.), 267–291. Dordrecht: Kluwer. Beauvillain, Cécile 1996 The integration of morphological and whole-word form information during eye-fixations on prefixed and suffixed words. Journal of Memory and Language 35: 801–820. Bentin, Slomo, and Laurie B. Feldman 1990 The contribution of morphological and semantic relatedness to the repetition effect at long and short lags: Evidence from Hebrew. Quarterly Journal of Experimental Psychology 42(A): 693–711. Bertram, Raymond, R. Harald Baayen, and Robert Schreuder 2000 Effects of family size for derived and inflected words. Journal of Memory and Language 42: 390–405. Bertram, Raymond, Matti Laine, and Katja Karvinen 1999 The interplay of word formation type, affixal homonymy, and productivity in lexical processing: Evidence from a morphologically rich language. Journal of Psycholinguistic Research 28: 213–226. The interplay of root, suffix and whole-word frequency 203 Bertram, Raymond, Robert Schreuder, and R. Harald Baayen 2000 The balance of storage and computation in morphological processing: The role of word formation type, affixal homonymy, and productivity. Journal of Experimental Psychology: Learning, Memory and Cognition 26: 489–511. Bradley, Dianne C. 1979 Lexical representation of derivational relation. In Juncture, Mark Aronoff and Marie Louise Kean (eds.), 37–55. Cambridge, MA: MIT Press. Burani, Cristina, Daniela Bimonte, and Laura Barca in prep. Knowledge of morphology as an aid to word comprehension and vocabulary learning. Burani, Cristina, and Alfonso Caramazza 1987 Representation and processing of derived words. Language and Cognitive Processes 2: 217–227. Burani, Cristina, Francesca M. Dovetto, Alberto Spuntarelli, and Anna M. Thornton 1999 Morpho-lexical naming of new root-suffix combinations: The role of semantic interpretability. Brain and Language 68: 333–339. Burani, Cristina, Francesca M. Dovetto, Anna M. Thornton, and Alessandro Laudanna 1997 Accessing and naming suffixed pseudo-words. In Yearbook of morphology 1996, Geert E. Booij and Jaap van Marle (eds.), 55– 72. Dordrecht: Kluwer. Burani, Cristina, and Alessandro Laudanna 1992 Units of representation of derived words in the lexicon. In Orthography, phonology, morphology, and meaning, Ram Frost and Leonard Katz (eds.), 361–376. Amsterdam: North-Holland. Burani, Cristina, Anna M. Thornton, Claudio Iacobini, and Alessandro Laudanna 1995 Investigating morphological non-words. In Crossdisciplinary approaches to morphology, Wolfgang U. Dressler and Cristina Burani (eds.), 37–53. Wien: Verlag der Österreichischen Akademie der Wissenschaften. Bybee, Joan 1995a Regular morphology and the lexicon. Language and Cognitive Processes 10: 425–455. Bybee, Joan 1995b Diachronic and typological properties of morphology and their implications for representation. In Morphological aspects of language processing, Laurie B. Feldman (ed.), 225–246. Hove: Erlbaum. 204 Burani and Thornton Campos, Alfredo 1993 Simple and derived words: Influence on other values of words. Perceptual and Motor Skills 77: 1193–1194. Caramazza, Alfonso, Alessandro Laudanna, and Cristina Romani 1988 Lexical access and inflectional morphology. Cognition 28: 297– 332. Chialant, Doriana, and Alfonso Caramazza 1995 Where is morphology and how is it processed? The case of written word recognition. In Morphological aspects of language processing, Laurie B. Feldman (ed.), 55–76. Hove: Erlbaum. Colé, Pascale, Cécile Beauvillain, and Juan Segui 1989 On the representation and processing of prefixed and suffixed derived words: A differential frequency effect. Journal of Memory and Language 28: 1–13. Coltheart, Max, Eileen Davelaar, Jon T. Jonasson, and Derek Besner 1977 Access to the internal lexicon. In Attention and performance II, Stanislav Dornič (ed.), 335–355. New York: Academic Press. Connine, Cynthia M., John Mullennix, Eve Shernoff, and Jennifer Yelen 1990 Word familiarity and frequency in visual and auditory word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition 16: 1084–1096. Coolen, Riet, Henk J. van Jaarsveld, and Robert Schreuder 1991 The interpretation of isolated novel nominal compounds. Memory and Cognition 19: 341–352. de Jong, Nivja H., Robert Schreuder, and R. Harald Baayen 2000 The morphological family size effect and morphology. Language and Cognitive Processes 15: 329–365. Feldman, Laurie B., and Emily G. Soltano 1999 Morphological priming: The role of prime duration, semantic transparency, and affix position. Brain and Language 68: 33–39. Frauenfelder, Uli H., and Robert Schreuder 1992 Constraining psycholinguistic models of morphological processing and representation: the role of productivity. In Yearbook of morphology 1991, Geert E. Booij and Jaap van Marle (eds.), 165–183. Dordrecht: Foris. Gernsbacher, Morton Ann 1984 Resolving 20 years of inconsistent interactions between lexical familiarity and orthography, concreteness, and polysemy. Journal of Experimental Psychology: General 113: 256–281. The interplay of root, suffix and whole-word frequency 205 Hagiwara, Hiroko, Yoko Sugioka, Takane Ito, Mitsuru Kawamura, and Junichi Shiota 1999 Neurolinguistic evidence for rule-based nominal suffixation. Language 75: 739–763. Hay, Jennifer 2000 Causes and consequences of word structure. Ph.D diss., Field of Linguistics, Northwestern University. Hay, Jennifer 2001 Lexical frequency in morphology: is everything relative? Linguistics 39: 1041–1070. Hay, Jennifer, and R. Harald Baayen 2002 Parsing and productivity. In Yearbook of morphology 2001, Geert E. Booij and Jaap van Marle (eds.). Dordrecht: Kluwer. Holmes, Virginia M. and J. Kevin O’Regan 1992 Reading derivationally affixed French words. Language and Cognitive Processes 7: 163–192. Hyönä, Jukka, and Alexander Pollatsek 1998 Reading Finnish compound words: Eye fixations are affected by component morphemes. Journal of Experimental Psychology: Human Perception and Performance 24: 1612–1627. Istituto di Linguistica Computazionale CNR 1989 Corpus di italiano contemporaneo. Unpublished manuscript. Pisa. Jarvella, Robert J., and Ola Wennstedt 1993 Recognition of partial regularity in words and sentences. Scandinavian Journal of Psychology 34: 76–85. Laine, Matti 1996 Lexical status of inflectional and derivational suffixes: Evidence from Finnish. Scandinavian Journal of Psychology 37: 238–248. Laudanna, Alessandro, and Cristina Burani 1995 Distributional properties of derivational affixes: implications for processing. In Morphological aspects of language processing, Laurie B. Feldman (ed.), 345–364. Hove: Erlbaum. Laudanna, Alessandro, Cristina Burani, and Antonella Cermele 1994 Prefixes as processing units. Language and Cognitive Processes 9: 295–316. Lima, Susan D., and Alexander Pollatsek 1983 Lexical access via an orthographic code? The basic orthographic syllabic structure (BOSS) reconsidered. Journal of Verbal Learning and Verbal Behavior 22: 310–332. 206 Burani and Thornton Marslen-Wilson, William, Lorraine K. Tyler, Rachelle Waksler, and Lianne Older 1994 Morphology and meaning in the mental lexicon. Psychological Review 101: 3–33. Meunier, Fanny, and Juan Segui 1999 Morphological priming effect: The role of surface frequency. Brain and Language 68: 54–60. Peperkamp, Sharon 1995 Prosodic constraints in the derivational morphology of Italian. In Yearbook of morphology 1994, Geert E. Booij and Jaap van Marle (eds.), 207–244. Dordrecht: Foris. Plaut, David C., and Laura M. Gonnerman 2000 Are non-semantic morphological effects incompatible with a distributed connectionist approach to lexical processing? Language and Cognitive Processes 15: 445–485. Pollatsek, Alexander, Jukka Hyönä, and Raymond Bertram 2000 The role of morphological constituents in reading Finnish compound words. Journal of Experimental Psychology: Human Perception and Performance 26: 820–833. Ratti, Daniela, Lucia Marconi, Giovanna Morgavi, and Claudia Rolando 1988 Flessioni, rime e anagrammi. Bologna: Zanichelli. Schreuder, Robert, and R. Harald Baayen 1995 Modeling morphological processing. In Morphological aspects of language processing, Laurie B. Feldman (ed.), 131–154. Hove: Erlbaum. Schreuder, Robert, and R.Harald Baayen 1997 How complex simple words can be. Journal of Memory and Language 36: 118–139. Schreuder, Robert, Cristina Burani, and R.Harald Baayen 2002 Parsing and semantic opacity. In Reading complex words, Egbert Assink and Dominiek Sandra (eds.), 159–189. Dordrecht: Kluwer. Stolz, Jennifer A., and Laurie B. Feldman 1995 The role of orthographic and semantic transparency of the base morpheme in morphological processing. In Morphological aspects of language processing, Laurie B. Feldman (ed.), 109–129. Hove: Erlbaum. Taft, Marcus 1994 Interactive-activation as a framework for understanding morphological processing. Language and Cognitive Processes 9: 271– 294. The interplay of root, suffix and whole-word frequency 207 Taft, Marcus, and Ken Forster 1975 Lexical storage and retrieval of prefixed words. Journal of Verbal Learning and Verbal Behavior 14: 637–647. Taft, Marcus, and Ken Forster 1976 Lexical storage and retrieval of polymorphemic and polysyllabic words. Journal of Verbal Learning and Verbal Behavior 15: 607–620. Thornton, Anna M., Claudio Iacobini, and Cristina Burani 1994 BDVDB. Una base di dati sul Vocabolario di Base della lingua italiana. Roma: Istituto di Psicologia del CNR. Thornton, Anna M., Claudio Iacobini, and Cristina Burani 1997 BDVDB. Una base di dati sul Vocabolario di Base della lingua italiana. Con un intervento di Tullio De Mauro. 2nd edition. Roma: Bulzoni. Tsapkini, Kyrana, Eva Kehayia, and Gonia Jarema 1998 The psycholinguistic reality of morphophonological changes during derivation. Brain and Cognition 37: 166–168. Tsapkini, Kyrana, Eva Kehayia, and Gonia Jarema 1999 Phonological change in derivation: A psycholinguistic study. Brain and Language 68: 318–323. van Jaarsveld, Henk J., Riet Coolen, and Robert Schreuder 1994 The role of analogy in the interpretation of novel compounds. Journal of Psycholinguistic Research 23: 111–137. Vannest, Jennifer, and Julie E. Boland 1999 Lexical morphology and lexical access. Brain and Language 68: 324–332.