Structures behind grammaticalization<1> Elly van Gelderen Centre for Advanced Study, Oslo, and Arizona State University August 2005 version There are many linguistic changes where words lose meaning and gain grammatical function. This grammaticalization often involves a full phrase becoming one word, or a verb becoming an auxiliary. The current paper provides a characterization of structural ways to examine grammaticalization. Within a Minimalist framework, it uses the Head Preference and Late Merge Principles for this purpose. Thus, it assumes that change is cyclical and provides grammarinternal reasons for this. It also uses Feature Grammaticalization to account for cross-linguistic similarities. 1. Introduction Grammaticalization involves the loss of semantic and phonological information and the increase of grammatical function (see e.g. Heine & Reh1984; Traugott & Heine 1991). Well-known examples include verbs changing to auxiliaries and prepositions to complementizers. Many of the accounts of the last 20 years are functional but recently there have been attempts to account for grammaticalization in a formal, structural way. Thus, Roberts & Roussou (2003) and van Gelderen (2004) discuss changes of lexical heads to functional heads (e.g. verbs to modals) in structural terms, and van Gelderen (2004) adds the change from specifier to head (e.g. demonstratives to articles). Simpson & Wu (2002) discuss specifier to higher specifier changes (e.g. negative DPs to negatives). In this paper, I’ll show that several kinds of grammaticalization processes can be given a formal account and add examples to those given in the literature. I will also focus on why, cross-linguistically, certain lexical words develop into certain grammatical categories and will call this phenomenon Feature Grammaticalization. It has long been 1 recognized that languages change from synthetic to analytic and back again (e.g. Bopp 1868). This fact has also been denied, e.g. by Jespersen (1922) and more recently by Norde (1997). Hodge (1970) calls this cyclical phenomenon the ‘Linguistic Cycle’. In this paper, I show what some of the structural reasons behind the cycle are, recognizing the difficulties in using the terns synthetic and analytic. I will formulate a way to characterize analytic and synthetic in a Minimalist framework. The outline is as follows. In section 2, I’ll first provide some background on Minimalist phrase structure and the two Economy Principles compatible with this framework. In section 3, I provide a few examples of the change from phrase to head, phrase to higher phrase, and head to higher head and discuss their relationship. Under this view, grammaticalization is uni-directional, brought about by structural factors. In section 4, I examine if these principles are relevant to concepts such as synthetic and analytic. The data used come from the Helsinki Corpus, the Old English Dictionary Project, and Middle English electronic texts from the Oxford Text Archive. 2 Some structures Within the generative tradition (e.g. Chomsky 1986), syntactic structures are built up using general rules, such as that each phrase consists of a head (X in (1)), and a complement (ZP in (1)) and specifier (YP in (1)): (1) XP YP X' X ZP In early work, this schema is quite strict, e.g. specifiers and complements are always full phrases. This changes with the introduction of (minimalist) bare phrase structure in the 1990s (Chomsky 1995). A verb and a pronoun object can merge with each other, as in (2), while one of the two heads projects, in this case V: 2 (2) VP V D see it Phrase structures are built using merge and move. `Merge' combines two items, e.g. see and it, of which one projects into a higher level and transmits its categorial features. The VP domain is seen as the thematic-layer, i.e. where theta-roles are determined. After functional categories such as I and C are merged to VP, ‘agree’ ensures that features in IP and CP find a noun or verb with matching (active) features to check agreement and Case. Movement to the specifier of IP occurs in those languages that have EPP features but headmovement may be seen as part of the PF component. Using general Minimalist principles, one can argue that checking between two heads, also referred to as incorporation, is more economical than between a specifier and a head. This is formulated in van Gelderen (2004: 11) as (3): (3) Head Preference Principle: Be a head, rather than a phrase. Principle (3) holds for merge (projection) as well as move (checking). The preferred structures are (4a) and (4b) rather than (4c), where FP stands for any functional category and where, for instance, a pronoun is merged in the head position in (4a), moved to it in (4b), but occupies the specifier position in (4a): (4)a. FP b. F’ . pro FP F’ . ... c. pro F’ pro … F FP F … F As I show below, the Head Preference Principle is relevant to a number of historical changes: whenever possible, a word is seen as a head rather than a phrase. In this way, 3 pronouns change from emphatic full phrases to clitic pronouns to agreement markers and negatives from full DPs to negative adverb phrases to heads. Within recent Minimalism, there is a second economy principle (see e.g. Chomsky 1995: 348). Merge, as in (2) above, "comes `free' in that it is required in some form for any recursive system" (Chomsky 2001: 3) and is "inescapable" (Chomsky 1995: 316; 378). This means that merge is more economical than (merge and) move. Thus, it is less economical to merge early and then move than to wait as long as possible before merging. In van Gelderen (2004: 12), this is formulated as (5): (5) Late Merge Principle: Merge as late as possible Chomsky (2001: 7-8) reformulates the notions of merge and move as external and internal merge respectively. "Argument structure is associated with external merge (base structure); everything else with internal merge (derived structure)" (p. 8). The latter leaves a copy in place, but is otherwise similar to merge. In this system, internal and external merge are variants of each other. I will argue that internal merge (i.e. earlier move) is still less economical since there is an additional copy in the derivation. For convenience, I will continue to use the term move rather than internal merge. How does Late Merge account for language change? If non-theta-marked elements can wait to merge outside the VP (Chomsky 1995: 314-5), through external merge, they will do so. I will therefore argue that if, for instance, a preposition is less relevant to the argument structure (e.g. to, for, and of in ModE), it will tend to merge higher (in IP or CP) rather than merge early (in VP) and then move. Why certain words are more appropriate than others will be seen as due to Feature Grammaticalization relevant to both principles (3) and (5). Like (3), Late Merge is argued to be a motivating force of linguistic change, accounting for the change from specifier to higher specifier and head to higher head. 3 Examples of change due to Economy 4 In this section, it is shown how (3) and (5) account for changes traditionally referred to as grammaticalization. The change from specifier to (higher) specifier follows from (5), that from specifier to head from (3), and that from head to (higher) head again from (5). 3.1 Specifier to Specifier Without using Late Merge, Simpson & Wu (2002: 291 ff.) analyze a change in negation in the history of French as in (6). Initially, the negative ne selects a Focus projection below the NegP but above the VP. The negative element pas in the FocP moves to the specifier of NegP, as in (6a). This object then becomes base generated in the specifier of FocP, as in (6b), and subsequently in Spec NegP, as in (6c): (6)a.NegP Neg ne b. FocP Spec pasi NegP Neg Foc' Foc ne VP V c. FocP Spec pas ti NegP Spec Foc' Foc V Neg' pas Neg VP VP ne VNP NP The change from specifier to higher specifier falls under the Late Merge Principle, as in (5) above, since in (6b) there is less movement than in (6a) and the negative is merged latest<2> in (6c). The next step will be for pas to become a head, in accordance with (3). This has presumably happened in varieties of French where ne has disappeared. 3.2 Specifier to head English negatives provide evidence for the Head Preference Principle in (3) because they change from specifier or full phrase to head. Initially, there is a negative nominal, as in the Old English (7), with a structure as in (9a) below. Next, the negative becomes 5 restricted to na wiht/na thing, as in (8) and represented in (9b). Finally, the negative specifier changes to a single word or head, not, represented in (9c): (7) Æt nyxtan næs nan heofodman Þæt .. At night not-was no headman who `At night there wasn't a headman who ...' (Peterborough Chronicle, anno 1010.26, Thorpe's edition) (8) ne fand Þær nan Þing buton ealde weallas not found there no thing (Peterborough Chronicle, anno 963.18) The different stages can of course be represented in the same text, as (7) and (8) show. The initial stage is one from specifier to a higher specifier, as shown in (9a), in accordance with (4) above. After the negative phrase becomes generated in the specifier, as in (9b), it can then become a head, as in (9c). Much has been written on this cycle since Jespersen (1916), but by using (3) above, we find a structural explanation: (9) a. CP . C' n-æsi NegP DPj D nan b. Neg' NP man Neg ti VP tj.... Neg' Neg .... CP . C' n-isi NegP A na(w)uht ti 6 c. CP . C' C NegP Ø Neg' Neg not/n't ... In the history of English, as soon as stage (c) is reached, the verb and not are written as one word, as in (10). This is quite frequent in letters such as the 15th century Paston Letters which have benot, darnot, letnot, shalnot, woldnot, and many others. It takes another 300 years before the auxiliaries start to contract with the negative, as in (11). (Both sentences are from the Helsinki Corpus): (10) Þat we cannot tell of (Wycliffite Sermons, sermo 16, I, 285, c1380) (11) But I shan't put you to the trouble of farther Excuses, if you please this Business shall rest here. (John Vanbrugh, The Relapse c1680). In texts that write the forms together, ne is no longer used as a negative head. The change shown in (9) is a traditional grammaticalization that can be accounted for by two structural principles, (3) and (5) above. This change results in a loss of semantic specificity and phonological weight. Thus, na wiht means ‘no creature’ and is more specific than just the negative marker and the loss of phonology between nawiht and not is obvious. What happens is that the semantic feature [negative] on D is reanalyzed as a grammatical one. I will refer to this as Feature Grammaticalization. Other instances of specifier to head grammaticalization provided in van Gelderen (2004) involve relative and demonstrative pronouns becoming complementizers, demonstratives becoming articles. In table 1 a few of the most common ones are listed without further discussion. Demonstrative pronoun that to complementizer Demonstrative pronoun to article Negative adverb to negation marker Adverb to aspect marker Adverb to complementizer Pronoun to agreement 7 ________________________________________________________________________ Table 1: 3.3 Examples of specifier to head changes From head to head After a phrase becomes a head, further loss of meaning and increase in grammatical function comes about if the head changes to a higher head, one with less lexical content. Another possibility is for the head to disappear, an option I do not discuss in this paper. Like the change from specifier to higher specifier, the change to higher head follows from Late Merge. Clear examples are those where verbs become auxiliaries. Since verbs need to move to higher categories to check their agreement features and since they do not contribute to the theta-roles, they can wait to merge later. Another example of this concerns the preposition for. I will show how features are transformed in this process, in accordance with Feature Grammaticalization. In the Peterborough Chronicle <3> (hence PC and, as before, quoted with the entry year from Thorpe's edition), for is used as a preposition of causation, as in (12) and (13). (12) þa luuede se kining hit swiðe for his broðer luuen Peada. 7 for his wedbroðeres luuen Oswi. 7 and for Saxulfes luuen þes abbodes `Then loved the king it much for love of his brother Peada and for his pledgebrother Oswiu and for love of the abbot Saxulf' (PC, anno 656.4). (13) ouþer for untrumnisse ouþer for lauerdes neode ouþer for haueleste ouþer for hwilces cinnes oþer neod he ne muge þær cumon `either from infirmity or from his lord's need or from lack of means or from need of any other kind he cannot go there' (PC, anno 675.30). It is remarkable how many of these concern constructions in which the PP of which for is the head is preposed, as in (13), (14), and (15): (14) for mine londe 7 for mine feo. mine eorles fulle to mine cneo 8 (15) for my land and for my property my earls fell to my knees (Layamon, Caligula 1733-4). þu 3ef þeseluen for me to lese me fra pine `you gave yourself to me to release me from pain' (Wohunge 88-9). According to van Dam (1957: 6), this fronting occurs regularly in OE. In (15), for is ambiguous between P and C, and hence the language learner ends up reanalyzing the P as C, and the DP as a topicalized element. In this connection, it is remarkable that the first instances of that-deletion listed in the OED (entry for that II 10) are as in (16) and (17), from the 14th century, i.e. where a for-phrase has been fronted and can serve as C: (16) (17) I dred me sare, for benison He sal me giue his malison I dread me sore for blessing he will me give his curse (Cursor Mundi, Cotton, 3665). Joab .. slowh Abner, for drede he scholde be ... `Joab killed Abner, out of fear that he should be ...' (Gower, Confessio I. 263). In Old and Middle English, forðæm also functions as `because', as in (18). This shows again that an original PP is functioning as C: (18) forþam Trumbriht wæs adon of þam biscopdome `because T had been deprived of his biscopric' (anno 685.1). The preposing is explained by Late Merge. The PP containing for is not relevant to the argument structure, so it can wait. The preposition for includes a semantic feature [cause] that can also be expressed in C and that's why for is reanalyzed as a C, as in (19) and others below. The earliest instance of for as a finite complementizer we know of in English is in the PC and is from the entry for the year 1135, as in (19). There are two others from the entry for 1135, as in (20) and (21): (19) for þæt ilc gær warth þe king ded because (in) that same year was the king dead (PC, 1135, 6) (20) for æuric man sone ræuede oþer þe mihte 9 because every man soon robbed another that could `because everyone that could robbed someone else' (PC, 1135, 8). (21) for agenes him risen sona þa rice men `because against him soon rose the powerful men' (PC, 1135, 18). This locates the first use of complementizer for with the second scribe of the PC, who starts adding information from 1132 on. Between 1135 and 1154, the use increases dramatically compared to the period before 1135, as (22) to (28) show for the next year that there is an entry: (22) for he hadded get his tresor because he had got his treasure (PC, 1137, 3). (23) for æuric rice man his castles makede `because every powerful man made his castles' (PC, 1137, 13-4). (24) for ne uuæren næure nan martyrs swa pined alse hi wæron `because never were martyrs as tortured as they were' (PC, 1137, 20). (25) for nan ne wæs o þe land `because none was in that land' (PC, 1137, 42). (26) for ouer siþon ne forbaren hi nouther circe ne ... `because nowhere did they forbear a church nor ...' (PC, 1137, 46). (27) for hi uueron al forcursæd `because they were all accursed' (PC, 1137, 53). (28) for þe land was al fordon mid suilce dædes `because the land was all fordone bysuch deeds' (PC, 1137, 54-5). Excluding the verb for `went', for occurs 101 times as preposition and complementizer in the PC. Of these, 16 are finite complementizers recorded during the last few years, given in (19) to (28). So, the stages are (a) preposing of the PP, due to Late Merge, and (b) reanalysis of for, due to the Head Preference. Table 2 shows some other examples from the history of English, again not further discussed in this paper: 10 __________________________________________________________ After, from P > C On, from P to ASP Like, from P > C To: P > ASP > M > C Modals and do: v > ASP __________________________________________________________ Table 2: Examples of the change from head to head. Structurally, the changes from specifier to higher specifier in (6), the ones from specifier to head in (9), and from head to higher head in (19) to (28) can be seen as resulting from Economy Principles at work in a derivation. In terms of language typology, the last change results in a more analytical language, but the first one is the beginning of a change that can lead to a more synthetic language. The stages can be represented as figure 1: Spec to (higher) spec > spec to head > head to (higher)head > head to dependent ________________________________________________________________________ Figure 1: Four stages of grammaticalization I will briefly turn to this cycle as well as the terms analytic and synthetic in the next section. 4. Analytic and Synthetic and the Cycle Von Schlegel is the first in 1818 to use analytic and synthetic where languages are concerned. However, as Schwegler (1990) points out, from the beginning these terms are imprecise since they include gradations, such as “elles penchent fortement vers” and “une certaine puissance de”. Von Schlegel’s reasons for postulating these terms may have been to distinguish the more ‘perfect ‘ synthetic languages from the less perfect ones. He sees the reason for change “les conquérans barbares” (1818: 24) who acquired Latin 11 imperfectly. In the 20th century, Sapir picks up the two notions and adds a third, polysynthetic, and he tries to distinguish syntax, morphology, and meaning where these terms are concerned (1921: 135-6). Languages such as Mandarin are analytic in that grammatical categories such as aspect are expressed as separate words, e.g. by the perfective marker le that has grammaticalized from the verb liao meaning `to complete' among other meanings (Sun 1996: 85; 178; Shi 2002). This means that a light verb comes to be generated higher in ASP which development goes according to the Late Merge Principle. Synthetic languages such as Old English change into more analytic languages using Late Merge as well. For instance, verbs inflected for mood and tense come to be replaced by auxiliaries generated in positions just expressing mood and tense, originating in verbs. However, the terms are hard to use in that many languages, e.g. Modern English and Mandarin Chinese, cannot be characterized as completely analytic languages since, as we have seen above, endings are created in the case of English negatives in (10) and (11) above, and Mandarin le is always supported by another elment, e.g. a verb. Terms such as analytic and synthetic are controversial but the idea of the linguistic cycle is perhaps even more so. As Hodge (1970) points out, it is an old concept but much criticized by Jespersen (1922) and others. The idea is that language change proceeds in a cycle. This does not mean that change reverses itself, as many opponents of unidirectionality have claimed. Rather, they proceed as sketched in e.g. figure 1. Hodge provides examples from many other languages and language families, Chinese, Egyptian, and Finno-Ugric. My proposal is to consider the switch from inflectional/synthetic to isolating/analytic as due to Late Merge and the one from isolating/analytic to inflectional/synthetic as due to the Head Preference Principle. The former speaks for itself but the latter requires some explanation. If a an element becomes a head, it will often become an affix due to extremely common process of head-to-head movement. Negative and aspectual heads will therefore be `picked up' by the verb moving through these positions. 5 Conclusion 12 In this paper, I have given examples of how two principles, Head Preference and Late Merge, are compatible with Minimalism and account for two types of grammaticalization. Notes 1 This paper was writting while at the Centre for Advanced Study in Oslo, Norway. I would much like to thank the participants at the Centre as well as the audience at DIGS where this paper was presented. 2 Roberts & Roussou (2003) examine this change, skipping the focus stage, but use the Lexical Subset Principle (if an element always occurs in one environment: reanalyze it) to account for the change to (6c). 3 The Peterborough Chronicle contains entries for years in the history of Britain from the time of Caesar to 1154. The early part up to 1121 is copied and then some entries are added before a second scribe takes over in 1132 and this stage shows very fast change. References Bopp, Franz 1868. Vergleichende Grammatik. Berlin: Dümmler. Brook, G. & R. Leslie, eds. 1963. Layamon: Brut. Oxford: Oxford University Press, EETS 250. Chomsky, Noam 1986. Barriers. Cambridge: MIT Press. Chomsky, Noam 1995. The Minimalist Program. Cambridge: MIT Press. Chomsky, Noam 2001. "Beyond Explanatory Adequacy". MIT Occasional Papers in Linguistics 20. Cambridge, MA: MITWPL. Dam, Johannes van 1957. The Causal Clause and Causal Prepositions in Early Old English Prose. Groningen: J.B Wolters. Dobree, B. & G. Webb (eds) 1927. The Complete Works of Sir John Verbrugh. Bloomsbury: The Nonesuch Press. 13 Gelderen, Elly van 2004. Grammaticalization as Economy. (Linguistik Aktuell/Linguistics Today 71). Amsterdam: John Benjamins. Greenberg, Joseph 1954 [1960]. “A quantitative approach to the morphological typology of language”. International Journal of American Linguistics 26.178-194. Heine, Bernd & Mechthild Reh 1984. Grammaticalization and Reanalysis in African Languages. Hamburg: Helmut Buske Verlag. Hodge, Carleton 1970. “The Linguistic Cycle”. Linguistic Sciences 13.1-7. Hudson, A. (ed.) 1983. Wycliffite Sermons. Oxford University Press. Jelinek, Eloise. 1987. “Auxiliaries and Ergative Splits”, in Martin Harris & Paolo Ramat (eds) Historical Development of Auxiliaries, 85-105. Berlin: Mouton de Gruyter. Jespersen, Otto 1916. Negation in English and other Languages. Copenhagen: A.F. Høst. Jespersen, Otto 1922. Language: Its Nature, Development, and Origin. London: Allen & Unwin. Lightfoot, David 1979. Principles of Diachronic Syntax. Cambridge: Cambridge University Press. Macaulay, G. (ed.) 1957. Confessio Amantis. London: Early Text Society. Morris, Richard (ed). 1874-1893. Cursor Mundi, 7 Parts. Trübner & Co. Norde, Muriel 1997. The history of the genitive in Swedish: A case study in degrammaticalization. Ph.D. dissertation. Universiteit van Amsterdam. Oxford English Dictionary (OED) 1933. Oxford: Oxford University Press. (online version used) Roberts, Ian & Anna Roussou 2003. Syntactic Change: A Minimalist Approach to Grammaticalization. Cambridge: Cambridge University Press. Sapir, Edward 1921. Language: An Introduction to the Study of Speech. New York: Harcourt Brace. Schlegel, August Wilhem von 1818. Observations sur la langue et la littérature provençales. Paris: Librairie grecque-latine-allemande. Schwegler, Armin 1990. Analyticity and Syntheticity: A Diachronic Perspective with Special Reference to Romance Languages. (Empirical Approaches to Language Typology, 6). Berlin: Mouton de Gruyter. 14 Shi, Yuzhi 2002. The Establishment of Modern Chinese Gramma r: The Formation of the Resultative Construction and its Effects. (Studies in Language Companion Series 59). Amsterdam: John Benjamins. Simpson, Andrew & Xiu-Zhi Zoe Wu 2002. "Agreement Shells and Focus". Language 78.287-313. Sun, Chaofen 1996. Word-order Change and Grammaticalization in the History of Chinese. Stanford: Stanford University Press. Thompson, W. 1958. Þe Wohunge of ure Lauerd. London: Oxford University Press. Thorpe, Benjamin 1861. Anglo-Saxon Chronicle I and II. London: Longman. Traugott, Elizabeth Closs & Bernd Heine 1991. Grammaticalization. Amsterdam: John Benjamins. 15