BASIC ISSUES IN THE THEORY OF SYNTAX: TOWARDS A MINIMALIST PERSPECTIVE Eric Reuland Utrecht institute of Linguistics OTS Revised October 2004 1 Preface I wrote this text since I wanted to explain some of the basic issues in linguistics to an audience that has an interest in current linguistic theory but little or no background in linguistics, and also is not in a position to work through a full-fledged text book. So, in principle this text does not presuppose any previous knowledge of linguistics and I would very much appreciate it if my readers could tell me where I forgot to add a necessary explanation. I also wanted my text to be as concise as possible. Not a full book, with lots of examples and explanations, but just the basics. Because of this it may be dense at times. For instance, in my overview of grammar as an assembly line I use symbols (a, b, etc.) instead of lexical items. The reason? It saves space, and does not detract from the formal properties of sentence construction. This text is certainly not an introduction to linguistics in the usual sense. Linguistics is a fascinating field. Over the last decades we made amazing discoveries about the way language works. In recent years we also made significant progress in linking our discoveries about the architecture of the language faculty to the way things work (or don’t work) in our brains when we are dealing with language. There are many works around that are intended to convey the fascination linguists have for their field, and share it with you. Pinker's The Language Instinct, and Words and Rules are great examples. There are also many textbooks that carefully introduce you into the ways linguistic arguments are being structured.1 In preparing this text I assumed that for some reason or other you have already decided that you need some knowledge of linguistics, and also that, for the moment, you need not concern yourself with details of linguistic argumentation and theory construction. The goal of this text is to give you a quick introduction to some of the technical aspects of current linguistics, and sufficient background to be able to access and assess (some of) that literature. So, in a nutshell, it is intended as a shortcut between the discussion of linguistics in works 1 For instance, Haegeman (1991), Fromkin et al. (2000). 2 such as Pinker's books, but also Kandel et al. 's Principles of Neural Science, and the current technical linguistic literature. I should add that the material this text covers is rather limited. Only structure building is covered in some detail. Other areas, such as binding are mentioned, but not really discussed. However, for someone without previous background in linguistics who has gone through this text, some of the current literature on binding will be accessible, and the same holds true for other fields.2 In my discussion of structure building I stay close to the standard conception in current syntax. The way I introduce 'movement' differs conceptually. Most linguistic literature stresses the status of 'movement' as dislocation, and hence an operation that at least raises the suspicion of being an imperfection (but see Chomsky 1998, 1999). In the approach I use here, "movement" is simply "one element doing double duty". The computational system has to meet various interface requirements, and it is in fact quite parsimonious to use one element to meet more than one requirement. Ideally, this is all there is to the phenomenon. As the layout indicates, I distinguish two levels of detail. The reader can skip the parts in small print, and still follow the subsequent parts of the text. I would be grateful for any comments and suggestions I receive. Eric Reuland eric.reuland@let.uu.nl 2 Reuland (2002) and Reuland and Everaert 2000 provide technical discussion of the state of the art in binding theory. In Reuland (2003) I discuss how binding facts shed light on cognitive architecture. 3 1. The goals of the theory of grammar If it comes to defining what a language is, people will generally be able to converge on something like (1) (1) A language is a systematic pairing of forms and meanings. Human languages are particular instances of such systematic form meaning pairings. There seems to be little doubt that the property of having language, in its common sense interpretation, which we take as a starting point, is something that sets apart humans from (other) animals, and it is human language or natural language that will concern us here. The various animal communication systems even if they convey information and intentions to their fellow animals vastly differ from human language in terms of the quality of the information they convey and in the particular way in which they do so in ways that are so obvious that they need no lengthy discussion here. However, humans have devised many artificial systems that might also qualify for the label language (mathematical systems, logics, programming languages), hence as an indication of a domain of empirical inquiry the above definition may well be not informative enough. However, one of the things that clearly sets apart natural language from artificial formmeaning pairings is that children acquire natural languages without explicit instruction (as will be more extensively discussed below). Whereas the many artificial languages are surely manifestations of what one may broadly call human intelligence in the case of natural language this is clearly an empirical issue, and there are strong indications that many aspects of the faculty of language stand rather apart from general properties of intelligence and other kinds of symbolic behaviour (see Smith and Tsimpli (1995) for extensive argumentation). A helpful picture of the position of the language system among the other cognitive systems is provided by the following Evolutionary fable (Chomsky (1998:6)): 4 Given a primate with the human mental architecture and sensorimotor apparatus in place, but not yet a language organ. What specifications does some language organ FL have to meet if upon insertion it is to work properly? Thus the theoretical aim of the linguist is to arrive at a precise characterization of that part of our cognitive system that is dedicated to language. A working hypothesis is that language is optimally designed. That it, is a ‘perfect system’, the optimal solution to the task of relating the two interface levels (reflecting form and interpretation respectively). An effect of this working hypothesis is that it discourages solving descriptive problems by introducing substantive hypotheses specific to language per se. Any such substantive hypothesis is like an imperfection in the design. Presumably, there are such imperfections, but they should only be postulated as a last resort. Rather, whenever one sees such a problem the first step should always be to investigate how it can be reduced to issues concerning the design of natural language in view of its general task. So, human language consists of a systematic relation between form and meaning. Forms are typically realized and perceived in the form of auditory stimuli, but a route via the visual system is also available (sign language, and in a sense also reading). Since neither the auditory nor the visual systems are dedicated specifically to language these may be considered as external to the language system per se. Yet, the language system is apparently able to "talk" to the articulatory-perceptual systems via some interface. This interface is usually referred to as the PFinterface (PF=Phonetic Form). On the meaning side, at least this much is clear that our cognitive system is able to process information, form concepts, feel emotions, form intentions, and experience thoughts independent of language. (Although language definitely helps in articulating those.) Following Chomsky and others we may dub this part of the system the system of thought (with perhaps also a "language of thought" in so far as concepts can be combined without using the linguistic system proper). By the same token this system is in some relevant sense external to the language system. Yet, again, the systems must be able to communicate, which takes place using the ConceptualIntentional Interface (C-I interface). 5 Taking his picture a little further, a grammar of a language is a system defining a systematic mapping between a set of expressions F reflecting the forms of the language, and a set of expressions I reflecting their interpretations. But grammar thus conceived is not a system that itself 'runs' the sensori-motor system so as to yield actual pronunciations, nor is it concerned with idiosyncratic interpretations some expressions may acquire (i.e. in so far as It is cold here may be understood as a suggestion to close the window this is not part of the mapping as understood here). Grammar as such is in fact medium neutral (as is independently necessary in order to account for the existence of sign languages). We can accommodate this by conceiving the grammar as a computational system specifying the mapping between a set of expressions at the PF interface with the sensori-motor system and the CI interface with the systems of thought. The core of the grammar is a combinatory system defined over lexical elements, or morphemes.3These morphemes are the minimal units containing the basic ingredients for a sound-meaning pairing. That is, they contain a set of phonological features that are visible to the PF-interface, a set of semantic features that are visible to the C-I interface, and (possibly) a set of grammatical features that are just there to drive grammatical computations.4 Elaborating the view sketched in the fable, we thus arrive at the following schematic picture of the (minimal) structure of the language system: 3 Morphemes are the building blocks of words. So, the word worked is composed of two morphemes: the stem work- and the suffix –ed representing the past inflection, the word (noun) coldness is composed of the adjectival stem cold- and the affix –ness yielding an abstract noun. More about this below. 4 Note, that we view language as a discrete system. 6 (2) (External: Sensori-motor system) | Phonetic Form (PF) -interface | Computational system of Human Language (CHL) (+Lexicon) | Conceptual-Intentional Interface (C-I interface) | (External: System of thought) This sketch of the grammatical system leads to two further issues, namely the locus of cross-linguistic variation, and issues of acquisition. One may safely assume that there is no variation between humans in the nature of their sensori-motor systems, their systems of thought, and the interfaces to these systems. There clearly is extensive variation in their lexicons. The question is whether there are also differences in the computational system, the way in which the morphemes are put together. Prima facie, it seems that there must be, since languages, for instance, differ in word order. However, by itself this is not compelling. Word order properties could also be encoded as part of the grammatical make up of the individual morphemes ("if the system uses me, this is where I go"). In fact, one of the major issues over the last decades has been the division of labor between computational system and lexicon in accounting for cross-linguistic variation. In current generative theories it is generally assumed that the computational system is extremely simple, and hence the burden of variation inevitably lies in certain subparts of the lexicon (more specifically, the elements of the functional system, see below) . The discussion of acquisition in generative grammar is inextricably linked to the 'innateness' debate. The question of whether language is innate has sparked an incredible amount of controversy over the last decades, much of which is quite surprising given what the participants apparently do agree on. 7 In line with much generative work (e.g. Chomsky (1986, 1995, 1998, 1999, etc.) one may formulate the issue as follows. A human infant that is exposed to language will in the course of roughly 6-8 years5 acquire a knowledge of that language that is indistinguishable from that of an adult human being who knows the same language (with the possible exception of the lexicon, which may require some more time to reach adult level and also may keep on growing). In this respect she is entirely unlike apes, dogs, bees, etc., who regardless of how much they are exposed to a language will never reach anything like human adult competence. We also know that in the period between birth and 8 years of age the competence of a child goes through a number of stages. Given that a child exposed to Dutch ends up learning Dutch, and not, for instance, Chinese and that the converse holds for a child exposed to Chinese, we know that the input plays an important role in determining which language ends up being learned. On the other hand, it is equally uncontroversial that a child is not more predisposed to learning one language rather than another: put a child of Dutch provenance in a Chinese environment and she will learn Chinese as well as any Chinese child; the converse is equally easily to observe. As a consequence, language acquisition can be schematically represented as in (3), where the various stages a child goes through are represented by Si and the Di are the data it is exposed to, yielding a change in state: (3) S0 ------ S1 ------ S2 ---.....---- Si ---........--- Sn ------ Sn------Sn | | | | | | D1 D2 Di Dn Dn+1 Dn+2 ....... S0 is termed the initial state, before the child is exposed to language.6 Sn is the steady adult state that does not change anymore if over time more data are presented (again, with the exception of the lexicon). 5 I am using 8 rather than 6 since for instance some aspects of morphology in morphologically rich languages appear not to be fully mastered before the age of 8. 6 Note, that for the discussion it is immaterial at what age we put S0. If exposure to language can already take place in some form in the womb, as there is evidence to believe, S0 can be put just before the first exposure in the womb. 8 In so far as it is rational, the controversy can hardly involve this sketch as such. Rather it involves the status of S0. By definition, S0 is the state of being able to acquire language, distinguishing humans from apes, dogs, etc. That is, it reflects man's innate capacity for language. The empirical questions are i) what its content is, and ii) how it relates to man's other cognitive capacities. The question of content can be approached by carefully investigating necessary and sufficient conditions for the learnability of languages of the human type. The question of the relation to other cognitive capacities requires an understanding not only of the human language capacity, but also of those other cognitive capacities among which it is embedded. So, again, logically S0 could be empty. But this would imply no difference between humans and animals, so empirically, this cannot be correct. A next question is whether S0 has any subsystem that is specialized for language. That is, whether S0 has any properties that cannot be explained on the basis of their role in "other" cognitive faculties. Notice, that we can only start addressing this question on the basis of substantive theories about language and what enables humans to acquire it, and of those other cognitive capacities. For instance, a statement that language is an emergent property of our general intelligence is entirely empty in the absence of a substantive theory of human intelligence, specifying the operations involved in our reasoning, and what we can and cannot do. If we would have such a theory, the next step would be to carefully show that all properties of language can be derived from those general principles (note, that it would be essential that none of the properties in the explanans have been introduced with recourse to the explanandum). The matter of fact is that our knowledge of most of our cognitive capacities, including our general intelligence, is still far too rudimentary to warrant any substantive comparisons. In addition to language only the visual system has been described into any depth. What is striking about the visual system is that it is hard-wired to a substantial degree, with highly specialized neuron groups for various sub-tasks, such as color recognition, but also the detection of movement (see Kandel et al. (2000), and Wijnen and Verstraten (2001) for overviews). In addition to low-level processing of information the visual system has a higher9 level component in which the various types of information are integrated. It is very clear that much of our capacity for vision is innate, and, uncontroversially, genetically determined (surprisingly, without sparking any great philosophical debates). One might add that it is hard to see how the types of processes taking place in the visual system could be shared with the language faculty. We may conclude, that in contrast to what much of the discussion suggests, the main issues underlying the "innateness debate" are not philosophical but entirely empirical. Resolving them requires i) specifying the properties S0 must have in order to be compatible with the fact that language can be acquired given the limited access to data as has been observed; ii) keeping an open mind for questions like what properties of S0 that are necessary for language acquisition are also uniquely involved in it.7 1.1. Interlude: Some basic issues in language acquisition In order to understand certain basic issues in language acquisition it is helpful to take a very simplified picture as a starting point. Of course we are ultimately interested in questions such as: What precisely does a person who knows a language know? Undoubtedly a lot, even a lot that our theories are as yet not even able to specify. A much simpler question is: What does a person who knows a language minimally know? To this a possible answer is: A person who knows a language knows at least which strings of words belong to that language and which strings don't. For instance, a person who knows Russian knows at least which strings of Russian words are proper Russian and which strings aren't. So, acquiring a language entails at least determining which strings of words are sentences of that language and which strings aren't. So in order to get an impression how hard a task learning a language minimally is, we may try to assess how hard the task is to determine which strings of words are sentences of that language and which 7 Note that even proponents of a general intelligence approach must be prepared to make nontrivial assumptions about the patterns our intelligence naturally imposes on reality, and subsequently show that it is these patterns that explain our linguistic generalizations. 10 strings aren't, given the vocabulary of words (once more introducing a considerable simplification of the task). So, in this simplified picture we view a language as a subset of the set of all expressions one can form over a given vocabulary. That is, assuming that the vocabulary of English contains the elements the, dog, bites, man, the set of English sentences will contain the dog bites, the dog bites the man, etc., but not the bites man, bites dog the, etc. The task of the child acquiring English is therefore, minimally, to determine what the full set of English sentences is like on the basis of the sentences it is exposed to for some period of time, let's say for 6 years. The question is then to get an impression of how hard this task is. For principled reasons the number of well-formed sentences of a language is taken to be infinite; that is, there is no upper bound to the length of individual sentences. However, even if one limits oneself to sentences under a reasonable length the number of well-formed sentences is truly astronomical. It has been calculated that the number of grammatical English sentences of 20 words and less is 1020. (Note, that this very normal sentence is already over this limit, being exactly 21 words and costing 9 seconds to pronounce). At an average of six seconds per sentence it will cost 19.000.000.000.000 years to say (or hear) them all. One might now wonder what is the percentage of these a kid of could have heard in six year's time. In the case of non-stop listening the percentage is 0,000000000031%, clearly still a gross overestimation as compared to what the child can actually be expected to hear. So, on the basis of at most such an extremely small percentage the kind gets to know "how language works". There are many further complicating factors we abstracted away from, such as lack of homogeneity and the presence of errors in the data. If we were to take these into account, the task would only become more formidable. This leads us to the principled point: We can formulate the “logical problem of language acquisition” as the following task: 11 (4) Consider an infinite set of which a finite subset is given. Determine the full set on the basis of this subset. Or: (5) Consider an infinite series of elements (e.g., the natural numbers). Determine, given a finite beginning of the series how it continues. In their generality these are tasks that are not only hard, but impossible (in fact, as given, these tasks are even impossible for finite supersets of the set given). These tasks illustrate what has become known as Plato’s problem (see Chomsky (1986)). (6) How comes it that human beings, whose contacts with the world are brief and personal and limited, are nevertheless able to know as much as we do know? For a more concrete illustration, consider the following task involving the completion of a series of numbers: (7) 1,2,3,4,5,6,7, …. Imaginably, one might say that finding the next number is easy. It should obviously be 8, etc. But of course, this is not at all guaranteed. It is easy to think of a perfectly reasonable function that takes the first seven natural numbers, and enumerates them followed by their doubles, triples, quadruples, etc. Or a function that enumerates the first seven natural numbers, skips the next seven, starts again at 15, etc. Even among those functions that nicely continue with 8 there are vast differences; if one enumerates the days of the month there is no successor of 31 (unless one starts anew with 1). If one enumerates the days of the year there is no successor of 366. It requires some work for a non-mathematician, but one can also easily find sequences that are monotonously increasing and yet have a substantial overlap in their initial part. Consider the sequence 1,5,11,19,29, ….. It can be rendered by the following instruction: a(a+1)-1, for a a natural number, yielding (1x2)-1, (2x3)-1, ... etc. Following the instruction the sequence continues as (6x7)-1=41, (7x8)-1=55. However, there is an alternative instruction: start with 1, subsequently follow the sequence of prime numbers, omit the first two primes, print the next one, omit one prime, print the next one, and continue omitting alternatingly two and one prime before printing one. This yields the sequence 1, (2,3), 5, (7), 11, (13,17), 19, (23), 29, (31,37), 41, (43), 47, (53, 59), 61, …..Note that the two instructions start diverging only after 41. With a bit of ingenuity one can always think of such diverging alternatives. All this serves to illustrate a very simple point: There is no general procedure to establish the "correct" completion of some initial part of a sequence. If this is true here, the same necessarily holds true for language acquisition: 12 (8) There is no general procedure to establish the full (grammar of a) language on the basis of some given finite subpart. However, we know that children are able to do so. The question is then how? The reason language can be acquired is that tasks of this sort can be carried out if the type of regularity determining the structure of the sequence has been given in advance. For instance, the task of completing (7) becomes trivial if it is given that there is a constant difference between each member of the series and its successor. 1,2,3,4,5,6,7, .... receives a unique solution, just like 2,4,6,8, ..., or 1,4,7,10,.... , etc. In fact, the given instruction defines a class of sequences that all share the property that giving the first two elements of any sequence fixes the rest. Similarly, one can define a class of sequences such that given a member of the series, its successor in the series can be obtained by multiplying with a constant factor. It is also easy to define a class of instructions that would include the first, but not the second option for completing the sequence 1,5,11,19,29. Optional section: 1.1.1. Learnability as a property of classes What is still implicit in this discussion, but important to bring out is that learnability is not a property of languages per se but only a property a language/grammar has with respect to a certain class of languages/grammars containing it, or of such classes as such with respect to a certain mode of presentation. A ground breaking work on learnability theory is Gold (1967)8. For the standard discussions of learnability mostly learnability by presentation is considered. The learning task is generally formulated in terms of "identifiability in the limit": An algorithm A is a learning algorithm by presentation for a class of languages C with grammars G iff for each language Li in C and for each finite sequence of expressions Ej from Li, A selects from G a grammar Gk generating Lk containing Ej. The term "learning by presentation" is used to express that this learnability model does not allow for feed-back. That is, there is no negative evidence in the data, just like the child does not receive negative evidence (evidence of the type "this string is not in the language"). A class of languages C is learnable by presentation iff there is a learning algorithm A such that for each language Li there is (at least) one finite sequence Ei of expressions of Li, such that A given Ei selects a grammar Gk generating Lk containing Ei, and for each extended sequence EiSn A selects Gk generating Lk containing EiSn. Informally speaking, what is guaranteed is that there is a finite sequence of expressions such that the algorithm need never change its mind after having encountered that sequence. If the algorithm does not change its mind in the long run, it has identified the language. Note, that if this sequence has been encountered, an observer need never become aware of this. What is required for this type of learning 8 E. Gold. 1967. Language identification in the limit. Information and Control 16, 447-474 13 task to have been effectively carried out is "no term "identifiability in the limit". change in the long run". Hence the It is important to note that many very simple classes of languages are not learnable in this sense. For instance, any class of languages consisting of all finite languages and one infinite language over a common vocabulary is not learnable. On the other hand, take an arbitrary language L that is so complex that it is undecidable (i.e. there is no algorithm to determine in a finite time whether a string is in that language). Suppose L is the only member of the class of languages C. The trivial algorithm A selecting L in C identifies L in the sense required. Thus, learnability has nothing to do with the complexity of the individual language/grammar, but only depends on the structure of the class with respect to which the selection must be carried out. 1.2 Conclusion Coming back to our earlier informal discussion, crucial for the task to be carried out is that an infinity of logical possibilities is not considered. In the theory of language acquisition such a limitation on the type of instruction is called a limitation on the hypothesis space. We can now see that: Without an initial restriction on the hypothesis space language can not be learnt. Our mind cannot be a tabula rasa The necessity to determine the restrictions on the hypothesis space is the essence of the connection between the investigation of Universal Grammar (the properties that all language share) and language acquisition research. Taking a big leap, surprisingly, in order for an organism to be able to effectively learn, it must be limited in the options it entertains. Paradoxically, if the capability of learning language would indeed turn out to be merely a matter of general intelligence, humans must in fact be less intelligent and less imaginative than people have a tendency to assume.9 9 Deacon (1997) proposes that our capacity to learn language is in fact not so surprising given that language is a human product, and in a sense "made to be learned". However, contra to appearances, this is just a restatement of the problem. Properly analyzed, it gives rise to the whole range of issues discussed here, only in a slightly altered terminology: how is the property of language that it can be learned reflected in its structure, and what does the fact that we apparently require this system tell us about our cognitive abilities? 14 2. The grammatical model: basic properties Since the publication of Chomsky (1981) a conception of grammar has developed in which the language faculty was conceived as an essentially modular system with a number of components which I list here with their standard names followed by a non-technical explanation (see Haegeman (1991) for an overview): 1. Lexicon - the inventory of elements from which linguistic expressions are built 2. Phrase structure - general principles of structure building 3. Movement theory - the property of displacement in natural language, including variation in position. For instance, in Dutch an object can occur to the left or the right of certain adverbials. To see this, compare the following sentences: i) ik denk dat Jan gisteren het boek gelezen heeft 'litt.: I think that John yesterday the book read has'; ii) ik denk dat Jan het boek gisteren gelezen heeft 'litt.: I think that John the book yesterday read has'. But the following sentence is impossible: iii) *ik denk het boek dat Jan gisteren gelezen heeft 'litt.: I think the book that John yesterday read has' Movement theory accounts for which displacements are possible, and which are not. 4. Theta-theory - Theta-theory is a theory of semantic roles. Predicates assign semantic roles to their arguments; for instance, in i) John opened the door with a key, John is an agent; the door undergoes the action of opening, and the key is the instrument. In ii) the key opened the door the agent is not expressed; in iii) the door opened, neither agent nor instrument are expressed. However, iv) *the key opened is illformed. I.e. it sounds incomplete in a way ii) and iii) do not. Thetatheory accounts for these and other properties of language. 5. Case theory - In many languages the form of a nominal element, such as I, he, John, the door, the key, depends on its relation to a verb or other predicate. In English there a contrast between a nominative I, he and an object form such as me, him only in the case of pronominals; other languages such as Latin or Russian, distinguish a whole range 15 of different forms (for instance, in Russian, with a key in 4.i above would be an instrumental). Case theory accounts for the crosslinguistic patterns in case systems, and the relation between Case and semantic roles. 6. Binding Theory - Many languages have expressions that depend on some other element in the sentence for their interpretation; examples are English himself, Dutch zich or zichzelf, French se, Russian s'eb'a, etc. Binding theory is concerned with the principles governing their interpretation and also with certain restrictions on the interpretation of pronouns. 7. Control theory - In computing the interpretation of sentences we often understand elements to be active in a position that are not pronounced there. For instance in I promised Bill to arrive on time, I is not only the subject of promise, but also the understood subject of arrive. In I asked Bill to arrive on time, Bill is not only the person asked, but is also understood as the subject of arrive. Control theory is concerned with the interpretation of such unpronounced elements but its current status is very close to that of binding theory. 8. Bounding theory - Bounding theory is concerned with restrictions on movement, but more properly discussed in connection with Movement Theory To these subtheories two interpretive components should be added: i) a component handling the expression of language in a medium (Phonetic Form=PF), and ii) a component handling the semantic interpretation (mapping a possibly abstract syntactic structure, Logical Form=LF into extra-linguistic meaning). In recent years much of the focus of research has been on the foundations of the theory, specifically examining the computational mechanisms and attempting to reduce and unify them as much as possible. To mention a few foci of research: - eliminating overlap between components (Movement theory, binding theory, bounding theory) 16 - unifying components where they are conceptually similar (Binding theory and Control theory). - assessing the division of labor between typically syntactic and typically interpretive mechanisms (binding theory) - minimizing the number of basic mechanisms 3. The lexicon 3.1. Morphemes: Affixes and stems As stated above, the lexicon (“dictionary”) of a language contains the basic building blocks of the grammar. Those basic building blocks are smaller than words. As a moment's reflection on the following English words shows, many of them can be decomposed into smaller units. (9) witch, bewitch; destroy, destroys, destroyed, destruction, destructible, destructibility, indestructible, indestructibility; sad, sadden, saddens, saddened; work, works, worked, workable, unworkable; real, reality, unreal; boy, boys, boyish, boyhood; ox, oxen; coherent, incoherent; inhere, inherent. These smallest recurring combinations of form and meaning, into which words can be analysed, are morphemes. For convenience sake we will briefly go over a few terms to talk about word structure, without attempting to give full definitions.10 In the examples of (9) we typically find one part that carries the main lexical content of the word, as in (10): (10) witch, destroy, sad, work, on, real, boy, .... These elements are the stems. In addition to stems we have affixes. Affixes may be attached either before or after the stem, which yields prefixes and suffixes respectively.11 (11) Affixes prefixes: be-, in-, un-, etc. 10 These terms are just conveniences, without much theoretical import; they are mostly selfexplanatory on the basis of a few examples. 11 Some language also have infixes: material that is inserted into the stem (for instance the same stem vic from Latin vici 'I won' also occurs in 'invincible'). In English, infixation is not productive, though. 17 suffixes: -ion, -able, -ity, -ent,-en, -ish, -hood; -s, -ed, -s, -en, etc. Affixes can be attached directly to the stem, but also to an affix that has already been attached. Affixation processes are not always entirely transparent. Sometimes the changes are relatively minor as in the case of sad+en sadden. As can be seen from the 'family' destroy, destroys, destroyed, destruction, .....we must assume that there are processes more fundamentally modifying a stem if affixes are added, or alternatively that a stem has two alternating forms (in this case destroy/destruct) with the selection depending on the affix chosen. Also, when affixes are added to affixes forms may change (able+ity ability). Furthermore, whereas in most cases the stem may occur as an independent word on its own, the family formed by coherent, incoherent, inhere, inherent shows that this is not always the case. The pair inhere, inherent suggests the existence of a suffix –ent making the adjective inherent from the verb inhere, but there is no verb *cohere in English from which –ent could have formed the adjective coherent. Similarly, comparing coherent and inherent, where in- is a well-known prefix and co- looks like one too (as suggested by (co-) author), one is inclined to first strip the prefix, leaving –*herent, which is not an English adjective, and then to strip – ent, leaving -*here- which is not an English verb. Note, that if one goes all the way back to Latin, there is a verb haerere which indeed underlies the formations under discussion. In order to reflect this one may say that the forms coherent, incoherent, inhere, inherent share the element –here- as their common root. So usually, the term root is used for the case where the stripping of affixes yields an element that in terms of systematic sound-meaning pairing has no (longer an) independent existence in the language.12 (11) gives the words of (9) with morpheme boundaries represented: (11) witch, be-witch; de(-?)stroy13, destroy-s, destroy-ed, destruction, destruct-able, destruct-abil-ity, in-destruct-able, in-destructabil- ity; sad, sadd-en, sadd-en-s, sadd-en-ed; work, work-s, work-ed, work-able, un-work-able; real, real-ity, un-real; boy, boy-s, boy-ish, boy-hood; ox, ox-en; co-her-ent, in-co-her-ent; in-here, in-her-ent. 12 In historical linguistics the term root is used for a common, sometimes rather abstract element shared a a number of related languages, such as the root kntm shared by English hundred, Dutch honderd, Latin centum, French cent, Russian sto, etc. In current morphological theory the term root is used for a minimal meaningful unit abstracting away from its lexical category. 13 Note, that one could also consider further analyzing destroy into de-stroy. Although deexists as an independent prefix, stroy does not exist as an independent stem in English, but struere was an independent word in Latin. 18 3.2. Derivation and inflection From another perspective affixes are also subdivided into derivational affixes and inflectional affixes. Derivational affixes typically represent a semantic operation on a stem mostly (but not always) coupled with a change in word class. So, -ion applied to a (latinate) verbal stem yields a noun denoting the action represented by the verb; -able applied to a verb yields the adjective denoting the property of being able to undergo the action represented by the verb; -ity applied to an adjective yields the abstract noun denoting the property represented by the adjective. The prefix in- applied to an adjective yields an expression denoting the absence of the property expressed by the adjective, etc. A couple of these options come together in a multiply derived noun such as in-destructibil-ity. Inflectional affixes essentially represent properties that encode grammatical relations between elements of a sentence, or enter into encoding them. Case in languages like Latin, Russian, or German is a typical instantiation of inflection. In English remnants of morphological Case marking can only be found in the form of the genitive 's and in the pronominal system (oppositions such as he/him, etc.). Other instantiations of inflection are agreement on the verb, such as the –s in works marking 3rd person singular, plural marking on nouns, such as the –s in boys, and the –en in oxen, and tense marking of verbs, such as the –ed in worked. Also –ing in participles and gerunds and –ly in adverbs are inflectional affixes. Inflectional affixes always follow all derivational affixes.14 2.1.3. Free versus bound morphemes Another traditional division of the class of morphemes is into free morphemes and bound morphemes. Free morphemes can be used as independent words (cat, witch, work, etc.) bound morphemes must be attached to a word/stem in order to be used (-s, –ed, -ity, etc.). Most content words are free morphemes in English, but some, as we have seen, are not (the -here- from inherent, or even the –concert- from dis-concerting).15 -ing in gerunds and –ly in adverbs can be taken to indicate that the boundaries between inflection and derivation are not always entirely sharp. However, the property of having to follow all other affixes is sufficient as a rule of thumb criterium. 15 Note that in highly inflectional languages the number of free morphemes will be very limited, since nouns and verbs that in English serve as stock examples of free morphemes all obligatorily require an inflectional morpheme expressing Case, Number, Tense, etc. 14 19 3.3. Words and segmentation As we saw in the case of destroy/destruction, the relation between forms in the same family of words is not always entirely transparent. Furthermore, there are certain affixes with the same function that are a bit picky as to which stem they attach to. For instance, the standard form of the plural affix for nouns is –s. But ox does not take the –s; instead it requires –en, a property which it shares with very few other nouns. In the case of oxen we can still see a separate segment representing the plural. However, in the case of the pairs man/men, woman/women with just a vowel contrast or deer/deer with no contrast at all, we cannot morphologically distinguish a segment for plural. For instance, in the case of man/men, we cannot say that the – e- represents the plural; rather it is replacing –a- by –e- that represents the plural. The only way to perspicuously represent such cases is by distinguishing between the morphological and the syntactic structure of a word. In the case of boy/boys morphological and syntactic structure match. The plural can be represented as in (12): (12) [N boy] + [s PLUR] boys In the case of man/men both structures do not match. Their relation is given in (13): (13) [N man + PLUR] men In cases like (13) one lexical element realizes two more abstract 'morphemes'. As one can see in (14) the same can happen with tense morphology on verbs (so-called strong verbs): (14) a. [V work] + [ed PAST] worked b. [V hold + PAST] held In a case such as deer in its plural use there are in principle two options. One is to represent it using the pattern of (13), as in (15): (15) [N deer + PLUR] deer 20 The alternative is that the stem deer is like the stem ox in that it requires a special form, -en in the case of oxen, but zero (notated Ø) in the case of deer. as in (16): (16) [N deer] + [Ø PLUR] deer The concept of a zero morpheme is important, and an essential tool for the description of inflectional paradigms, both for languages with poor inflection and for languages with rich inflection.16 The difference between regular and irregular forms plays an important role in the psycholinguistic literature, especially in the debates between proponents of a symbolic, rule-based, or a connectionist architecture. Debates focus on whether our language system treats regular and irregular forms differently or similarly in acquisition and storage. Apart from this, the contrast is also important in the study of language impairment. 3.4. Lexical and functional categories A final important distinction within the lexicon is that between lexical and functional categories. Lexical categories are the well-known word classes noun, verb, and adjective, preposition, abbreviated as N, V, A, P. Lexical categories are open. That is, they are large, and they grow pretty rapidly over time, and in fact one may argue that even at one point of time their membership is not finite. Furthermore, they consist of content words; that is they contain words that reflect our conceptualization of the world. Note, that the category P is a bit of an outsider in some respects it is nevertheless generally considered to be a lexical category.17 The 16 For instance in describing the declension of Russian masculine inanimate nouns it is easiest to say that the nominative, genitive, dative, accusative, instrumental and locative endings are respectively: -Ø, -a, -u, - Ø, -om, -e, yielding stol, stola, stolu, stol, stolom, stole for the noun stol='table'. 17 The number of core prepositions is quite small, suggesting that it is functional; however, many languages allow quite productive extensions of this class, which makes the class lexical. Many prepositions have purely grammatical uses as in the on in John depended on his luck, however, on in John put the chair on the table clearly reflects a relational concept. 21 set of lexical categories is universal in the sense that no language has more lexical categories; but some have less. Functional categories are word classes such as determiner (D), and complementizer (C). Determiners are words such as the, a, this, that, some, every, etc. Complementizers are words such as that, whether, for, if, because, etc., which introduce subordinate clauses. Members of functional categories either express relations between parts of the sentence (e.g. the complementizer that signals that what follows is a subordinate clause), or express conditions on the use of a phrase. For instance, if I am reporting to someone I saw a man in the garden, the indefinite article of a man expresses that this man is new in some sense, has not been talked about before. The definite article in the garden expresses that the garden is familiar. Functional categories are closed classes. The number of elements they contain is definitely finite, in fact rather small. They are also quite stable over time. The complete universal inventory of functional categories is a matter under debate. Determiners and complementizers are uncontroversial, and so is a third functional category Tense (T). Tense is a bit special in that in many languages its members are all bound morphemes attached to the verb, and expressing Tense and Modality. So, in English the –ed of worked is a member of Inflection, and so is –s in works. The same holds true of tense endings in Dutch, but also in languages such as Russian or Latin. On the other hand, English auxiliaries such as can, will, must and the various forms of the "proverb" do are also members of the category T. 3.5. Lexical structure As we saw, words have internal structure. They can be analyzed into smaller components, namely individual morphemes. From the perspective of a grammar, as an explicit procedure to describe a language, we can envisage a process that forms words from the smallest units by an operation of concatenation. The question is then whether just anything goes, or whether there are restrictions on this 22 operation, and furthermore, whether concatenation is a binary operation or one that just glues an arbitrary array of elements together. That is, if we have a word that is composed of, let us say, three morphemes a, b, and c (in that order), there are the following logical possibilities: i) all three morphemes have the same status within the whole, indicated as [a-b-c]; ii) a and b form a unit to which c is attached, indicated as [[a-b]-c], or iii) a is attached to the unit b-c, indicated as [a-[b-c]]. In actual fact, option (i) is never realized; in all cases where three or more morphemes are put together we can see clear indications of a hierarchical structure. Consider for instance, the word in-destruct-able. In- is a negative prefix that only attaches to adjectives. Destroy is not an adjective but a verbal stem. So, it could not combine with in-. Adding -able to the stem (changing the stem into destruct- and – able into -ible) creates an adjective (meaning being able to be destroyed), this in turn may be prefixed by in- yielding not being able to be destroyed. So, the structure is [in[destruct-able]]. To this expression one may add the suffix –ity, which applied to an adjective, yields an abstract property. So, [[in-[destruct-able]]-ity], realized as indestructibility is the property of not being able to be destroyed. In cases such as these the order of attachment is forced by the requirements each of the elements puts on its environment: -able requires a verb, in- requires an adjective, so the adjective must be formed first, etc. (Similarly in the case of unhappiness, its structure being [[un-happi]-ness], rather than [un-[happi-ness]]). The order of attachment is generally reflected in the interpretation of these complex expressions. It follows that concatenation is best viewed as a binary operation, putting morphemes together into words. A certain selectiveness in terms of what can be combined with what is the rule rather than the exception. It is easily seen that many combinations are ruled out: (17) a. b. c. d. *happi-un-ness *woman-able *man-ity *develop-hood Such properties must be reflected in the information contained in the lexical entry. Thus, a lexical entry must contain information about its 23 form, its (contribution to) interpretation, and its combinatorial properties. Words can also be formed by combining free morphemes, as for instance in history teacher, front door, rest-room attendant, black board eraser shop, gold fish watcher, black board eraser shop attendant, etc. The possibility to combine nouns into new ones is rather unrestricted, and more limited by stylistic considerations than by grammatical ones.18 The order of combining such nouns is not in general grammatically restricted. Difference in the order of composition gives an intuitively plausible account of the fact that some three- or more membered compounds are ambiguous. A well-known example is California history teacher which can either mean 'history teacher from California' or teacher of Californian history. Other examples can be easily constructed (e.g. top school teacher, ambiguous between 'teacher at a top school' versus a 'top school-teacher', etc.). Such differences in interpretation can be straightforwardly rendered as differences in hierarchical structure in turn following from the order of concatenation: (18) a. b. (19) a. b. [top [squadron commander]] [[top squadron] commander] [California [history teacher]] [[California history] teacher] The idea that hierarchy reflects what is combined with what for purposes of interpretation has turned out an important tool not only in morphology, but also for syntax and the way syntax relates to the interpretative system. To cite a well-known example, cats chase mice does not mean the same as mice chase cats. Yet the words are exactly the same. Thus, in order to interpret a sentence we need not only the meaning of the words, but also information about the way they have been combined. Hierarchy also plays a role in accounting for the ambiguity of sentences like John hit the dog with the stick. Under one interpretation it is the dog with the stick that is hit by John (with whatever instrument), under another John uses the stick to hit the dog. Structurally: 18 Usage may differ across linguistic communities; in Dutch and German very long compounds are not at all uncommon, unlike what is found in English (although the famous Dutch hazewindhondshalsbandslotsleutelgaatjespennetjesmakersjongensgereedschapsmandje may seem to overdo it a little). 24 (20) a. John [hit [ [the dog] [with the stick]] b. John [[hit [the dog]] [with the stick]] 4. The computational system Just like morphemes combine to form words, words combine to form larger, syntactic, structures. So, syntax is the combinatorial system used by human language, also called the 'computational system of human language', abbreviated as CHL. A standard assumption is nowadays that just as in morphology, concatenation in syntax is binary, yielding binary branching structures throughout. In current syntactic literature the term Merge is mostly used instead of concatenate. Here, we will be using both terms. As in morphology not every combination of words gives a wellformed structure, as can be easily assessed. Some of the restrictions can be viewed as purely formal: a determiner such as the, a, every, combines with a nominal element, not with a verbal element. A complementizer such as if, that, whether, combines with a clausal structure, and not with a nominal structure, etc. Other restrictions are directly related to more conceptual aspects of meaning, as reflected in the contrasts between the expressions in (21): (21) a. b. (22) a. b. John loves Mary ??The brick loves Mary John opened the lock/the key opened the lock ??Serenity opened the lock One would expect the oddity of the b-cases not to be part of the grammar of the individual languages, but rather of the way in which our semantics reflects properties of the world as we conceptualize it. Many of these restrictions are intuitively quite straightforward for anyone with some knowledge of sentence analysis in traditional grammar. However, the formal restrictions must somehow be incorporated in the computational system. As in the case of morphology these restrictions are best encoded as lexical properties of the elements involved. If we have a binary process of concatenation, it should in principle be possible to locally check whether the relevant 25 condition is met. The semantic restrictions must follow from the way in which the meaning of the whole is computed from the meaning of the parts. An important property of the syntactic system is that it allows recursion. It is easy to see that recursion in language is in fact all over the place. Verbs take sentences as arguments which contains verbs having sentences as arguments, nominal expressions can be modified by relative clauses, which in turn contain nominal elements modified by relative clauses, prepositions take arguments that may in turn contain prepositions, etc. Suppose one has a set of instructions as to how to put together well-formed sentences, like John told me a story, he gave a book to Mary, he told Bill a story, he told Cindy a story, etc. It is clear that we have a different sentence, any time we replace the indirect object me, Bill, Cindy, by another noun. However, we can also replace a story, for instance by a lie, but, crucially, also by a sentence, or more precisely, by that followed by a sentence. This may give, for instance, John told me that he gave a book to Mary. But, of course, also John told me that he told Bill a story. Replacing a story, we get John told me that he told Bill that he gave a book to Mary. Suppose, John is a real bore, so, he may start telling me that he told Bill that he told Cindy that he gave a book to Mary. Nothing prevents that he told the same thing to Cindy, namely that he told Bill that he told Cindy that ….. etc. So we get John told me that he told Bill that he told Cindy that he told Bill, that he told Cindy, ………..that he gave a book to Mary. Note, that this not a somewhat contrived assumption about language I am using. John may just be an absolute bore, who is rambling on virtually indefinitely. I am just using the recursive property of language to report this to you. So, while carrying out the instructions for forming sentences, the same set of instructions can be invoked again, before finishing the first run, the second run, the third run, etc. A very simply example, is one that involves recursion of modification: a nice kitty, a nice, nice kitty, a nice, nice, nice, kitty, a nice, nice, nice, ……., nice kitty, etc. Again, a bit boring, but no principle of grammar is violated. It is easiest to break up the task of specifying the structure of wellformed sentences into a number of smaller sub-tasks, following the traditional division of the sentence into parts of speech. We will obviously have to limit ourselves to a summary of the main facts about the syntactic structure of sentences and their components, as they are accepted and used today. We saw above that natural languages make use of (at most) four lexical categories: N,V,A,P, and in addition a number of functional 26 categories, including D, I, and C. In the Indo-European languages (from English to Dutch, French, Russian, etc.) all four lexical categories are instantiated, and in our discussions we will mainly restrict ourselves to the three functional categories introduced so far (where necessary we may introduce some additional ones). 4.1 Basic principles of structure building with the category V for a starter It will be clear that if concatenation is a binary operation, there will be little difference in structure between the expressions in the various categories. This becomes most transparent if we take a very abstract example. Take an arbitrary verb, call it V. We know that verbs can take direct objects, as in love Mary, love the cat, love the cat that jumped out of the window yesterday, etc. That is the verb can combine with a simplex expression straight from the lexicon, or with an expression that has itself been derived. Let 's call it a, putting off the question of its properties. Now we can represent the result of their concatenation as in (23): (23) V a There is a traditional insight that if you combine a verb with an object, you end up with something verbal (the predicate), and not nominal, even if the object is a noun. This is indeed correct, and adopted by contemporary syntactic theory. So the result of concatenating V with a is verbal. This must be expressed in the system. So, (23) is not enough (note that generally we represent concatenation by juxtaposition, although we could write V+a if we wanted to be entirely explicit.) We express that the property of being verbal is inherited by the resulting structure by labeling this structure as V, to be expressed as in (24a) or (24b) (these are equivalent notations): (24) a. [V V a] b. V V a 27 This notation expresses that V and a belong together: they form a constituent, and the category of the constituent is V. Another way of expressing this is by saying that V projects. The resulting structure is also called a V-projection. It is customary to use a distinct notation for the verbal element as it is grabbed from the lexicon and the verbal element resulting from concatenation. To this end the notation V' (Vbar) is used for the complex expression, as in (25): (25) a. [V' V a] b. V' V a Verbs, of course, do not only have objects, they also have subjects, as in He loves Mary. To put it abstractly, just like V concatenates with a, it must also be able to concatenate with b. However, given binary branching, the option of directly concatenating with V is already taken by a. The remaining option is that b concatenates with V'. Again the representation must indicate whether or not the resulting structure is verbal. It can be shown to be verbal indeed; however, yet to be distinguished from the V'. That it is verbal but yet higher in the hierarchy is indicated by the label VP (Verb Phrase), as in (26): (26) a. [VP b b. [V' V a]] VP V' b V a Since V determines the nature of the whole constituent, it is called the head of the constituent. a is called the complement of the head, since it is concatenated directly with the head, b is called its specifier (being concatenated one step higher). What we have found here, is the basic trivial pattern of linguistic structure which is the same for all categories: take a head, combine it with a complement, add a specifier19. In (26'a, b and c) this is 19 Note that (26') represents the existing options. If a verb expresses a concept that does not 28 expressed for each of the other lexical categories. In (26'd) its general character is expressed by using a variable X, that ranges over categories: (26)' a. b. c. d. [NP b [N' N a]] [AP b [A' A a]] [PP b [P' P a]] [XP b [X' X a]] The principles of structure building in their simplest form can be summarized as follows (using the metaphor of an 'assembly line'): i) access the store (lexicon) and take out a simplex part a; (a is a head) ii) select from the store or assembly line a part b (simplex if taken from the store, already assembled if taken from the assembly line) that fits with a (is selected by a); (b is the complement of a) iii) put a and b together (merge a and b) as an a-type part. iv) put the unit back on the assembly line for later use, or v) select from either store or assembly line a part c (simplex or complex) that fits with a (is selected by a) vi) put the a-type part and c together as an a-type part; (c is a specifier of a) vii) at this stage the unit can be put back on the assembly line viii) access the store and take out a simplex part d; (d is a head) ix) put d and an element e from the assembly line together; if d selects e the result is a d-type part; x) etc. - If two elements a and b are merged, which of the two projects is determined by which of the two selects the other: the element that selects projects. - If an element is taken from the assembly line, it can only be put back if some part has been added to its exterior (no adding to its internal structure) require, or even allow for a complement (for instance the verb rain) there is no need to select a complement. 29 - If a new head is concatenated with the structure that has been formed up to a certain point, and it selects that structure the new head "takes over" (projects) Later on we will discuss some possible modifications of this simple picture, but still staying quite close to it. Let's continue the discussion of the category V. The verb loves in our example is marked for Tense. A basic verb form in the lexicon does not carry Tense. Whether or not the verb has Tense is an independent choice as can be seen by the fact that English verbs may lack a marking for Tense (and agreement), but be used in their infinitival form with to, as in Cindy expected John to love Mary or even without a marker, as in John saw Mary work hard. The fact that a sentence has Tense is expressed by the functional category T. T concatenates with VP and takes it as its complement. Thus, if a tensed sentence is to be formed, the next step is (27): (27) a. [T' T [VP b b. [V' V a]]] T' T VP V' b V a (27) illustrates what we said earlier, namely that if a new head is concatenated with the structure that has been formed up to a certain point, the new head takes over: the head T concatenated with VP yields a T-category, no longer a V-category. Importantly, only heads have the property of changing the projection type. As a rule of thumb we can say that the notation VP indicates that the V-projection stops at that point. V' indicates an in-between level of projection. As already indicated in (26') this generalizes over categories: for any category X (be it functional or lexical) we can distinguish between the level of the head X itself (also called X0, or X-zero) the X'-level (=X with its complement), and the XP-level (X' with its specifier at the point where the projection stops, which is always, except, of course, at the top of 30 the sentence, the point where another head takes over). This general pattern can be represented as in (28) (28) XP Specifier X' X0 Complement This general schema has been argued to apply across the board to all syntactic categories. But, as we have seen at `the beginning of section 4., there is more to linguistic structure than just this schema. Although the X-X'-XP notation is still in use, in current theory its use is a matter of convenience rather than of principle. For instance, whether we have a V, a V' or a VP can be read off the structure. So, strictly speaking it is superfluous to express it on the label itself. 4.2 Local dependencies 4.2.1. Heads and arguments: s-selection and c-selection It is intuitively easy to understand why the verb love combines with an object and a subject: the underlying concept specifies that love is associated both with an object of love (or a source of the love) and an experiencer of that love, and it is also our knowledge of that concept which underlies our feeling that there is something odd with the brick loves Mary. Similarly, we find John opened the lock/the key opened the lock. The sentence ??serenity opened the lock is a bit strange, however. (Although quite possible in a fairy tale.) Such selectional restrictions are expected to be relatively constant across languages, but sensitive to the type of world depicted. Since they are closely linked to the meaning of the elements involved, this type of selection is also called s-selection (from semantic selection). The grammar expresses these restrictions in the form of thematic roles that predicates assign to their arguments and the properties arguments must have in order to be able to carry a certain thematic role. Typical examples of thematic roles are: - agent, e.g. John in John hit the ball 31 - instrument, e.g. a knife in John cut the salami with a knife - cause, e.g. the wind in the wind opened the door - experiencer, e.g. John in John worried about his health, but also Mary in John gave Mary the creeps - goal (sometimes benefactor) e.g. Mary, in John gave Mary a book, Boston, in John went to Boston - theme/patient, e.g. the cat in John kicked the cat, or the couch in John moved the couch - source, e.g. the police, from prison in Max escaped the police /from prison As can be easily determined, agents and experiencers are typically animate; causes, instruments, etc. need not be. As these and the following examples show, only certain roles are associated with a fixed position. Some roles are obligatory, others are optional, but often if one role is omitted, one of the remaining roles gets linked to a different position. Below I just give a number of illustrative examples: (29) a. b. c. d. - John opened *(the door) - The key opened *(the door) - The door opened (*the key, *by John) - *Opened the door - John worried (about his health) - His health worried *(John) - John moved the car - The car moved - John broke the vase - The vase broke Recent work has made considerable progress in unearthing the system underlying this variation (e.g. Reinhart 2002). For reasons of time and space I will leave at these few remarks. The difference in the position in which an argument is realized can also be accompanied by a specific morphological change in the verb and the auxiliary. This is the case in what is known as passive, and illustrated in (30): 32 (30) a. b. c. d. -John discovered *(Mary) - Mary was discovered (by John) - John fed the cat - The cat was fed by John - John gave (Mary) *(a book) - Mary was given a book (by John) - John depended on Bill - Bill was depended on In passive we see a systematic combination of three factors: i) the verb is in participial form ii) there is a form of to be as a passive auxiliary iii) the object shows up in subject position Cross-linguistically it is usually the direct object that is shifted; in English (sometimes also in Dutch) an indirect object can be passivized as well (30c); English also allows passivization out of prepositional phrases. In addition to restrictions that follow from the underlying concepts and the ways they can be realized, there are also restrictions that are of a formal nature: the category of the argument is selected as well. This is called c-selection (from category-selection, also called strict subcategorization. For instance, in English the object of love is realized as a DP as in John loves Mary. In Dutch its translation equivalent is John houdt van Mary = litt. John holds of Mary. Dropping the van gives a correct Dutch sentence, but with a different meaning. So, in Dutch houden in this sense c-selects a PP, in English its equivalent c-selects a DP. Similarly, in English we find John died for hunger; its Dutch equivalent is Jan stierf van de honger. In general we find considerable cross-linguistic variation in such requirements that are formal, and cannot be easily related to properties of the conceptual system (think of the difficulty finding the right preposition for prepositional complements in another language). These restrictions must somehow be encoded among the properties of the head that carries them. 33 Also the relation between functional heads and their complements is one of selection. For instance, T in (27) selects a Verbal category (V/VP), not for instance an NP or PP.20 Summarizing: heads - assign thematic roles to arguments - carry s-selectional restrictions on arguments - carry formal restrictions on arguments (c-selection) Going back to the general schema of (28) it is important to note that there is a principled distinction between the scheme itself and the way the positions it defined are filled. (28) XP Specifier X' X0 Complement It is the properties of the computational system in its barest form that determine the scheme (or a similar alternative); it is the lexical properties of the elements inserted in the positions defined that determine to what extent a structure makes sense.21 4.2.2. Heads and arguments: Case and agreement Heads do not only put requirements on the content or the type of category they are merged with, but they sometimes also impose 20 We will leave aside the question of how precisely this type of selection is motivated, formally or conceptually. 21 This contrast between general computational mechanisms and properties of lexical elements has been argued to also have a neurocognitive basis (Ullman 2001) in that: - the computational system involves procedural memory - the lexicon involves declarative memory This provides exciting possibilities for further convergence between linguistic and neurocognitive research. 34 morphological requirements. That is, they require that their complement or specifier carry a certain inflection. Case and agreement are put here together because they are similar in many important respects, and accordingly many analyses treat them identically (although there are also differences and other researchers conclude that they are therefore not to be completely identified; I will leave this issue open). Well-known examples are from languages varying from Latin to Russian or German in which object DPs carry accusative Case, indirect object DPs carry dative Case and subjects carry nominative Case, as illustrated in (31): (31) DerNom Mann hat demDat Mädchen einenAcc Brief geschrieben The man has the girl a letter written In such constructions the Case is determined by properties of the head with which the DP has been merged. A very similar type of dependency is agreement. A subject agrees in person and number (sometimes also gender) with the tense marked verb, as in (32): (32) DerNom Mann3rd sing hat3rd sing /*hast2nd sing demDat Mädchen einenAcc Brief geschrieben The man has/*have the girl a letter written An adjective agrees in Case and gender with the Noun it modifies, as in (33): (33) DerNom Mann3rd sing hat3rd sing einenAcc, masc schönen Acc, masc Brief Acc, masc geschrieben Although Dutch and English are very impoverished in Case and agreement, there are some rudiments (more in Dutch than in English). The pronominal system still has some Case distinctions (I/me, he/him, she/her, we/us, they/them; ik/mij, jij/jou/je, hij/hem, zij/haar, wij/ons, jullie/je, zij/hun/hen), the tensed verb has forms with and without the –s in English, and Dutch distinguishes forms with –t, and –en. In Dutch there is also some marginal adjectival agreement left (with or without -e). All these dependencies can be checked within the basic configuration of (28), i.e. either in a head-complement or a head-specifier configuration, i.e. we can call them strictly local. Not all dependencies are strictly local, as we will see in the next section. However, we will also see that all dependencies are local in an extended sense. Before moving to the next section let's note one more property of the structures as defined so far that will turn out to be important: 35 (34) The binary character of the Merge operation leads to an asymmetry in linguistic structures: specifiers are higher in the structure than both heads and complements. Or: internally to one chunk of structure the order of Merger can be read off the structure. This asymmetry plays a crucial role in conditions on grammatical dependencies. It is this asymmetry that is crucial for the extended notion of locality that is relevant for grammatical principles. 4.3. Dislocation As a start of our discussion of what has been called the dislocation property of natural language, note that there is something remarkable about the structure of (27). Even though T is concatenated with VP to its left, the order in which T, subject and verb are pronounced is not immediately reflected in the structure. The order of concatenation would led us to expect the order T b V a (-s he love Mary) what we find is (35): (35) He loves Mary That, we find the order Subject-Verb-Tense-Object (b V T a). In a sentence in which Tense is not realized as a bound morpheme (-s) but as a free morpheme such as will, the order is still one with the subject to the left of T, as in (36): (36) He will love Mary Both (35) and (36) illustrate at a micro-level an important property of natural language, namely, the dislocation property: sometimes elements are realized in positions that are different from the positions where they are first merged (and where other family members do show up). For instance, if T is an auxiliary it does show up to the left of V, as expected if VP is its complement; however, the subject does not show up on the right of T, but rather to its left. If T is a bound morpheme it is attached to the right of V, and hence again to the right rather than to the left of the subject. In order to understand the latter 36 fact, it suffices to see syntax as the primary source, and morphological spell-out as a process that overlays it. We can represent this as in (37): (37) a. syntax: b. morphology: [T [Subject [V Object]]] Subject V-T Object So, in the realization the morphological requirement that a bound morpheme realizing T must be attached as a suffix to the verb wins out over other requirements. However, syntactically and semantically the T is just as external to the VP as an auxiliary that is effectively realized externally. This is just a first example of the fact that linguistic structures occasionally have to meet conflicting requirements. A standard way of doing so is by what is metaphorically called 'movement'. Movement expresses that one and the same element is active in two different positions. This brings us to the second issue, namely the position of the subject. From a conceptual point it is clear that he in He will love Mary realizes the experiencer of the love for Mary. As such it is independent of T (note that the restrictions on the subject of love do not vary with the choice of T). Yet, appears that we have to concatenate he with a constituent headed by will rather than with the one headed by love. It has been found, however, that the requirement for he to occur to the left of T, must have an entirely different reason, unrelated to conceptualization. The point is that languages such as English and Dutch require some element to appear in that position even if there is no conceptual reason for it. That is, both English and Dutch Tensed sentences require a so-called expletive subject, even when it carries no meaning. (38) gives a number of examples of sentences with an expletive subject (indicated in bold): 37 (38) a. It will rain22 b. There arrived a man c. (Ik denk) dat het zal regenen (I think) that it will rain d. Er heeft iemand gisteren een glas gebroken There has someone yesterday a glass broken' Let's for the moment be content stipulating that this is so. In many languages tensed sentences require that a (nominal) element is concatenated with the phrase headed by T. In cases like (38) a pronominal element (it, there, het, er) with very marginal semantic properties is selected from the lexicon and concatenated with the Tphrase to fulfil that requirement. The basic intuition underlying current work is that in cases such as He will love Mary the requirement can be met in a more economical way, namely by 'reusing' an element that already is part of the structure, as illustrated in (39) and (40): (39) a. [TP b [T' T [VP (b) b. [V' V a]]]] TP b T' T VP V' (b) (40) V a TP He T' T will VP V' (he) V love Mary So, the element b is taken from the position in which it was first concatenated and concatenated anew in the specifier position of T. It 22 A good test is the impossibility to question. So it is weird to ask *What will rain? Or *Who/what did a man arrive? 38 is important to note, that, the interpretive system still has to treat b as if it were in its original position. b truly has a dual role: it is interpreted in its 'original' position, but also makes it possible to meet a formal requirement in the most economical way. In fact it can be argued that this formal requirement has its basis in a further interpretive requirement. We will come back to that at the end of this section. A common assumption nowadays is that this property of dual use can be best captured by leaving a copy in the original position. In this introduction we will indicate the fact that it is a copy by putting it in brackets. Informally, being put between brackets means that the element performs a role in that position, but is not accompanied by an instruction to the sensori-motor system to pronounce it there. This is necessary in order to avoid generating strings such as *He will he come, which could otherwise be derived. Going back to our initial concern, note that the T-projection and the V-projection, different as they are conceptually, are identical in structure. They both exhibit the standard X'- structure: (41) XP Specifier X' X0 Complement As can be easily seen in embedded clauses, TP may be concatenated with yet a further functional head: (42) (I noticed) that John loved his cat That is one member of a class of 'clause introducers' or complementizers, abbreviated as C. Other members of the class are if, whether, as, though, while, etc. Such elements indicate that a clause is part of a larger clause, and they indicate how the clause they head is to be connected to the higher clause (as a specifier or complement, and if so as an assertion (that) or a question (if, whether), or as an adverbial 39 clause used as a time adverbial (while), a conditional (if; ambiguous!!), an adversative (though), etc. Just like the category T, also C has a full-fledged projection following the pattern of (41). So, building this structure yields (43): (43) a. [CP ---- [C' C [TP b [T' T [VP (b) b. [V' V a]]]]]] CP ---- C' C TP b T' T VP V' (b) V a Unlike the specifier of T, the specifier of C, that is, the position indicated by ---- in (43) is often empty in languages such as English or Dutch. It is the position typically occupied by question words, as in (44): (44) John was wondering whom he loved Whereas in English the presence of a question word causes the C-position to be empty, in (slightly substandard) Dutch one can easily see that the question word occurs to the left of the complementizer: (45) Jan vroeg zich af wat of Marie vertelde litt: Jan asked himself what if Marie told Such sentences once more illustrate the fact that an element that already has been concatenated to the structure once may be moved out and reused: whom in (44) functions both as the object of loved and as the element up-front that signals that there is a question about that object. The same holds true of wat in (45). This can be expressed by extending the structure of (42) by one more step as in (46): (46) a. [CP a [ C' C [TP b [T' T [VP (b) [V' V (a) ]]]]]] 40 b. CP a C' C TP b T' T VP V' (b) V (a) As discussed, in our abstract structures so far, we indicate a situation in which a word (or phrase) has been moved out and reused by putting it in brackets in the position of its first merge. Similarly, we can use this notation also for the real language examples of (44) and (45), as in (47): (47) a. b. John was wondering whom he loved (whom) Jan vroeg zich af wat of Marie (wat) vertelde Jan asked himself what if Marie told Instead of doubling, most of the literature still uses a notation, in which the position of first merge is occupied by a trace, indicated by t. Since often in a structure more than one element has been reused, the trace and the reused element carry a subscript, an index telling what traces are related to what element. Using this notation, (47) comes out as (48): (48) a. b. John was wondering whomi he loved ti Jan vroeg zich af wat i of Marie ti vertelde There are potential differences between a theory that simply reuses what has been inserted, a theory that copies what has been inserted, and a theory in which movement leaves a trace. Here is not the place to enter into an extensive discussion. From the perspective of the internal structure of linguistic theory the differences are quite subtle. However, it may well be the case that from a processing perspective the choice is much easier. But this is a matter of further investigation. If so one may wonder about the situation in clauses that are not embedded, so-called root-clauses. Compare (49a) with (49b): 41 (49) a. b. John loved Mary Whom did John love (whom) Whom is up front in its clause, as in (44). Using a methodological principle saying that structures are uniform unless proven to be different, whom should be in the specifier of CP as in (44). The question is then, where is C? Since if and whether not only express that what follows is a question, but also that what follows is subordinated, they cannot be used. How then can C be identified? Note that in addition to whom being up front, also the position of T has shifted. In (49a) T is realized on V as discussed earlier, in (49b) we see that the verb is without Tense, and that the sentence contains an auxiliary, did, that appears to have no other function than that of bearing Tense and that occurs to the left of John. A standard assumption is that in (48) not only whom plays a dual role, namely that of expressing the object of love, and expressing that it is this object that is being questioned, but that also did does: it expresses T and also has the role of filling the root C. As a consequence, we arrive at a structure to be represented as in (50): (50) a. [CP a [C' [CT] [TP b [T' (T) [VP (b) b. [V' V (a) ]]]]]] CP a C' C T TP b T' (T) VP V' (b) V (a) Using the trace notation we obtain the following conventional tree representation of (49b), where the indices indicate which elements go together : 42 (51) CP whomh C' C TP didj Johni T tj T' VP V' ti V love th What, then, about (49a)? Does it also contain a C-domain, or is it just a TP? One could of course think of arbitrary solutions, such as i) in most cases we know of there is direct evidence that root clauses are CPs, hence also those without direct evidence for should be so classified; or, alternatively, ii) we don't see any incontrovertible CP material, hence it must be an TP. At the moment the point seems moot; whether or not there is a real issue will ultimately depend on the nature of the relation between syntax and the interpretative system. The status of the functional structure may also approached from the following perspective. Suppose we have a simplex verb such as kiss, and two arguments, Cindy and Tom. With these three elements what one may call the predicational core (i.e. what does who to whom) of the sentence Cindy kiss Tom is fully specified. However, at this point we still have an incomplete sentential structure. In order for a predication to be properly semantically evaluated more information is needed. Informally: one needs to know for which coordinates in space/time it should hold and with which degree of certainty. This is the type of information that is encoded in the T-system. Note further that the triple (kiss, Cindy, Tom) is not only the predicational core of the assertion Cindy kissed Tom, but also of the question of whether Cindy kissed Tom. That is, in order to have a complete sentence, not only the Tense of a sentence must be specified, but also its Force: is it an assertion, a question or a command? This is what happens in the C-system (nowadays by some authors explicitly called Force-system). So, in a complete sentence, the core predication is contained in two layers, or 43 shells: [Force [ Tense/mood [ Predication ] ] ]. In order for the derivation to be completed, also values for Tense and Force have to be specified by selecting appropriate functional heads.23 Dislocation, or dual use is then cause by the fact that certain elements not only play a role in the core predication, but also in the Tense of Force systems). Here, we will occasionally and informally omit the CP layer in simple declaratives without attaching any theoretical significance to this. It is important to note that our discussion so far was based on rather abstract considerations and on one of the simplest of verbs, namely the two-place predicate love. What was said about the T-system and the C-system is entirely independent however, of the verb selected. Any verb, be it give, sell, exchange, promise, request, will be embedded under the same 'functional shell'. Their differences will be reflected in what we see inside the VP. Whereas the verb love, just like verbs such as hit, has a rather simple argument structure, with a subject and an object, such verbs also allow adverbial modification. So, we can say John will probably love Mary or The bullet yesterday hit the vehicle. From the perspective of the simple structural model given in (41) there appear to be two ways in which these and other modifiers could be introduced into the structure. Let us note for a starter, that modifiers cannot be introduced as specifiers or complements of a lexical category. They are not the type of elements that receive a semantic role such as agent or patient. One way is to analyze them as belonging to the functional structure of the clause. If so they are treated on the same footing as Tense, or C and treated as elements that merge with a VP (or TP if higher in the structure). A proper term covering adverbial elements such as possibly, probably, certainly is Modality, abbreviated Mod. In accordance with the X'- schema discussed earlier, it should have a complement (VP in (52), and may have a specifier. Possible candidates for specifiers of Mod are phrases such as with absolute certainty, with great probability, etc. since, being phrases, such expressions cannot occur in the head position of the ModP. This possibility is illustrated in (52) (note that the specifier and the head cannot be filled at the same time.) 23 Traditionally, the Force domain is represented by one functional head, namely C, and the T-domain by either one (interchangeably called I or T) or two (T and AgrS, i.e. the carrier of subject agreement). Currently, many researchers take it that the number of functional projections in these domains may be substantially larger. Taking T to stand for the Tense domain and C for the Force domain, allows us to abstract away from these issues. 44 (52) TP T' Johni T will ModP Mod probably VP ti V' V love Mary This line, proposed by Cinque (1999), gives rise to a great many functional projections (see Nilsen 2003 for discussion and an alternative). A fairly traditional approach takes it that the X'-schema of (41) is just a little too restricted. Instead of the two relations, specifier and complement a third relation is admitted, namely that of adjunct. Adjuncts are typically inserted in the middle part of the projection, but can also occur at the top part. These configurations are illustrated in (53): (53) XP Adjunct XP Specifier X' Adjunct X0 X' Complement Under that approach probably or yesterday could be analyzed as VPadjuncts, as in (54): (54) TP T' Johni T will VP VP probably ti V' V love Mary 45 In the light of recent developments sketched in Nilsen (2003) one may conclude that both approaches are actually closer than one might initially expect. Here, we will be primarily using the adjunction notation, mainly for expository reasons, because in a number of cases it yields more intuitive structures. 4.4. The category N Nouns determine a class of structures that canonically fit into slots that are traditionally termed subject, direct object, indirect object, prepositional complement. This class of expressions varies from simple pronominals, such as I, me, he, him, etc. to proper names such as John, Mary, Henry VIII, and complex expressions such as Elmer's pet porcupine, the cat that danced on the mat while the dog got too fat, etc. The canonical positions where we find these elements are generally called argument positions, and upon interpretation such expressions generally (but not always!!) function as arguments in the predicate logical sense of some predicate (expressed by a verb, adjective, of preposition). As in the verbal projection one can distinguish a lexical layer and a functional layer within these expressions. To take a very simple example: in an expression like the cat, the noun cat represents the lexical layer, the determiner the represents the functional layer. The noun cat is called the lexical head of the cat, the big beautiful cat, etc. The determiner the is the functional head of the expression, as in (55) and (56): (55) [D' [D the] [N cat]] Just like T in the verbal domain links the core predication to a set of coordinates in space/time, D relates the noun set to other relevant sets, and expresses notions such as (in)definiteness, etc. The lexical layer is expandable, as in the big beautiful cat of the neighbors, and also the functional layer is, as in all the many cats. Of course these expansions can be combined as in all the many big beautiful cats of the neighbors. Let's discuss (55) in some more detail. (55) is the result of merging the and cat. The and cat are both heads. Hence, in order to determine the structure of the cat we have to 46 determine which of the two projects. Just like T selects a verbal projection, D selects a nominal projection. Thus, the projects and cat is its complement as indicated. According to the principles of the X'-system a D together with its complement forms a D'. However, what is the status of cat? It is an N, but the projection line stops, hence it is also maximal, i.e. an NP. The solution is to say that being an XP is not an intrinsic property of a constituent, but determined by its environment. Thus, cat is an N that functions as an NP. Or, to return to the metaphor of the assembly line, cat is a part that is ready for use as a complement of D. The same question arises about the D'. It is a D', but the projection stops there. If we put it in a sentence, as in John saw the cat, a V-projection takes over. Hence it is maximal and functions as a DP. So, we could represent the structure as in (56): (56) a. [DP [D the] [NP cat]] b. DP D the NP cat Accordingly, expressions such as the cat, the big beautiful dog, every impoverished member of the Russian nobility are all DPs. But, if we take the idea that the bar-level can be read off the structure seriously, we can also notate the structure as in (57) without loss of information: (57) [D [D the] [N cat]] For convenience sake we will mostly use the more entrenched notation with XP, etc., and generally simplify the structures as much as possible. We have already seen many examples of DPs with modification by adjectives and prepositional phrases. We come back to their internal structure in the next two paragraphs. For the moment it suffices that just like nouns and verbs these heads project as phrases, which are referred to as AP and PP respectively. As modifiers they belong to the N-part of the projection. As is shown by DPs such as the ugly big bad dumb wolf with the rotten teeth from across the mountains both the number of modifying APs and the number of modifying PPs are in principle unlimited. If an AP left merges with an N-projection we get an N-projection; if a PP right-merges with an N-projection we again 47 get an N-projection. There is no structural limit to this process, nor does interpretation impose any limitations. Thus, the structural pattern is as indicated in (58), with --- and … standing for slots that can be filled : (58) DP Spec D' D NP (Spec) N' --- N' N' … N' … --N' N0 Complement The complement position is taken by elements that just like arguments of a verb receive a semantic role from the noun, like, for instance, Carthage in the destruction of Carthage, or the CP that history is bunk in the DP the claim that history is bunk. Note, that not only APs and PPs can be modifiers, but also clauses, in the form of relative clauses (the cat that I liked, the claim that Henry Ford made). An analysis of relative clauses would lead us beyond the scope of the present overview, however. What about the specifier position of N? Just like nouns are able to assign semantic roles to their complements, such as of Carthage in the destruction of Carthage, they can also assign agent roles as in Rome's destruction of Carthage. In clauses, as we saw, the subject DP is moved out of the VP to the specifier position of T. Similarly, in DPs the 'subject' shows up in the D-system, as can be seen from the fact that a determiner such as the or every and an agent cannot occur simultaneously (*the Rome's destruction of Carthage). The simplest solution is that here too the 'subject' does double duty and is moved from the specifier of N- to the specifier of D-position. The 's is taken to occupy the D-position, and the structure is as in 48 (59) with a trace in the specifier of N-position (and with the internal structure of the PP left unspecified): (59) DP Romei D' D NP 's N' ti N destruction PP of Carthage The position occupied by Rome in (59) is the same as the position occupied by John in John's cat, where John is the possessor of the cat (in so far as the cat allows this). For sake of completeness it should be mentioned that the DP has one position that is more to the left. This is the position of all in all John's cats. In line with the type of approach we found for the sentential domain, one may assume that the presence of all reflects a functional head that may take a DP as its complement. In addition to DPs like the big bad ugly wolf, we also have DPs such as these three big bad ugly wolves with a numeral in between. As we see from the fact that *these big three bad ugly wolves, or *these big bad ugly three wolves but also three these …. wolves are impossible, numerals occur to the right of determiners, but to the left of modifiers. This can be encoded in the structure by assuming that merging a numeral with an N-projection gives an expression that is no longer an N-projection. The ensuing projection is standardly analyzed as a number projection, with a Number head (Num) taking an NP as its complement. These three little pigs then comes out as: (60) DP D these NumP NP24 Num three N' AP A little N' N pigs 24 This notation hides a problem since it is not clear what motivates the structure [NP [N' AP [N' N]]. Simply following basic instructions for merger, one would expect to find [NP AP N] here. However, this notation does not differentiate between adjuncts and specifiers (and strictly speaking, not even between these and complements). One way to define a specifier is that it is a sister of an X' topped by an XP. If so, sufficient structure to encode this must be assumed. There are other ways to encode the difference between specifiers and adjuncts; discussing these would lead us too far away. 49 So far we have been discussing DPs with a head noun and a Determiner, such as the cat. Cat represents the type of noun that cannot be used as the complement of a verb by itself as shown by *John saw cat; it needs a determiner or being marked plural, as in John saw a/the cat/cats. Proper names and pronominals do not take determiners in English or Dutch. In fact, proper names combine nominal and determiner properties. This can be expressed by first merging a proper name as an N and then moving it into the D-position. Pronominals are generally taken to be just Ds. 4.5. The category A As is to be expected by now, the basic structural pattern of Aprojections is identical to that of V and N-projections. Adjectives may have complements, as her daughter in proud of her daughter. They may be modified, as in cases like the unexpectedly bright student. A large class of adjectives allows a special type of degree modification as in a very tall girl, that girl is taller than her brother, etc. Thus, APs fall squarely in the pattern of X'- structure given in (41). Some researchers analyze degree modifiers such as very, etc. as specifiers, others assume a functional category Degree (Deg) on top of the AP. Since for current purposes we may abstract away from this issue, we will simply use AP, leaving its internal structure unspecified. One further property should be noted. APs may occur within the NP to the left of the N, they may also occur in other positions, e.g. postnominally, as complements to verbs such as the copula be, etc. APs to the left of the N are said to be in attributive position. In other positions they are predicative. In English, attributive APs only allow a very simple structure. Predicative APs are not so restricted. So, we have the proud mother, the very proud mother, but not *the proud of her daughter mother. I saw the mother, proud of her daughter is fine though. In Dutch such a grammatical restriction does not apply, as shown by de op haar dochter zo trotse moeder 'litt: the at her daughter so proud mother'. 4.6. The category P Like in the case of the other categories, the basic structure of Pprojections follows from the fact that a P can be merged with a complement, such as the DP the room in into the room. In turn, this phrase may be specified by an expressions such as right, as in right into the room, or under as in under in the closet. Such structures will be analyzed as in (61): 50 (61) PP under P' P DP in D NP the N closet 5. Syntactic dependencies As discussed in the previous sections, the basic patterns of language arise from a simple combinatorial system operating on lexical elements (reflecting an underlying conceptual system) and functional elements representing instructions for their 'use'). This part of the computational system is rather trivial. Prima facie less trivial are the principles of our computational system governing the dual use of elements, that is, the dislocation property expressed by movement. For reasons of space we will limit ourselves to two types of movement, both of which we have already encountered. One is movement into the T-system, the other movement into the C-system. The former instantiates so-called A-movement, the latter A'-movement. 5.1. A-Movement As we saw in section 4.2.1, an object may show up in subject position if i) the verb is in participial form; ii) there is a form of to be as a passive auxiliary. The relevant configuration is given in (62). (62) T' T was VP V' V chased DP the mouse Due to the passive (participial) morphology on chased the latter cannot assign an agent role to its specifier. Hence the specifier stays empty. Given the fact that T needs a DP as in the case of it rains 51 discussed earlier, it attracts the only DP in the VP, namely the mouse, yielding (63) (using the trace notation): (63) TP T' DPi the mouse T was VP V' V chased ti This type of movement is called A-movement, since the DP moves into an argument position. Why do we call this a dependency? In fact there are two reasons. First, T must be able to 'see' the mouse, and attract it so it can be reused to help meet its own requirement that it needs a DP. But also very importantly, the dislocation interacts with the locality of thematic role assignment. Abstracting away from the particular mechanism we used, any person interpreting the mouse was chased, or Mary was depended on must be able to compute an interpretation in which the mouse in the mouse was chased is assigned the same role as the mouse in Fluffy chased the mouse. Furthermore, the complement position of chased cannot be filled by anything else, i.e. we cannot have *the mouse was chased Fluffy. We can have the agent in a so-called by-phrase, though, as in the mouse was chased by Fluffy. Note that a proper understanding of the passive morphology is essential for getting the right interpretation. The passive morphology represents the following instructions to the interpreter: i) do not assign to the subject the role the active verb assigns to its subject; ii) look for a gap, and assign to the subject the role that would otherwise have been assigned to an element in the position of the gap Sometimes this complex process is facilitated by semantic properties of the predicate, as for instance in the case of the apple was eaten by John. Here hard and fast superficial processing strategies will tell the interpreter that apples usually don't eat people. This sentence 52 instantiates what is usually called a non-reversible passive. The Fluffy-sentence is instead a reversible passive. Fluffy could very well be a small kitten, and thus either chase a mouse, or be chased by it. So, only the passive morphology tells the interpreter that Fluffy is chased rather than doing the chasing herself. This condition is naturally satisfied in the mature and intact language system, but it need not be met in an immature or impaired system. It is a well-known fact that both for young children and for patients with certain types of language impairment this condition is not met, and hence such speakers have considerable problems with reversible passives. A-movement is somewhat more general than discussed so far, since it can also take placed out of certain infinitival clauses (clauses without marking for Past or Present). Consider to this end (64): (64) It seems that the cat chased the mice The verb seems has as its complement the sentence the cat chased the mice. It does not assign a thematic role to its specifier. This can be seen from the fact that we have the expletive subject it. The cat is simply the subject of chased the mice. Compare (64) to (65): (65) The cat seems to chase the mice Here, we still want to say that the cat has the role of the mice chaser: that she chased the mice seems to be the case. Semantically, (65) does not differ from (64) as to what is the argument of what. This is expressed by (66): 53 (66) TP T' DPi the cat T -s VP V' V seem- TP T' ti T to VP V' ti V chase DP the mice So, here the cat does dual (in fact triple) duty as the element getting the thematic role from chase, and satisfying the requirements of the Tsystem in the higher clause (plus satisfying the requirements of the Tsystem in the lower clause). A-movement can reach even further down as illustrated in (67), where the complement of seem is a passive clause: (67) a. [TPThe mousei [T' – s [VP seem [TP ti [T' to [VP have been [VP ti chased ti]]]]]]] 54 b. TP T' DPi T the mouse -s VP V' V seem- TP T' ti T to VP V' V have VP V VP been ti V' V chased ti In our earlier discussion of dependencies we noted the local character of the dependencies such as selection, sub-categorization and agreement. (67) shows that the domain of A-movement is more extended. However, there are also important restrictions on Amovement. For one thing, it cannot cross another DP. This is shown by the impossibility of (68), although there is no semantic reason why it would not make perfect sense: (68) *Cindyi seemed that it appeared ti to be sick A rough first approximation of the relevant principle is that structures of the following type cannot be derived by A-movement (but see below): (69) DPi ......DPk....... ti A-movement and A'-movement as discussed in the next section represent a general type of dependency that is called binding. Characteristic of binding is that a certain position must be interpreted by a computation crucially involving an element in another position. So, for the language processor the value of the traces in (67) can only be computed by having recourse to the value of the moved element the 55 mouse. These traces are said to be bound by the mouse. Although in general, binding can go far down it is still local in some sense, as illustrated in (70). Intuitively, (70) expresses that the binder must be merged (reused in the cases discussed so far) with a constituent containing the bindee. The relation between XP and the bindee in (70) is called c-command ( it has various definitions, but these in principle boil down to the configuration in (70)). The c-command condition on binding is one of the most pervasive structural conditions on dependencies in natural language. (70) c-command: XPi [YP ...... bindeei...... ] Thus, reversing XP and the bindee in (70) is an impossible binding configuration but also (71) is a configuration in which no binding between XP and the bindee can obtain: (71) [ZP ...XPi... ] [YP ...... bindeei...... ] Thus, in all cases of binding (72) holds: (72) The binder c-commands the bindee Let us now turn to the type of movement we find in questions. This type of movement is called A'-movement because it typically puts an element in a position that is not an argument position of a predicate nor a position in which Case is assigned. 5.2. A'-Movement A'-movement as instantiated by question formation we have already come across. In languages like Dutch, English, German, there is a standard procedure for questioning the value of a certain position in the sentence: Merge a question word (generally abbreviated as Wh-word) in the position of which you wish to elicit the value, and link it to the Force layer of the clause by moving it there: 56 (73) a. b. c. (John was wondering) whom he loved (John was wondering) [ --- [he loved whom] ] (John was wondering) [whomi [he loved ti ] In both English and Dutch this movement is possible over an unbounded domain, as illustrated in (74) for English: (74) Whoi did you say that Bill told Mary that he was willing to bet a million bucks that she never considered to promise Cindy she would leave ti alone? As in the case of A-movement it is easy to see that question word movement creates a dependency between the source and the target positions: i) it is impossible to fill both the source position and the target position:* John was wondering whom he loved Bill. ii) In the case of A'-movement the moved element can still shows where it comes from in the form of its Case. So, the form whom in (73) can only be understood if we relate it to the object position. In English this is marginal due to the marginal Case system, and the one-way contrast of who/whom (who can be both subject and object, but whom can only be an object). In languages with rich morphological case, such as German, Russian this is entirely transparent.25 Note that it is crucial for the interpretation of Wh-sentences that the person interpreting knows at least two things: i) a wh-element up front of the clause is part of the Force layer, and must therefore interpreted as signalling a question; ii) a wh-element up front must be related to a gap (a trace, silent copy, etc.) and his computational system must be able to figure out where that gap is. Both conditions are naturally satisfied in the intact language system, but they need not be met in an impaired system. Actually, 25 It is important to note that the presence of such a dependency is entirely theory-neutral. There are other grammatical theories that do not employ the movement metaphor, but any theory acknowledges that there is a 'gap' in (72) or (73) that has to be interpreted by relating it to its antecedent. 57 it is a well-known fact that in certain types of impaired language systems these conditions are not met, and hence such patients have considerable problems interpreting wh-questions. Locality conditions on Wh-movement Just like in the case of A-movement, the interpretation of the trace/copy left by moving a question word must be 'reconstructed' from the moved element by binding. Consequently, also Wh-movement obeys the ccommand condition. That is, the wh-word must c-command its trace, as in (75): (75) c-command: Whi [YP ...... ti...... ] There are also a number of interesting restrictions on Wh-movement.26 One very general condition with no known exception crosslinguistically is the impossibility to move wh-words out of adverbial clauses. A case like (76) is true word salad in any language: (76) *whomi did John write a novel after Cindy neglected ti Facts like (76) instantiate what is generally known as island phenomena. Although island phenomena are well-described, and their general patterns are known, not all of them have yet been derived from the general workings of the computational system. A large class has been derived, however. Earlier in our discussion of A-movement we introduced a rough approximation of the relevant principle: (77) Likes cannot be moved across likes This derives the class of so-called Wh-islands. The general pattern is that Wh-words cannot be moved out of a clause headed by a Wh-word. A classical example is given in (78): (78) *[CP1 whatj did [TP1 you ask [CP2 whoi [TP2 ti read tj ]]]] 26 In actual fact, these restrictions also apply to A-movement, but this rarely shows up since other far stricter conditions prevent them from becoming relevant. 58 In order to see how this works, let's take as a starting point the situation where both Wh's are still outside the Force domain: (79) [CP2 C [TP2 whoi [read whatj ]]]]] The requirement of dual use can be encoded by assuming that C has a feature that 'needs' Wh. In order to meet this need it attracts Wh. As a result one of the two Wh-elements must move to Spec-C. Take it to be the subject, yielding (80): (80) [CP2 whoi C [TP2 ti read whatj]]]] Suppose structure building goes further, yielding (81), where also the matrix C attracts Wh: (81) [CP1 C did [TP1 you ask [CP2 whoi (C)[TP2 ti read whatj]]]] However, given (77) the matrix C is not allowed to attract what. What cannot move over who. Hence attraction of what fails and (82) cannot be derived. (82) *[CP1 whatj C did [TP1 you ask [CP2 whoi (C)[TP2 ti read tj]]]] This is just one illustration of how potential movement can be blocked by an intervening element. Such intervention effects are quite pervasive in natural language, and do not only raise the question of how precisely they are encoded in the grammatical mechanisms. If they are so pervasive they also raise the question of whether they can be explained in terms of the neural mechanisms that are involved in the execution of linguistic processes. The same holds true for this other pervasive property of language, namely that binding type dependencies require c-command. The investigation of these issues constitutes one of the important challenges facing us. 59 Literature Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht: Foris Chomsky, Noam. 1986. Knowledge of Language: Its Nature, Origin and Use. New York: Praeger Chomsky, Noam. 1986. Barriers. Cambridge, Mass.: MIT press Chomsky, Noam. 1995. The Minimalist Program. Cambridge, Mass.: MIT Press Chomsky, Noam. 1998. Minimalist Inquiries. MIT Working papers in Linguistics. Cambridge, Mass.: MIT Chomsky, Noam. 1999. Derivation by Phase. MIT Working papers in Linguistics, Cambridge, Mass.: MIT Cinque, G. 1999. Adverbs and functional heads: A Crosslinguistic perspective. Oxford University Press Deacon, Terrence. 1997. The symbolic species. New York: Norton & Company Fromkin, Victoria, et al. 2000. Linguistics: an Introduction to Linguistic Theory. London: Blackwell. Haegeman, Liliane. 1991. Introduction to Government & Binding Theory. Oxford: Blackwell Kandel, Eric, James Schwartz, and Thomas Jessell. 2000. Principles of Neural Science (4rth edition). New York: McGraw-Hill Nilsen, Oystein. 2003. Eliminating Positions. LOT Dissertation Series. Utrecht: Roquade Pinker, Steven. 1994. The Language Instinct. New York: Morrow Pinker, Steven. 1999. Words and Rules: The Ingredients of Language. New York, Basic Books. Reinhart, Tanya. 2002. The Theta System - an Overview. Theoretical Linguistics 28(3) Reuland, Eric. 2002. Binding theory. In: Nadel, L. (ed.) Encyclopedia of Cognitive Science Vol 1, London: Nature Publishing Group, 382-390 Reuland, Eric. 2003. State-of-the-article. Anaphoric dependencies: A window into the architecture of the language system. GLOT International 7.1/2, 2-25 Reuland, Eric and Martin Everaert. 2000. Deconstructing Binding. In: Mark Baltin and Chris Collins (eds.) Handbook of Syntactic Theory. London: Blackwell 60 Smith, Neil, and Ianthi. Tsimpli. 1995. The Mind of a Savant. London: Blackwell Ullman, Michael. 2001. The Delarative Procedural Model of Lexicon and Grammar. Journal of Psycholinguistic Research 30 (1), 3-69 Wijnen, Frank, and Frans Verstraten. 2001. Het brein te kijk. Lisse: Swets en Zeitlinger 61