BMN ANGD A2 Linguistic Theory Lecture 3: Movement 1 Pre-theoretical Concepts of Movement The informal idea that elements move about in linguistic expressions is very natural and has probably been around for centuries. All of the following sentences seem to indicate that things move from one place to another in certain constructions: (1) a b c d e the goalie kicked the ball you saw him you will tell him he took a gun from the draw a soldier in full uniform arrived the ball was kicked by the goalie who did you see will you tell him from the draw, he took a gun a soldier arrived in full uniform It seems that there are numerous cases where sentences with at least related, if not virtually identical meanings are related to each other through the reorganisation of their elements. It is therefore quite an intuitive and natural idea that natural language grammars involve processes by which things move about. Despite the obviousness of the idea, however, no one really took the idea beyond the informal level and certainly no one considered the consequences of having grammatical rules which change the position of words and phrases in a structure. Thus, we might find informal descriptions such as: ‘to form a question, move the question word to the front of the sentence’, or ‘in the passive, the object moves to the subject position’, but it was never discussed what kind of a grammatical rule might be able to do such a thing. 2 Transformational Grammar Last week we introduced Chomsky’s formalisation of the structuralists notion of Constituent Structure analysis, the Phrase Structure Grammar, and we briefly demonstrated why Chomsky thought that such a grammar was not an adequate one for modelling natural language phenomena. His main criticism was that Phrase Structure Grammars allow no way for independently generated structures to be connected. This is problematic precisely because of sentences such as those in (1), where it is clear that these pairs of sentences are connected in some way. A phrase structure grammar would generate these sentences as it does all others and hence all sentences should have the same status, related to any other sentence to exactly the same degree. Note that it is not just sharing the same lexical items that relates two sentences: (2) a b c John loves Mary Mary loves John Mary, John loves All of these sentences share the same lexical stock, but (2a) and (c) are more strongly related to each other than either is to (40b). Chomsky also pointed out that there are ambiguities that do not appear to have a lexical or structural explanation. For example, consider: (3) the shooting of the hunters Mark Newson This is ambiguous in terms of whether the hunters are doing the shooting or getting shot. This is clearly not a case of lexical ambiguity as all the words mean the same thing in both interpretations. Yet is does not seem to be a case of structural ambiguity either as the PP seems to be a modifier of the noun in both cases, presumably analysed by a simple PSG as follows: (4) NP Det N PP the shooting of the hunters Note that this nominal phrase is related to two very different sentences: (5) a b the hunters shot something someone shot the hunters What this gives us is virtually the opposite to the case where two syntactically different sentences are related, as in the sentences in (1): (6) Mary was loved by John John loves Mary Mary, John loves (7) the hunters shot something the shooting of the hunters someone shot the hunters What is needed, Chomsky argued in 1957, is a new kind of rule that has the power to relate sentences. The kind of rule Chomsky envisioned, he termed a transformational rule. In 1957, the system that Chomsky described was very different from that he subsequently developed and as it has little consequence in the subsequent development of transformational grammar, I will deal with it only briefly. Chomsky proposed that we need to identify a subset of a language’s sentence, called the kernel, which are more basic than others. These were to be generated by a simplified phrase structure grammar. Transformational rules were rules which operated on kernel sentences to produce all of the others, thus accounting for relatedness between sentences: a kernel sentence will be related to all those which are produced from it via the transformational rules. Thus, the model looked like the following: (8) Phrase Structure Grammar Kernel sentence s Transformations Transformed sentences One obvious problem with this model is not only identifying which the kernel sentences are but also justifying why this set should be those selected as more basic than others. Chomsky argued that kernel sentences were active, positive and declarative and that passive, negative and interrogative sentences were all produced by transformation from a kernel sentence. However, there is no real justification for claiming that active, positive declarative sentences 2 Movement are any more basic than others and so the system seemed arbitrary at best. The model did not survive and so we will spend no more time on it. From the start of the 1960’s Chomsky developed Transformational Grammar along more familiar lines, with each expression of a language associated with a structural description which applied before transformations took place, called a Deep Structure, and a structural change effected by the operations of transformations, called a Surface Structure. Phrase Structure Rules were responsible for forming Deep Structures and Transformations acted on these to form Surface Structures: (9) Phrase Structure Rules Deep Structure T-Rules Surface Structure Although this meant that no two sentences were directly related to each other by transformations, Chomsky was still able to maintain a connection between different sentences by maintaining that there was a sufficient similarity in their Deep Structures. Thus an active and a passive sentence might both start with an identical Deep Structure, but end up with different Surface Structures by the application of different transformations in both cases. Obviously this gets rid of the need to define a set of kernel sentences and so there is no need to explain why one type of sentence should be considered more basic than any other. 3 The Expansion of Transformations During the 1960s a great number of transformational rules were proposed to account for a wide range of linguistic observations, mainly from English, but also from a steadily expanding set of other languages too. It is worth considering a few of these to demonstrate how they worked and to see what problems they faced. Let us start with the treatment of the English verbal system that Chomsky first proposed in his 1957 book. As we mentioned last week Chomsky at first assumed the traditional perspective that the auxiliary verbs form a structural unit with the verb: (10) S NP John VP Verb NP Aux V the paper has read 3 Mark Newson Thus we start with the following phrase structure rules: (11) S → NP VP VP → Verb NP Verb → Aux V Aux → (have) (be) Note that the use of the auxiliaries have and be is not mutually exclusive and is optional. This is represented by the brackets around the two Aux elements in the last rule. But now we have to face the fact that depending on which auxiliary is used a different form of the verb appears: (12) a b John has read the paper John is reading the paper The key to understanding what is going on here can be found by observing the following sentence: (13) John has been reading the paper Note that the main verb is in its –ing form, while the auxiliary be is in its perfective form represented by the morpheme –en. Given that the main verb is in the perfective form in (12a) where it follows the auxiliary have, we can conclude that any verb that follows have will be in its perfective form. By extension then, we can say that any verb that follows the auxiliary be will be in its –ing form. Therefore we have an association between have and the morpheme –en, and one between be and the morpheme –ing, and both of these morphemes end up on the following verbal element. To capture the association between the auxiliary verbs an the morphemes, Chomsky proposed the following: (14) Aux → (have + en) (be + ing) In other words, when the Aux node is expanded into words, both the auxiliary verb and its associated morpheme are inserted together and this accounts for the association of the two elements. Obviously the insertion of each auxiliary is an option. But the point is that if the option is taken, both the auxiliary and the morpheme will be inserted. The rule set will produce the following sequences of elements: (15) a b c d John + read + the + paper John + have + en + read + the + paper John + be + ing + read + the + paper John + have + en + be + ing + read + the + paper To account for the fact that the morphemes end up on the following verbal element, Chomsky proposed the following transformation: (16) structural description: aff + V structural change: # V + aff # 4 Movement What this says is that if in a structure we find a sequence where an affix precedes a verb, we move the affix behind the verb and make the two together a single word (indicated by the word boundary symbols ‘#’). Thus applying this rule to (15) we get: (17) a b c d John + read + the + paper John + have + # read + en # + the + paper John + be + # read + ing # + the + paper John + have + # be + en # + # read + ing # + the + paper Which, when phonological adjustments have been made, come out as: (18) a b c d John read the paper John has read the paper John is reading the paper John has been reading the paper The analysis became known as the ‘affix hopping’ analysis because of the way that the affix hops backwards onto the following verbal element. An important part of this analysis was the treatment Chomsky gave to the tense morpheme. Essentially he wanted to give the same treatment to this as the other verbal morphemes and so it should be subject to the affix hopping transformation too. The important observation is that the tense morpheme always appears on the first verbal element and therefore we can surmise that its underlying position is in front of this element so that it can hop backwards onto it: (19) ed + have # have + ed # = had To achieve this all we need to do is include the tense element in the Aux phrase structure rule: (20) Aux → T (have + en) (be + ing) Once this is done, the tense morpheme will be generated at the front of the verbal elements and by the affix hopping rule will become attached to whichever verb follows it. Note that it is an obligatory element and so every sentence will have a tense element in it. To account for infinitives and participles, Chomsky proposed that the infinitival to and the gerundive –ing could also be taken as instances of T. As to is not an affix, it would not undergo affix hopping, but the gerundive morpheme will behave like all the others. Thus we will get: (21) … ed + read … … ed + have + en + read … … to + have + en + read … … ing + have + en + read … read had read to have read having read Now let us consider the slightly more complex transformation which is involved in forming the passive. As is well known, there are at least four differences between active and passive sentences: the object of the active corresponds to the subject of the passive; the subject of the active corresponds to the NP inside an optional by phrase in the passive which follows the verb; there is an auxiliary be in the passive as well as a morpheme situated on the main verb: 5 Mark Newson (22) a b the goalie saved the ball the ball was saved (by the goalie) Assuming the basic phrase structure rules in (11), we can assume that the basic subject position is before the verb and the basic object position follows it. Therefore in the active no transformation acts to move these elements. Of course, there are transformations which operate in active sentences, affix hopping for example, and so it is not the case that the Deep Structure and the Surface Structure of active sentences are identical. However, something more radical applies with the passive. We might propose the following: (23) SD: SC: NP1 + Aux + V + NP2 NP2 + Aux + be + en + V (by NP1) Essentially what this says is that if we find a structure in which there is a subject (NP1) a verb and an object (NP2) we can apply the transformation in which the object replaces the subject, the auxiliary be is inserted followed by the morpheme –en and the subject can optionally be placed in a by phrase following the verb. Assuming that the passive transformation applies before the affix hopping rule, the passive morpheme will be treated like the others and land on the following main verb. Although this is a little primitive and there are obviously technical issues to be addressed, even at this point we can point out some positive aspects of the analysis. First it is predicted that only transitive verbs will be able to passivise. This is because for the transformation to be able to operate, its structural description must be met. This states that there must be an object and only a transitive verb will have an object. Hence John was smiled will be ungrammatical because no transformation could produce it. Second, any restrictions which apply to a verbs subject and object need only to be stated with respect to Deep Structures and they will apply equally to active and passive sentences and they will not have to be restated in mirror image for each. So the grammaticality of (24a) has the same source as the grammaticality of (25a) and the ungrammaticality of (24b) has the same source as (25b): (24) a b sincerity frightens John * John frightens sincerity (25) a b John is frightened by sincerity * sincerity is frightened by John 4 The Problem With Transformations The passive transformation is a god example of what came to be seen as the main problem for this type of analysis. This one transformation apparently has the ability to move or perhaps simply delete the subject, move the object into the vacated subject position and insert various things into structure, thereby radically changing the whole thing. Of course, at first, this was seen as the most possible aspects of transformations: you can analyse anything with them! But this turns out to be a huge problem: if you can do anything with a transformation, it is impossible to explain why certain things happen in certain constructions in certain languages. In other words, transformations offer a very good way of describing linguistic phenomena, but they cannot explain it. 6 Movement The same argument can be put in terms of language acquisition, which, as we have already mentioned, is what Chomsky claimed had to be accounted for before a theory could be considered explanatorily adequate. Suppose human infants are born possessing a rudimentary grammar, with a phrase structure component for generating Deep structures and a transformational component for generating Surface structures. What they have to learn is the details of the grammar: what particular phrase structure rules and transformations are in operation in the language being spoken around them. If transformations are unlimited, this component of the grammar must start as a complete blank – the child may know that there are transformations, but they can have no idea about which transformations are actually being used. The task would then to be figure out, from listening to sentences spoken, what transformations are in use. Given the complexity and number of transformations needed, this task would be impossible. The only way to proceed from this is to limit the power of transformations. If there are limits on what can possibly be a transformational rule, these will simplify the task of language acquisition as it will automatically discount possible hypotheses that the learner might otherwise consider. To give an analogy: it is like looking for a needle in a haystack, but being told which square meter of the haystack to search. Obviously, the greater the limitations we place on the search space, the easier it is to find the needle! Other technical issues arise from consideration of the passive transformation. For example, consider the deletion process assumed to optionally take place with the subject. Presumably if grammars are to be used in the expression and decoding of utterances, a hearer must have the ability to identify the relevant Deep structure for any given sentence. However, any deletions, unless very restricted, will make the construction of the exact Deep structure rather hard. For example, given the passive sentence in (26), it might have been formed from any of the Deep structures given in (27): (26) John was seen (27) a b c d e f Bill + saw + John Mary + saw + John that + man + saw + John that + tall + man + saw + John that + tall + man + who + just + left + saw + John … Clearly the possible Deep structures associated with (26) is limitless and therefore the task of its recovery is impossible. Of course, one might consider strategies that could be adopted to ensure that some Deep structure could be recovered: taking contextual clues for example, or constructing a nonspecific subject with a meaning similar to ‘someone’. However, there are reasons to believe that human grammars do not do this sort of thing and that when they have to reconstruct missing information in a sentence, they do so under very strictly limited conditions. For example, consider a straightforward case of ellipsis, in which part of a sentence is simply left unsaid: (28) I don’t want to be late and neither do you 7 Mark Newson It is blatantly clear what the missing part of this sentence is: want to be late, and indeed, it couldn’t be interpreted as anything else. Even if we tried to set up a context in which some other interpretation would be possible, e.g. the topic of the conversation is something that you don’t want to do: (29) Mary just told me that you don’t want to go to the party. But I don’t want to be late and neither do you (! want to go to the party) It seems that what is going on here is that when we need to recover missing elements from a very restricted source – essentially a corresponding position in a preceding coordinate sentence. For reasons such as these, syntacticians started to see deletion processes as problematic and therefore to be avoided in analyses. In fact, this fitted with the idea that transformations needed to be limited as if we assume that transformations do not delete material we have a start of the whole business of restricting the transformational component. To return to the analysis of the passive then, what can we say about the subject? Obviously, it would be simpler if we assumed that the Deep structure of a passive sentence such as (26) simply lack a subject and therefore such an element does not have to be reconstructed. We might therefore revise the transformation thus: (30) SD: SC: __ + Aux + V + NP NP + Aux + be + en + V Ignoring the auxiliary and the morpheme for the time being, the passive transformation is therefore much simplified as the movement of the object into an empty subject position. Of course, this raises the question of the by phrase: where does it come from? The simple answer would be to say that this is not inserted by the transformation at all, but is optionally included at Deep structure as any other modifying PP is: (31) John was seen (by Bill) (at 10 o’clock) (in the coffee shop) (in full military dress) It remains to be explained why we do not have a by phrase in active sentences, but this it not difficult. To start with, there are no problems with by phrases per se in active sentences: (32) a b he built the structure by hand he hadn’t finished the project by midday Specifically what is ruled out is the case where the by phrase has the same interpretation as the subject: (33) * Bill saw John by Mary But this is not unique to the passive. PPs which have the same interpretation as other parts of the sentence are not allowed either: (34) * I saw him on Sunday on Saturday 8 Movement Clearly these sorts of sentences contradict themselves and this is the source of their unacceptability. This is exactly what is going on in (33) and so nothing new has to be said about this. In a passive where there is no subject, there is no conflict with such a by phrase and hence its appearance is without issue. Note that this analysis allows us to simplify the rule so that not only is the subject not optionally deleted, but the by phrase is not optionally inserted either. All that happens is that with a passive Deep structure, which inherently lacks a subject, the object obligatorily moves to the subject position. However, as things stand, we are still allowing for the insertion of the auxiliary verb and the passive morpheme. Is this entirely necessary? To some extent, given that we have a phrase structure rule which is responsible for the insertion of auxiliaries and their associated morphemes it is slightly odd that we should have a transformation that does the same thing. So again it would be simpler if we were to assume that this aspect of the passive was not part of the transformation at all. In which case the passive transformation reduces to the simple process of moving the object to the subject position. Of course, it still needs to be accounted for why the passive auxiliary and morpheme are obligatorily present in passive structures. One obvious way to approach this would be to make the auxiliary and morpheme part of the structural description of the transformation so that passivisation could not apply if they were absent: (35) SD: SC: __ + be + en + V + NP NP + be + en + V However, this isn’t perhaps the best idea as it only accounts for why passive movement is not possible in the absence of the auxiliary and morpheme, not why passive sentences are ungrammatical without them. It seems that there are more basic conditions that require these elements to be present in passive structures and presumably these conditions apply to Deep Structures and are nothing to do with the movement itself. Another reason why the auxiliary and morpheme should not be seen as part of the condition on this movement is that there are very similar cases of the movement that do not involve these things. Consider the following: (36) a b c I saw the building demolished the building got demolished the building appeared to be old In the first case, we appear to have an embedded passive sentence in which the building is interpreted as object of the verb, but sits in a position in front of the verb reminiscent of the subject. Importantly, there is no auxiliary present. Therefore if we wish to use the passive transformation to account for this structure, the transformation cannot make reference to the presence of the auxiliary. In this case, the morpheme is still present however. But in the second case, while the morpheme is present, it is not placed on the verb whose subject position is moved into by the passivised object (i.e. get). In the final case we do not have an instance of passivisation at all. However, there does appear to be a movement which has very similar properties to passive. This can be seen by considering the following related sentence: (37) it appeared that the building was old 9 Mark Newson In this case, the building is in the subject position of the embedded sentence and given that (36c) has a similar interpretation, we can assume that this is the same position that this element starts off in in this example. Thus, a reasonable analysis of (36c) would be: (38) Deep Structure: Surface Structure __ appeared [the building to be old] the building appeared [ to be old] It is fairly obvious that this shares with the passive construction a moved NP into a higher empty subject position. Indeed, the fact that we actually get passive constructions identical to this demonstrates that the similarity is more than coincidental: (39) a b it was believed [that the building was old] the building was believed [to be old] As we would need to add the possibility of (39b) into the passive transformations, it would be an obvious extension to make to include (36c) as well. Yet in order to do this the transformation cannot refer to the passive auxiliary or morpheme as these are nothing to do with the movement in the latter. What we end up with is the following: (40) SD: SC: __ … V NP NP … V In other words, whenever we have a Deep structure where there is an empty subject and an NP immediately following a verb, the NP is moved into the subject position. There are two important things to note at this point. Comparing this version of the ‘passive transformation’ to the original one in (23), (40) is simpler and more general. It is simpler because it does not include all of the specific properties of particular passive constructions: the by phrase and the auxiliary, for example and it is more general because it applies to more structures, some of which are not even passives. For this reason, during the 1970s this movement became known as NP movement and was assumed to be part of the analysis of other kinds of structure too. We will see next week that the development we have traced here in which the adding of restrictive assumptions about transformations led to a situation in which transformations themselves became simpler and more general, was a remarkably common one in the history of generative grammar. It is clearly not a logical necessity that restriction should lead to simplicity and generality, and so the fact that these were results of this restrictive programme of development have been taken as indicative that this programme is heading in the right direction. 5 Conclusion What we have presented in this lecture is a rather condensed and somewhat ‘tunnel-visioned’ view of the development of passive construction from the early 1960s to the mid 1970s. It is tunnel-visioned because this development did not take place in isolation, but the treatment of other constructions were also developing along similar lines at the same time. However, the 10 Movement passive offers a good example of the more general process of grammatical development. One of the lines of development we have concentrated on above is the idea that the transformational component of the grammar became simpler and more general over time. To start, it consisted of many rules which were specific to the kinds of constructions they were supposed to generate. This can be clearly seen in the original idea that there was a subset of sentence types, the kernel sentences, that were basic and then all other sentence types had to be produced from these via a series of transformations specifically designed to generate them. For every non-kernel sentence, then, there would have to be at least one or possibly more, transformation. Over time, the transformation component became smaller as construction types were not in a one to one correspondence with transformations, but a few transformations were seen to be involved in many different constructions. This progression marks a change in view concerning what linguists should be doing when investigating language, from developing grammatical descriptions of specific phenomena, towards the identification of the general processes involved in linguistic generation. Specific linguistic phenomena from this point of view are merely the epiphenomena of the far more interesting linguistic system. Of course, this ties in well with Chomsky’s aim of producing a Universal Grammar, which would go a long way to account for the fact of language acquisition, as transformations which are construction specific are necessarily going to be language specific, while more general transformations stand a chance of being applicable to a range of phenomena cross-linguistically. Another part of this development which we have not said much about is the change of view of the nature of the levels of descriptions for linguistic expressions, particularly in that of the Deep Structure. The original concept was to generate one sentence type from another. So the underlying forms were proper sentences in their own right. This rapidly changed so that Deep Structures were not considered to be governed by the conditions on surface forms and so were considerably more abstract. However, in the earlier versions of the theory, the Deep Structures were probably more closely related to certain surface forms than others and as such represented certain surface phenomena more closely. So, the Deep Structure for the passive still mirrored the general form of the active, including the presence of a subject. Later, Deep Structures became more associated with particular surface structures, but more abstract as well. Thus active and passive sentences were associated with different Deep Structures and were not connected by sharing the same one. The Deep Structure of the passive lacked a subject and hence was nothing like any grammatical Surface Structure. To some extent then, this development moves away from the original motivation for transformational rules: something designed to account for connections between different sentences. While active and passive sentences are still connected by having certain features represented at Deep Structure in common (the object is in object position in both), this connection is more subtle than it was at first assumed to be. However, by this time, the generality of the analysis gained by using transformations had become its own motivation and hence Chomsky’s original arguments were not so central. 11