Lecture 3

BMN ANGD A2 Linguistic Theory
Lecture 3: Movement
Pre-theoretical Concepts of Movement
The informal idea that elements move about in linguistic expressions is very natural and has
probably been around for centuries. All of the following sentences seem to indicate that
things move from one place to another in certain constructions:
the goalie kicked the ball
you saw him
you will tell him
he took a gun from the draw
a soldier in full uniform arrived
the ball was kicked by the goalie
who did you see
will you tell him
from the draw, he took a gun
a soldier arrived in full uniform
It seems that there are numerous cases where sentences with at least related, if not virtually
identical meanings are related to each other through the reorganisation of their elements. It is
therefore quite an intuitive and natural idea that natural language grammars involve processes
by which things move about.
Despite the obviousness of the idea, however, no one really took the idea beyond the informal
level and certainly no one considered the consequences of having grammatical rules which
change the position of words and phrases in a structure. Thus, we might find informal
descriptions such as: ‘to form a question, move the question word to the front of the
sentence’, or ‘in the passive, the object moves to the subject position’, but it was never
discussed what kind of a grammatical rule might be able to do such a thing.
Transformational Grammar
Last week we introduced Chomsky’s formalisation of the structuralists notion of Constituent
Structure analysis, the Phrase Structure Grammar, and we briefly demonstrated why
Chomsky thought that such a grammar was not an adequate one for modelling natural
language phenomena. His main criticism was that Phrase Structure Grammars allow no way
for independently generated structures to be connected. This is problematic precisely because
of sentences such as those in (1), where it is clear that these pairs of sentences are connected
in some way. A phrase structure grammar would generate these sentences as it does all others
and hence all sentences should have the same status, related to any other sentence to exactly
the same degree. Note that it is not just sharing the same lexical items that relates two
John loves Mary
Mary loves John
Mary, John loves
All of these sentences share the same lexical stock, but (2a) and (c) are more strongly related
to each other than either is to (40b).
Chomsky also pointed out that there are ambiguities that do not appear to have a lexical or
structural explanation. For example, consider:
the shooting of the hunters
Mark Newson
This is ambiguous in terms of whether the hunters are doing the shooting or getting shot. This
is clearly not a case of lexical ambiguity as all the words mean the same thing in both
interpretations. Yet is does not seem to be a case of structural ambiguity either as the PP
seems to be a modifier of the noun in both cases, presumably analysed by a simple PSG as
the shooting of the hunters
Note that this nominal phrase is related to two very different sentences:
the hunters shot something
someone shot the hunters
What this gives us is virtually the opposite to the case where two syntactically different
sentences are related, as in the sentences in (1):
Mary was loved by John
John loves Mary
Mary, John loves
the hunters shot something
the shooting of the hunters
someone shot the hunters
What is needed, Chomsky argued in 1957, is a new kind of rule that has the power to relate
sentences. The kind of rule Chomsky envisioned, he termed a transformational rule. In 1957,
the system that Chomsky described was very different from that he subsequently developed
and as it has little consequence in the subsequent development of transformational grammar, I
will deal with it only briefly. Chomsky proposed that we need to identify a subset of a
language’s sentence, called the kernel, which are more basic than others. These were to be
generated by a simplified phrase structure grammar. Transformational rules were rules which
operated on kernel sentences to produce all of the others, thus accounting for relatedness
between sentences: a kernel sentence will be related to all those which are produced from it
via the transformational rules. Thus, the model looked like the following:
One obvious problem with this model is not only identifying which the kernel sentences are
but also justifying why this set should be those selected as more basic than others. Chomsky
argued that kernel sentences were active, positive and declarative and that passive, negative
and interrogative sentences were all produced by transformation from a kernel sentence.
However, there is no real justification for claiming that active, positive declarative sentences
are any more basic than others and so the system seemed arbitrary at best. The model did not
survive and so we will spend no more time on it.
From the start of the 1960’s Chomsky developed Transformational Grammar along more
familiar lines, with each expression of a language associated with a structural description
which applied before transformations took place, called a Deep Structure, and a structural
change effected by the operations of transformations, called a Surface Structure. Phrase
Structure Rules were responsible for forming Deep Structures and Transformations acted on
these to form Surface Structures:
Phrase Structure Rules
Deep Structure
Surface Structure
Although this meant that no two sentences were directly related to each other by
transformations, Chomsky was still able to maintain a connection between different sentences
by maintaining that there was a sufficient similarity in their Deep Structures. Thus an active
and a passive sentence might both start with an identical Deep Structure, but end up with
different Surface Structures by the application of different transformations in both cases.
Obviously this gets rid of the need to define a set of kernel sentences and so there is no need
to explain why one type of sentence should be considered more basic than any other.
The Expansion of Transformations
During the 1960s a great number of transformational rules were proposed to account for a
wide range of linguistic observations, mainly from English, but also from a steadily
expanding set of other languages too. It is worth considering a few of these to demonstrate
how they worked and to see what problems they faced.
Let us start with the treatment of the English verbal system that Chomsky first proposed in
his 1957 book. As we mentioned last week Chomsky at first assumed the traditional
perspective that the auxiliary verbs form a structural unit with the verb:
Aux V
the paper
has read
Mark Newson
Thus we start with the following phrase structure rules:
VP → Verb NP
Verb → Aux V
Aux → (have) (be)
Note that the use of the auxiliaries have and be is not mutually exclusive and is optional. This
is represented by the brackets around the two Aux elements in the last rule. But now we have
to face the fact that depending on which auxiliary is used a different form of the verb appears:
John has read the paper
John is reading the paper
The key to understanding what is going on here can be found by observing the following
John has been reading the paper
Note that the main verb is in its –ing form, while the auxiliary be is in its perfective form
represented by the morpheme –en. Given that the main verb is in the perfective form in (12a)
where it follows the auxiliary have, we can conclude that any verb that follows have will be
in its perfective form. By extension then, we can say that any verb that follows the auxiliary
be will be in its –ing form. Therefore we have an association between have and the
morpheme –en, and one between be and the morpheme –ing, and both of these morphemes
end up on the following verbal element. To capture the association between the auxiliary
verbs an the morphemes, Chomsky proposed the following:
Aux → (have + en) (be + ing)
In other words, when the Aux node is expanded into words, both the auxiliary verb and its
associated morpheme are inserted together and this accounts for the association of the two
elements. Obviously the insertion of each auxiliary is an option. But the point is that if the
option is taken, both the auxiliary and the morpheme will be inserted. The rule set will
produce the following sequences of elements:
John + read + the + paper
John + have + en + read + the + paper
John + be + ing + read + the + paper
John + have + en + be + ing + read + the + paper
To account for the fact that the morphemes end up on the following verbal element, Chomsky
proposed the following transformation:
structural description: aff + V
structural change: # V + aff #
What this says is that if in a structure we find a sequence where an affix precedes a verb, we
move the affix behind the verb and make the two together a single word (indicated by the
word boundary symbols ‘#’). Thus applying this rule to (15) we get:
John + read + the + paper
John + have + # read + en # + the + paper
John + be + # read + ing # + the + paper
John + have + # be + en # + # read + ing # + the + paper
Which, when phonological adjustments have been made, come out as:
John read the paper
John has read the paper
John is reading the paper
John has been reading the paper
The analysis became known as the ‘affix hopping’ analysis because of the way that the affix
hops backwards onto the following verbal element.
An important part of this analysis was the treatment Chomsky gave to the tense morpheme.
Essentially he wanted to give the same treatment to this as the other verbal morphemes and so
it should be subject to the affix hopping transformation too. The important observation is that
the tense morpheme always appears on the first verbal element and therefore we can surmise
that its underlying position is in front of this element so that it can hop backwards onto it:
ed + have  # have + ed # = had
To achieve this all we need to do is include the tense element in the Aux phrase structure
Aux → T (have + en) (be + ing)
Once this is done, the tense morpheme will be generated at the front of the verbal elements
and by the affix hopping rule will become attached to whichever verb follows it. Note that it
is an obligatory element and so every sentence will have a tense element in it. To account for
infinitives and participles, Chomsky proposed that the infinitival to and the gerundive –ing
could also be taken as instances of T. As to is not an affix, it would not undergo affix
hopping, but the gerundive morpheme will behave like all the others. Thus we will get:
… ed + read …
… ed + have + en + read …
… to + have + en + read …
… ing + have + en + read …
had read
to have read
having read
Now let us consider the slightly more complex transformation which is involved in forming
the passive. As is well known, there are at least four differences between active and passive
sentences: the object of the active corresponds to the subject of the passive; the subject of the
active corresponds to the NP inside an optional by phrase in the passive which follows the
verb; there is an auxiliary be in the passive as well as a morpheme situated on the main verb:
Mark Newson
the goalie saved the ball
the ball was saved (by the goalie)
Assuming the basic phrase structure rules in (11), we can assume that the basic subject
position is before the verb and the basic object position follows it. Therefore in the active no
transformation acts to move these elements. Of course, there are transformations which
operate in active sentences, affix hopping for example, and so it is not the case that the Deep
Structure and the Surface Structure of active sentences are identical. However, something
more radical applies with the passive. We might propose the following:
NP1 + Aux + V + NP2
NP2 + Aux + be + en + V (by NP1)
Essentially what this says is that if we find a structure in which there is a subject (NP1) a verb
and an object (NP2) we can apply the transformation in which the object replaces the subject,
the auxiliary be is inserted followed by the morpheme –en and the subject can optionally be
placed in a by phrase following the verb. Assuming that the passive transformation applies
before the affix hopping rule, the passive morpheme will be treated like the others and land
on the following main verb.
Although this is a little primitive and there are obviously technical issues to be addressed,
even at this point we can point out some positive aspects of the analysis. First it is predicted
that only transitive verbs will be able to passivise. This is because for the transformation to be
able to operate, its structural description must be met. This states that there must be an object
and only a transitive verb will have an object. Hence John was smiled will be ungrammatical
because no transformation could produce it. Second, any restrictions which apply to a verbs
subject and object need only to be stated with respect to Deep Structures and they will apply
equally to active and passive sentences and they will not have to be restated in mirror image
for each. So the grammaticality of (24a) has the same source as the grammaticality of (25a)
and the ungrammaticality of (24b) has the same source as (25b):
sincerity frightens John
* John frightens sincerity
John is frightened by sincerity
* sincerity is frightened by John
The Problem With Transformations
The passive transformation is a god example of what came to be seen as the main problem for
this type of analysis. This one transformation apparently has the ability to move or perhaps
simply delete the subject, move the object into the vacated subject position and insert various
things into structure, thereby radically changing the whole thing. Of course, at first, this was
seen as the most possible aspects of transformations: you can analyse anything with them!
But this turns out to be a huge problem: if you can do anything with a transformation, it is
impossible to explain why certain things happen in certain constructions in certain languages.
In other words, transformations offer a very good way of describing linguistic phenomena,
but they cannot explain it.
The same argument can be put in terms of language acquisition, which, as we have already
mentioned, is what Chomsky claimed had to be accounted for before a theory could be
considered explanatorily adequate. Suppose human infants are born possessing a rudimentary
grammar, with a phrase structure component for generating Deep structures and a
transformational component for generating Surface structures. What they have to learn is the
details of the grammar: what particular phrase structure rules and transformations are in
operation in the language being spoken around them. If transformations are unlimited, this
component of the grammar must start as a complete blank – the child may know that there are
transformations, but they can have no idea about which transformations are actually being
used. The task would then to be figure out, from listening to sentences spoken, what
transformations are in use. Given the complexity and number of transformations needed, this
task would be impossible.
The only way to proceed from this is to limit the power of transformations. If there are limits
on what can possibly be a transformational rule, these will simplify the task of language
acquisition as it will automatically discount possible hypotheses that the learner might
otherwise consider. To give an analogy: it is like looking for a needle in a haystack, but being
told which square meter of the haystack to search. Obviously, the greater the limitations we
place on the search space, the easier it is to find the needle!
Other technical issues arise from consideration of the passive transformation. For example,
consider the deletion process assumed to optionally take place with the subject. Presumably if
grammars are to be used in the expression and decoding of utterances, a hearer must have the
ability to identify the relevant Deep structure for any given sentence. However, any deletions,
unless very restricted, will make the construction of the exact Deep structure rather hard. For
example, given the passive sentence in (26), it might have been formed from any of the Deep
structures given in (27):
John was seen
Bill + saw + John
Mary + saw + John
that + man + saw + John
that + tall + man + saw + John
that + tall + man + who + just + left + saw + John
Clearly the possible Deep structures associated with (26) is limitless and therefore the task of
its recovery is impossible.
Of course, one might consider strategies that could be adopted to ensure that some Deep
structure could be recovered: taking contextual clues for example, or constructing a nonspecific subject with a meaning similar to ‘someone’. However, there are reasons to believe
that human grammars do not do this sort of thing and that when they have to reconstruct
missing information in a sentence, they do so under very strictly limited conditions. For
example, consider a straightforward case of ellipsis, in which part of a sentence is simply left
I don’t want to be late and neither do you
Mark Newson
It is blatantly clear what the missing part of this sentence is: want to be late, and indeed, it
couldn’t be interpreted as anything else. Even if we tried to set up a context in which some
other interpretation would be possible, e.g. the topic of the conversation is something that you
don’t want to do:
Mary just told me that you don’t want to go to the party.
But I don’t want to be late and neither do you (! want to go to the party)
It seems that what is going on here is that when we need to recover missing elements from a
very restricted source – essentially a corresponding position in a preceding coordinate
For reasons such as these, syntacticians started to see deletion processes as problematic and
therefore to be avoided in analyses. In fact, this fitted with the idea that transformations
needed to be limited as if we assume that transformations do not delete material we have a
start of the whole business of restricting the transformational component.
To return to the analysis of the passive then, what can we say about the subject? Obviously, it
would be simpler if we assumed that the Deep structure of a passive sentence such as (26)
simply lack a subject and therefore such an element does not have to be reconstructed. We
might therefore revise the transformation thus:
__ + Aux
+ V + NP
NP + Aux + be + en + V
Ignoring the auxiliary and the morpheme for the time being, the passive transformation is
therefore much simplified as the movement of the object into an empty subject position. Of
course, this raises the question of the by phrase: where does it come from? The simple answer
would be to say that this is not inserted by the transformation at all, but is optionally included
at Deep structure as any other modifying PP is:
John was seen (by Bill) (at 10 o’clock) (in the coffee shop) (in full military dress)
It remains to be explained why we do not have a by phrase in active sentences, but this it not
difficult. To start with, there are no problems with by phrases per se in active sentences:
he built the structure by hand
he hadn’t finished the project by midday
Specifically what is ruled out is the case where the by phrase has the same interpretation as
the subject:
* Bill saw John by Mary
But this is not unique to the passive. PPs which have the same interpretation as other parts of
the sentence are not allowed either:
* I saw him on Sunday on Saturday
Clearly these sorts of sentences contradict themselves and this is the source of their
unacceptability. This is exactly what is going on in (33) and so nothing new has to be said
about this. In a passive where there is no subject, there is no conflict with such a by phrase
and hence its appearance is without issue.
Note that this analysis allows us to simplify the rule so that not only is the subject not
optionally deleted, but the by phrase is not optionally inserted either. All that happens is that
with a passive Deep structure, which inherently lacks a subject, the object obligatorily moves
to the subject position. However, as things stand, we are still allowing for the insertion of the
auxiliary verb and the passive morpheme. Is this entirely necessary? To some extent, given
that we have a phrase structure rule which is responsible for the insertion of auxiliaries and
their associated morphemes it is slightly odd that we should have a transformation that does
the same thing. So again it would be simpler if we were to assume that this aspect of the
passive was not part of the transformation at all. In which case the passive transformation
reduces to the simple process of moving the object to the subject position. Of course, it still
needs to be accounted for why the passive auxiliary and morpheme are obligatorily present in
passive structures. One obvious way to approach this would be to make the auxiliary and
morpheme part of the structural description of the transformation so that passivisation could
not apply if they were absent:
__ + be + en + V + NP
NP + be + en + V
However, this isn’t perhaps the best idea as it only accounts for why passive movement is not
possible in the absence of the auxiliary and morpheme, not why passive sentences are
ungrammatical without them. It seems that there are more basic conditions that require these
elements to be present in passive structures and presumably these conditions apply to Deep
Structures and are nothing to do with the movement itself.
Another reason why the auxiliary and morpheme should not be seen as part of the condition
on this movement is that there are very similar cases of the movement that do not involve
these things. Consider the following:
I saw the building demolished
the building got demolished
the building appeared to be old
In the first case, we appear to have an embedded passive sentence in which the building is
interpreted as object of the verb, but sits in a position in front of the verb reminiscent of the
subject. Importantly, there is no auxiliary present. Therefore if we wish to use the passive
transformation to account for this structure, the transformation cannot make reference to the
presence of the auxiliary. In this case, the morpheme is still present however. But in the
second case, while the morpheme is present, it is not placed on the verb whose subject
position is moved into by the passivised object (i.e. get). In the final case we do not have an
instance of passivisation at all. However, there does appear to be a movement which has very
similar properties to passive. This can be seen by considering the following related sentence:
it appeared that the building was old
Mark Newson
In this case, the building is in the subject position of the embedded sentence and given that
(36c) has a similar interpretation, we can assume that this is the same position that this
element starts off in in this example. Thus, a reasonable analysis of (36c) would be:
Deep Structure:
Surface Structure
appeared [the building to be old]
the building appeared [
to be old]
It is fairly obvious that this shares with the passive construction a moved NP into a higher
empty subject position. Indeed, the fact that we actually get passive constructions identical to
this demonstrates that the similarity is more than coincidental:
it was believed [that the building was old]
the building was believed [to be old]
As we would need to add the possibility of (39b) into the passive transformations, it would be
an obvious extension to make to include (36c) as well. Yet in order to do this the
transformation cannot refer to the passive auxiliary or morpheme as these are nothing to do
with the movement in the latter.
What we end up with is the following:
__ … V NP
NP … V
In other words, whenever we have a Deep structure where there is an empty subject and an
NP immediately following a verb, the NP is moved into the subject position.
There are two important things to note at this point. Comparing this version of the ‘passive
transformation’ to the original one in (23), (40) is simpler and more general. It is simpler
because it does not include all of the specific properties of particular passive constructions:
the by phrase and the auxiliary, for example and it is more general because it applies to more
structures, some of which are not even passives. For this reason, during the 1970s this
movement became known as NP movement and was assumed to be part of the analysis of
other kinds of structure too.
We will see next week that the development we have traced here in which the adding of
restrictive assumptions about transformations led to a situation in which transformations
themselves became simpler and more general, was a remarkably common one in the history
of generative grammar. It is clearly not a logical necessity that restriction should lead to
simplicity and generality, and so the fact that these were results of this restrictive programme
of development have been taken as indicative that this programme is heading in the right
What we have presented in this lecture is a rather condensed and somewhat ‘tunnel-visioned’
view of the development of passive construction from the early 1960s to the mid 1970s. It is
tunnel-visioned because this development did not take place in isolation, but the treatment of
other constructions were also developing along similar lines at the same time. However, the
passive offers a good example of the more general process of grammatical development. One
of the lines of development we have concentrated on above is the idea that the
transformational component of the grammar became simpler and more general over time. To
start, it consisted of many rules which were specific to the kinds of constructions they were
supposed to generate. This can be clearly seen in the original idea that there was a subset of
sentence types, the kernel sentences, that were basic and then all other sentence types had to
be produced from these via a series of transformations specifically designed to generate them.
For every non-kernel sentence, then, there would have to be at least one or possibly more,
transformation. Over time, the transformation component became smaller as construction
types were not in a one to one correspondence with transformations, but a few
transformations were seen to be involved in many different constructions. This progression
marks a change in view concerning what linguists should be doing when investigating
language, from developing grammatical descriptions of specific phenomena, towards the
identification of the general processes involved in linguistic generation. Specific linguistic
phenomena from this point of view are merely the epiphenomena of the far more interesting
linguistic system. Of course, this ties in well with Chomsky’s aim of producing a Universal
Grammar, which would go a long way to account for the fact of language acquisition, as
transformations which are construction specific are necessarily going to be language specific,
while more general transformations stand a chance of being applicable to a range of
phenomena cross-linguistically.
Another part of this development which we have not said much about is the change of view
of the nature of the levels of descriptions for linguistic expressions, particularly in that of the
Deep Structure. The original concept was to generate one sentence type from another. So the
underlying forms were proper sentences in their own right. This rapidly changed so that Deep
Structures were not considered to be governed by the conditions on surface forms and so
were considerably more abstract. However, in the earlier versions of the theory, the Deep
Structures were probably more closely related to certain surface forms than others and as
such represented certain surface phenomena more closely. So, the Deep Structure for the
passive still mirrored the general form of the active, including the presence of a subject.
Later, Deep Structures became more associated with particular surface structures, but more
abstract as well. Thus active and passive sentences were associated with different Deep
Structures and were not connected by sharing the same one. The Deep Structure of the
passive lacked a subject and hence was nothing like any grammatical Surface Structure. To
some extent then, this development moves away from the original motivation for
transformational rules: something designed to account for connections between different
sentences. While active and passive sentences are still connected by having certain features
represented at Deep Structure in common (the object is in object position in both), this
connection is more subtle than it was at first assumed to be. However, by this time, the
generality of the analysis gained by using transformations had become its own motivation
and hence Chomsky’s original arguments were not so central.