Lecture 11

advertisement

BMN ANGD A2 Linguistic Theory

Lecture 11: Explanation

1 Introduction

One of the main thrusts of Chomskyan linguistics since the 1950s is the desire to find an explanatory theory: a theory which explains linguistic phenomena rather than just describes them. This raises the question of what distinguishes an explanatory theory from a descriptive one. It turns out that this is a rather complicated issue.

2 Take One: levels of adequacy

Chomsky (1965) attempted to provide a framework in which we might be able to judge linguistic theories. The point is that there is an infinite number of formal grammars which might be compatible with a set of linguistic data. Therefore we face the problem of distinguishing between them: which ones are better than the others? It is quite easy to demonstrate the infinite number of grammars compatible with a given language. Consider a simple formal language which consists of sequences of one or more instances of a word “a”.

Thus, the following are grammatical sentences in this language:

(1) a aa aaa aaaa etc.

The following phrase structure grammar is compatible with this language:

(2)

S → a

S → Sa

This grammar is capable of generating stings of “a” of any length, due to the recursive second rule. Thus it would predict that any of the sentences in (1) are grammatical and sentences that are not in this set, such as those that have “b” in them, would be predicted to be ungrammatical.

The same is true for the following grammar too:

(3)

S → a

S → aS

Given that these grammars are distinct, we now have two grammars which are compatible with this language.

To demonstrate that there are an infinite number of grammars compatible with this language, consider the following transformational grammar:

(4) S' → Sb

S → a

S → Sa

Mark Newson

“b” deletion transformation (obligatory):

SD: A b

SC: A

This grammar produces sentences made up of strings of “a” followed by a single “b”, which is then obligatorily deleted by the transformation. Hence at the surface all sentences will contain “a”s but no “b”s and the grammar generates all the sentences of the language in (1) and no other sentence. It should be obvious that as this is a perfectly legitimate grammar, albeit a perfectly nonsensical one, that we can have an infinite number of distinct grammars which introduce different amounts of linguistic elements at D-structure and then deletes this at S-structure, all generating the same language.

Chomsky’s basic level for an adequate linguistic theory is observational adequacy . A grammar is observationally adequate if it predicts to be grammatical all and only the sentences of the language. In this respect, all of the three grammars in (2) to (4) above are observationally adequate with respect to the language in (1).

But note that these grammars ascribe a completely different structure to the sentences of the language. The one in (3) has rightward branching structures whereas that in (4) has leftward branching structures:

S (5) S a S S a

a S S a

a a

It is generally considered possible to determine the underlying structure of an expression of a language, largely through the intuitions of native speakers concerning the acceptability of certain permutations. Thus a grammar which structures expressions in a way that is compatible with native speaker judgements is better than one that does not, even if they are both observationally adequate. The level of adequacy attained by such a grammar is that of descriptive adequacy .

Still, there may be a large number of grammars which attain descriptive adequacy for any given language and given that speaker intuitions do not extend to the details of the grammatical system itself, it may be that no amount of investigation can sort out which of them is most accurate. The third level that Chomsky proposed is that of explanatory adequacy . This then brings us to our topic: what is an explanatorily adequate theory? It is important to note that Chomsky proposed this definition as a technical way of distinguishing between competing theories rather than as a definition of what counts as an explanation.

Essentially Chomsky proposed that any theory that is able to shed light on the fact of language acquisition reaches explanatory adequacy. Obviously of the set of descriptively adequate grammars some will be radically different from descriptively adequate grammars for other languages. Such grammars could not be considered to be the product of an acquisition process as if the mind were able to conceive of them, the set of grammars able to be conceived of would be boundless and hence unlearnable. Only those grammars which fall into a set which shares common properties with grammars for other languages could be

2

Explanation considered to be the product of a Universal Grammar, which being innate to the species would impose restrictions on the hypotheses entertained by human infants in learning language. In this way we can see that explanatory adequacy is associated with Universal

Grammar rather than the grammar of a particular language. The latter is more to do with descriptive adequacy.

3 Problems for Explanatory Adequacy

The way I have presented Chomsky’s levels of adequacy above, indeed the way it is usually presented and was in fact presented by Chomsky himself in 1965, indicates that attainment of adequacy at one level implies attainment of adequacy at lower levels. Thus a descriptively adequate grammar must also be observationally adequate, though not necessarily explanatorily adequate. This turns out to be too restrictive a condition on determining theoretical adequacy. The problem is that there is a tension between the attainment of descriptive adequacy and explanatory adequacy. The more freedom we allow in a grammar, the more readily it yields descriptions, but the less readily it fits into any useful notion of

Universal Grammar. The more restrictions we place on a grammar, the more easily it fits with a general theory of Universal Grammar, but the harder it is to provide complete descriptive coverage of any given language. As explanation is seen as the higher goal, the tendency within Chomskyan circles has been to relinquish complete accurate description for the good of attaining explanatory adequacy – the hope being that as we progress we might come to understand things better and therefore go back and provide more full descriptions in places where details had necessarily been sparse. This was seen as better than struggling to attain better descriptive coverage at the expense of losing explanation. Given that descriptions may or may not lead to explanation, we are not likely to push our understanding much further by taking this route and hence time might be wasted exploring descriptive possibilities that lead us nowhere.

Not everyone agrees with this point of view however. Some have argued that if we do not pay attention to the details of description, we are not likely to make much in the way of progress in gaining explanation. After all, what is the use of an explanatory theory that cannot describe the basic observations? Probably there are aspects to this debate which are confused over different uses of the term ‘explanation’. As I pointed out, Chomsky’s use is a technical one to do with selecting out of competing theories. A theory that attains explanatory adequacy may shed light on the problem of language acquisition, but it does not necessarily provide us with a better explanation of a certain linguistic phenomenon than a theory that does not attain this level, whereby I mean ‘explanation’ in the more standard and less technical sense. For example, there may be facts about language which are completely unrelated to the problem of language acquisition. A theory which sheds light on these facts, but which does not attain explanatory adequacy may be judged against an ‘explanatorily adequate’ one which says nothing about these facts. Of course, the best theory is one which addresses both these facts and ones concerning language acquisition, but in the absence of such an ideal theory what are we to say: which theory is better?

The point is that there are any number of things about human language, all of which needs explaining. Is it reasonable to pull one of these, i.e. facts about language acquisition, and stand it above all the others? Certainly, using any observation to limit the choice of possible theories, given all the logically possible ones, will enable progress to be made. There is also a connection between limitation and explanation, which we will discuss in more detail in the next section. However it is misleading to equate one such set of limitations with the notion of

3

Mark Newson explanation, which seems to have been the unfortunate consequence of Chomsky’s term

‘explanatory adequacy’.

4 Restriction and Explanation

The discussion above highlights an aspect of explanation which is a core part of the concept in most scientific exploration. Essentially the more phenomena one can account for with the same number of assumptions, the more the explanation achieved. In other words if the theory is simpler than the data observed then it counts as an explanation. The simpler the theory, the more explanatory it is.

It should be noted that the notion of ‘simplicity’ I am referring to here is not one that necessarily equates to ‘easy to understand’. I refer instead to a structural notion of simplicity which is relative to linguistic data. For example, suppose the principles of a grammar and the linguistic phenomena the grammar described were in a one to one correspondence. Then the grammar would be of the same complexity as the data it applied to. Suppose we now reduce the grammar so that there are fewer grammatical principles, but that the grammar still describes the data. In this case the grammar is relatively more simple than the data it applies to and can, to some extent be said to explain the data. Obviously, the more simple the grammar can be made, the more explanatory it becomes.

This is the basis for the idea that restrictions to a theory make the theory more explanatory.

The more restrictions we place on a theory, the relatively more simple it has to be in order to operate within the restrictions. So, for example, a grammar that operates without deletion rules is simpler than one that operates with such rules.

Consideration of facts such as the learnability of language do much the same thing as adding restrictions, only in this case what we do is increase the complexity of the data whilst keeping that of the grammar stable. Either way, the grammar ends up more simple than the data and thus its explanatory content is increased.

5 The Problem of Reduction

What we have been considering so far is essentially a reductionist approach: if we simplify a grammar but maintain its descriptive capacity, some principle of the more simple theory must account for what more than one principle of the more complex one did. In other words, two or more principles of the complex grammar must reduce to one principle of the simple one.

The problem with this is that taken to its extreme, the explanation provided by reduction is not particularly useful in helping us to understand something. For example, the most extreme version of reductionism would account for everything in the universe, including human language, as a result of the collision of two hydrogen atoms at the very start of time. But such an account of language would be impossible for us to comprehend, even if it were possible for us to discover, as the human mind is incapable of following the chain of events that led from the big bang to the development of language in the human species.

In narrower terms, another problem with reductionism is knowing where to stop. If reducing two grammatical principles to one counts as an explanation of the data dealt with by those principles, the question then arises as to what explains the data dealt with by the single principle. A reductionist approach would then look for an even more general principle to reduce the single principle (and a few others) to. But then the same question arises about the

4

Explanation new principle. A theory based on these lines is like building castles in the air: there can be no foundation.

6 The Minimalist Programme

Since the early 1990s, Chomsky has attempted to increase the explanatory content of the theory by reductionist methods, though with some rationale to provide a foundation. He reasons in the following way: the simplest and therefore most explanatory linguistic theory would be a theory that contained only principles that were necessary in that no theory of language would be possible without them. This approach then aims to reduce the grammar to the bare minimum.

Of course, it is no straightforward matter to identify what the bare minimum is. To exert some leverage on this question Chomsky assumes that in order for any linguistic theory to be a linguistic theory it must at least account for use of language, which he takes to be the externalisation of thought. That is, the linguistic system bridges the gap between inner (nonlinguistic) mental systems and the physical events which allow for the transmission of the linguistic signal, i.e. the movement of the speech organs or hands, etc. These events are also controlled by brain mechanisms which control bodily movements. Thus the linguistic system can be seen as that part of the brain which links two other parts of the brain, one which deals with conceptualisation and understanding and one which deals with movement and perception.

In the simplest case the linguistic system would contain nothing that was not imposed on it by the requirement of its function: linking the conceptual system to the expressive system. This would mean that there would be nothing to explain as every part of the grammar would be justified by the conditions of its use. If there is any aspect of the linguistic system that is not thus motivated, then further explanation is required and the theory is non-minimalist.

To give some idea of how this works, let us take movement phenomena as an example. By the 1980s, within Government and Binding theory, there were three types of movement phenomena recognised: A-movement (a.k.a. NP-movement), Ā-movement (wh-movement, etc.) and Head movement. Of course, all these were seen as different versions of Move

, the general transformation which licensed the movement of any element into any position. These three versions of movement could be distinguished in terms of what was moved, the place it was moved to and the general conditions accounting for why the movement was grammatical.

For example, with A-movement, and argument NP moves to the subject position and the movement is invariably made from a Caseless position:

(6) a b

John

1

was seen t

1

John

1

seems [ t

1

to be rich]

* it was seen John

* it seems [John to be rich]

Given the assumption that passive verbs lose the ability to assign accusative Case to their object, the object position in a passive construction is Caseless, accounting for the ungrammaticality if the object does not move from this position. Similarly, the subject position of a non-finite clause is generally Caseless, and hence if an NP remains in this position the result is ungrammatical. A-movement was sometimes referred to, therefore, as

Case motivated movement as the movement allowed for the satisfaction of the Case Filter. Of course, to say that the movement had a motivation was metaphorical in the GB system, as the grammar worked by allowing all kinds of movements, but then filtering out structures, either

5

Mark Newson involving movement or not, which violated the constraints. Thus, no movement was motivated, it is just that some movements helped to achieve the conditions for the satisfaction of constraints that would have been violated if the movement had not taken place.

The most well known of Ā-movements is wh-movement and this has nothing to do with

Case. Generally a wh-element moves to the front of the clause, into the complementiser system, into a position that is not associated with Case. If the wh-element is an NP argument, then it always moves from a Case position, though not all wh-elements are NPs or arguments and so have nothing to do with Case:

(7) a b who how

1

1

did you see t

1

did he fix the car t

1

* who

1

was it seen t

1

As we can see in (7a), the movement of a wh-object from a transitive verb is grammatical whereas the same movement from a passive verb is ungrammatical. The difference between the two, as already mentioned, is that the fist is a Case position whereas the second is not. In

(7b) the wh-element, how is neither an NP nor an argument. It functions as a verbal modifier and so is not associated with Case at all. The fact that the two grammatical instances of whmovement are virtually identical, though one involves an NP argument for which Case is relevant and the other does not, shows that wh-movement does not have anything to do with

Case. Yet, if wh-movement does not take place, the relevant kind of question is not formed:

(8) a b you saw who he fixed the car how

Neither of these sentences is ungrammatical if not interpreted as wh-questions – they can be interpreted as ‘echo questions’ in which someone repeats something someone else has said, replacing misheard or doubtful material with a wh-element. (8a) might be an appropriate response to someone announcing that they saw Elvis the other night. However, these are not grammatical wh-questions and hence some principle of the grammar must be violated if the wh-element does not move in a wh-clause. It seems that this has more to do with interpretation. For example, consider the following:

(9) a b who

1

did he believe [ John met t

1

] * he believed [ who

1

John met t

1

]

* who

1

did he ask [ John met t

1

] he asked [ who

1

John met t

1

]

Note that the place that the wh-element moves to is related to the verb of the main clause.

The verb believe takes a declarative clause complement and ask takes an interrogative one.

The ungrammaticality of wh-movement to the front of the complement clause of believe indicates that if a wh-element moves to the front of the clause it must be interpreted as interrogative and the ungrammaticality of not moving the wh-element to the front of the complement clause of ask indicates that if a wh-element does not move to the front of a clause it is not interpreted as interrogative.

Of course under the assumption that syntax is independent of semantics, it cannot be interpretation that is directly responsible for wh-movement. Usually there are assumed to be syntactic features involved. Thus, suppose we assume that clauses are marked syntactically as

[±Wh] on their complementisers, so that a clause with interrogative interpretation has a

[+Wh] marked complementiser and a declaratively interpreted clause has a [-Wh] complementiser. Suppose that pronouns are also marked for [±Wh], so that interrogative

6

Explanation pronouns are marked [+Wh]. It then seems that a [+Wh] pronoun must be at the front of a

[+Wh] marked clause:

(10) a [

[+Wh]

who

1

did he believe [ John met t

1

]] * he believed [

[-Wh]

who

1

John met t

1

] b * [

[-Wh]

who

1

did he ask [ John met t

1

]] he asked [

[+Wh]

who

1

John met t

1

]

Therefore there seems to be an agreement relationship between [+Wh] pronouns and [+Wh] complementisers. Assuming the standard CP analysis of the clause, this works out as follows:

(11) CP who

[+Wh]

C'

C

[+Wh]

IP

It seems then that the movement is ‘motivated’ by the need for the wh-element to agree with a local complementiser.

Finally we turn to head movement. One of the most straightforward cases of this movement concerns the movement of aspectual auxiliaries into the tense position. We can see this in the following examples:

(12) a b c he may secretly have left

* he secretly has left he has secretly has left

As we see, when there is a modal auxiliary, which is assumed to occupy the tense position, the aspectual auxiliary have can follow an adverb like secretly . However, when there is no modal and the tense is marked on the aspectual auxiliary, the auxiliary must precede the adverb. This indicates that the auxiliary undergoes a movement from inside the VP to the tense position immediately after the subject:

(13) D-structure: he [-past] secretly [

VP

have left]

S-structure: he has

1

secretly [

VP

t

1

left]

Obviously by moving the auxiliary to the tense position allows the auxiliary to bare the tense.

One way to envisage this is to assume that what occupies the tense position is a tense morpheme and that this, being a bound morpheme, needs to attach to something. The movement of the auxiliary then facilitates this morphological requirement. Hence, if we are looking for a motivation for this head movement, it seems that the motivation is to do with morphology.

To summarise briefly, there are three types of movement, all of which appear to satisfy different conditions. A-movement allows the Case filter to be satisfied, Ā-movement allows semantically motivated features to agree with each other and head movement allows bound morphemes to attach to a host.

From a Minimalist perspective however, this is far from Minimal. In the Minimalist

Programme, building on the notion of economy that Chomsky introduced at the end of the

1980s, it was proposed that all movements must be motivated: if a movement were

7

Mark Newson unmotivated, it wouldn’t happen as unnecessary movements are uneconomical. But, if the system is to be reduced to a bare minimum, one would imagine that there should be one reason at most why elements move. Indeed, following Minimalist principles, the reason why movement should take place should be entirely motivated by the need for the linguistic system to fulfil its function of linking the conceptual system to the motor system.

To try to meet this aim, Chomsky envisages the workings of the linguistic system in the following way:

(14) words

PF → motor system

LF → conceptual system

In this model, the linguistic system is conceptualised as a structure building process which takes words and step by step builds a structure from them by combining them into a successively larger structure. This process is called merger and it is motivated by the fact that a set of words can only be interpreted either semantically or phonetically if they are combined into an appropriate structure: it would be impossible to know how to interpret a set of unconnected words either in terms of how they relate to each other semantically or in terms of which one to pronounce first, etc. Thus, there must be a process which combines words into structures and the simplest version of this is one which takes two objects and forms them into a new object, thus:

(15) Word 1 Word 2 → Phrase

Word 1 Word 2

The process will then take another word and combine it either with the phrase just built or with a fourth word to form an independent phrase which can then be merged with the first phrase etc. This process will continue until all the words have been built into a single structure and can be interpreted at its end points.

Now, Chomsky envisages movement to be another step in the structure building process. This is rather like merger only instead of taking two independent elements and merging them, it takes two elements where one is already built into the other, such as Word 2 and the phrase that is built by merging Word 1 and Word 2 in (15). These two things will be merged and a new phrase will be built:

8

Explanation

(16) Phrase 1

Word 1 Word 2

Phrase 2

Word 2 Phrase 1

Word 1 Word 2

Thus by a sequence of mergers and movements a structure can be built.

The motivation for merger is that a structure must be built for the words to be interpreted. But this cannot be the motivation for movement as a movement does not combine elements into a structure which are not already part of a structure. Thus movement must have another motivation. Chomsky claims that the motivation for all movements is to enable the deletion of pure grammatical features which would be uninterpretable either semantically or phonetically and hence would cause problems if allowed to remain in the structure to its end points (LF and PF). For example, take agreement features on a verb. Verbs, we know, can be morphologically marked with features which reflect those of the subject: 2 nd person, singular, masculine, for example. But while such features are perfectly interpretable in semantic terms on the subject, it is not clear that they have any meaning on the verb. If such grammatical features were presented to the conceptual system for interpretation, confusion would arise.

Thus, an aim of the structure building process is to get rid of these grammatical features. It does this by pairing uninterpretable grammatical features on one element with interpretable features on another. For example, the grammatical features on the verb are paired with the semantic features on the subject and if both ‘agree’, the grammatical features can be deleted:

(17) a b he

3s

reads

3s

→ agreement → he

3s

reads

3s we

1s

reads

3s

→ no agreement → we

1s

reads

3s

= uninterpretable outcome

In (17a) the grammatical features on the verb are deleted as they agree with the ones on the subject. Therefore the outcome is perfectly interpretable and so grammatical. In (17b) however, as the features do not agree, no deletion is possible and the outcome contains uninterpretable features. The result is an ungrammaticality.

The general idea then is that things move in order to enter into relationships with other elements so that uninterpretable features can be deleted. This process is called feature checking. Moreover, the claim is that feature checking is the sole motivation for all movements. Consider wh-movement. We have seen that this seems to want to put a whpronoun in a syntactic relationship with an agreeing complementiser: both should be [+Wh].

It is not difficult to see how this can be made to fit the feature checking scenario outlined above, though it may be difficult to say which [+Wh] feature, the one on the pronoun or the one on the complementiser, is uninterpretable. I will not consider this problem here.

Case motivated movement is also fairly easily interpreted within a feature checking system.

Suppose that nominal elements enter a structure already with a Case. This is not hard to envisage for the pronouns whose form indicates the Case they bear, but we can extend the assumption to other nominals too. Case features, such as nominative and accusative are not interpretable, being morphological reflexes of grammatical functions determined by syntactic position in languages such as English. Thus like agreement morphemes on verbs, they must be deleted before the end of the structure building process. Obviously movement play a part in this and Cases are checked in certain positions: object position for accusative and subject

9

Mark Newson of a finite clause for nominative. By placing nominal elements in these positions, their Case features can be checked and if appropriate deleted:

(18) a b he

Nom

seems [ he

Nom

to be rich] nominative is checked and so deleted it seems [he

Nom

to be rich] nominative is unchecked and so ungrammatical

Finally we turn to head movement. Just like we can assume that Case is something that is marked on a nominal before it enters a structure, we can assume that verbal inflections are also present on words prior to structure building. The tense position then is not the place where the verbal inflections are inserted independently, but simply the position to which verbs move to check their tense features. Again, if the tense feature on the verb checks with the relevant element, an uninterpretable feature can be deleted and the structure can be appropriately interpreted.

While I have not attempted to give full details of the above processes, the point being made is that all movements can be handled in a similar way and although it may look as though they have different motivations, under this view they turn out to be able to be treated in very similar ways. This is entirely compatible with the aims of the Minimalist Programme as there is nothing in the linguistic system to explain: all movements fulfil the same role which is necessary to allow the linguistic system to fulfil its purpose. If we found that some movements were motivated by other considerations, there would be a need to explain why and hence the whole grammar would be less explanatory.

References

Chomsky, Noam 1965 Aspects of the Theory of Syntax , MIT Press, Cambridge, Mass.

10

Download