MS Word

advertisement

To appear in Ted Gibson and Neal Pearlmutter (eds)

The processing and acquisition of reference , MIT Press

P

ROCESSING OR PRAGMATICS

?

-

EXPLAINING THE COREFERENCE DELAY

.

Tanya Reinhart

Chien and Wexler's (1990) were pioneering in establishing the basic generalization regarding the acquisition of anaphora. Based on experiments with a large number of children (177, aged

2;6 to 7;0), they showed that acquisition delays are found only with coreference, and not with the binding theory in general: Children perform well on the variable-binding aspects of the binding theory, including condition B, but they performed poorly on coreference in condition

B environments. This conformed with Reinhart's (1983) theoretical conclusion that variable binding and coreference are governed by different types of linguistic conditions. The conditions on binding are absolute output conditions, while the conditions on coreference are relative and context dependent. The acquisition question has been why this difference should entail a delay in children's performance on coreference.

In Reinhart (1983) the condition on coreference was perceived as belonging to pragmatics. It involves an inference based on knowledge of grammar, meaning and appropriateness to context, and I believed it can be viewed as an instance of Gricean generalized implicatures - another area where poor performance of children has been recently discovered (Chierchia et al, 2001, Gualmini et al, 2001). Chien and Wexler formulated a similar intuition and argued that children's coreference performance reflects a delay in acquiring the context considerations underlying a pragmatic principle.

Grodzinsky and Reinhart (1993) took a different perspective on this question. Their point of departure has been another seminal result of Chien and Wexler's study. Virtually all studies on the acquisition of coreference have found not just vague poor performance, but results ranging around 50% adult like answers. Such figures are, in principle, consistent with chance performance. Chien and Wexler conducted careful statistical analysis, including individual data, and found that many of the children perform at chance level individually, namely they sometimes answer "yes" and sometimes "no" on the same condition. If so, then this is a pattern of guess, which is not known to be common in acquisition. Grodzinsky and Reinhart argued that the account for the coreference delay should also explain this specific pattern.

Grodzinsky and Reinhart's account rests on a later development of the coreference condition, stated as rule I (Intrasentential Coreference). While in the Seventies and Eighties, everything that was not governed by syntax proper was lumped together as "pragmatics", in the nineties the concept of the interface was beginning to emerge. Rule I views the coreference restriction as belonging to the context Interface, where all the components required for the coreference inference are available (syntax, semantics and context). In today's terminology,

Rule I is a procedure involving reference-set computation, namely an optimality type procedure comparing two competing representations. To determine whether coreference is permitted in a given derivation, another representation with a bound variable should be

1

constructed. Coreference is permitted only if the two are not equivalent in the given context.

The computation involved in coreference is, thus, more complex than that involved in binding, and Grodzinsky and Reinhart argue that it is the computational complexity, rather than just the appeal to context, which explains children's difficulties in the relevant tasks. The processing poses too big a load on their working memory, which is known to be less developed than that of adults. Failing the execution, they may resort to guessing.

Thornton and Wexler (1999) adopt rule I, under its reformulation in Heim (1998), but they raise several arguments against the processing account, and conclude that there is no reason to assume children have any difficulties in processing reference set computation of the type required by rule I. They maintain that the coreference delay reflects a pragmatic deficiency, and develop an analysis of the pragmatic factors underlying rule I, that children have not acquired yet.

The broader question underlying this debate is the role of processing considerations in acquisition. This factor has hardly been considered in studies on the acquisition of syntactic competence. However, it has been independently established that working memory limitations exist in children - for extensive surveys of the findings see Gathercole and Hitch,

1993; Gathercole and Baddeley, 1993. It would make sense, therefore, to determine which effects of acquisition delays can be traced to this factor. I will survey here this debate, and argue in favor of the processing approach. This requires, first, a more detailed presentation of rule I and its relations to the binding theory. As always, the same intuitive idea can be implemented in various ways, and my presentation of rule I will follow its implementation in

Reinhart (2000).

1. Rule I (Intrasentential Coreference).

It is by now well established that intra sentential pronominal anaphora has two interpretations: binding and covaluation (coreference). In the first, the pronoun (- originally a free variable) is bound by some operator; in the second, the pronoun picks up the same value

(reference) as some other argument in the sentence. The most obvious instance of covaluation is coreference, where the value is a referential discourse entity, but other instances are discussed in Heim (1998) and Reinhart (2000). Quantified DPs cannot serve as antecedents for coreference (having no reference), so they can normally enter only bound anaphora relations. But referential DPs allow both relations. E.g., there are two anaphoric construals for (1) that can be represented as in (2).

1) Lucie thinks she is smart.

2 a)

Lucie (λx (x thinks x is smart)) b) Lucie (λx (x think she is smart)) & she = Lucie

In (2b) the pronoun corefers with Lucie

. In (2a), the pronoun is bound by the λ operator. In the framework of syntactic binding theory, the conditions on binding must be stated in terms of relations between arguments (DPs). Hence, Lucie is said to bind the pronoun in this representation. However, this means that syntactic binding must be defined as in (3) (from

Reinhart, 2000; see also Heim 1998).

3 Binding : α binds β iff α is an argument of a λ-predicate whose operator binds β.

2

4) Lucie thinks she is smart and Lily does too.

(2a) and (2b) are of course equivalent, in isolation. But, as was discovered in the seventies

(since Keenan 1971), certain contexts show that there is a real ambiguity here. E.g. assuming that she = Lucie , the elliptic second conjunct of (4) can mean either that Lily thinks that Lucie is smart (the 'strict' reading), or that Lily thinks Lily herself is smart (the 'sloppy' reading). The first is obtained if the elided predicate is construed as in (2b), and the second - if it is the predicate of (2a).

The conditions under which bound-variable anaphora is possible are pretty much agreed upon, and they are summarized in (5).

5 (Variable) binding condition .

β can be construed as a variable bound by α iff a. b. c.

α c-commands β, and

β is a free variable, and

In the local domain of α, β is not a pronoun. (

Condition B ).

(4a) defines the structural configuration for binding assumed since Reinhart (1976).

1

(4b) does not need be stated as a specific condition, but it is the obvious condition imposed by logic: Only free variables can be bound by an operator. Pronouns and anaphors are commonly viewed as variables, so these are the candidates for being bound (leaving aside here more complex instances of free variables). (4c), by contrast, is a specific condition of natural language. Condition B of the binding theory determines that pronouns cannot be bound in the local domain of the binder. Only anaphors can be bound in that domain. Thus, in (6a) the first two conditions of (5) are met: A free variable is c-commanded by a potential binder. Nevertheless, the interpretation (6b) is ruled out by (5c). There are various views on the formulation of condition B, as well as the question why it should exist in natural language, but this topic is irrelevant for the present discussion.

6

7 a) b)

Every lady praised her.

*Every lady (λx (x praised x)) a) Lucie praised her b) *Lucie (λx (x praised x)) c)

* Lucie (λx (x praised her)) & her = Lucie

The same condition rules out, obviously, the binding construal with a referential DP, as (7b) for (7a). However, this is not sufficient to rule out anaphora in (7a). In principle, a free pronoun can pick up its value anywhere, so nothing so far rules out its picking up Lucie , as in

1

Note however, that (5a), as stated here, does not rule out cases of weak crossover, because it does not specify at which stage of the derivation c-command should hold. In His mother loves every boy, every boy can c-command the pronoun after QR, and (5a) will not rule out binding in this derivation.

The covaluation conditions I turn to directly would also not rule this out. As is standard, I assume now that weak cross over is handled by a different generalization.( In my earlier work I assumed that c-command must hold at the overt structure, hence the same condition rules out also weakcrossover).

3

(7c). In fact, however, the sentence cannot have this covaluation reading. So we need also to define the conditions on covaluation. These conditions are more complex. On the one hand, covaluation is much freer than binding, and it does not require c-command ( The woman who praised him hates Max ). On the other, the two anaphora types still obey some shared restrictions. Specifically, covaluation appears to also obey condition B (7c). This presently means that condition B has to be stated so that it restricts both binding and covaluation

(coreference).

How this could be done is not a trivial question. As we saw, anaphora can mean two very different things (binding and covaluation). For this reason, it is unreasonable to assume that both interpretations can be captured by one and the same coindexing mechanism, as in the classical binding theory. (A survey of the problems can be found in Reinhart 1983, 2000.) Let us assume such problems can be solved. (E.g. binding and covaluation are captured by different types of indices, or condition B is stated twice, in slightly different terms - for binding and for covaluation). Still we should note that there is another way to approach this problem, which avoids such questions.

An observational generalization which emerges so far is that covaluation is generally free, except in the c-command domain. Recall that this is the domain that enables variable binding. In this domain, it turns out that if binding is excluded, covaluation is excluded as well. Let us state this in (8).

8) Covaluation condition (Temporary) .

α and β cannot be covalued if a. b.

α is in a configuration to bind β, (namely, α c-commands β) and

α cannot bind β.

Suppose in the processing of (7a), ( Lucie praised her), we are considering assigning the pronoun the value of Lucie, namely the covaluation of she and Lucie (7c). For that, (8) needs to be consulted. Lucie is in a configuration enabling it to bind the pronoun, but actual binding is ruled out by condition B (5c). Hence, (8b) determines that the covaluation in (7c) is also disallowed.

In their empirical coverage the two approaches to condition B effects are precisely identical.

But in the second, rather than checking a direct structural restriction on covaluation, we need to consider an altogether different question, namely whether binding is possible in our given derivation. On this view, covaluation is not directly governed by a condition of the computational system, but by an interface strategy that takes into account the options open for the computational system in generating the given derivation. At this point, this second approach may seem a weirdly indirect way to capture the given facts. But this becomes less so, when we consider another set of facts.

It was noted in Reinhart (1983) that in the case of coreference, we can find systematic violations of condition B, as in (9). (Coreference is marked by italics.)

9 a) Despite the big fuss about Felix' candidacy, when we counted the votes, we found out that, in fact, only Felix himself voted for him . (Reinhart,1983)

4

b) I dreamt that I was Brigitte Bardot and I kissed me . (George Lakoff, discussed in Heim, 1998) c) You are you and she is she . Don't loose your ego!

10) *Oscar is depressed these days. He almost seems t to hate him . (Meaning - to hate himself ).

Contextually similar examples were noted in Evans (1980) for (what became known as) condition C

2

. Evans argued that the reason why his equivalent examples are permitted is that although the pronoun ends up coreferring with a c-commanding NP, it is not referentially dependent on that NP, but rather it picks up its value from the previous mention of this referent in the discourse. If this is the correct explanation, it is not clear why condition B violations are not always possible. It is known that actual discourse tends to maintain referential continuity, so in a large majority of cases, a potential antecedent has been mentioned already before. Still, an arbitrary instance of referential continuity, like (10), does not allow condition B violation.

I argued in Reinhart (1983) that the reason why coreference is possible in (9) is that the coreference interpretation is clearly distinguishable from the interpretation that would be obtained by variable-binding. No such distinction can be found in (10). (The conditions under which the two readings are distinguishable are discussed further in Grodzinsky and Reinhart

1993, and, at greater depth, in Heim 1998. I will illustrate them briefly directly.) If a comparison with the bound interpretation is relevant for deciding whether covaluation is possible in a condition B environment, it is very difficult to see how this could even be stated by a purely structural condition. The covaluation condition (8), by contrast, easily enables stating this by adding a clause, as in (11). (11) is Rule I (Intrasentential Coreference) of

Grodzinsky and Reinhart (1993), as reformulated in Reinhart (2000).

11) Covaluation Rule I.

α and β cannot be covalued if a.

α is in a configuration to bind β, (namely, α c-commands β) and b. α cannot bind β and c. The covaluation interpretation is indistinguishable from what would be obtained if α binds β.

In (9a) ( Only Felix voted for him), Felix is in a configuration to bind him, (11a); hence (11b) needs to be consulted for covaluation. Felix cannot bind him ; hence (11c) needs to be consulted. But covaluation and binding are distinguishable here. Hence the third conjunct of

(11) does not hold, so (11) does not rule out covaluation. In the case of (9a), the distinction is clearly truth conditional: the reading obtained by covaluation ( Only Felix (λx (x voted for

Felix) ) is true only when no-one else voted for Felix, while the reading that would have been

2 As noted in Reinhart (1983), coreference where principle B blocks binding is much harder to find than coreference in condition C environments. E.g. in the context of (9a), it would be more natural to express the idea with When we counted the ballots …Only Felix voted for Felix . The reason suggested there is that using the full proper name is the more explicit way to capture the intended meaning. (The pronoun requires the further task of identifying its value.) Nevertheless, examples like (9) are possible, with effort.

5

obtained by binding (

Only Felix (λx (x voted for x)

), may be true if many people voted for

Felix, but he is the only person who voted for himself.

More broadly, in all the examples of (9), applying (11c) would show that the bound variable interpretation is distinguishable from the covaluation reading. Heim (1998) points out that defining the notion 'distinguishable interpretation' is not a trivial matter, and develops the notion of guises to account for some of the contexts that she believes are not captured this way. However, in many of her cases, the contexts remain the same, namely those where a contrast can be found with the bound-variable interpretation (I return directly to the area where our views differ). For (9b), Grodzinsky and Reinhart (1993) argue that this is because most likely Lakoff's dream did not involve an act of self-kissing. Heim's concept of guises can capture this intuition in a different way. In the identity cases like (9c), Reinhart (1983) argued that the bound variable interpretation ( (You(λx (x is x) ) is a tautology, while the intended covaluation reading ( You (

λx(x is you) is an empirical statement. Heim (1998) developed the concept of 'structured meaning' to handle such cases. In (10), as in (9), (the trace of) Felix c-commands him (11a) and cannot bind it (11b). But here, neither the internal semantics of the sentence, nor the context, provide any possible distinction between the covaluation interpretation and the binding interpretation. Since all conjuncts of (11) hold, it rules out covaluation.

It is because of clause c of (11) that the covaluation rule cannot be a simple structural condition on coindexation outputs. Rather, it is an optimality type condition. To compute

(11c), a reference (comparison) set must be constructed. E.g. for (12a), the binding construal

(12c) is ruled out by condition B, which is an absolute, non-negotiable, condition. But suppose we consider assigning the free pronoun the value of Oscar, namely (12b). For this, we need to construct the reference set (12b,c). Although the derivation at hand does not allow the interpretation (12c), it needs to be constructed, and compared with (12b). Only if the two are distinct at the relevant context, (12b) is allowed.

12 a) Oscar hates him.

Reference-set for covaluation : b) Oscar hates him & him = Oscar c)

Oscar (λx (x hates x))

There are several approaches to the question why the computation of covaluation requires a comparison of representations in a reference set, namely what is behind Rule I. Initially I assumed (Reinhart 1983) that what governs the covaluation condition is the fact that in configurations of c-command one could also opt for binding. A predecessor of this view was

Dowty (1980), who proposed that the underlying principle was 'avoid ambiguity'. The scope of his proposal was only instances of (what became) condition B: For (12a), he observed that replacing the pronoun with the reflexive anaphor would yield an unambiguous anaphora interpretation, while the choice of a pronoun allows both an anaphoric and non-anaphoric interpretation. I argued that this is a more general phenomenon (also when opting for variable binding still leaves the derivation ambiguous). Assuming that binding is in general a more explicit way to express anaphora than covaluation, then avoiding it with no interpretative reason suggests non-coreference. This is also the approach taken by Grodzinsky and Reinhart

(1993), stated there in more general terms of economy. There are several more sophisticated

6

lines attempting to explain rule I in terms of "least effort" economy, most notably Fox (1998) and Reuland (2001)

3

.

However, there have always been some problems with this interpretation of the economy requirement underlying rule I. If variable binding is always preferred over coreference we would expect a sentence like (13a) to allow only the bound variable construal of the pronoun.

In fact, it allows both construals, as witnessed by the classical ambiguity of the VP ellipsis in

(13b). The problem for this economy view is how the construal (13c) is generated for (13b), given that (13a) allows also variable binding.

13 a) Max likes his mother b) Max likes his mother and Felix does too. c) Max [likes his mother, ( his = Max) ] and Felix does [like his mother, ( his =

Max) ] too. d) *Max praised him and Lucie did too, ( him = Max ). e) *He likes Max's mother, and Felix does too, ( he = Max ).

3 Within this view of economy, rule I could be stated without clause b. of (11), as in (i), which is essentially how it was viewed in Grodzinsky and Reinhart (1993) (modulo technical changes introduced in Reinhart (2000). i)

α and β cannot be covalued if

a.

b.

α is in a configuration to A-bind β, and

The covaluation interpretation is indistinguishable from what would be obtained if α Abinds β.

That variable binding is more economical is possibly defendable, in terms of semantic processing.

Compare the two interpretations of (ii). ii) a) b)

Max loves his mother

Max (λx (x loves x's mother))

Max (λx (x loves z's mother) & (z = Max))

In (a), where the pronoun is bound, the VP forms a set, and we just have to check whether Max is in it.

In (iib), the pronoun remains a free variable. The VP remains an open property, and it has to be held open until the pronoun is assigned a value. Only when this happens, assessment can take place. If it turns out that the intended value is, anyway, Max, then it is not obvious why we had to go through assignment at all. The economy requirement would be, then, "get rid of free variables - i.e. close open properties - as soon as possible". So this appears to be an instance of the 'least effort' principle of economy. This view of the economy requirement is developed, under a different terminology, in Fox

(1998).

Reuland (2001), assuming a generalization like (i), offers a different rational for why 'least effort' requires that (i) should hold. On his analysis, variable binding is a procedure taking place within the computational system (forming a chain), while coreference is a discourse procedure. He argues, roughly, that in general, procedures applying during the derivation are more economical than those applying at the interface. So when the first is available, an interpretation based on the second is excluded.

Nevertheless, in view of problems surveyed partially below, I argued in Reinhart (2002), (forthcoming) that these potential economy considerations do not, in fact, play a role in anaphora resolution, and there is no general preference for variable binding over coreference or covaluation, when both are allowed by the CS.

7

At a first glance, it may be suggested that (13c) is licensed by the ellipsis context (namely, that the more economical variable binding construal can be avoided, because the covaluation reading is distinct in this context). But this is not so. Although ellipsis contexts enable the two construals to be distinct, they crucially do not license covaluation in and of themselves.

In (13d), the fact that we want to use the predicate (

λ x (x praised him, him =Max )) in the elided conjunct, does not enable covaluation of Max and him in the first. The same point is illustrated for condition C environments in (13e). More generally, evaluating whether the bound reading is distinct from the covaluation reading can be based only on information in the derivation itself (perhaps relative to its previous context), but not on considerations of how it would effect upcoming discourse. (In this sense, this type of economy remains local, as in other instances of economy. See Fox (1995) for an extensive discussion of this point, in the case of QR in ellipsis structures. Thus, sentences like (13b), which have served as the classical illustration of the availability of two construals of anaphora, remain unexplained within this view of economy. (Grodzinsky and Reinhart (1993) could address this problem only with a stipulation (in their footnote 13). Other attempts are discussed in Reinhart

(2000).)

In view of this and other problems, I proposed in Reinhart (2000, forthcoming that the type of economy involved here is different. Rule I as stated in (11) prohibits free covaluation only when binding is disallowed in a given derivation (clause b of (11)). Since in (13a) binding is permitted, covaluation is also free, without consulting clause c of (11). I argued that rule I is an instance of a broader interface strategy that can be labeled 'minimize interpretative options'. The problem for users of linguistic derivations is how to minimize the set of possible interpretations of a given PF. The more options there are, the more mysterious is the fact that speakers manage to understand each other. Hence, for an efficient use, an interpretative option ruled out by the computational system should not be sneaked back arbitrarily by procedures available at the discourse level. In our case, if binding is ruled out, namely the set of interpretative options of the pronoun is restricted by the CS, rule I determines that one cannot obtain precisely the same interpretation by using the discourse option of covaluation. Other instances of Minimize interpretative options are discussed in

Reinhart (Forthcoming).

One may wonder, then, why rule I applies just in the case of condition B environments. The answer is that it does not. In fact, Rule I is a general restriction on covaluation. The other environment where covaluation is not allowed is when the pronoun c-commands the antecedent. This has been originally perceived as a structural condition on coreference. The condition has essentially remained unchanged since its first formulation in Reinhart (1976) (-

14a), and it is presently known as condition C (14b):

14 a) "A given NP cannot be interpreted as coreferential with a non-pronoun in its c-command domain." (Reinhart, 1976) b) Condition C: (Chomsky, 1981):

An R-expression is free

Definitions: i. An NP is bound iff it is coindexed with a c-commanding NP. ii. An NP is free iff it is not bound. iii. An R-expression is any DP which is not a free pronoun or anaphor.

8

The term pronoun in (14a) was defined to include also reflexive pronouns, so a non-pronoun is neither pronoun nor reflexive. If coreference is to be captured by coindexation, (14a) determines that a non-pronoun cannot be coindexed with a c-commanding NP. (14b) captures the same generalization, by use of the definitions (i-iii). ' Free ' is defined as not coindexed (i.e. not coreferential) with a c-commanding NP . An R-expression is a nonpronoun (or anaphor). The term has nothing to do with reference - Bound variables, like wh traces, are also defined as R-expressions. The two formulations in (14), thus, express the same condition, based on the view that both binding and coreference (or more broadly covaluation) are guided by a structural rule.

Let us assume, for the moment, that condition C, just like condition B, is a condition on

(variable) binding, namely, that it should be added to the binding conditions in (5). This means that binding is impossible in (15a), namely that (15b) is ruled out. But just as in the case of condition B (exemplified in (7)), this is not sufficient to rule out the covaluation construal in (15c).

15 a) She thinks that Lucie is smart. b)

*Lucie (λx (x thinks that x is smart)) c)

*She (λx (x thinks that Lucie is smart)) & she = Lucie

However, once rule I (11) is assumed, nothing needs be added to it, to rule out (15c). In

(15a), she c-commands Lucie . Hence, if we are considering covaluation of the two, (11b) needs to be consulted. By Condition C, she cannot bind Lucie (15b ) . The fate of (15c) now depends on clause c of rule I. In the given context, the covaluation reading (15c) is equivalent to (i.e. indistinguishable from) the bound reading (15b). Hence (15c) is ruled out.

In (16), by contrast, the two readings are truth-conditionally distinct. (In (16b) considering oneself smart is said to hold only of Lucie; in (16c) considering Lucie smart holds only of

Lucie.) Hence Rule I allows covaluation here.

16 a) Only she thinks that Lucie is smart. b)

Only Lucie (λx (x thinks that x is smart)) c)

Only she (λx (x thinks that Lucie is smart)) & she = Lucie

I will return to other instances where Rule I permits coreference in apparent violation of condition C. In fact, the evidence for clause c of rule I is much stronger in the case of condition C than with condition B. (See footnote 2 for a potential reason.) All of Evans'

(1980) original examples were with apparent condition C violations, and historically my major motivation in Reinhart (1983) was capturing the interpretative and conceptual problems posed by condition C.

So far we assumed that, similarly to condition B, condition C is still needed as a structural condition on binding, independently of covaluation. Let us now turn to the question whether this is indeed so. To check this, let us incorporate condition C into the binding condition we assumed before in (5), as clause d. of (17).

17 (Variable) Binding condition (with condition C).

β can be construed as a variable bound by α iff a. α c-commands β, and b. β is a free variable, and

9

c. In the local domain of α, β is not a pronoun. (

Condition B ), and d. β is not an R-expression. (

Condition C)

Recall that the definition of R-expression covers anything which is not a free pronoun or anaphor, namely cannot be construed as a free variable. ( Wh - traces are R-expressions). Thus,

(17d) just repeats (17b). Clause b. is, of course, crucial for binding, but as I mentioned, it is just a prerequisite of logic, that does not have to be stated as a specific linguistic condition.

(One cannot imagine that natural language would be useable at the interface, if it had a concept of variable-binding undefined in logic.) Our original binding condition (5), is, thus, sufficient to capture the restrictions on variable binding under consideration. (As in the standard theory, weak crossover is not captured by what has been stated here; see footnote 1).

Condition C, is, thus, superfluous as far as binding goes. However, since the linguistics community seems attached to condition C, we may leave open here the question whether clause d. of (17) is needed independently of clause b., namely, whether condition C exists.

Either way, condition C, just like condition B, only restricts variable binding, and the crucial point here is that covaluation (coreference) in condition C environment is governed by the covaluation rule I.

Let us examine a further example for how clause b. of (17), or condition C, interacts with rule I in classical cases where it is assumed that condition C is at work..

18 a) Who does she think t is smart? b)

Who (λz (she think z is smart)) c)

*Who (λz (z think z is smart))

In the strong-cross-over structure (18b), the pronoun she c-commands the trace z . But by

(17b) (namely 5b), it cannot bind it, because the trace is a bound, rather than free variable.

Who can still bind the free pronoun she . In this case we would obtain (18c), where the pronoun and the trace are covalued - both get the value z . (That the she z does not bind the trace - z can be verified with definition (3).) However, since she both c-commands the trace z , and cannot bind it, Rule I (11) determines that this covaluation is ruled out. Condition C effects in quantification contexts ( *She i

think that every lady i

is smart ) are precisely analogous, assuming that the quantified DP undergoes QR.

Turning now to the acquisition of rule I, as established since Chien and Wexler (1990), there is a sharp contrast between children's performance on bound anaphora and on coreference.

Specifically, children rule out variable binding construals in condition B contexts at a rate of

80% -90%, but they perform at around 50% on ruling out coreference in these contexts. G&R assume that both the binding conditions and Rule I (or the broader strategy behind it) are innate, and fully available to the child, namely, there is no deficiency of information, or any factor that awaits acquiring. The child also masters innately the basic laws of logic, and has the tools to compute logical equivalence, as required by clause c. or Rule I. But the difference in children's performance on binding and coreference follows from the different types of computation involved in resolving these two types of anaphora. Computing clause c. of rule I requires constructing, holding, and carrying a semantic comparison of a reference set with two representations, and G&R argue that the amount of processing required by this step exceeds children's working memory capacity, which is not as developed yet as that of adults.

10

The MIT Encyclopedia of Cognitive Sciences defines the working memory system as follows: "Cognitive scientists now assume that the major function of the system in question is to temporarily store the outcomes of intermediate computations when problem solving, and to perform further computations on these temporary outcomes (e.g. Baddeley 1986)" (Smith

1999, p. 888). It is obvious that reference set computation relies heavily on this ability to store and perform further computation on temporary outcomes. Independently of this task, it has been by now pretty established in psychology that children's working memory is not yet fully developed . An extensive survey of the literature and findings in this area can be found in

Gathercole and Hitch (1993) and Gathercole and Baddeley (1993). (For an example of experiments on the linguistic effects of this limitation with pre-school children, see

Gathercole and Adams, 1993, Adams and Gathercole 1996.)

4

Given this limitation, one may assume that children know precisely what they have to calculate in order to answer the questions in rule I tasks, but they fail to execute the required procedure. I will return to more details of how this works in section 2.1.

G&R argue that the crucial indication for working memory failure is in the statistics of children's performance. What the repeating experiments on coreference in condition B environments confirmed, is that at the relevant experimental setting, the results are at approximately 50% of adult-like performance. As mentioned, Chien and Wexler (1990) showed that, in these circumstances, chance performance is found also in individual children

(conflicting answers on the same condition), which indicates a guess pattern. The same results were confirmed in Thornton and Wexler (1999) who showed that (for most children in their experiments) the analysis of the individual results corresponds to the binomial model - probability of arbitrary choices between two options. I will turn to their statistical findings and their significance in section 4., where I also discuss the experimental conditions under which chance performance is to be expected. 50% performance consistent with a guess pattern is not something that is found all over the place in acquisition. If children don't know a given rule, one may still expect a uniform performance pattern of individual children. But if the source of the difficulty is a processing failure these results are explained: To resort to a guess, the child has to know that he is missing something. (Otherwise he would operate uniformly based on his assumptions on what the relevant rule is, or, in case of default strategies, according to the default.) This condition is met here because the child knows innately that he has to execute the comparison required by clause c. of rule I. Since he gets stuck in the execution, and there is a pressure to answer either "yes" or "no", one of these is chosen arbitrarily.

4 The working memory system should not be confused with memory resources in general (long term memory). E.g. an anonymous reviewer of this paper argues against the claim put forth here that

"children are capable of memorizing a large number of new words; they are capable of learning rules of new games, etc." and that the same is true of aphasic patients. But these tasks concern memory resources in general, regarding which I am not aware of evidenced limitation in children. Smith

(1999) explains that the view of working memory as a getaway to long term memory has been undermined by neuropsychological studies that found that there are patients who are impaired on working memory tasks but perform normally on long-term memory tasks.

Note also that the precise details of how working memory develops - whether memory capacity itself increases, or only efficiency in allowing more resources to be employed in storage, is a subject of debate. But these details are not important for the present discussion, because either way, children's working memory was found not to operate as efficiently as adults'.

11

2. Thornton and Wexler's arguments against the processing account.

Thornton and Wexler (1999) (-T&W) argue against the processing account. They assume a version of Rule I that follows its reformulation in Heim's (1998). As mentioned, on Heim's view, there are some contexts where what enables a coreference reading is that the two NPs pick up the shared referent under distinct guises. In all other contexts, Heim's analysis is the same as outlined above. But T&W extend her analysis to all contexts and argue that it is not the need to construct and evaluate a comparison set that hinders children's performance on coreference, but rather a pragmatic deficiency in identifying the use of guises. Thus, children's difficulties do not reflect a processing limitation, but problems with contextual orientation that develops with age.

Let us first examine the main arguments of T&W against the processing account of

Grodzinsky and Reinhart (1993). One argument is conceptual. They say:

"On Grodzinsky and Reinhart's account, the processing bottleneck that children encounter is 'of the sort known to diminish with age' (1993, 91). Thus, they do not share the assumption that children have access to a universal parser (see Crain and

Wexler 1999; Crain and Thornton 1998). Rather, the child's processing system has different properties from adults, and Rule I remains problematic until this system matures" (T&W, 1999, 47)

I definitely share the theoretical assumption of a universal parser, in the references sited by

T&W. But G&R's point of departure is precisely that the children's parser, being innate, is identical to that of adults. They argue that "there is no known reason to assume that any of the steps [of Rule I] requires knowledge that surpasses children's innate endowment… But the execution of all these steps, in the specific case of structures like [ Oscar touched him ] puts a much heavier burden on working memory than do other rules (e.g. the binding conditions)…

If this is so, then presented with [such sentences], children know exactly what they are required to do by Rule I, but getting stuck in the execution process, they give up and guess"

(G&R, 88). The difference between children and adults, in this case, is only in the size of their working memory. It is commonplace wisdom that precisely one and the same parser

(software), applying in two hardware-systems differing only in the size of their memory, may fail at some tasks in the one, but not in the other. A difference in memory-space cannot be described as a different parser, nor precisely as a different processing system. Rather, acknowledging common-place wisdom in psychology, that children's working memory is smaller than adults', enables us to explain how the same innate computational system and parser can still fail in children's processing.

Conceptual issues aside, T&W raise two arguments against G&R's analysis, which they summarize as follows: "There are two main problems with Grodzinsky and Reinhart's account… First, there is little or no evidence to support the proposal that some sentences containing pronouns ( e.g

. Mama bear is washing her) cause a processing overload whereas others (e.g. Mama bear is washing her face… ) do not. Second, there are reliable experimental finding showing that whereas children misinterpret pronouns in principle B structrures, they do not have difficulty with parallel principle C structure (i.e. Mama bear is

12

washing her vs. She is washing Mama Bear ). On rule I account, both should be equally difficult to process" (p.52). Let us examine each of these arguments.

2.1. Processing load .

The empirical prediction of G&R is that in all tasks that require the processing of clause c. of rule I (as stated here) would lead to chance performance of children, which G&R take as the evidence for a processing load. In the present formulation, clause c. is the step that requires a semantic reference-set computation. Let us see, first, how this works. Rule I is repeated below.

11) Covaluation Rule I.

α and β cannot be covalued if a.

α is in a configuration to bind β, (namely, α c-commands β) and b. α cannot bind β and c. The covaluation interpretation is indistinguishable from what would be obtained if α binds β.

Suppose the child is considering coreference assignment in a given derivation. This means

Rule I must be consulted. If either clause a. or clause b. of (11) does not hold, the assessment ends here, with nothing complex about it.

19 a) Max's mother loves him (& he =Max).

b) The woman next to Max praised him (& him = Max) .

20) a) b)

Mama Bear is washing her face ( & her = Mama Bear ).

Mama Bear is washing herself (& herself = Mama Bear).

In (19), clause a. of rule I (11a) does not hold - neither of the candidates for covaluation ccommands the other. Hence, clause b. (11b) need not even be consulted, and the covaluation goes through. In (20a), clause a. holds, so clause b. must be checked. However, under the present formulation of rule I (see the discussion of (13 above), clause b. does not hold since binding conditions B and C do not rule out the binding of the anaphoric element. So assessment ends here, and coreference is permitted. The same is true of (20b). There is no evidence that the need to check clause b. of (11) poses any processing difficulties to children.

But if both clause a. and b. hold, assessment must go through clause c., which is the costly step. These are the cases of coreference in apparent violation of conditions B (21) and C, (22,

23).

21 *Mama Bear is washing her ( & her = Mama Bear ).

22 *She is washing Mama Bear (& she = Mama Bear ).

23 Only she is washing Mama Bear (& she = Mama Bear ).

In these cases, a comparison representation must be constructed and compared to the intended coreference representation. In terms of processing it does not matter whether the final verdict of rule I is "allow", as in (23), or "disallow", as in (21, 22). In both cases, the decision requires a complex computation.

13

A question that arises is what it is precisely about step c. of (11) that exceeds the processing ability of children. In fact, two procedures take place in applying this clause. It is easiest to spell them out from the perspective of the comprehension side of the parser. First, in order to determine whether coreference is distinct from binding, the binding representation needs to be constructed. This representation is not available at the input derivation (which is associated with the phonological input the parser receives), since the input derivation does not allow binding. So the parser has to construct an alternative representation with variable binding. (The details of the procedure of constructing the alternative derivation are discussed in Reinhart, 2000.) The next step is semantic computation: The two representations need to be compared against the context, and only if they are distinct, coreference is allowed. The second procedure seems similar in nature to that involved in semantic disambiguation, where two representations need to be compared in order to select the one appropriate to the context.

It is known that semantic disambiguation itself poses already a processing load, because it requires holding two (or more) representations in working memory. So one may ask which of the parsing procedures involved in reference-set computation surpasses the capacity of children's working memory. G&R assumed that the later task is already beyond children's ability, and cited children's performance on lexical disambiguation as evidence. (Faced with a lexically ambiguous word, children select the reading that is statistically more frequent, rather than comparing the competing readings against the context.) As pointed out in T&W, in the case of lexical disambiguation, alternative analyses of children's performance are available.

Still, it is known that children (like, in fact, adults) tend to develop defaults to bypass the parsing of semantic disambiguation, which is some evidence for the greater processing load posed by this task (see, e.g. Crain, Ni and Conway (1994)). Nevertheless, I suggested in

Reinhart (1999) that the conclusion of G&R was mistaken, and it is only the full complex involved in reference-set computation which leads to a processing crash of the child's parser.

More empirical work is needed on children's performance on semantic disambiguation, but the hypothesis put forth here is that we expect a processing failure only when the computation requires also the first step of constructing a derivation not available at the parser's input.

The theoretical expectation of the present framework is that reference-set computation (of the relevant global type) is always associated with group performance at the 50% range in dual choice tasks in acquisition, namely, that if other areas of language are found that require reference-set computation, with properties similar to clause c. of rule I, then (under the appropriate experimental setting) we should find children performance at the 50% range in these areas as well. The proposed explanation is that children are aware of the innately required computation, but they cannot carry it out because of their limited memory resources, and they resort instead to strategies enabling bypassing it. In Reinhart (forthcoming) I argue that two such bypassing strategies are possible: One is simple guessing, witnessed by individual performance at the range of 50%. The other, dominant in tasks involving semantic disambiguation, is the selection of an arbitrary default, which may be fixed for a given child across tasks. But since the choice of the default is, itself, arbitrary, the group results remain at the 50% range.

As we will see, in the area of rule I, the guess strategy is found in all anaphora tasks involving step c. of this rule, including in tasks similar to (23), where this clause rules coreference in.

However, T&W offer an alternative account for children's difficulties in all these cases

14

(except 23). Hence it is appropriate for them to raise the question what independent evidence exists that the source of difficulty here is indeed the processing load. In principle, it should be possible to test directly the processing load in rule I tasks by standard measurement of processing time, or more sophisticated eye-tracking experiments. To my knowledge, this has not been done. Another type of possible independent evidence is if performance at the 50% range is found indeed also in areas other than anaphora, where, on the one hand, reference-set computation has been established, and, on the other, T&W's pragmatic analysis cannot apply.

I argue in Reinhart (forthcoming) that such evidence has indeed emerged in the areas of stress-shift for focus. Chierciha et al (2001) and Gualmini et al (2001) found the same in the area of implicatures, which also involve semantic reference-set computation. (Since semantic disambiguation is at work in both these areas, the dominant strategy in both is the arbitrary fixed default.) 5

5 T&W try also to provide empirical counter evidence to G&R's claim that it is the complexity of the computation which is responsible for children's difficulties. This is based on the assumption that there are other areas of anaphora that involve equally complex computations, and still, they pose no difficulties to children. As they put it, "indeed, there are several empirical findings in the literature showing that for many complex structures, children can hold two representations in memory and compare them for the purposes of computing the reference of a pronoun" (T&W, 46). However, the two examples they discuss of such complex computations do not, in fact, involve any reference-set computation, nor can it be argued that they pose a comparable complexity of computation.

One example concerns discourse anaphora instances as in (i) (T&W's (39), p. 46). ii a) b)

No mouse/every mouse came to Simba's party. He wore a hat.

A mouse came to Simba's party. He wore a hat.

The pronoun can refer to the indefinite in (ib), but not to the quantified DP in (ia). T&W cite experiments of Crain and of Conaway that found that children performed almost adult-like on anaphora tasks in such sentences, and they conclude that this is despite the fact that "clearly children must be able to hold both sentences in memory in order to apply the relevant constraint" (T&W, 47).

It is not obvious why this is so clear. I am not aware of an analysis that requires a reference-set computation in such tasks, and if one exists, it is unmotivated. In this case, there is even no need to hold two representations at all. It has been established in DRT (and other frameworks) that indefinites introduce a discourse referent that can be picked up in subsequent discourse, (ib), while quantified NPs normally do not (special circumstances, absent in (ia), aside). This generalization can be stated under many theoretical formulations, but the task involved is establishing which item in the discourse referents storage is available for the pronoun to get its value from. The mouse entity is available in this storage for (ib) but not for (ia). In any case, the task requires looking at the discourse storage, rather than retaining two representations, let alone comparing them.

The other example is with quantified (bound) anaphora in the domain of condition B. T&W cite experiments by Crain (1991) and Thornton (1990), which checked sentences like (iia). i a) I know who scratched him - Bert. b) Every turtle scratched him and Bert did too.

Children correctly rejected (iia) if Bert was shown scratching himself, which means that they had no difficulties in processing the sentence. But this is hardly surprising. T&W view (iia) as an instance of ellipsis, namely a VP needs to be copied or reconstructed for Bert . If so, this is precisely equivalent to the type of task in (iib), which was the focus of T&W's own experiments (although T &W did not

15

2.2. Condition C.

The second major argument of T&W regards condition C. The theoretical stand underlying their argument is that condition C must be assumed as an independent syntactic condition.

(This follows Heim (1998), who modified rule I to apply only in condition B environments, leaving the covaluation problems with condition C for future research.) However, as we saw in section 1, whether condition C is needed for binding or not, is independent of the question under consideration, of children's performance on the coreference aspects of condition C. Be that as it may, it seems to me that T&W should expect exactly the same behavior in

Condition C environments, AS G&R do. This is because they state that, as in Chien and

Wexler (1990), they continue to assume that the pragmatic generalization governs both the coreference aspects of condition B and of condition C (p. 31), and they provide an alternative account of why children's performance on condition C coreference appears better than that on condition B (see below). This is not surprising, since both G&R and Chien and Wexler

(1990), further developed in T&W, share the assumptions of Reinhart (1983) that variable binding and coreference are governed by different types of rules. Chien and Wexler's study was pioneering in establishing that, correspondingly, children perform well on binding tasks

(in both conditon B and C environments), but perform at around 50% in coreference tasks.

So, under both analyses one may expect the same with the coreference aspects of condition

C. Nevertheless, T&W also argue that given that condition C is innate, children should not have problems with its coreference aspects, and use this as an argument against G&R's analysis. Let us follow this argument.

With the exception of Grimshaw and Rozen (1990), who found near-chance performance on sentences like their (24a)

6

, most studies found that children rule out coreference disallowed by condition C at a much higher rate than their performance on condition B.

24 a) *He said that Bert touched the box. ( he = Bert) b) Because he heard a lion, Tommy ran fast. ( he = Tommy)

However, G&R argued that the apparent improved performance on condition C might reflect an independent factor: In a right branching language, the most frequent instances of condition

C violations also involve backwards anaphora. With the exception of Crain and McKee

(1985), studies that found a high rate of rejection of anaphora in (24a) also found that experiment precisely with sentences like (iib), but rather with the reverse ordering or the conjuncts, which, they note, may have been an oversight). The construal of the second conjunct is determined by the construal of the first. But in the first, there is no coreference option to begin with, because the antecedent is quantified. Binding condition B disallows construing the pronoun as bound, namely forming the predicate λx (x scratched x) , hence this predicate cannot be reconstructed in the second conjunct. (As shown already in Chien and Wexler (1990) and reiterated in T&W, children do not have difficulties with the variable binding aspects of condition B.) Next, the parallelism condition determines that a coreference interpretation in the second (ellided) conjunct is only possible if it is available in the first. According to T&W, children essentially master this condition. In any case, it seems that in this example, T&W's analysis and the reference-set analysis have precisely the same predictions. So if this experiment poses a probem, it is a problem for both.

6 In the reported experiments, children allowed coreference in (23) in 37.5% of the cases, which was not significantly different from their performance on condition B violations.

16

children reject backward anaphora in structures like (24b), where it is permitted by condition

C. These studies attribute both results to an independent directionality factor, and conclude that children reject backward anaphora regardless of condition C. (To mention just a few:

Tavakolian 1977, Solan 1978, Lust and Clifford 1982).

In their chapter 2, T&W argue in reply that the findings regarding directionality effects are just a product of the experimental setting, specifically, of using act-out or elicited imitation tasks. Thus, they conclude that there is no evidence for children's independent difficulties with backward anaphora, which means that the reason they reject anaphora in structures like

(24a) can only be adherence to condition C (p. 49). Interestingly, however, in chapter 3,

T&W encounter a directionality problem for their own analysis. Their account for children's performance on condition B is that they have not mastered yet the use of guises. Thus, they may obtain "local coreference" under two different guises in (25a), where the adult conditions for obtaining distinct guises are not met. One would expect, then, that children should also allow local coreference in the condition C environment (25b), by precisely the same procedure of assigning different guises to the relevant entity. Still, in T&W's experiments, children rejected local coreference in sentences like (25b) at a high rate of 92%, while they rejected coreference in sentences like (25a) only at the rate of 57%, with mostly chance performance.

25 a) b)

Mama Bear washed her.

She washed Mama Bear (T&W, 25, p. 106)

T&W explain that "the crucial difference between sentences subject to principle B and those subject to principle C is the obvious one: In the former, the pronoun is in object position, and in the later, the pronoun is in subject position"(p. 106), namely the crucial factor is directionality. They proceed to offer two reasons why when the pronoun precedes the potential antecedent (as also in 24a), anaphora computation will be blocked independently of the guises options. One is in terms of processing: Pronouns are assigned a reference as soon as they are encountered. If she in (25b) has been assigned the reference of Mama Bear (from the previous discourse), then deciding whether it could corefer with the next occurrence of

Mama Bear, under a different guise, would require backtracking this step and start a new guise-computation. It is this backtracking which is difficult to children, or as T&W conclude,

"obviously, an on-line incremental parser would find this amount of computation burdensome". (p. 107)

7

. Presumably, then, children do not even consider the coreference option in such contexts.

7 In fact, T&W make a stronger claim that "backtracking in order to reconsider the interpretation of the pronoun, is not likely to be within the parser's capacity, either for children or for adults". (p. 107).

This cannot be true since the adults' parser can clearly deal with backward anaphora, as well as with the apparent condition C violations permitted by rule I, such as (16) above, or Evan's (i): i) I know what Ann and Bill have in common. She thinks that Bill is a genious, and He thinks that Bill is a genius.

Needless to say, if T&W's generalization holds also for adults, then there is very little evidence for condition C in right branching languages, since most of what it rules out would also be ruled out by the special parser limitation.

17

Whether this is the precise formulation of the directionality factor or not, it confirms

Grodzinsky and Reinhart's conclusion that with backwards anaphora there is an independent factor that disables the application of rule I for children. Possibly, the factor is not just any directionality, as G&R assumed, following previous studies, but only directionality involving the subject, as proposed by T&W. In any case, it is this directionality factor that explains why children reject coreference in the common condition C contexts, independently of rule I.

As it turns out in chapter 3, also within T&W's analysis, children's improved performance on condition C tasks does not provide any evidence for their mastery of the computation of the coreference aspects of condition C. In fact, they assume, just like G&R, that children bypass rule I, or the guises computation, in these environments.

Grodzinsky and Reinhart (1993) suggest that to circumvent the directionality factor, children's performance on the coreference aspects of condition C should be checked in the few instances where this factor is absent, most notably in reconstruction contexts such as

(26).

26) *Near Ann, she saw a lion ( she = Ann).

Under the reconstruction analysis, condition C (and hence, rule I) blocks coreference here because, once reconstructed to its original position, the PP is c-commanded by the pronoun.

On the other hand, there is no directionality effect here, since during processing the pronoun follows the antecedent. Indeed, in such environments, experiments reached a clear consensus: children perform poorly, with exact figures varying according to the experimental method.

(E.g. Ingram and Shaw 1981, Taylor-Brown 1983, Lust Loveland and Kornet 1980.)

T&W dismiss these findings as well (chapter 2), based on the claim that one of them (Taylor-

Brown 1983) has used what they consider deficient methodology. (They ignore in this chapter the question how come children all of a sudden master guises computation in these contexts.)

They argue further that a more recent study - Chierchia and Guasti (2000) has proved unequivocally children's mastery of condition C in reconstruction contexts. In fact, however,

Chierchia and Guasti's study focused on bound variable anaphora, such as (the Italian version of) (27).

27) *In the Barrel of every pirate i

, he i

carefully put a gun.

Indeed, children rejected anaphora in such sentences 90% of the times. But this is precisely the expected result for both Chien and Wexler (1990) and G&R. Rule I is not involved in the processing of (27). Bound variable anaphora is governed directly by the binding conditions, whether the relevant condition here is condition C, or clause b of (17) (-the logical requirement that only free variables can be bound). The crucial assumption of G&R (as of

T&W) is that children should face no problem in the processing of variable binding. In

G&R's terms this is because a heavy computational load is only involved in coreference tasks, where rule I needs to be consulted to determine whether coreference is still permitted although binding is ruled out. Chierchia and Guasti, in fact, emphasize this point in that same paper, stating explicitly that they did not study coreference in these structures, but their theoretical expectation is that in coreference tasks, the same difference would be found between children's performance on bound-variable (quantified) anaphora and on coreference, as found in condition B environments.

18

It remains the case that to properly check children's performance on rule I in condition C environments, one should abstract away from possible directionality factors. An unexpected further confirmation that if this is controlled, children perform at chance level, comes from

T&W's own experiments on coreference in VP ellipsis, with sentences like (28).

28) The kiwi bird cleaned Flash Gordon and he did too.

In the story context, Kiwi bird and Flash Gordon fell in the mud. Flash Gordon asked a third participant to help clean him, but that one refused. Kiwi bird helped clean Flash Gordon, but mostly, Flash Gordon had to clean himself on his own. Children accepted (28) as true in this context 54% of the times. In other words, they allowed the pronoun he to corefer with Flash

Gordon at chance level.

Let us first examine the type of computation required to determine whether coreference is permitted here. At the interpretation stage, the λ predicate formed in the first conjunct (29a) is present also in the second conjunct (29b) (whether copied from the first, or just deleted at

PF, but present at LF).

29 a)

The kiwi bird (λx (x cleaned Flash Gordon)) and b) He did (λx (x cleaned Flash Gordon)) too (& he = Flash Gordon)

Now, we are considering assigning the pronoun the value of Flash Gordon , which would result in a covaluation configuration in clause (29b). The first clause of rule I (11) holds: the pronoun c-commands Flash Gordon in (29b), hence rule I must be further checked. Clause b. holds as well - the pronoun cannot bind Flash Gordon , by condition C, or its equivalent logical prohibition (clause b. of 17). So clause c. of rule I must be applied, namely the representation (30) should be constructed and compared with (29b).

30 He did (λx (x cleaned x)) too (& he = Flash Gordon).

The coreference construal (29b) is permitted only if in the context of (29a), (29b) is distinguishable from (30). The fact of the matter is that it is. The parallelism requirement is that the predicates in the two conjuncts be identical (under the relevant definition). The predicate in (30) does not occur in (29a). The only candidate for parallelism is the predicate as construed in (29b). In more intuitive terms, the property shared by the two events is that of cleaning Flash Gordon , not of cleaning oneself.

The type of meaning distinctness this example shows is similar to that in (31), observed by

Evans (1980), and discussed in Reinhart (1983), Grodzinsky and Reinhart (1993) and Heim

(1998). Heim labeled such contexts 'structured meaning' contexts.

31) I know what Ann and Bill have in common: She thinks that Bill is terrific, and he thinks that Bill is terrific. (adapted from Evans 1980: (49))

32 a) She (λx (x thinks that Bill is terrific)) and b)

He (λx (x thinks that Bill is terrific)) (&

he = Bill ).

33)

He (λx (x thinks that x is terrific)) (&

he = Bill ).

19

The last conjunct in (31) violates condition C. Nevertheless, the coreference interpretation

(32b) is permitted. In this case, parallelism is not imposed by ellipsis, but by the content of the preceding context, which requires identifying a shared property of Ann and Bill.

Although the proposition (32b) is equivalent to (33), the properties attributed to their subjects are not identical (they denote different sets). It is only the property in (32b) that is indeed shared by (32a, b), or by Bill and Ann. This suffices for rule I to allow the coreference construal in (32b). Typical of parallelism configurations like both (28) and (31) is that the shared material must be destressed (in 31) or fully suppressed (in 28), which entails that in both, the subject pronoun is stressed.

By this computation, then, (28) comes out as an instance of coreference ruled in by rule I.

8

This contrasts with some claims in the theoretical literature on coreference that (28) is ruled out for adults (e.g. Fiengo and May, 1994). But that it is indeed the correct verdict is witnessed by the results in the adults' control experiment of T&W, where they accepted (28) at a rate of 83%. The less than 100% acceptance rate here may be typical of rule I computations, which require more effort also from adults, but it is still in sharp contrast to their full rejection of coreference in condition C environments ruled out by rule I, such as He cleaned Flash Gordon - with no ellipsis context.

Recall that unlike the adults, children performed here at chance level, allowing coreference at

54% . For G&R, this is the expected result whenever clause c. of rule I needs to be processed.

Whether coreference is ruled in or ruled out by rule I cannot be relevant, since children cannot complete the computation, anyway. The question then is why this expectation is confirmed only in the ellipsis context 28, and not in the experiments with the same sentences as matrix, as in (25b - She washed Mama bear) or He cleaned Flash Gordon.

T&W provide an answer: In the VP ellipsis context, the on-line processing factor, which blocks even considering coreference in the matrix cases, does not play a role, because the full predicate attributed to the subject pronoun in the elliptic conjunct is available to the child

8 As explained in Fox (1995), an ellipsis context is not sufficient, by itself, to license coreference by rule I (see also the discussion of (13) above). Thus, in the reverse order in (i), coreference in the first conjunct is not allowed, even though this would enable the interpretation that both the kiwi bird and

Flash Gordon himself cleaned Flash Gordon. i) He cleaned Flash Gordon, and the kiwi bird did too.

However, the prohibition stated by Fox is (roughly) against letting future discourse effect the processing of a given derivation. In (28), by contrast, the computation of rule I applies after the relevant predicate has been already formed in the previous context.

That apparent condition C violations are possible in the given ellipsis context has been noted before.

T&W mention that Fiengo and May (1994) suggested for sentences like Mary likes John and he thinks that Sally does too that an operation of "vehicle change" (roughly) changes the status of John in the second conjunct to that of a pronoun. As T&W point out, however, this would not work for the local context of (28), where a pronoun is ruled out as well. (Fiengo and May argue that sentences like

(28) are indeed ruled out, but given the adults answers in the experiment, who accepted it at 83% of the times, this cannot be true.) The account T&W offer for why coreference is permitted in (28) is that since the pronoun is stressed, it is taken as a different guise of Flash Gordon.

Hence this is an instance of coreference under different guises.

20

from the previous conjunct. So when the reference of the pronoun is decided, the full proposition is available for computation. (p. 129). Thus, the offensive backtracking is not required, or, in our terms, the directionality factor is avoided.

This, then, is a novel direct confirmation of G&R's prediction: In contexts where the directionality factor is neutralized, children's performance on condition C aspects of rule I should be at chance level. Nevertheless, T&W present this experimental finding as a major and decisive argument against G&R's analysis, in two chapters of their book: "On

Grodzinskhy and Reinhart's view", they argue, "the proposed asymmetry between matrix and

VP ellipsis structures is not to be expected. For Grodzinsky and Reinhart, children should respond at chance levels to matrix sentences governed by principle C because rule I requires two representations to be compared" (p.129). "Thus, Grodzinsky and Reinhart's account cannot be correct in its present form" (p. 201).

This same experimental finding also sheds some light on another argument of T&W against

G&R's analysis. Under G&R's account, the crucial factor leading to chance performance is that children are unable to carry out the computation required by rule I, because of their underdeveloped working memory. This means that what the actual verdict of rule I is (for adults) cannot have an effect on children's performance. Whether rule I permits coreference or not, they will not be able to complete the computation to decide this. In T&W's account, by contrast, the explanation for children's delay rests on their extending adults conditions for the creation of guises: "Children create guises in a superset of the contexts in which adults so"

(18, p. 102). It follows from this principle, that children should allow coreference under distinct guises wherever adults do, though they extend this also to areas where adults don't.

Thus, when rule I permits coreference in conditions B and C environments, T&W predict that children should perform like adults. T&W acknowledge that these environments have not been studied experimentally (overlooking the relevance of their own experiment on (28-29), where, as we saw, children performed at chance level in a condition C environment which happens to be ruled in by Rule I). Nevertheless, they expect that these are the results that would be found, once the experiments are done, and explain that such findings would be a further argument against G&R's analysis that predicts the opposite (p. 48).

3. Questions of learnability.

In the absence of any actual argument against a processing account, the two accounts for coreference delay in acquisition appear, so far, equivalent in their empirical coverage, with the exception of one area - derivations ruled in by rule I. Assuming that T&W's analysis can be modified to handle such cases, this is an interesting situation, where two different accounts appear equally possible for the same phenomenon. This is particularly interesting since the two accounts are based on essentially the same view of the linguistic background, namely on the division of labor between binding theory and the coreference restrictions. Both follow the view in Reinhart (1983) that binding theory (under its various formulations) restricts only variable binding, and its operations are of the familiar type of output conditions of the computational system. Coreference (or covaluation), on the other hand, is governed by a different type of procedure, which is based on context-dependent inference. In principle, it is just as reasonable to assume that what causes coreference delay in acquisition is the type of computation involved, executing which requires larger working memory than children have

21

(G&R), or that it is some deficiency of the relevant context-dependent factors, which children have not mastered yet (T&W). Our next question, in this and the next section, is whether it is possible, nevertheless, to decide between these two possible accounts.

Note, first, that for the processing account, no question of learnability arises. G&R assume that children know innately everything that is required for coreference computation. So as soon as their working memory matures, they will be able to execute it. T&W's account is based on some deficiency in knowledge, which has to be acquired. So the question how it is acquired is relevant. To assess how T&W answer this question, more details of their analysis of guises are needed.

What T&W find particularly attractive in Heim's (1998) reformulation of rule I, is that in some areas it enables coreference computation to depend only on the identification of guises, without applying rule I, namely, with no comparison of representations. The clearest example is what Heim labels "identity debate contexts" as in her (34).

34 Speaker A: Is this speaker Zelda?

Speaker B: How can you doubt it? She praises her to the sky. No competing candidates would do that.

In such contexts, it can be argued that her refers to the person that A and B identify as Zelda, while she refers to the person who is the speaker (say on stage), and whom B does not manage to identify. The same entity (Zelda) is then presented here under two guises. Heim argues that in this case, a comparison of a coreference representation with the bound variable representation must identify them as logically equivalent. So rule I as stated would wrongly rule coreference out. She proposes that, in fact, when entities are represented under different guises, their relation does not count as coreference (roughly, since pronouns have guises as their denotation, and not individuals; hence they do not have the same denotation here). Thus, they are not subject to rule I at all. Though this is not crucial for the present discussion, I should mention that I do not share Heim's intuition that the reason why speaker B's utterance in (34) is appropriate is that no coreference is involved here. Though postulating noncoreference may solve a technical problem, my own intuition is that the fact that she and her corefer is crucial for the interpretation - it is the inference that speaker B wants speaker A to draw.

Instances of 'debated identity' seem to me related to what Heim labeled "structured meaning" cases. Namely, what matters in the context is the difference of the properties, rather than the full propositions, which are equivalent in both cases. In the identity debates contexts, the property one ascribes to an individual under discussion should help establish his identity.

Praising oneself

(λx(x praises x)

and praising her = Zelda

(λx(x praises her)

are distinct properties. If we identify someone as belonging to the set of those who praise themselves to the sky, this cannot help in establishing this person's identity, in the given context, but locating her in the set of those who praise Zelda to the sky, enables the inference that she is

Zelda. (In Heim's example, the context also spells out that Zelda is probably the only member in this set. But the same inference would be licensed also without this addition.) Obviously, we do not have yet the formal tools to describe precisely this type of inference (which may rest on notions like relevance). This is the worry that led Heim to exclude such problems from the range of rule I. Doing so enables us to keep the term "distinguishable interpretation", which is used in rule I, purely truth-conditional. But it is not clearly getting us closer to

22

understanding either the inference at question, or the conditions under which speakers are allowed to opt for coreference rather than variable binding.

Let us, however, assume with Heim that identity debates contexts are instances of distinct guises. If so, then the coreference task for the child is just to determine whether it is possible that two referring expressions represent, in the given context, different guises of a discourse entity. If this happens, then coreference is permitted. T&W argue that given that guises are coded, they must be innate. What children do not master yet are the conditions under which speakers associate different guises with the same discourse entity, since children have a general deficiency in identifying speakers contextual intentions.

However, for the analysis to work, T&W extend Heim's analysis much further, e.g. to the contexts Heim labeled "structured-meaning" that we examined in (31), changed in (35) so it illustrates condition B environments.

35 I know what Ann and Bill have in common: She adores him passionately and he adores him passionately .

For Heim, he and him in the italicized clause cannot possibly pass as two distinct guises of

Bill. Allowing this would deprive the intuitive concept of guises of any content, since there is nothing here that suggests speakers' uncertainty concerning identity, or a dual perception of the same individual. Heim assumes that in this case, rule I applies as in Reinhart (1983), or

G&R, namely a semantic comparison of representations is needed, though she offers further refinement of the conditions under which they are distinguishable.

T&W, by contrast, argue that two guises are involved here as well. There is Bill "in the guise of the individual, in the flesh" who adores someone, and there is Bill in the guise of the person that Ann adores (p. 94, adopted here from T&W's different example (9)). Bearing different guises, as used here, seems to mean just bearing different θ-roles. T&W label this type of guise distinction "role reversal guise" (p.101). The same of course can be said of any instance of permitted coreference. E.g. in Bill adores himself, there is Bill the agent, and Bill the patient. But it also holds for all instances of blocked coreference. In Mama bear washed her we have automatically two guises.

T&W appear aware of this, and they add another condition on the identification of guises.

They note that in the relevant clause in (35) the subject pronoun is stressed, and propose the generalization that "stress on the pronoun has the effect of presenting [Bill] in a different guise, in virtue of the unexpected property of self-admiration" (p. 93). More generally, stress is a major clue for identifying guises in T&W's analysis, and they argue that except for identity debate contexts, it is required in all cases of local coreference under distinct guises.

The heavy stress marks that there is something surprising, and non-characteristic about the situation expressed in the sentence. With this, then, T&W can identify one of the factors that children have to acquire when they eventually reach adult coreference use.

36

"Children must learn that stress … marks the speaker's intention to convey the local coreference interpretation by bringing [the stressed element] into focus." (T&W, p.

205)

23

While it is true that in many instances of coreference approved by rule I in condition B contexts there is heavy stress on one constituent or the other (in (35) it is on the subject; in

(37) - on the object), the same stress pattern is found in many other instances where it does not have the effect of allowing coreference. E.g. (37) is a context that T&W believe allows coreference for Mama Bear washed her , with the help of the heavy stress (a judgment not shared by all). But the same stress-pattern, with the same sentence, in the contexts of (38) has precisely the opposite effect of enforcing non-coreference interpretation (- Mama Bear could only wash Daisy Duck).

37) Mama Bear did not wash Miss Piggy. Mama bear washed her .

38 a) First Daisy Duck washed Mama Bear and then Mama Bear washed her . b) First Daisy Duck washed Miss Piggy and then Mama Bear washed her.

39) Children must learn that stress marks the speaker's intention to convey non coreference interpretation, by bringing the stressed element into focus.

By the same logic, then, we should add to the conditions the child must learn, the one in (39).

A theory equipped with both (36) and (39) can never fail, because it covers the whole domain of options (-stress either means coreference, or non-coreference). In a sense, it captures the facts accurately - the child eventually knows that heavy stress is sometimes associated with coreference, and sometimes with non-coreference, as is the state of affairs in the adults' world. Nevertheless, it is not clear that this is the type of theory we want.

9

A more appealing conclusion for such state of affairs would be that stress is not the factor that determines coreference options in condition B contexts.

If it were indeed possible to reduce all instances of coreference approved by rule I to distinct guises, then there would be no motivation to assume any reference-set computation for coreference, to begin with. It is only necessary to determine for a given sentence whether the two referential occurrences are under the same or a different guise, which does not involve constructing and comparing semantic representations. Indeed, T&W mention, in passim, that possibly, "rule I can be dispensed with entirely" (p. 104). If so, then the processing account is of course unmotivated, and we are left indeed only with pragmatic considerations.

This would not be the first attempt to dismiss the problem posed by coreference computation by enriching the set of referential distinctions, namely, to capture it directly by properties of the participating arguments, rather than by properties of the full representations. A whole family of accounts, starting with Evans (1980), attempted to distinguish coreference from

'referential dependence' and argue that the binding conditions restrict only the latter. E.g.

Fiengo and May (1994) argue that coreference is always possible for two given NPs, as long as "it is not part of the meaning of the sentence that they are co-valued" (their linking rule).

Like Evans, they do not consider why, then, coreference is not just simply always possible

(see the discussion of (10) above). The apparent success of such attempts rests on using

9 This either or condition is reminiscent of the original Principle P that Chien and Wexler (1990) offered to account for coreference: Assuming that binding conditions B and C always enforce contraindexing, the principle says that "contraindexed NPs are noncoreferential unless the context explicitly forces coreference". In other words, contraindexed NPs are either coreferential or not, depending on unspecified context considerations.

24

undefined notions. Thus, as I mentioned, Evans' insight was in exposing and illustrating virtually all contexts that allow coreference in apparent violation of the binding conditions.

But his description of the distinction he assumes would equally allow the same everywhere else. (For a more detailed survey of this point, see Reinhart, 1983.) A theory based on undefined distinctions is always true, by virtue of being unfalsifiable.

T&W are probably aware of the danger of unfalsifiability posed by their description of guises as signaled by heavy stress. So they appear to view this just as a necessary condition

(in all but identity debate contexts). Stress alone is not sufficient to determine guises interpretation. In addition, they assume that there are special contextual cues that speakers use as " markers of the speaker's intended interpretation" (p.105). It is only when these special cues are used, that the sentence can be associated with what T&W view as surprising, or non-characteristic traits of the situation expressed by the sentence, which in turn allow coreference. In other words, these cues determine when by using heavy stress, the speaker actually intends to use the expressions as different guises. It is these cues that the child has to learn. Regarding what these cues are, T&W do not say much, but rather refer the reader to

Heim (1998). As we saw, however, Heim argues that no guises are involved at all in the examples under consideration.

On the view of Heim (1998) and Reinhart (1983), determining that coreference is possible here is not based on any cues, but rather on applying logic: If the coreference representation is logically distinguishable from the bound one, coreference is permitted.

This set of presently unspecified contextual conditions (cues) is, then, what children are missing at the age of the experiments, and will acquire in the next couple of months or years.

Based on innately specified pragmatic principles of the Gricean sort, "children count on speakers to make their intended interpretation clear whenever possible, using whatever means are at their disposal. Children learn from experience that specific contextual cues accompany the local coreference interpretation, such as the factor of 'surprise'…" (p. 103). How does this learning from experience happen, in the absence of negative evidence? T&W suggest that

"once children have witnessed a sufficient number of examples of the local coreference interpretation in contexts that contain the relevant contextual cues, they will thereafter refrain from assigning this interpretation in the absence of these special markers of the speaker's intended interpretation" ( ibid ).

The underlying assumption of T&W is probably that the acquisition of contextual abilities, and identification of speakers' intentions is of a different type than found with innate UG principles. Nevertheless, the same learnability questions still arise. As mentioned, actual examples of coreference in condition B environments are quite rare in discourse. One may wonder if by the age of about 6, all children got sufficient exposure to such uses. One may also wonder how each child decides at this age that the examples in the corpus he has encountered so far cover the whole set of options of use. Let us assume that once the set of

"special markers of the speaker's intended interpretation" is defined at some greater precision, these questions can be answered.

4. Explaining chance performance .

Assuming that the pragmatic and the processing accounts may fair roughly the same in predicting the areas of delay in the acquisition of coreference, and even if it turned out that

25

they are equally plausible in terms of learnability, we may still ask whether they both, indeed, explain the experimental findings. To address this, we need to get clearer about what the problem is that requires an explanation.

Though it is standard to describe the experimental findings as indicating a delay in the acquisition of coreference, the findings are much more specific than that. Acquisition delay can take several forms. If children don't know a given rule, or have set the parameter wrongly, the most natural result to expect is in the vicinity of 90-100% non-adult performance. (A variety of different group statistical-results is to be expected if children's performance differ individually). But in all experiments on condition B coreference (of the relevant type - see below), the group statistics of children's performance is around 50%. This in and of itself is a curious result, but it becomes more puzzling once it is established that this is indeed chance performance, taking into consideration the performance of individual subjects. As mentioned, Chien and Wexler (1990) provide statistical analyses of individual performance, showing that many children perform individually at chance level (-sometimes reject and sometimes accept coreference under the same experimental conditions). G&R's point of departure was that chance performance of this kind indicates guessing, which requires an explanation.

Let me first clarify the experimental conditions at which 50% performance is found, as explained in G&R. The target sentence is preceded by, or embedded in another sentence which also provides an antecedent for the pronoun, as, e.g. in (40a).

40 a) This is A. This is B. Is A washing him?

Picture/story context: b) c)

A washes A

A washes B

The story or picture accompanying the sentence includes either the situation in (40b) or in

(40c). In both Chien and Wexler (1990), and Grimshaw and Rozen (1990), children had no problem answering "yes" in the vicinity of 90% in the context (40c), but they had around

50% performance in the context of (40b). It is this condition (40b) (the "mismatch" condition), which is relevant for our discussion. By comparison, Chien and Wexler found that at the same context, if A is a quantified DP, like every bear , rather than a referential DP, children at the age of 5 gave the adult answer "no", at 85% of the times.

On G&R's account, the reason why chance performance occurs only for (40b) is that only in this context (clause c of) rule I needs to be consulted. Though G&R do not explain this, rule

I applies when a coreference interpretation is considered. In the context (40c) the option of coreference is not suggested by the context, so there is no reason for the child to even examine the option in deciding his answer. In (40b), by contrast, a coreference interpretation corresponds to the context situation. So rule I has to determine whether the target sentence allows coreference (in which case the answer to (40a) is "yes") or not (with the answer

"no").

10 Since in this sentence binding is disallowed, clause c of rule I needs to be processed.

10 This is in general the case with interface reference-set computation. In Reinhart (forthcoming) I argue that the same computation is found with QR and stress shift for focus. In these cases as well, the computation needs to be carried out only if the relevant interpretation is considered. This means,

26

Adults would complete the task successfully and answer "no", but children cannot complete the execution, hence they perform at chance, or guess.

Not all subsequent experiments confirmed chance performance also at the level of individual children, but Thornton and Wexler (1999) point out that usually the experiments' results have not been sufficiently analyzed, statistically, to determine that. (No comparison with the binomial model of results expected based on guessing out of two choices). In the detailed statistical analysis of T&W of their own experiments, a similar pattern to that of Chien and

Wexler (1990) was found. By their own conclusions, the binomial model was confirmed at least for a group of about 75% of the children -15 out of the 19 subjects.

The group's performance on condition B sentences like Bert brushed him was approval of coreference 58% of the times.

11

The individual subject data reveals that out of the 19 subjects, 8 children accepted ¾ or 4/4 trials, 7 children accepted 2/4 trials, 1 child accepted

¼ trials and 3 children accepted 0/4 trials. Note that seven children showed an equal number of "yes" and "no" on the four trials of the same condition. But this is not the only indication of individual chance performance (since chance allows different individual numbers). The combined group results are almost consistent with the binomial model for guess selection between two options. T&W point out that the number of correct answers (- adult-like coreference rejection) in 3/4 or all 4 of the trials is a bit higher than the probability in a binomial model: The model predicts 2 such children, while there are 4 (p. 175). T&W propose to identify these 4 children as a separate subgroup. For the other 15 children (or, statistically, for 17 of the 19 children), the response pattern is fully consistent with the binomial model of pure chance, or guessing.

As for the subgroup of 4 children that rejected coreference in condition B environments,

T&W assume that they have reached adult knowledge. In their terms, this means they have mastered early the cues to guise- identification, or (if correct), in G&R terms, this would mean that their working memory has developed early, so they are able to execute the computation. Technically, only 2 children diverge from the binomial statistics, as we just saw. But T&W followed this group in all conditions of the sequence of experiments, and they found out that the same children perform equally well in all conditions involving coreference in condition B environments. (E.g. in the ellipsis condition with sentences like

Bert brushed him and the Tin Man did too , this sub-group permitted coreference incorrectly in 1/16 trials, precisely the same result as in the non ellipsis condition Bert touched him, that we have just examined.) e.g. that not all interpretations of quantifiers scope are equally complex. To compute whether in (i) a woman can take scope over every bear no special computation needs apply.

(i) A woman washed every bear.

But if the option of a QR interpretation is considered (wide scope for every bear ) semantic reference set needs to be constructed, so the computation is more costly.

11 In the VP ellipsis sentences like Bert brushed him and the Tin man did too , coreference acceptance rate was 43%. Their combined rate is 50.5%. Usually, the more results are combined in a chance pattern performance, the closer it gets to precisely 50%.

27

This uniform behavior across conditions justifies singling out all four children as a separate group. However, T&W's conclusion that the reason they are singled out is that, unlike the other children, they have reached adult-knowledge does not follow automatically. In fact, it is not consistent with another of T&W's findings. In the VP ellipsis tests of condition C, that we discussed in (28), repeated in (41), the children as a group performed at chance level, allowing the construal he cleaned Flash Gordon at 54% of the times. But on this task, unlike the condition B tasks, there is no significant difference between the two groups of children, as seen in (42) (T&W p. 200).

41) The kiwi bird cleaned Flash Gordon and he did too.

42 Acceptance of the interpretation 'He cleaned Flash Gordon & he = Flash Gordon': a) Group I (4 children): 44% (7/16) b) Group II (15 children): 57% (32/56)/

The adult control group accepted coreference here in 83% of the times, but the group of 4 children that are presumably "little adults" in their knowledge of guises (or rule I) performed here at 44%, which is in the range of chance performance. (T&W do not provide the individual data for this group of 4 children on this condition.) Recall that this "structured meaning context" is an instance of coreference ruled in by rule I, although binding is ruled out here by condition C. In T&W's analysis this is a case of distinct guises. If the four children at question mastered adults' guise understanding, which T&W assume, to explain their performance on condition B tasks, they should have manifested this also in the present task.

It is in principle possible that when children are unable to execute a given task, some of them would develop some sort of a strategy (default) to deal uniformly with such tasks without applying the difficult procedure. Children operating by a strategy end up performing uniformly across different tasks, and, depending on the default strategy and the experimental condition, it can happen to be the adult like response. It is not crucial for the present discussion to determine what strategy could explain the full data of the performance or these four children.

12 Still the facts suggest that a strategy is at play, for these children, in the case

12 It is possible, in fact, to formulate such a strategy, but it is hard to see where it could come from: It would be to disallow coreference whenever a pronoun can be bound, skipping rule I altogether. This would rule out coreference in condition B environments but not in (40), where the pronoun cannot be bound. Hence, rule I still needs to be processed, leading to the familiar failure and guessing. There is another interesting finding of T&W that appears consistent with such a strategy. This regards the strict interpretation of reflexives in VP ellipsis contexts such as (ib), namely, the question whether children allow a coreference interpretation for the reflexive in (ia), as opposed to its interpretation as bound variables. On this issue, there is no reason to expect 50% performance, under either theory, ii i and indeed it was not found. But the two groups performed dramatically differently, as summarized in (ii) (T&W p. 195). a) b)

Hawkman fanned himself.

Hawkman fanned himself and the baby boy did too.

Acceptance of the strict interpretation of (1b) (The baby boy fanned the Hawkman) a) Group I (4 children): 13% (1/16)

:

28

of condition B. It seems rather less likely that they would have adult-knowledge in one case and not in the other.

In any case, abstracting away from the group of four, the crucial finding confirmed again in

T&W's experiments is that at least for the majority of children, performance in Rule I environments is at chance level, consistent with individual guessing..

So, a crucial question about a given analysis of coreference delay is whether it can explain this specific guess pattern, which, as mentioned, is not a common finding in all areas of acquisition. Even if not all children show this pattern, those who do need explaining. The processing analysis of G&R has taken this finding as its point of departure, and it provides a straightforward answer: A guess pattern is found when, on the one had the child knows what needs to be computed to provide an answer, and on the other hand, he is not able to complete the task. So, given that there are two options to choose from - "yes" or "no", the choice is arbitrary - guessing.

On the pragmatic account of T&W, it is hard to see how chance performance could be derived even for a minority of the children. On their account, the source of children's coreference delay is their extension of the conditions allowing distinct guises: They permit distinct guise-interpretation in a super set of the conditions under which adults permit them

(T&W's 'extended guise creation', p. 102). Their performance, then, should be determined by the size and properties of the superset they adopt. Suppose, e.g. children accept freely what

T&W labeled "role reversal guises", namely they allow every thematic role to correspond to a separate guise. In this case they should always allow coreference in condition B environments, because (like in any other instance of coreference) the two occurrences have different thematic roles. So their performance should be close to 100% acceptance. Suppose they take heavy stress as always allowing distinct guises. Then their performance should depend on the experimental conditions. In sentences where heavy stress is used, they should allow, again, coreference at the range of 100%. But if heavy stress is avoided (as in most of the experiments) - they should perform "adult like" and disallow coreference at the same range. So the guises super-set analysis can indeed predict correctly that children's performance would differ from adults, but it cannot predict the specific way it differs, namely the actual findings of individual chance performance.

References b) Group II (15 children): 81% (21/26)

The 4 children that T&W identified as little adults disallowed it; the others allowed it. From the perspective of rule I alone, coreference should be permitted in (ia), because clause b of rule I does not hold - Hawkman can bind himself . (See the discussion of (20) in section 2. The question why this is not an option taken by adults more commonly is an independent issue, to which I do not know the answer.) Children who apply rule I will be able to get this far, and since this clause does not hold, they will allow coreference here. Children who bypass rule I and operate by the strategy just outlined will rule out coreference in (ia) because the reflexive can be bound.

29

Adams, A.-M. and Gathercole S.E. (1996). Phonological working memory and spoken language development in young children. Quarterly Journal of Experimental

Psychology , 49A, 216-233.

Baddeley, A.D. (1986). Working Memory. Oxford: Oxford University Press.

Chien, Y.-Ch., and K. Wexler (1990). Children's knowledge of locality conditions in binding as evidence for the modularity of syntax and pragmatics. Language Acquisition 1, pp. 225-295.

Chierchia G. & M.T. Guasti (2000) "Backwards vs. Forward Anaphora: Reconstruction in

Child Language", Language Acquisition , 8.2:129-170.

Chierchia, G, S. Crain, M.T. Guasti, A. Gualmini and L Meroni (2001) "The Acquisition of

Disjunction: Evidence for a Grammatical View of Scalar Implicatures" in Anna H-J-Do et al

(eds), BUCLD 25 proceedings , Somerville MA: Cascadilla Press: 157-168

Chomsky, N. (1981). Lectures on Government and Binding.

Foris, Dordrecht.

Crain, S. and McKee, C. (1985). The acquisition of structural restrictions on anaphora. In S.

Berman, J.W. Choe and J. McDonough (Eds), NELS 16, Amherst: University of

Masachusetts, GLSA

Crain, S., W. Ni, and L. Conway (1994). Learning, Parsing, and Modularity. In Ch.

Crain, S and R. Thornton (1998). Investigations in Universal Grammar: A guide to experiments on the acquisition of syntax and semantics . Cambridge, MA: MIT Press.

Dowty, D. 1980. `Comments on the Paper by Bach and Partee.' In K.J. Kreiman, and A.E.

Ojeda (Eds), Papers from the Parasession on Pronouns and Anaphora , Chicago Linguistic

Society.

Evans, G. (1980). "Pronouns" Linguistic Inquiry Vol 11, no 2, p. 337-362.

Fiengo, Robert and Robert May (1994). Indices and Identity , MIT Press, Cambridge,

Mass.

Fox, Danny (1995). Economy and scope. Natural Language Semantics 3:283-341.

Fox, Danny (1998). Locality in variable binding. In Is the best good enough? Optimality and competition in syntax , ed. Pilar Barbosa, Danny Fox, Paul Hagstrom, Martha

McGinnis, and David Pesetsky, 129-155. Cambridge, MA: MIT Press and MIT

Working Papers in Linguistics.

Gathercole, S.E. and A. Adams (1993) Phonological working memory in very young children. Developmental Psychology , 29, 770-778.

Gathercole, S and A. Baddeley (1993). Working memory and language . Essays in cognitive psychology. Hove: Lawrence Erlbaum.

Gathercole, S. and G. Hitch, (1993). Developmental changes in short-term memory: a revised working memory perspective. In Collins, A. F., Gathercole, S. E., Conway, M. A., Morris, P.

E. (eds.), Theories of Memory.

Hove: Lawrence Erlbaum, 189-211.

30

Grimshaw, Jane and Sara Thomas Rosen (1990). Knowledge and obedience: the developmental status of the binding theory. Linguistic Inquiry 21: 187-222.

Grodzinsky, Yoseph. and Reinhart, Tanya (1993). 'The innateness of binding and coreference'

Linguistic Inquiry 24(1): 69-102.

Gualmini,A., S. Crain, L. Meroni, G. Chierchia and M.T. Guasti (2001). At the

Semantics/Pragmatics Interface in Child Language. Proceedings of SALT XI, 231-247,

Cornell University, Ithaca, NY.

Heim, Irene (1998). Anaphora and semantic interpretation: A reinterpretation of Reinhart’s approach. In The interpretative tract , ed. U. Sauerland and O. Percus. In MIT working papers in linguistics 25. MITWPL, MIT, Cambridge, MA.

Ingram, D. and C. Shaw (1981). The comprehension of pronominal reference in children .

MS. University of British Columbia, Vancouver.

Keenan, E. (1971) "Names, quantifiers and a solution to the sloppy identity problem", Papers in

Linguistics , vol. 4, no. 2.

Lust, B. and T. Clifford (1982). The 3D study: Effects of depth, distance adn directionality on children’s acquisition of Mandarin Chinese. In J. Pustejovsky and P. Sells (eds.)

NELS 12 .

Amherst: University of Massechusetts, GLSA.

Lust, B., K. Loveland and R. Kornet (1980). The development of anaphora in first language.

Linguistic Analysis 6, 217-249.

Reinhart, T. (1976). The syntactic domain of anaphora , PhD dissertation, MIT, Cambridge,

Mass. Distributed by MITWPL.

Reinhart, T. (1983). Anaphora and semantic interpretation Croom-Helm; Chicago

University press.

Reinhart, T. (2000). 'Strategies of anaphora resolution''. In Hans Bennis, Martin Everaert and Eric Reuland (eds) Interface Strategies.

Royal Netherlands Academy of Arts and

Sciences, Amsterdam, the Netherlands.

Reinhart, T. (forthcoming). Interface Strategies . Cambridge, MA: MIT Press.

Reuland, E. (2001). Primitives of Binding. Linguistic Inquiry 32.3: 439-492.

Solan, L. M. (1978). Anaphora in child language.

Unpublished doctoral dissertation, Univ. of

Massachusetts, Amherst.

Smith, Edward E (1999). Working memory, in Wilson Robert A. and Frank C. Kieil (eds)

The MIT Encyclopedia of the Cognitive SciencesI. Cambridge, MA: The MIT Press.

Thornton, Rosalind, and Kenneth Wexler. (1999). Principle B, VP ellipsis and interpretation in child grammars . Cambridge, MA: The MIT Press.

31

Thornton, R (1990). Adventures in long-distance moving: The acquisition of complex whquestions. Unpublished doctoral dissertation, Univ. of Connecticut, Storrs.

Tavakolian, Susan L. (1977). Structural principles in the acquisition of complex sentences

Unpublished doctoral dissertation, Univ. of Massachusetts, Amherst.

Taylor-Browne, K. (1983). Acquiring restrictions on forwards anaphora: A pilot study. In

Calgary Working Papers in Linguistics 9. Calgary: Alberta: University of Calgary,

Department of Linguistics.

32

Download