Natural Language and Linguistic Theory manuscript No. (will be inserted by the editor) Feature sharing in agreement Dag Trygve Truslew Haug · Tatiana Nikitina the date of receipt and acceptance should be inserted later Abstract This article discusses the mechanism of feature sharing in the analysis of agreement across theories. We argue that there are agreement phenomena that require an agreement mechanism which is both symmetric and feature sharing. Our main argument relies on a Latin nominalized clause construction which has until now remained ill understood. We show that this construction requires a feature sharing and symmetrical approach to agreement. We also show that phenomena in Tsez and in Algonquian that have so far been described in terms of long distance agreement lend themselves to a treatment in terms of feature sharing, and we look at the consequences for the theory of agreement. We show that there are also cases of agreement which resist a feature-sharing treatment. This means that we cannot pin down a single agree mechanism. Some agreement phenomena require feature sharing, others do not, and yet others are incompatible with feature sharing. Keywords Agreement · Feature sharing · Long distance agreement · Latin 1 Introduction Although deceptively simple on the surface, agreement has in recent years proven to be a complex phenomenon and a rich source for linguistic theorizing. In its canonical form, agreement involves a set of features being realized in two different positions. However, the features are only ‘real’ (in the sense of being either inherent or syntactically or semantically interpretable) in one of these positions, called the controller, while they are redundant in the other position, called the target, cf. (1) from Latin.1 (1) rosa spinosa floruit rose:NOM ; F thorny:NOM ; F ; SG bloomed:PST;3 SG ‘The thorny rose bloomed.’ The noun rosa is feminine because this is an inherent property of the noun, nominative because it heads an NP whose grammatical function demands a nominative (syntactic interpretability), and singular because it denotes a Dag Trygve Truslew Haug Department of Philosophy, Classics, History of Arts and Ideas PO Box 1020 Blindern 0315 Oslo Norway E-mail: d.t.t.haug@ifikk.uio.no Tatiana Nikitina UMR 8135 LLACAN, CNRS 7 rue Guy Moquet 94801 Villejuif France E-mail: tavnik@gmail.com 1 To make the examples easier to read, information that can be expressed in the translation of the word is not repeated in the glosses, e.g. there is no gloss for number on nouns. The following glosses are used in the paper: 1 - first person; 3 - third person; I - IV - noun class I - IV; ABL - ablative; ABS - absolutive; ACC - accusative; CAUS - causative; COMP - complementizer; CONJ - conjunct; DAT - dative; DEF - definite; DIR direct; EMPH - emphatic; ERG - ergative; F - feminine; GEN - genitive; IC - initial change; IMPF - imperfect; INCL - inclusive; INF - infinitive; LOC - locative; M - masculine; N - neuter; NEG - negation; NMLZ - nominalization; NOM - nominative OBJ - object; OBL - oblique; PASS passive; PFV - perfective; PL - plural; PRES - present; PRF - perfect; PST - past; PTCP - participle; REFL - reflexive; SBJV - subjunctive; S EC O BJ - second object; SG - singular; SUBJ - subject; TA - transitive animate; TI - transitive inanimate; TRANS - transitive. 2 Dag Trygve Truslew Haug, Tatiana Nikitina No feature sharing Feature sharing Asymmetric Standard Minimalism e.g. Frampton and Gutmann (2000); Pesetsky and Torrego (2007) Symmetric Standard LFG e.g. Kathol (1999); Ackema and Neeleman (2013) Table 1 Analytical choices in theories of agreement single object (semantic interpretability). The adjective spinosa carries the same features, but in this case they are neither inherent in the lexeme nor interpreted. So rosa is the controller and spinosa the target. Such examples motivate the simple and powerful idea that agreement is just feature matching: the redundant exponents of features on the target must match those on the controller. This leads to an asymmetric view of agreement, since the status of the two sets of features is fundamentally different. In particular, since the target features must match those of the controller, the controller cannot be underspecified for features that are present on the target. However, the target can be underspecified, since there is no matching requirement in the other direction. Not all instances of agreement lend themselves easily to an asymmetric account. For example, Pollard and Sag (1994, 64) point to agreement with null (‘prodrop’) arguments, giving the Polish examples in (2), where the verb agrees with a null subject.2 (2) kochałem I(masc) loved kochałam I(fem) loved kochałeś you(masc) loved kochałaś you(fem) loved kochał he loved kochała she loved To maintain an asymmetric view of agreement, we are essentially forced to assume that the examples in (2) involve a multiplicity of phonetically null pronominals, one for each distinct form of the verb. However, if we abandon asymmetry, we can have underspecified controllers. Hence, we can simply assume that there is a single null argument, which is unspecified for gender and person. In the general case, the target and the controller in a symmetric approach cospecify information about a single syntactic entity and in null argument structures it just so happens that all the information comes from the target. Symmetry is one dimension, then, along which theories of agreement may differ. Traditionally, derivational theories of syntax assumed asymmetric agreement and non-derivational theories symmetric agreement, but recently Ackema and Neeleman (2013) have argued for symmetric agreement within an otherwise derivational, minimalist approach. We return to symmetry in more detail in section 1.1. Another dimension where theories of agreement may differ is whether they assume what we will call feature sharing, i.e. are the features involved in an agreement configuration available in the syntactic loci of both the controller and the target, or only in that of the controller? Returning to (1), it is clear that both rosa and spinosa are morphologically singular and equally clear that only rosa is semantically singular, since spinosa denotes a property and hence semantic number does not apply at all. The question, then, is whether the syntax goes with the morphology or with the semantics. Is there a syntactic feature NUMBER sg in the syntactic locus of spinosa or not? The standard answer is no. On the asymmetric, “matching” view of agreement, spinosa is usually taken to have a non-interpretable NUMBER feature which gets deleted during the derivation. On the symmetric view of agreement found in LFG, it is normally assumed that spinosa contributes a feature directly to its head, so again there is no feature sharing. But some linguists, working both within symmetric and asymmetric theories, have assumed feature sharing. This is true for example of Kathol 1999, who works in LFG, but also of Pesetsky and Torrego (2007) who uphold the standard minimalist distinction between interpretable controller features and uninterpretable target features, but develop a view which dissociates interpretability and valuation: when Agree matches a pair of uninterpretable and interpretable features, the result is that the features are valued in both the controller and the target position. In other words, the features in the target position will be valued, but uninterpretable. This yields an asymmetric theory with feature sharing. Similar views are found in e.g. Frampton and Gutmann (2000); Legate (2005); Bobaljik (2008). We discuss feature sharing in more detail in section 1.2. We conclude that the question of the best architecture for a theory of agreement is both non-trivial and of cross-theoretical interest. Table 1 sums up the analytical choices that some selected theories of agreement make. In this paper we argue that there are phenomena that require an agreement mechanism which is both symmetric and feature sharing. Our main argument relies on a Latin nominalized clause construction which has until now remained ill understood: this construction is presented in section 3, and its special agreement properties are analyzed in section 4. We will also show that phenomena that have so far been described in terms of long distance agreement (see e.g. Polinsky, 2003; Boeckx, 2009) lend themselves to a treatment in terms of feature sharing (section 5), although the converse is not true: current theories of long distance agreement cannot deal with the Latin data. Section 6 looks at the consequences for the theory of agreement. We show that there are also cases of agreement 2 A similar point was made by Barlow (1992). Feature sharing in agreement 3 which resist a feature-sharing treatment. This means that we cannot pin down a single agree mechanism. Some agreement phenomena require feature sharing, others do not, and yet others are incompatible with feature sharing. Neither the feature sharing nor the non-feature sharing theory of agreement is more expressive than the other: they are needed for different phenomena. This does not seem to be the case with the distinction between symmetric and asymmetric agreement. Asymmetric theories are mostly justified by metatheoretical concerns, such as restrictiveness of the theory; but this means that if a single phenomenon can be shown to require symmetry, as the Latin data in fact does, the justification for asymmetry disappears. We do not want to rule out that empirical justification for asymmetric agreement could eventually be found, but in its absence we tentatively conclude that only symmetric agreement is required, and hence that only the two types of agreement defined by the absence or presence of feature sharing are needed. 1.1 The symmetry of agreement While (2) shows that an asymmetric theory of agreement has to adopt an uneconomical analysis of agreement with a null argument, there is at least a way out by positing multiple null pronouns. But there are other cases, which seem to be incompatible with an asymmetric approach. Consider so-called ‘unagreement’ in Spanish, as in (3) (= Ackema and Neeleman 2013, ex. 27a, originally from Harmer and Norton 1957, 270). Our discussion here will be cursory, as the goal in this section is to illustrate the theoretical and empirical issues that are at stake in the discussion on symmetry, not to argue that the Spanish data forces a symmetric analysis. For more details we refer to Ackema and Neeleman (2013) and the references there. (3) ¡Qué desgraciad-as somos las mujer-es! how unfortunate-F. PL be:1 PL DEF ; F ; PL women(F.)-PL ‘How unfortunate we women are!’ Clearly, the person features on the verb somos and its subject mujeres do not match: if we take mujeres to be specified as third person, there is a direct contradiction, and if we take mujeres to lack a person feature (e.g. by underspecification or by defining the third person as the absence of a person feature), then the verb (the target) appears to specify a person feature that is not present in the NP (the controller). To deal with such facts, an asymmetric theory of agreement will either have to assume a hidden subject with the appropriate features (to which the overt ‘subject’ is in apposition or anaphorically linked), or use hidden features. In this particular case the latter solution would involve the unlikely assumption that there is a noun mujeres that carries a first person feature.3 The first solution seems more promising, but Ackema and Neeleman (2013, p. 22) argue that the distribution of such unagreeing subjects matches that of regular subjects (in particular, they need not be clause-peripheral). A symmetric theory seems better placed to deal with these facts, since target features have the same status as controller features. In (3), then, the controller mujeres is simply lexically underspecified for person and hence compatible with any person value on the target verb, whose features are just as much bearers of information (rather than being matching, ‘superfluous’ features) as controller features are.4 We can illustrate the situation as in (4), continuing with the simplification that the third person is represented as the absence of a person feature.5 (4) target " PERSON NUMBER 1 pl # 6 v 6w " GENDER NUMBER f pl # controller We can relate such feature structures by subsumption. A feature structure f subsumes (symbolically f v g, negated f 6v g) another structure g iff f is at least as general (contains the same or less information) as g. Subsumption is a partial order, so there are four possible situations. Either the controller and the target are not comparable (neither subsumes the other), as in (4), or they both subsume each other (they are identical), as in (5), or the target subsumes the controller (6), or the controller subsumes the target (7) (5) target " PERSON NUMBER 1 pl # v w " PERSON NUMBER 1 pl # controller 3 Alternatively, a reviewer suggests that las could be an underspecified default spellout of a first person transitive D element. For our purposes here, we do not need to dwell on such responses to the analysis in Ackema and Neeleman (2013), as the issue is orthogonal to our discussion. 4 As a reviewer notes, this analysis predicts that mujeres could also occur with a second person verb and this prediction is borne out. 5 See e.g. Dalrymple and Kaplan (2000) for a more sophisticated treatment using set-valued features where the third person is the empty set. 4 Dag Trygve Truslew Haug, Tatiana Nikitina (6) target (7) target " " NUMBER pl PERSON 1 pl NUMBER # v # w " " GENDER NUMBER NUMBER f pl pl # controller # controller The notion of (a)symmetry in theories of agreement relates to the status of the target features: in asymmetric theories target features are not independent but must be licensed by a corresponding feature on the controller. Asymmetric theories therefore come built in with the strong claim that universally, the agreement features of the target must subsume the agreement features of the controller, i.e. only (5) and (6) are allowed by universal grammar, and the situations in (4) and (7) do not occur. The strength of this claim obviously depends on how surface exceptions such as (3), which seems to instantiate (4), are dealt with. We will see that the Latin case we discuss in this article is particularly challenging for asymmetric theories. Let us now see more concretely how agreement works in an asymmetric theory.6 We illustrate the standard minimalist model in (8), which gives a simplified tree for (3). (8) NP φ " GENDER NUMBER f pl # u:φ " V PERSON NUMBER 1 pl # Like other features, agreement (φ -) features come in two types: interpretable and uninterpretable ones. The latter are prefixed with u: in (8) and occur on the agreement target. Features that are uninterpretable must be deleted before they reach Logical Form (LF), and this deletion occurs via checking (matching) against an interpretable counterpart. Thus, if the agreement target has features not present on the controller, they cannot be checked and will remain at LF, causing the derivation to crash. This will be the case in (8), where the uninterpretable PERSON feature does not have an interpretable counterpart. This is why ‘unagreement’ is problematic on an asymmetric approach. As we saw above, one way out would be to assume that the NP in (3) is not itself the subject, but is in apposition to a subject first person plural pronoun.7 In this structure, shown on the left side of (9), target and controller features match, and the target features can be checked by the controller and removed, to yield the structure on the right side of (9), which is the input to semantic interpretation. (9) NP NP φ " PERSON NUMBER 1 pl # ⇒ V u:φ " PERSON NUMBER φ 1 pl " PERSON NUMBER 1 pl # V # Contrast this with the symmetric approach, as illustrated with LFG’s standard agreement theory. The idea here is that agreement is co-specification of a single syntactic entity. When a verb agrees with its subject, it directly specifies features of its subject. For example, a simplified feature structure for the verb somos in (3) might be as in (10). 6 In derivational theories, especially when making use of covert or string-vacuous movement, the question will also arise whether agreement is upwards or downwards. We do not need to take a position in this debate here (see e.g. Zeijlstra (2012) and Preminger (2013)), as our goal is simply to illustrate the general workings of the theory. 7 As we also saw above, this solution is critized in Ackema and Neeleman (2013). We present it here for expository purposes only. Feature sharing in agreement (10) 5 “ BE ” TENSE SUBJ g pres AGR h " PERSON NUMBER # 1 pl The feature structures are labelled: g is the verb’s feature structure and h that of its subject. Notice that LFG makes use of recursive feature structures, so the subject’s feature structure is the value of the SUBJ attribute inside the verb’s feature structure. Inside SUBJ in turn, the feature structure AGR bundles the agreement features. Thus, the structure in (10) makes the verb (partially) specify its subject’s features, in particular in this case, the values of PERSON and NUMBER . A simplified feature structure of the subject is given in (11). (11) “ WOMAN ” AGR i " NUMBER GENDER # pl f Other principles of the grammar ensure that i is identified as the subject of g, i.e. i = h. i and h must therefore unify, and since there are no conflicting features in the two structures, we get the well-formed feature structure in (12). (12) “ BE ” TENSE SUBJ g pres “ WOMAN ” PERSON AGR NUMBER h,i GENDER 1 pl f The AGR features of the subject have various origins: PERSON 1 is contributed by the verb, GENDER f by the noun, and NUMBER pl by both the verb and the noun. Agreement results from the unification of two AGR attributes via the solving of an equation: the verb specifies information about h AGR and the subject specifies information about i AGR, so when the grammatical function assignment tells us that h = i, we know that h AGR and i AGR are just two different names for the same feature structure. Importantly, therefore, the resulting feature structure only has a single AGR attribute, in the syntactic locus of the subject: there is no AGR in the outer, verbal feature structure g. We have illustrated the asymmetric theory with Minimalism, and the symmetric theory with LFG, but the symmetry of agreement cuts across theories. For example, LFG has a mechanism of constraining equations, which unlike other equations merely checks for the presence of a feature, in much the same way as uninterpretable features in Minimalism. Although we are not aware of any attempts within LFG to reduce all agreement to constraining equations, specific phenomena have been treated in terms of such equations (see e.g. Andrews (1982) on morphological blocking, Dalrymple and Kaplan (2000) on feature indeterminacy and Wechsler (2011) on mixed agreement) And within Minimalism, Ackema and Neeleman (2013) have argued for a symmetric theory of agreement. In other words, the distinction between symmetric and asymmetric agreement is of cross-theoretical relevance. 1.2 Feature sharing In addition to the question of symmetry, agreement theories differ in where in the syntax the agreement features are located. There are two possible options: either we assume that agreement features are present in the syntactic loci of both the controller and the target (‘feature sharing’), just as they are morphologically expressed in both positions; or the features are only present in the syntactic locus of the controller, despite their morphological realization in both positions. The latter view is standard in both LFG and Minimalism, so the theories are similar on this parameter.8 The motivation is primarily semantic: the features are only represented in the locus where they 8 Similar enough for our purposes, that is. There are two main differences: First, the relevant notion of syntactic locus is different in the two theories. As we just saw, Minimalism typically represents agreement features in the tree structure, whereas LFG situates them in a feature 6 Dag Trygve Truslew Haug, Tatiana Nikitina are interpreted. For example, although the first person feature in (3) is overtly represented on the target (the verb), its interpretation is that the denotation of the controller (the subject NP) is a set including the speaker. But analyses in terms of feature sharing have also been proposed, sometimes within an otherwise asymmetric model (Frampton and Gutmann, 2000; Legate, 2005; Pesetsky and Torrego, 2007; Bobaljik, 2008), sometimes in a symmetric setting (Wechsler and Zlatic, 2003; Ackema and Neeleman, 2013). One way to think about the motivation for this is as an interface problem. When two items agree in a feature, this means they both bear a morphological exponent of the same value for that feature. In that sense, there is symmetry in the morphology, and that is the basis for observing agreement in the first place. By contrast, there is no doubt that semantically, agreement features are licensed (i.e. are either inherent or interpreted) on the controller only.9 This is equally crucial to the notion of agreement: if two lexical items are both semantically plural, or both inherently feminine, then we would not say that they agree with each other. Jointly, the morphological symmetry and the semantic asymmetry are necessary and sufficient definitional properties of agreement. They are also necessary and sufficient conditions for distinguishing targets and controllers. This leaves open what happens at the syntactic level. The standard answer is that syntax pairs with semantics, i.e. the agreement features are only realized in the syntactic locus of the controller, not in that of the target. This is not to say that target features play no role at all: as we saw above, they do serve to ‘filter’ syntactic structures. In derivational approaches, they rule out structures where the uninterpretable features cannot be matched by the agreement mechanism; and in LFG, they rule out feature structures where unification fails. However, neither the V in the output structure of (9) nor the verbal feature structure g in (12) contain the agreement features. Put another way, syntax goes with semantics: the verb is morphologically first person plural, but it is neither syntactically nor semantically a first person plural. Instead, the first person plural features are syntactically represented in the locus of the noun phrase, which is also where they belong semantically, since they indicate that the reference of the noun phrase is plural and includes the speaker; they do not change the interpretation of the verb as such. This is not a forced conclusion. It is possible to argue that syntax goes with the morphology, i.e. that features are present in the syntactic projection of words where they have morphological exponence, even when they are not licensed semantically. In other words, the mismatch is not between morphology on one side and syntax-semantics on the other, but between morphology-syntax on one side and semantics on the other. We will refer to this view as syntactic feature-sharing. (13) and (14) show what the resulting syntactic structures could look like in Minimalism and in LFG, replacing (9) and (12), respectively. (These are just possible instantiations of a feature sharing analysis, which we will explore in more detail in section 4.3.) (13) NP NP φ " PERSON NUMBER 1 pl # ⇒ V u:φ " PERSON NUMBER V 1 pl # φ (14) “ BE ” TENSE pres h i AGR “ WOMAN ” PERSON SUBJ NUMBER AGR f g,h GENDER " PERSON NUMBER 1 pl # 1 pl f structure, an attribute-value matrix which in graph-theoretical terms is a directed, possibly cyclic graph, not a tree. Second, one could argue that agreement features are not entirely absent from the target locus in standard Minimalism, since they are present as uninterpretable features, as shown in the lefthand side of (9). However, in the end result of the derivation (the righthand side of (9)) they disappear, and hence they cannot act as controllers in another agreement relation at the same time. 9 The situation with case is different from that of number and gender, since case has no semantics. However, case features are typically still interpreted on the controller only in the sense that they specify the controller’s function, not that of the target. Feature sharing in agreement 7 The line in (14) indicates that the two instances of AGR share the same feature structure value. This type of feature sharing agreement is sometimes assumed in HPSG. For example, Sag et al. (2003, 238) assume the Specifier-Head Agreement Constraint in (15), which ensures that all inflecting lexemes agree with their specifier. (15) HEAD infl-lxm : SYN VAL h 1 AGR " SPR h i AGR # i 1 The index 1 indicates structure sharing (just like the line in the LFG representation), the presence of the same value in two different positions in the attribute-value graph. So this constraint enforces structure sharing of the agreement features (AGR) between any inflecting head and its specifier (SPR), leading to feature sharing in our terms. Not all work in HPSG assumes feature sharing. Pollard and Sag (1994, 82) posit the alternative structure in (16) for a 3.sg. verb in English. h i (16) VFORM fin HEAD CATEGORY D E NP[nom] 1 [3rd,sing] SUBCAT " # RELATION walk CONTENT WALKER 1 Here too, there is structure sharing, but not feature sharing in our sense. The verb imposes agreement by constraining the first (and only) element on its SUBCAT list to be 3rd person singular and that element is structure shared with the value of the WALKER feature. This way we get a symmetric approach with cospecification of the agreement features. But there is no sense in which the verb itself “has” the features 3rd person singular in the syntax, and the verb could not be the controller of agreement in these features. (It may seem strange to assume that the verb would be an agreement controller, but as we will see, this is what happens in Latin.) In fact there is a strand of work in HPSG (see e.g. Wechsler and Zlatic, 2003) which assumes that both these agreement mechanisms are available and that feature sharing (15) appears in so-called CONCORD agreement (prototypically CASE , GENDER , NUMBER agreement inside NPs), while standard symmetry (16) is typical of INDEX agreement (prototypically PERSON , NUMBER , CASE agreement in predicate-argument structures). We return to this distinction in section 6.2. The motivations for assuming feature sharing differ between authors. For some, such as Ackema and Neeleman (2013), which we will discuss in more detail in section 4.3, it appears to be mainly a byproduct of their way of ensuring symmetry. For Legate (2005), it is related to her view of phases in the syntactic derivation. In Pesetsky and Torrego (2007), whose feature sharing theory is not symmetric, feature sharing is mainly a way of passing information from lexical to functional categories, e.g. from the finite verb in vP to TnsP or between a relative phrase in spec,CP and the head C. This is a peculiarity of Minimalism’s Agree and not what we generally associate with agreement phenomena in a theory-neutral perspective. From a more empirical point of view, the most comprehensive defense of a feature-sharing approach is found in Kathol (1999). One of his arguments is particularly relevant here, namely the observation that a non-feature sharing approach cannot explain why there often is a close morphological relationship between the form of the selector category and the category selected. Consider for example the Latin case in (17) (= Kathol, 1999, ex. 13). (17) illarum duarum bonarum feminarum those:GEN ; F ; PL two:GEN ; F ; PL good:GEN ; F ; PL women:GEN ; F ; PL ‘of those two good women’ Clearly, we can simplify the morphology-syntax interface considerably if we asssume that the affix -arum contributes the features genitive, feminine, plural to whatever stem it attaches to. As Kathol observes, such morphological correspondences are less common in predicate-argument agreement than in NP-internal agreement, but they do exist, cf. (18) (= Kathol 1999, ex. 14, originally from Welmers 1973, 171) from Swahili. (18) a. b. Kikapu kikubwa kimoja kilianguka. basket large one fell ‘One large basket fell.’ Vikapu vikubwa vitatu vilianguka. baskets large three fell 8 Dag Trygve Truslew Haug, Tatiana Nikitina ‘Three large baskets fell.’ Kathol’s argument is conceptual rather than empirical in nature, since we can get the right predictions if we just complicate our theory of how affixes contribute features to the stem they attach to.10 Nevertheless, it is interesting to observe that Latin participle-subject agreement, which we will present in sections 2–3, displays exactly the same morphological pattern. So there are several ways of justifying feature sharing. But as far as we are aware, only Legate (2005) has made use of the property of feature sharing which is crucial to our analysis of Latin participles, namely that since the agreement features will be available in the locus of the target, the target can act as a controller of further agreement processes in these same features, yielding ‘cyclic agreement’.11 This, as we will see, is the key to understanding the Latin dominant participle construction and moreover, it allows for a feature sharing treatment of so-called long distance agreement in general without violating syntactic locality, a concept to which we now turn. 1.3 Syntactic locality and multiple agreement Syntactic locality has long been an important concept in generative grammar. It goes back at least to Chomsky (1965) who formulated a strict locality condition on subcategorization, which was further refined by Kajita (1968). As pointed out by Sag (2010), the locality of subcategorization gives us a non-trivial prediction that there cannot be a verb evorp which is like prove except it imposes the non-local constraint that its complement clause be transitive: (19) a. Lee evorped that someone bought the car. b. *Lee evorped that someone died. c. *Lee evorped that someone ran into the room. Locality considerations are relevant for many linguistic phenomena; see Sag (2010) for an overview. Most syntactic theories assume some locality principle. Although the exact implementation will differ between frameworks, they all seek to restrict the application of syntactic processes, including agreement, to local domains, and to analyze apparent non-local processes as (sequences of) local processes. What is a local domain? In a phrase structure grammar, the strictest definition of a local domain will restrict syntactic processes to configurations such as head-specifier, head-complement and head-adjunct. Similar notions may be given dependency-based definitions in dependency grammars and in LFG’s feature structures. For our purposes, however, a more comprehensive and less theory-dependent notion is desirable. We adopt what Polinsky and Potsdam (2001, 609) dub ‘the clause-mate assumption’ in (20). (20) The controller and the target are in the same clause at some level of syntactic representation Following Chomsky (2000), a major conceptual argument in favor of cyclicity/locality is that it allows a major reduction in computational complexity, in the sense that it limits the search space that the linguistic mechanisms have to consider. The smaller we assume local domains to be, the stronger this argument becomes, for the more we limit the search space. Given that there are clear instances of verbs agreeing with oblique arguments, the clause-mate assumption is the strongest locality constraint on agreement that we can plausibly entertain. As such, it is conceptually desirable. However, much modern work in Minimalism assumes that the clause-mate assumption is too strong. Instead, it assumes that syntactic locality (including in agreement) follows the Phase-Impenetrability Constraint (Chomsky, 2000, 108) (21) In phase α with head H, the domain of H is not accessible to operations outside α, only H and its edge [i.e. any specifiers of H, our comment] are accessible to such operations. (21) is more liberal than the clause-mate assumption in that it makes an exception for the edge/specifiers of H. Polinsky (2003) argues empirically that we need this exception in order to account for apparent long-distance agreement phenomena. More radically, Bošković (2007) argues that Agree is not subject to the Phase-Impenetrability Constraint at all and that agreement (but not movement) can look into phases, i.e. there are no locality constraints on agreement. 10 A reviewer objects that it is not the case that ‘morphological identity of exponents is a crucial factor for feature sharing, and that lack of such identity interferes with feature sharing’. We agree, but this is not the force of Kathol’s argument. Rather, the point is that whenever there is formal identity of exponents, feature sharing theories can assume that the exponents are in fact the same. An non-feature sharing theory, on the other hand, will have to conclude that the two surface-identical exponents are in fact different and only one of them expresses an interpretable feature (in Minimalist terms), or contributes information directly about the word it attaches too (in LFG terms). 11 The use of cycles of feature-sharing Agree to pass information up the tree in Pesetsky and Torrego (2007) is similar, but as we noted above this is a theory-internal use of Agree in Minimalism, not connected with what is usually understood as agreement. Feature sharing in agreement 9 Both Polinsky and Bošković base their arguments on so-called long distance agreement. (22) (= Polinsky and Potsdam, 2001, ex 48a) shows an example from Tsez (Tsezic, Northeast Caucasian). (22) eni-r [už-ā magalu b-āc’-ru-ìi] b-iy-xo mother-DAT boy-ERG bread:III ; ABS III-eat-PST; PTCP - NMLZ III-know-PRES ‘The mother knows the boy ate the bread.’ In Tsez, verbs regularly agree with their absolutive argument in noun class (glossed with Roman numerals). But in (22), we see that the matrix verb b-iy-xo ‘know’ bears a noun class III feature that apparently comes from the noun magalu inside the complement clause. In other words, the matrix verb agrees, not with its own absolutive argument, but with the absolutive of its complement clause. Hence the term ‘long-distance agreement’. Polinsky (2003) and Bošković (2007) both conclude that such agreement is truly long-distance in that it violates the clause-mate assumption in (20). But Polinsky (2003) argues that the weaker Phase-Impenetrability Condition (21) is not violated: magalu ‘bread’ undergoes topicalization to the edge of the complement clause, which means that it is on the edge of that phase and hence available for agreement with the matrix verb. Note that this topicalization must be covert, for magalu in (22) is not overtly at the edge of the complement clause, but rather preceded by the ergative argument už-ā. Bošković (2007) argues against such covert topicalization. He concludes that Tsez long distance agreement is even more radical in that it violates not only the clause-mate assumption but also the Phase-Impenetrability Condition. In fact, he claims there are no locality constraints on agreement, only intervention effects: the verb must agree with the closest eligible controller. Since only absolutives are eligible in Tsez, we get the long distance agreement in (22). We will return to the analysis of Tsez long distance agreement in section 5. But let us observe already at this stage that the agreement in (22) seems to be ‘mediated’ via agreement on the verb of the complement clause, b-āc’-ru-ìi, which is also marked for noun class III. This suggests a feature sharing analysis: the embedded verb agrees with its absolutive argument and thereby shares its noun class feature, which in turn is passed up to the matrix verb, i.e., we have cycles of local agreement relations rather than proper long distance agreement. We will explore this analysis in more detail in section 5. For now, we note that such an analysis, if feasible, would account well for the data without violating syntactic locality. If so, we have a powerful, conceptual argument in favor of maintaining locality. All varieties of generative syntax hold that a strong locality constraint is a desirable principle because it reduces computational complexity. If some, or even most, syntacticians working within Minimalism now think the strongest possible locality constraint – the clause-mate assumption – does not apply to agreement, it is precisely because of the long distance agreement data. If this data can be explained through feature sharing operating purely locally, we should on conceptual grounds – the reduction of computational complexity – prefer such an account to one that uses long-distance agreement. But the issue is not purely conceptual. We will see in section 4 that a feature sharing, local notion of agreement is needed for the Latin data and in section 5 that such an account will generalize to previously reported data on long distance agreement, while the non-local analyses that have been devised for the latter cases cannot conversely be generalized to the Latin data. In sum, the local analysis is not only conceptually attractive but also empirically more succesful. 2 Latin agreement The specific agreement phenomenon that is the focus of our article occurs in so-called dominant participle constructions. To motivate our analysis of these, we first briefly survey the different usages of Latin participles and the agreement facts in these contexts. Latin has three different participles. There is a future participle whose distribution in Classical Latin is limited to periphrastic forms. We will ignore this form here. The two other participles are the present (active) and the perfect (passive) participles. These are similar to the English participles in -ing and -ed. The present active and the perfect passive participle both have a variety of uses illustrated in (23)-(29): attributive (23), nominalized (24), subject predicative (25), object predicative (26), periphrasis (27), free predicative (28) and ablative absolute (29). In these examples the participle is bold-faced and its agreeing NP (if present) is in italics. Notice that the attributive (23) and the free predicative (28) are generally identical;12 it is a matter of textual interpretation which analysis is correct for a given example. (23) rosa florens pulchra est rose:NOM ; F blooming:NOM ; F ; SG beautiful:NOM ; F ; SG is 12 As far as we can tell from the written text, that is. But it is likely that attributive participles, unlike free predicates, formed constituents with their nouns. This constituency could have been marked prosodically, but such evidence is of course no longer available to us. 10 Dag Trygve Truslew Haug, Tatiana Nikitina ‘The blooming rose is beautiful.’ (attributive) (24) medici leviter aegrotantes leniter curant doctors:NOM ; M lightly being.ill:ACC ; M / F ; PL mildly cure:PRES ;3 P ‘Doctors cure the lightly ill mildly.’ (nominalized, Cic. de Off 1.83) (25) rosa florens est rose:NOM ; F blooming:NOM ; M / F / N ; SG is ‘The rose is blooming.’ (subject complement) (26) vidi puerum currentem saw boy:ACC ; M running:ACC ; M / F ; SG ‘I saw the boy running.’ (object complement) (27) puer amatus est boy:NOM ; M loved:NOM ; M ; SG is ‘He was/has been loved.’ (periphrastic perfect) (28) rosa florens pulchra est rose:NOM ; F blooming:NOM ; M / F / N ; SG beautiful:NOM ; F ; SG is ‘A rose is beautiful when it blooms.’ (free predicative) (29) his pugnantibus illum in equum quidam ex them:ABL ; M / F / N ; PL fighting:ABL ; M / F / N ; PL him:ACC in horse:ACC ; M someone:NOM ; M from suis intulit his own:ABL ; M / F / N ; PL mount:PRF.3 S ‘while they were fighting, one from his [attendants] mounted him on a horse.’ (absolute construction, Caes. Gal. 6.30) In all cases except nominalizations, the clause contains an NP that agrees in CASE, NUMBER and GENDER with the participle. Moreover, Haug and Nikitina (2012) argue for an analysis in which the agreeing NP is always the subject of the participle. This analysis is obvious in the periphrastic case (27), and in the subject complement case (25) it follows on a standard analysis of the copula as either a raising verb or an auxiliary. Similarly, in the ablative absolute (29), although the exact structure of the construction and its relation to the matrix clause may be disputed, it is clear that the NP his must be the subject of pugnantibus. In the object complement case (26), the subjecthood of the NP follows automatically from an analysis in terms of exceptional case marking (ECM, the NP is assigned case by the matrix verb but is structurally the subject of the participle) or raising to object (the NP is the thematic subject of the participle but ‘raises’ to the matrix clause and receives case there). If instead we analyze (26) as object control, it is not clear whether puerum or a controlled PRO is the subject of the participle. But Latin in fact offers good evidence in favor of identity theories of control, where the controller has two theta roles (Cecchetto and Oniga, 2004). This can be implemented as in the movement theory of control (Boeckx et al., 2010) or via LFG’s functional control mechanism (Bresnan, 1982), but the result is the same: puerum would be the subject of the participle, even on a control analysis. The same holds for the free predicative use in (28). In attributive structures such as (23) it is less obvious that the NP is the participle’s subject. Indeed it is sometimes assumed that attributive elements have no subjects at all. However, reflexives governed by attributive or nominalized participles can be bound by the participle’s semantic subject as in (30)–(31). (30) testificor autem rursum omni homini circumcidenti se testify:PRES ;1 S but again every:DAT; M / F / N ; SG man:DAT; M circumcising:DAT; M / F / N ; SG REFL ; ACC quoniam that ‘I declare to every man who circumcises himself that. . . ’ (Vulgate, Gal. 5:3)13 (31) notavi etiam in porticu gregem cursorum cum magistro se observe:PRF ;1 S also in gallery:DAT troop:ACC ; M runners:GEN ; M with coach:ABL REFL ; ACC exercentem practicing:ACC ; M / F ; SG ‘I also noticed a troop of runners practicing in the gallery with their coach.’14 (Petronius, Satyricon 29) 13 This example is a translation from Greek, but the Greek original has a different structure without a reflexive pronoun, showing that the binding is real Latin. 14 It is crucial here that the Latin verb noto, unlike its English counterpart notice is only constructed with an NP object, not with a (small) clause. Feature sharing in agreement 11 Latin reflexives must be bound by subjects unless they are used as logophors, which is not the case here. Matrix subjects can bind into participle clauses, but this is clearly not what is happening in (30)–(31). We must conclude that these participles do have subject positions. The subject in (30) must be either the NP omni homini, or a position controlled by it. Again, the interpretation of such a structure depends on the theory of control that one adopts, but on identity theories of control, which are supported by the Latin data, the conclusion must again be that the NP is the subject of the participle. A similar analysis can be given for “nominalized” participles, implying that they are null head modifiers (Devine and Stephens, 2000, 228-246). It follows that the agreement facts that we see in (23)-(29) can all be stated in very simple terms as in (32). (32) Participles agree in NUMBER, GENDER and CASE with their subject. Moreover, the principle in (32) is just a general instance of predicate-subject agreement in Latin, which can be stated as in (33). (33) If a predicate bears morphological exponents of PERSON, NUMBER, GENDER and/or CASE, then there is obligatory agreement in these features between the predicate and its subject. This principle holds for participles, and also for adjectives in primary and secondary predication (agreeing in NUMBER , GENDER and CASE ), and for finite verbs (which agree in PERSON and NUMBER ). Agreement in PERSON is often (see e.g. Wechsler and Zlatic, 2003) taken as indicative of a different kind of agreement (INDEX-agreement) from agreement in CASE (CONCORD-agreement), with agreement in GENDER and NUMBER being possible in both types. We will return to this distinction in section 6.2, but note that the principle in (33) covers both types of agreement. 3 The dominant participle construction There is one participle construction that we have not yet considered, but which displays very interesting agreement properties. This is the so called dominant participle construction, also known as the ‘ab urbe condita’-construction after the famous instance used in the Roman dating system.15 This construction is a nexus of a participle and a noun that can appear in all nominal contexts, but with clausal semantics, as shown in (34). (The participle is bold-faced and its agreeing NP in italics, as in examples (23)–(29).) (34) ab urbe condita from city:ABL ; F ; SG founded:ABL ; F ; SG ‘from the city’s founding’ As indicated in the translation, this construction has a very close analogue in the English nominal gerund construction. We will argue that the structure is essentially the same – that of a nominalized small clause – but that the Latin agreement system obscures this deeper similarity. The structure and history of the dominant participle construction is discussed extensively in Nikitina and Haug (2015). We refer the reader to that paper for a detailed account, including previous scholarship. Here we only survey the most important facts. 3.1 A mixed-category analysis of dominant participles As far as we are aware, all previous analyses of the dominant participle construction assume that the noun is the syntactic head and the participle is an attributive or sometimes predicative adjunct. The main argument quoted in favor of this analysis seems to be that the participle agrees with the noun. However, we have seen that agreement is not restricted to attributive constructions but is general across all uses of the participle. In fact there are at least three reasons to consider the participle the head of the dominant construction: First, the dominant construction is commonly attested with a pronoun in the nominal slot, as in (35), where the subject is the relative pronoun quibus. (35) 15 Quibus latis gloriabatur which:ABL ; M / F / N ; PL carried:ABL ; M / F / N ; PL glory:IMPF ; PASS ;3 S ‘[the laws] in the passing of which he gloried.’ (Cic. Phil. 1.10) The Roman calendar counted the years from the foundation of Rome in (allegedly) 753 BCE. 12 Dag Trygve Truslew Haug, Tatiana Nikitina Pronouns cannot normally be modified in Latin, so this construction cannot be attributive. Second, the meaning of the construction is clause-like, and (36) allows for a number of clausal paraphrases (37), as noted by (Pinkster, 1990, 133):16 (36) occisus dictator Caesar aliis pessimum aliis killed:NOM ; M ; SG dictator:NOM ; M C.:NOM ; M others:DAT; M / F / N worst:NOM ; N ; SG others:DAT; M / F / N pulcherrimum facinus videretur most.beautiful:NOM ; N ; SG deed:NOM ; N perceive:IMPF ; SBJV; PASS ;3 SG ‘the slaying of Dictator Caesar seemed to some the worst, and to others, the most glorious deed.’ (Tac. Ann. 1.8) (37) a. b. quod dictator occisus erat pulcherrimum facinus that dictator:NOM ; M killed:NOM ; M ; SG be:IMPF ;3 S most.beautiful:NOM ; N ; SG deed:NOM ; N videbatur perceive:IMPF ; PASS ;3 S ‘That the dictator had been killed seemed the most glorious deed.’ dictatorem occisum esse pulcherrimum facinus dictator:ACC ; M killed:ACC ; M / N ; SG be:PRES ; INF most.beautiful:NOM ; N ; SG deed:NOM ; N videbatur perceive:IMPF ; PASS ;3 S ‘That the dictator had been killed seemed the most glorious deed.’ The semantics of dictator occisus is propositional, i.e. it denotes the proposition that there was an event in which Caesar was killed. This makes it different from constructions such as ‘the young Isaac Newton’ or ‘a more resolute Roosevelt’, which are often taken as referring to a stage or a manifestation of the head noun (von Heusinger and Wespel, 2006). In a sentence like The dead Caesar frightened everyone, the dead Caesar could be argued to refer to Caesar’s manifestation as dead. On an analysis where stages and manifestations are inherent in the semantics of nouns it would then be possible to preserve the noun’s status as the semantic (and syntactic) head. But in (36) (and its paraphrases in (37)), the reference is clearly to a proposition, which cannot plausibly be inherent in the nominal semantics. If it were, we would expect the noun alone to be as good a subject as the noun plus participle combination, but it is not, as shown in (38). (38) #dictator pulcherrimum facinus videbatur dictator:NOM ; M most.beautiful:NOM ; N ; SG deed:NOM ; N perceive:IMPF ; PASS ;3 S #‘The dictator seemed a beautiful deed.’ Finally, while the participle is not omissible, the noun can be left out if the verb is impersonal, as in (39). (39) in libris Sibyllinis propter crebrius eo anno de caelo in books:ABL Sibylline:ABL on.account.of more.frequently that:ABL year:ABL from sky:ABL lapidatum inspectis rained.stones:ACC ; M / N ; SG examined:ABL ; M / F / N ; PL ‘. . . in the Sibylline books, which were consulted on account of the fact that it rained stones more frequently from the sky that year.’ (Liv. 29.10) The participle lapidatum is from the impersonal verb lapidare ‘to rain stones’ and consequently, no noun occurs and the dominant construction consists of the participle alone. It is also possible to leave out the NP when it is easily recoverable in the context, as in (40). (40) (For if no one had passed this way since I went indoors, the casket would be lying here. Why say “here?” It’s lost, I guess; it’s done for. It’s all over with unhappy and unlucky me! It’s nowhere, and nowhere am I.) perdita perdidit me lost:NOM ; F ; SG lose:PRF ;3 S me:ACC ‘Its being lost proved my loss.’ (Plautus, Cist. 686) If the participle is the head and the NP its subject, then this is just normal null anaphora (prodrop) of an easily recoverable subject. Similar examples are found in later Latin too. 16 The two variants in (37) differ in that the first uses a finite complement clause introduced by the complementizer quod, whereas the second uses a nonfinite accusative with infinitive structure (literally, ‘For the dictator to have been killed seemed the most glorious deed’). Both are rendered most naturally in English with a that-clause. Feature sharing in agreement 13 But while there is evidence that the participle is the head of the construction, it is also clear that the external syntax of the construction is nominal, as laid out in detail in Haug and Nikitina (2012); Nikitina and Haug (2015). For example, the dominant construction can be coordinated with NPs, and they can occur in all nominal positions, including subject, object (of verb and preposition) and adnominal genitive. These facts motivate an analysis of the construction as an NP whose ultimate head is the participle V, which takes the embedded NP as its subject. To keep the two occurrences of NP separate we will use NPc and NPe (mnemonic for their having respectively clausal and entity-type semantics). There could be several intermediate projections between V and NPc , but schematically we can represent the structure as in (41). NPc (41) NPe . . . Vhead (41) implies that the dominant participle is a mixed category construction, where the verbal head ultimately projects a noun phrase. The English gerund is a prototypical example of this, e.g. on the analysis of Pullum (1991) illustrated in (42). (We have added the subscript c and e to clarify the correspondence to (41).) (42) NPc VP[VFORM:ptcp] NPe[POSS+] your NP V[VFORM:ptcp] breaking Det N’ the N record There is a large literature on mixed categories, which we cannot do justice to here. The fundamental question is what licences the category mismatch. On Pullum’s account it is the morphology (i.e. the [VFORM:ptcp] feature); in the approach of (Bresnan, 2001, 100-1,291-292) the featural decomposition of syntactic categories does the work. In non-lexicalist treatments the V head typically moves to an abstract nominal head to join the nominalization suffix, schematically as in (43) (which ignores the position of NPe , perhaps in spec,NPc ), cf. Bresnan (1997). NPc VP N stem suffix V . . . t (43) A non-lexicalist, movement-based account may be problematic for Latin (Lapointe, 1999), but we ignore the details here. For more on mixed categories we refer to Nikitina (2008). A more detailed syntactic analysis of the dominant participle construction is given in Haug and Nikitina (2012); Nikitina and Haug (2015). 3.2 Agreement in dominant participles Whatever the exact analysis of the dominant participle construction, any structure compatible with the schema in (41) will have the same implications for how agreement and information flow through the structure. All linguistic 14 Dag Trygve Truslew Haug, Tatiana Nikitina frameworks incorporate a mechanism for passing features from heads to their maximal projections, so all the information at V is in principle available at the position of NPc . Symmetric theories of agreement also provide a mechanism for passing information from the agreeing head V to its subject NPe . But no theory without feature sharing provides a mechanism for passing information in the other direction, from the controller to the target, i.e. in our case from the subject NPe to its agreeing head V. This is the “wrong direction” for information passing via agreement, since in theories without feature sharing (whether asymmetric or symmetric) the information is collected in the locus of the controller (NPe ) only, not that of the target (V). And yet the Latin data requires the information to be available at V so that it can in turn flow up to NPc . Consider (44). (44) ne eum Lentulus et Cethegus . . . deprehensi terrerent lest him:ACC L.:NOM ; M and C.:NOM ; M captured:NOM ; M ; PL frighten:IMPF ; SUBJ ;3 PL ‘lest the capture of Lentulus and Cethegus should frighten him.’ (Sall., Cat 48.4) The coordinated subject NPe Lentulus et Cethegus induces plural agreement on the dominant participle deprehensi.17 Together they form the nominalized clause Lentulus et Cethegus . . . deprehensi which is the subject of the matrix verb terrerent. As a subject, the clause induces agreement on its predicate, terrerent, in this case third person and plural number. Crucially, this is syntactic agreement, not semantic. One could imagine that Lentulus et Cethegus deprehensi meant ‘the captures of Lentulus and Cethegus’, which would be semantically plural and trigger semantic agreement on the verb terrerent. But there are numerous difficulties with this. First, observe that if a dominant participle is morphologically singular but semantically plural, it does not trigger plural agreement. (45) ea res saepe temptata . . . tardabat this:NOM ; F ; SG thing:NOM ; F ; SG often attempted:NOM ; F ; SG delay:IMPF ;3 S ‘This thing often being tried delayed (his undertakings)’ (Caes. Civ. 1.26.2) In (45), it is likely that it is totality of the repeated trying which causes the delay, so the interpretation is collective, or ‘pluractional’. But such a reading is not enough to force a singular in cases where the NPe is plural: (46) HS xl milia in singulos iudices distributa eum numerum forty thousand sesterces:NOM ; N ; PL to each judge:ACC ; M ; PL distributed:NOM ; PL ; N this number:ACC ; M sententiarum conficere debebant votes:GEN ; F ; PL make.up:INF ought:IMPF ;3 P ‘Giving forty thousand sesterces to each judge ought to make up that number of votes’ (Cicero, Pro Cluentio 74) This suggests that it is the syntactic NUMBER feature of the dominant construction as exposed morphologically on its compound phrases, which governs agreement in these cases, not the semantics. Moreover, in the material collected in Heick (1936) there is no instance of a mismatch between the NUMBER of the dominant participle construction and the agreeing number morphology on the matrix verb in cases where the dominant participle is a subject. This is unexpected on a semantic agreement theory. It is in fact unlikely that dominant participle constructions like Lentulus et Cethegus deprehensi are semantically plural at all. Latin participles are consistently predicates – unlike e.g. English gerunds they do not have a second existence as event nominals. The semantic result of combining a predicate and its subject is a single proposition, no matter whether the subject is singular or plural. And in fact, when other propositions occur in agreement-triggering environments in Latin, they consistently trigger singular agreement, as in (47). (47) discordias versari esset necesse dissensions:ACC ; F ; PL exist:INF would.be:IMPF ; SBJV;3 S necessary:NOM ; N ; SG ‘that dissensions exist would be necessary’ (Cic. Att. 2.1.6) We conclude that there is every reason to believe that, although the dominant construction in (44) is syntactically plural and triggers plural agreement, it is semantically a proposition and not in any sense a semantic plural. Consider now (48)–(49). 17 Notice that, at least on LFG assumptions, the resolution of the coordinate NUMBER value to plural is entirely internal to the coordination and therefore orthogonal to the question of how agreement should be modelled. Single conjunct agreement does not seem to be attested with dominant participles, but could in principe be captured in exactly the same way as other instances of single conjunct predicate-argument agreement. Feature sharing in agreement 15 (48) unus annus additus labori tuo multorum one:NOM ; M year:NOM ; M added:NOM ; M ; SG labour:DAT; M your:DAT; M / N ; SG many:GEN ; M / N ; PL annorum laetitiam nobis . . . adferret year:GEN ; M ; PL joy.ACC us.DAT bring:SBJV; IMPF ;3 S ‘Adding one year to your labour would add many years of joy to us.’ (Cic., Q. fr. 1.1.3) (49) adiecit decus natus eo anno divus Augustus add:PRF ;3 S glory:ACC ; N ; SG born:NOM ; M ; SG that year:ABL ; M ; SG divine:NOM ; M ; SG Augustus:NOM ; M ‘The birth of the divine Augustus that year added glory [to Cicero’s consulate]’ (Vell. Paterc. 2.36.1) In both these examples, as well as in (45) above, we see that noun and the participle in a dominant participle construction agree with each other, but are also both available to further agreement processes at each end of the chain. In the following we focus on (49), but the reasoning applies to (45), (48) and numerous other examples that can be found in the Latin data. In (49), the phrase natus Augustus, literally ‘Augustus born’, i.e. ‘the birth of Augustus’ forms a dominant participle construction and its two parts agree in CASE , NUMBER and GENDER. The crucial point is that at each end of this agreement chain, the features are available to further agreement processes: hence, Augustus and its adjective divus agree in CASE , NUMBER and GENDER, while natus (as head of the matrix subject constituent) and the matrix verb adiecit agree in NUMBER (and PERSON). For this to be possible, it is clear that the CASE , NUMBER and GENDER on natus and Augustus must be real (in some theory-specific sense) at each end of the chain. This is the challenge that a theory of agreement must overcome in order to account for the Latin data, and as we will see, all non-feature sharing theories of agreement, asymmetric and symmetric, local and non-local, fail to account for this. 4 Formalizing dominant participle agreement 4.1 Local agreement without feature sharing fails Let us first consider a traditional derivational analysis. It could look like (50), where we have assumed for concreteness that the clause in the dominant participle construction projects to AspP (given that Latin participles have aspect), which is embedded inside an NPc with an abstract head N (to which the verb would move covertly on an analysis like (43)). For now we ignore CASE, since it is often treated differently from standard φ -features in derivational frameworks, and the standard features are enough to prove our present point; we will return to CASE below. (50) TP VP NPc adiecitu:φ [NUM=s] decus AspP Nφ :? VP NPe AP N divusu:φ [GEND=m,NUM=s] Augustusφ [GEND=m,NUM=s] natusu:φ [GEND=m,NUM=s] In this configuration, standard assumptions would make it possible for agreement to match the uninterpretable features on the lower VP with the interpretable features on the lower NPe , ensuring correct subject-predicate agreement and making sure that the uninterpretable features are deleted before they reach Logical Form. Moreover, since the lower NPe has interpretable φ -features, it can be the controller of several agreement processes, including 16 Dag Trygve Truslew Haug, Tatiana Nikitina one with its adjective divus. However, there is no way for the lower NPe to pass its features up to the higher NPc , which would be necessary for the construction as a whole to correctly induce agreement with its verb. Alternatively, we could assume that it is the higher NPc that is the bearer of the interpretable φ -features, and the features on the lower NPe and the V are uninterpretable. In this case, we correctly allow external agreement processes (i.e. the main predicate adiecit) to target these features, but we fail to allow agreement between the lower NPe and its adjective, since the lower NPe has no interpretable φ -features. To achieve this, we would need to posit that the lower NPe has both an interpretable and an uninterpretable set of features, which would blur the asymmetric nature of agreement. There also would not seem to be a way of preventing NPe from agreeing with itself, as it were, if it carries matching sets of both interpretable and uninterpretable φ -features. We conclude that a standard derivational approach is unable to account for the Latin data. In fact, if we uphold the distinction between interpretable and uninterpretable features, and assume there is only one set of the former features, then it seems we cannot explain the data without violating syntactic locality. And we will see below that even non-local analyses break down once we consider CASE. Standard LFG agreement theory does not fare better. It yields the analysis in (51) for (49). (51) “ADD GLORY ” “ BE BORN ” " # nom AGR CASE NUMBER s “AUGUSTUS ” SUBJ NUMBER s SUBJ AGR GENDER m CASE nom h i ADJ “ DIVINE ” There is an AGR attribute in the feature structure of the embedded subject Augustus, which is available for agreement with the adjunct divine and the predicate be born. But we fail to predict that the lower subject’s AGR is available for agreement with the matrix predicate add glory, since it is too deep in the structure: the matrix verb can only agree with its subject, not with its subject’s subject. Therefore, there is a NUMBER feature in the higher subject’s AGR, but this is falsely predicted to be independent of the NUMBER feature in the lower subject’s AGR. We also fail to predict that the matrix predicate assigns case to the construction, again because the relevant CASE feature is too deep in the structure. Instead, we get two independent CASE values, which are wrongly predicted to be able to diverge. We see that the standard agreement theories of both derivational approaches and of LFG have the same problem. The absence of feature sharing makes it impossible to pass the agreement features up the structure and make them available to subsequent agreement relations, assuming we maintain the locality of agreement. 4.2 Long distance agreement without feature sharing fails On the other hand, it may seem that we can get the facts right if we give up locality, along the lines that were discussed in section 1.3. For example, we could allow the matrix verb adiecit to agree with the subject NPe inside the nominalized clause, as ‘long distance agreement’. This could be implemented either with non-standard agreement equations in LFG or via a revised Agree mechanism in Minimalism (following e.g. Polinsky and Potsdam, 2001) – in fact, agreement in non-standard configurations was one of the main motivations for introducing the operation Agree in Minimalism. However, while this approach will work for number and gender, it will not do for case. We already mentioned that (51) wrongly contains two independent case features. Similarly, in (50), standard assumptions about case assignment would lead us to expect that the higher verb adiecit can only assign nominative case to NPc , not to the embedded NPe . Yet recall that the case marking on NPe in (50) depends on the function of the nominalized clause as a whole within the matrix structure. In (50) we get nominative because that is the case that the verb adiecit assigns to its subject; if the function of the nominalized clause requires another case, this is reflected on NPe (as well as on the participle), as illustrated in (34)–(35). We will see below that in a feature sharing system we can handle this by having adiecit assign case to NPc and letting agreement propagate the case feature through the structure. On the other hand, without feature sharing there Feature sharing in agreement 17 is no obvious way to propagate a case value on NPc down the structure. For that we would need long distance case assignment directly to NPe . The usual mechanism for long distance case assignment is Exceptional Case Marking (ECM): adiecit would case-mark its subject’s subject just like English believe case-marks its complement’s subject on an ECM analysis. However, ECM involves case-marking without thematic role assignment. Therefore, on standard Minimalist assumptions (Pesetsky and Torrego, 2011, 63), ECM can only involve structural case, not quirky or inherent case. This is borne out by Icelandic ECM data, but crucially the Latin dominant participle construction behaves differently: it can appear in any case, including a semantically motivated adjunct case, or inherent case, such as the ablative case assigned by the verb gloriabatur ‘glory in’ in example (35). For that reason, an ECM analysis is not viable and we must conclude that NPc gets its case from agreement with NPe ; but this is incompatible with the non-feature sharing approach of Polinsky and Potsdam (2001). 4.3 A symmetric feature sharing analysis Feature sharing theories fare much better with the Latin data. We first illustrate this with the symmetric theory developed in a Minimalist setting by Ackema and Neeleman (2013).18 This approach borrows ideas from autosegmental phonology and assumes that φ features (represented as trees rather than attribute-value matrices but that is irrelevant here) are generated independently on verbs and nouns and represented on a different tier from their corresponding lexical items.19 Returning to (3), this could start out as in (52).20 (52) [NP φ ] | GEND f NUM pl ... [V φ ] | PERSON 1 NUM pl They further assume that φ -features unify in agreement, thus giving a symmetric account. Moreover, the unified φ -features are associated with both syntactic loci, so we also have feature sharing. The result is as in (53). (53) [NP φ ] [V φ ] GEND f NUM pl PERSON 1 At this stage of the derivation, then, the set of φ -features is available both to the DP and the V. However, Ackema and Neeleman assume that this structure can be affected by ‘dissociation’, which basically deletes a link between a φ -node and a feature bundle. From (53), dissociation could produce the structure in (54). (54) [NP φ ] | GEND f NUM pl PERSON ... [V φ ] 1 (54) is the input to Logical Form (LF), so the features are only interpreted on the NP. Note that the application of ‘dissociation’ is optional and can be repeated, so we could derive various other representations from (53), e.g. by deleting the link between the NP and the feature bundle, or both links. However, these structures would be ill-formed because of the principle of φ -feature licensing given in (55) (= Ackema and Neeleman, 2013, ex. 10), which essentially constrains the application of dissociation. (55) At LF, each φ -feature F must be licensed in each position L with which it is associated. F is licensed in L iff (i) F is inherent in L’s lexical specification, or (ii) F receives a semantic interpretation in L. This deletion happens in the LF branch of the grammar in the minimalist Y-model, so it is outside ‘core syntax’. In the core syntax itself, then, the agreement features are tied to all the lexical items that carry overt agreement 18 As already mentioned, there are several other variants of agreement with feature sharing in the minimalist literature. We cannot go through them all here, and choose to focus on Ackema and Neeleman (2013) as a recent and well worked-out, representative theory. 19 In LFG too, it has been proposed to model agreement in a separate structure (Falk, 2006). 20 We have replaced their DP with NP. Nothing hinges on this. 18 Dag Trygve Truslew Haug, Tatiana Nikitina features. This means that the model can directly account for the Latin dominant participle construction. We simply generalize the approach by having the subject NPe , its AP, the dominant participle predicate, the outer NPc and the matrix predicate all share the same set of agreement features, as in (56). Again, we ignore CASE because it is normally treated differently in Minimalism. (56) [NPc φ [NPe φ [AP φ ] ] [V φ ] ] ... [V φ ] GEND m NUM s The same effect can be achieved in other symmetric theories of agreement if they are augmented with feature sharing. In LFG, for example, we get an identical account if we think of the φ -nodes in the upper tier as AGR attributes, and the lower tier as the value of those attributes. This gives an analysis where we enforce agreement not by having a single AGR attribute as in (51), but by having multiple AGR attributes that take the same value through structure sharing. Let us see how this works in more detail. If Haug and Nikitina (2012) are correct in arguing that in the functional structure, heads are subjects of their modifiers, the generalization in (33) can be captured via the equation in (57) inherent in the lexical entry of any predicate.21 (57) (↑ AGR) = (↑ SUBJ AGR) (57) ensures that there is a single value for the two different AGR attributes: technically this is ensured by unification, meaning that information flows in both directions between the two positions. This is opposed to the traditional LFG approach to agreement, which has a single AGR attribute (no feature sharing), whose value can be specified by both the controller and target (symmetry). (58) gives the constituent structure of (49). For concreteness we have chosen a flat structure, where each clause is an S node dominating the verb and all of its arguments. Nothing hinges on this, for in LFG the relevant agreement relations are established based on function, as in (57), rather than constituent structure. (58) ↑=↓ Vm S NPc[CASE=nom] NP ↑=↓ S adiecit[NUM=s] decus ↑=↓ Vp natus[GEND=m,NUM=s,CASE=nom] NPe AP ↑=↓ N divus[GEND=m,NUM=s,CASE=nom] Augustus[GEND=m,NUM=s,CASE=nom] The terminal nodes are marked with their features: these are part of the lexical entries of the relevant forms and bold-faced where they are inherent. NPc is also marked with a CASE=nom feature that it gets from the verb adiecit, which assigns nominative case to its subject. This feature is also bold-faced, since NPc is the position that gets case from the matrix verb. Headedness relations are marked with the standard LFG notation ↑ = ↓, which means that 21 Nothing hinges on the analysis in Haug and Nikitina (2012): it would be possible to dissociate adjunct structures from ordinary subjectpredicate structures and treat them with an equation (↑ AGR ) = (ADJ ∈ ↑) AGR. This would be a structure-specific rule associated with adjunction structures rather than with particular lexical entries. Feature sharing in agreement 19 features flow between a head daughter and its mother node. Recall that on our analysis of the dominant participle construction, it is a mixed category, concretely an NP whose head is an S. As we have seen, the relevant agreement configurations in Latin are predicate–subject and modifier–head (which we tentatively reduce to predicate–subject following the analysis in Haug and Nikitina (2012)), i.e. the AdjP agrees with NPe , NPe agrees with V p ; and NPc agrees with Vm . Under standard LFG assumptions about agreement, the features in an agreement configuration are only syntactically present on subjects (in predicate– subject agreement). Therefore, the features on natus are only registered in the feature structure of its subject NPe and not available to be passed up the head chain to NPc . This means we get the two independent feature bundles in (51). If on the other hand we revise our assumptions about agreement and assume feature sharing, as in (57), the features on natus – although agreeing with NPe – are still present in the V p position and can be passed up the chain to NPc . The features are represented syntactically in every locus where they appear on lexical items. The agreement operations link these feature bundles so that instead of the two unconnected AGR attributes we see in (51), we get four connected AGR attributes, as in (59). (59) “ADD GLORY ” AGR SUBJ h i “ BE BORN ” AGR SUBJ h i “AUGUSTUS ” NUMBER s AGR GENDER m CASE nom “ DIVINE ” h i ADJ AGR In (59), the shared value of AGR is only represented in the embedded subject position, but this is purely graphical: the interpretation of (59) is that the value of the different AGR attributes are token identical. Similar accounts of this type of agreement can be devised within other symmetric theories, e.g. in HPSG, following the theory of Kathol (1999). We have seen that the Latin case requires feature sharing. Does it also require symmetry? Once we consider where the agreement features originate, it becomes clear that the answer is yes. As indicated by the bold-facing in (58), the GENDER and NUMBER features are controlled by NPe , while the CASE feature is controlled by NPc , since this is the position that is assigned case by the verb. If agreement is asymmetric, we cannot capture this in a single agreement operation, even if we assume feature sharing. We would have to assume two agreement processes: one in which V agrees with its subject NPe in GENDER and NUMBER; and another, non-standard process whereby the subject NPe agrees in CASE with its verb. These two processes would therefore go in opposite directions, making it impossible to define constraints on the directionality of Agree and thereby also undermining asymmetry. By contrast, a feature sharing and symmetric approach directly accounts for the Latin data. Finally, it is worth noting that although it is the special properties of the dominant participle construction that require a symmetric, feature sharing approach to agreement in Latin, this approach (both on the version in Ackema and Neeleman (2013) and the LFG version) generalizes directly to all other types of agreement, i.e. we can (but need not, see section 6.1) analyze all Latin agreement as involving feature sharing, as we have done in (59) - we do not have to assume that dominant participles require a special agreement mechanism.22 “Worst case” agreement generalizes to the normal case. This is likely to be important in the acquisition of a feature sharing agreement system: while nothing in normal agreement requires that the co-specified feature structure is available in both syntactic positions, there is also nothing that restricts the co-specified feature structure to a single position. 22 Note however that finite subordinate clauses in subject position behave differently from dominant participles. We return briefly to this in section 6.1. 20 Dag Trygve Truslew Haug, Tatiana Nikitina 5 Preserving the locality of agreement We have seen that the Latin dominant participle construction requires feature sharing agreement. Now observe that the configuration that let us establish this is similar to the configurations which have often been analyzed as long-distance agreement: the participle is the target of one agreement process with an argument inside its own clause, but at the same time it is the head of a clause that is in an argument position and controls agreement with a structurally higher verb. This gives the impression that the higher verb agrees with the subject of the embedded clause, i.e. there is long distance agreement. But crucially the agreement is mediated by the intermediate verb (the participle), so there is no violation of locality if we assume feature sharing. We also showed that a long distance agreement analysis does not work for Latin because it cannot account for the case facts. We will now see that the converse is not true: the feature sharing analysis we developed for Latin will carry over to other reported cases of long distance agreement. We cannot here hope to cover the whole issue of long distance agreement, which has engendered extensive theoretical discussion, in particular within Minimalism (see e.g. Koopman, 2006; Boeckx, 2008, 2009, and many others), nor can we deal with all the reported cases of long distance agreement. Instead, in section 5.1, we focus on the three different types of long distance agreement identified by Polinsky (2003). As Polinsky shows, only one of these, which she dubs ‘clause-periphery agreement’, is actually problematic for standard locality assumptions. Accordingly, in section 5.2, we discuss the best-documented cases of clause-periphery agreement (found in Tsez, Polinsky and Potsdam 2001, and in some Algonquian languages, Bruening 2001; Branigan and MacKenzie 2002) and show that, if we assume feature sharing in agreement, the problem simply ceases to exist. (A related point is made in Legate 2005.) Finally, in section 5.3 we show that although these instances of apparent cross-clausal agreement provide the best evidence for feature sharing agreement, there are also some purely clause-internal agreement phenomena that lend themselves naturally to a feature-sharing account. 5.1 Types of long distance agreement According to Polinsky (2003), (apparent) long distance agreement comes in three types: agreement through mediation, agreement through argument sharing and clause-periphery agreement. In mediated agreement, the controller is represented by an unpronounced coindexed ‘proxy’ in the clause containing a target, as represented schematically in (60) (= Polinsky, 2003, ex 8). (60) [IP Subject V+Agri NPi [ CP/IP . . . NPi . . . ] ] In this structure, which is found in several Algonquian languages (Polinsky, 2003, 284), the apparent long-distance agreement is actually local. (61) shows an example from Blackfoot (= Polinsky 2003, ex (10b)). (61) noxkówa máxka’po’takssi nitsíksstatawa nit-wikixtatwaa-wa [n-oxko-wa m-áxk-a’po’takixsi] 1.SUBJ-want;TRANS -3. OBJ 1-son-3 3. SUBJ-might-work ‘I want my son to work.’ (61) has the proxy structure seen in (60), as shown in (62) (= Polinsky 2003, ex (10b)). (62) pro-1 SG want pro-3 SGi [my.son-3 SGi work ] That is, the matrix verb does not directly agree with the downstairs subject, but rather with a coindexed ‘proxy’ which has argument status in the higher clause. (Polinsky, 2003, 285–288) surveys the argument for this analysis, involving binding facts, indirect referential relationships, split antecedence and more. There is, then, no violation of locality here. The second type of apparent long distance agreement, agreement through argument sharing, can come about when the controller of the apparent long distance agreement appears in the clause containing the target at some level of syntactic representation. There are two common processes that can give rise to this: raising and clause union (the formation of a monoclausal ‘complex predicate’). Neither of these processes are relevant for the Latin dominant participles, but some of the examples discussed in the literature on Hindi/Urdu (Butt, 1995; Bhatt, 2005; Butt, 2014) are superficially very similar. Consider (63). (63) a. Ram-ne [rotii khaa-nii] chaah-ii ˙ F eat-INF ; F want-PFV; F ; SG Ram-ERG bread: ‘Ram wanted to eat bread’ Feature sharing in agreement b. 21 Ram-ne [rotii khaa-naa] chaah-aa ˙ F eat-INF ; M want-PFV; M ; SG Ram-ERG bread: ‘Ram wanted to eat bread’ In (63-a), the embedded object rotii, the infinitive khaa-nii and the matrix verb chaah-ii all agree in feminine gender. This agreement is optional,˙ as shown in (63-b). The agreement pattern in (63-a) is explained in Butt (1995, 2014) as a series of local agreement relations: the infinitive agrees with its object, and the finite verb in turn agrees with the infinitive. Although Butt does not spell this out, this analysis obviously entails a feature sharing approach to agreement, since the infinitive is valued for gender because of agreement with its object. However, Butt (1995) and Bhatt (2005) both agree that the infinitive and the matrix verb in (63) form a monoclausal structure, and so the agreement of the “embedded” object and the matrix predicate is in fact entirely local. What makes the Hindi/Urdu case look like the Latin one is the fact that the intermediate verbal head – the infinitive in Hindi-Urdu and the participle in Latin – bears morphological exponents of the agreement features. However, in Hindi-Urdu, the infinitive’s agreement is parasitic: the infinitive cannot agree with its object unless the matrix verb also agrees (although there is dialectal variation, see Bhatt, 2005, 785). In Latin, by contrast, the participle and its subject can and must agree under all circumstances, providing independent evidence for a local agreement process inside the dominant participle construction. We now turn to clause-periphery agreement, which according to Polinsky (2003) is the only case that actually violates the clause-mate assumption. These examples are also much closer to the Latin case. Clause-periphery agreement is illustrated by the Tsez example (22) that was discussed in section 1.3, repeated here as (64-a). The matrix verb agrees in noun class (glossed with Roman numerals) with the absolutive argument of an embedded clause that is itself in an absolutive position. (64-b) (= Polinsky and Potsdam, 2001, ex 47a) shows that this kind of agreement is optional - the sentential complement can also trigger class IV (abstract nominal) agreement. Notice that the structure of these examples are very similar to what we proposed for Latin. The complement clause is in fact a participle that has been nominalized, just like in our analysis of the Latin dominant participles, with the difference that in Tsez, the nominalization is signalled by the morphology, whereas in Latin it is purely syntactic. The same complementation strategy is found in the related language Hinuq, where it can also triggers long distance agreement (Forker, 2011, 558-561,574–585). (64) a. b. eni-r [už-ā magalu b-āc’-ru-ìi] b-iy-xo mother-DAT boy-ERG bread:III ; ABS III-eat-PST; PTCP - NMLZ III-know-PRES ‘The mother knows the boy ate the bread.’ eni-r [už-ā magalu b-āc’-ru-ìi] r-iy-xo mother-DAT boy-ERG bread:III ; ABS III-eat-PST; PTCP - NMLZ IV-know-PRES ‘The mother knows the boy ate the bread.’ Polinsky and Potsdam (2001, 610) argue that the alternation in (64) is governed by the topicality of the embedded absolutive: long distance agreement (64-a) is only possible when the absolutive is a topic; otherwise the complement clause triggers class IV (abstract nominal) agreement (64-b). They further argue that the topic undergoes covert movement to a left-peripheral position (TopP) in the clause. Finally, they propose that locality in agreement is determined by government, which is looser than the clause-mate assumption and yields the same predictions as the Phase-Impenetrability Condition: a head governs (and hence, following Polinsky and Potsdam (2001, 627), can agree with) its specifier, its complement, elements adjoined to its complement, and the specifier of its complement. The latter type of agreement, Polinsky and Potsdam (2001) argue, is instantiated in Tsez: whenever there is no CP above TopP, the verb can agree with an absolutive in TopP. The structure is shown in (65). (65) [ IP . . . V+Agri [ TopicP NPi [ IP . . . ti . . . ] ] ] Similar structures (though allowing agreement with both topics and foci) have been proposed for the Algonquian languages Passamaquoddy (Bruening, 2001) and Innu-Aimûn (Branigan and MacKenzie, 2002). (Note that Bruening argues that Passamquoddy has both clause-periphery agreement and ‘proxy agreement’ as discussed above, in different configurations.) We now turn to a closer discussion of the data from Tsez, Passamaquoddy and InnuAimûn, which on the analysis in (65) clearly violate standard locality assumptions. We will see that we can maintain strict locality (i.e. the clause-mate assumption) if we assume feature-sharing agreement. 5.2 Clause-periphery agreement as local, feature-sharing agreement In Tsez, as we just saw, long distance agreement is contingent on topicality, and the same holds in Innu-Aimûn, as we will see. In Passamaquoddy, on the other hand, there is evidence that also foci can trigger long distance 22 Dag Trygve Truslew Haug, Tatiana Nikitina agreement (Bruening, 2001, 282-3). In none of these languages, however, is the long distance agreement controller necessarily surface peripheral in the embedded clause. For example, we saw in (64-a) that the controller is in situ in the embedded clause. In derivational accounts this is typically captured as covert topicalization. This resort to covert movement may seem suspicious and, although Polinsky and Potsdam (2001, 629-633) provides theoryinternal justification, Bošković (2003, 2007) argues that it is problematic on Minimalist assumptions. Also, it is often hard to identify topics in the absence of overt syntactic (or morphological) marking, so that the analysis runs the risk of circularity. Nevertheless, to stay close to the interpretation of the data in Polinsky and Potsdam (2001); Bruening (2001); Branigan and MacKenzie (2002) we will assume that this description is basically correct and that the overarching generalization is that long distance agreement requires the controller argument to be in an operator position. In LFG terms this means that the agreement controller is functionally identified with an operator function even when it appears in situ. To capture the Passamaquoddy data we make use of the generalized operator function UDF (Alsina, 2008; Asudeh, 2012), which is not associated with any specific discourse function, rather than the more commonly used TOPIC function.23 Assuming, then, that agreement in Tsez involves feature sharing when the controller is topical, the agreement facts follow directly without any need to tinker with syntactic locality. In (64-a), then, the embedded clause will actually take on the class III feature of its topic and hence induce class III agreement on the matrix verb. In LFG terms we get the following structure: (66) PRED SUBJ OBJ ‘know hSUBJ , OBJi’ h i “ MOTHER ” PRED ‘eat hSUBJ , OBJ i’ h i AGR CLASS III .. . h i UDF AGR h i SUBJ “ BOY ” “ BREAD ” h OBJ AGR CLASS III i The embedded object bread is functionally identified with the embedded clause’s operator function UDF, which in turn agrees in a feature-sharing way with its local predicate eat. In this way, the CLASS III feature gets passed up to the clausal level where in turn it agrees with the governing predicate know. Observe also that the morphological facts are similar to the Swahili case (18) brought up by Kathol (1999) in defense of feature sharing agreement: in the upper part of the agreement chain, both the matrix and the embedded verb bears the same morphological exponent b, indicating noun class III. Unlike the Swahili case, the same marker is not present on the lower element magalu, but this is not surprising since the relevant agreement feature here is the noun class, which is obviously inherent in the noun. A similar analysis of the Tsez facts (though couched in different, derivational terms) is in fact briefly discussed and rejected by Polinsky and Potsdam (2001, fn. 17), based on evidence from pronominalization. The embedded clause retains its class IV specification in a situation of sentential anaphora: (67) a. b. enir [už-ā magalu b-āc’-ru-ìi] b-iy-xo mother:DAT boy:ERG bread:III . ABS III:eat:PST; PTCP. NMLZ III:know:PRES ‘The mother knows the boy ate the bread.’ neìā [ža r-igu/*b-igu yoì-ňin] eňis she:ERG this IV:good/III:good is:COMP said ‘She says it (=that the boy ate the bread) is good.’ 23 A reviewer points out it is unclear what exactly is the connection between operator positions and long distance agreement. This is true, but it is equally problematic for a functional and a configurational approach to operators: the configurational approach has the advantage that the operator position is closer in tree-geometric terms to the agreement target, but the disadvantage that the controller does not in fact always appear in this position. The functional approach to operators has the advantage of not predicting contrary to the surface facts that the controller must be in the periphery of the embedded clause, but the corresponding disadvantage that the controller is not closer to its target. So far feature sharing agreement has not been particularly well studied empirically. In the light of the evidence for interaction between information structure and (object) agreement amassed by Dalrymple and Nikolaeva (2011) we would not find it surprising if some of these interactions involve feature sharing. Feature sharing in agreement 23 (67-b) shows that the pronominal reference to the embedded clause from (67-a) has class IV, even if the matrix clause in (67-a) agrees in class III. Polinsky and Potsdam conclude that the embedded clause in (67-a) remains class IV and the matrix verb agrees with the embedded absolutive topic, not the embedded clause. However, there is no reason to expect all features to survive in pronominalization, and in any case, it is well known that pronounantecedent agreement tends towards semantic resolution (see for example Corbett, 1979). So this argument is invalid,24 and we think it is better to analyze Tsez in terms of feature sharing agreement, which does not require a locality violation. The same analysis can be extended to reported instances of long distance agreement in Algonquian, although we also need to capture the fact that the Algonquian languages have an inverse agreement system where the verb agrees with two arguments. Consider (68) (= Bruening 2001, ex 679b, Bruening 2009, ex (22)) from Passamaquoddy. (68) Ma=te n-wewitaham-a-wiy-ik mahtoqehsuw-ok tama al n-toli-putoma-n-ok NEG = EMPH 1-remember- DIR - NEG -3 P rabbit-3 P where UNCERTAIN 1-there-lose-S EC O BJ-3P kcihku-k forest-LOC ‘I don’t remember where in the forest I lost the rabbits’ In (68), the matrix verb ‘remember’ agrees with its own first person argument as well as with a downstairs third person plural argument. (Notice that the two agreement slots are not directly associated with specific grammatical functions – that is the job of the DIR morpheme. For our purposes we can ignore this aspect of inverse agreement systems and simply take the verb form as a whole as specifying features of its subject and object, altough it is not glossed that way. When the verb form is marked as direct, the first agreement slot indicates subject features and the second slot object features.) The downstairs argument has been fronted to the position before the wh-word of the embedded clause, but Bruening argues that it is still part of the lower clause. Therefore, the agreement between this argument and the higher verb is apparently cross-clausal. In Passamaquoddy like in Tsez, the controller of the apparent long distance agreement need not be fronted in the surface structure but can remain in situ. It does need to have some special discourse function (Bruening, 2001, 282-283), but unlike in Tsez, it does not have to be a topic. In (69) (shortened from Bruening 2001, ex (697a)), the controller is the wh-word in an embedded question, which is presumably a focus. (69) N-kosiciy-a wen elomi-ya-t 1-know.TA - DIR who IC.away-go-3 CONJ ‘I know who left’ The matrix verb is marked as transitive animate in agreement with the wh-word wen. This is one of the important arguments against a ‘proxy agreement’ analysis, since that would require a structure that could be paraphrased ‘I know himi [whoi left]?’, which has no sensible interpretation. For our purposes it is important to note that like in Tsez the fronted argument also induces agreement on its local verb. Therefore, the feature-sharing analysis works for Passamaquoddy too. Let us illustrate this with a sketch analysis of (68). First, the agreement morphology on the embedded verb specifies that it takes a first person subject and a third person plural object. In LFG terms, its f-structure looks like (70). (70) PRED SUBJ OBJ ‘lose h SUBJ , OBJi h AGR AGR PERSON " PERSON NUMBER 1 i # 3 pl As in Tsez, we analyze agreement with an argument in the higher operator position as feature sharing. Schematically we then get the structure in (71) for the embedded clause. (We ignore all the material other than the verb and the two arguments it agrees with. Presumably the wh-word is in a distinct operator position, as in Bruening’s analysis.) 24 The same conclusion is reached by Koopman (2006, 174) and Boeckx (2009, 15). 24 (71) Dag Trygve Truslew Haug, Tatiana Nikitina PRED AGR UDF SUBJ OBJ ‘lose hSUBJ , " PERSON NUMBER .. . h h AGR “ RABBITS ” AGR i “I” AGR # 3 pl OBJ i’ PERSON " i 1 PERSON NUMBER # 3 pl Thus, feature sharing passes up the agreement features from the operator UDF to the verb which heads the complement clause; therefore the matrix verb’s agreement in these features is entirely local. The same analysis can be extended to the other Algonquian language that has been claimed to have optional long distance agreement, Innu-Aimûn (Montagnais), as seen in (72) (= Branigan and MacKenzie 2002, ex (4)) (72) a. b. Ni-tshissît-en kâ-uîtshi-shk Pûn utâuia. 1-remember-TI PST-helped-3/2 PL Paul father ‘I remember that Paul’s father helped you.’ Ni-tshissît-âtin kâ-uîtshi-shk Pûn utâuia. 1-remember-1/2 PL PST-helped-3/2 PL Paul father ‘I remember that Paul’s father helped you.’ In (72-a), the matrix verb is marked as having a first person prefix and is marked as transitive inanimate (TI), reflecting either agreement with the complement clause or some kind of default agreement. The verb of the complement clause is marked as taking a third person subject and a second person plural object. In (72-b) the matrix verb is in the transitive animate class and agrees with the downstairs second plural object (as well as its first person subject), but the agreement in the complement clause stays the same. Therefore such structures can be analyzed as in (71). The crucial point, then, is that the argument that participates in long distance agreement also participates in local agreement within its own clause. If the local agreement process is feature sharing, the features will be passed up the verb of the embedded clause, where they can participate in local agreement with the matrix clause, giving the impression of long distance agreement. Nevertheless, it must be admitted that the feature sharing is not as transparently encoded in Algonquian as is the case in Tsez: we do not see the same morphological exponent realized twice in the agreement chain; instead, a complex inverse agreement system is at work, where both the higher and the lower verb agree with two arguments. The two agreement relations are to some extent fused in the morphological system, but at the syntactic level, the fact remains that there is one argument that both the higher and the lower verbs interact with. In section 6.2 we suggest that the difference between the type of agreement that we see in Latin and Tsez versus what we see in Algonquian may reflect different historical origins. In conclusion, we have strong empirical evidence for the clause-mate assumption: While the extant long distance agreement accounts cannot account for the Latin data presented here, as we saw in section 5, the feature sharing, strictly local analysis proposed here for Latin data does generalize to previously reported data on long distance agreement. 5.3 Clause-internal feature sharing Finally, we would like to note that although long distance agreement provides the most compelling case for feature sharing, it may also be found in local agreement. A case in point is Archi (Lezgic, Northeast Caucasian). In this language, verbs that can agree always agree with their absolutive argument. This is shown in (73).25 Agreement is in gender class, which is glossed with Roman numerals I - IV as in the Tsez examples above. 25 The Archi examples come from the website of the project From competing theories to fieldwork, http://fahs-wiki.soh.surrey.ac. uk/groups/fromcompetingtheoriestofieldworkarchi/. See also Chumakina and Corbett (2008). Feature sharing in agreement (73) 25 to-w-mi-s Ajša d-ak:u that.one-I . SG - OBL . SG - DAT Aisha.( II )[ SG . ABS .] II . SG-see.PFV ‘He has seen Aisha (female).’ More surprisingly, other elements in the clause, both arguments and adjuncts, show agreement if they have a morphological slot for agreement. Interestingly, even some adverbs and one postposition has such a slot. (74) shows an agreeing ergative argument (boldfaced). (74) nena‹b›u hanžugur Qummar b-a‹r›ča-r? ‹ III . SG ›1 PL . INCL . ERG how life(III)[ABS . SG] III . SG-‹IMPF›carry.out-IMPF ‘. . . how (should) we spend our life?’ Clearly it is possible to analyze this as the ergative argument nena‹b›u simply agreeing with the absolutive argument Qummar. However, there is no syntactic dependency between the ergative and the absolutive argument, so this is a puzzling type of agreement. A better approach may be to assume (following Kibrik 2003, 563–564 but recasting the analysis in our terms; see also Kibrik 1994, 349) that the predicate-absolutive agreement is feature sharing so that the clausal head actually comes to bear a syntactic gender III feature by virtue of agreement with the absolutive. The ergative (and other agreeing elements) can then be analyzed as agreeing with their clausal head. Although this is still an unusual type of agreement since the verb is the controller rather than the target, it would be a less surprising configuration, since there is at least a syntactic dependency between the ergative and its clausal head. A related phenomenon is found in Warlpiri (Ngarrka, Pama-Nyungan). As discussed by Simpson (1991, 202– 214), adverbials in this language agree in case with the event participants they ‘modify’: for example, a manner adjunct modifies the event, but may also have a special relationship to the subject and therefore agree with it in (ergative or absolutive) case. Similarly, location adjuncts may be understood as attributing location to an event participant, and this can induce case agreement. Finally, even time adjuncts may have case agreement, as in (75) (= Simpson, 1991, p. 208, ex (189)). (75) Jalangu-rlu ka-lu-jana puluku turnu-ma-ni yapa-ngku. today-ERG PRES-3 P. SUBJ-3 P. OBJ bullock muster-CAUS man-ERG ‘The people are mustering the cattle today.’ It is optional for the time adjunct to agree in case, but if it does agree, it can only agree with the subject and not with other event participants. However, it is hard to see how time adjuncts can bear a special relationship to the subject. Simpson (1991, 209) provides a tentative analysis which would amount to feature sharing in our terminology: “the subject’s CASE provides a case-feature for the event”. This in turn makes the case available for further agreement processes and the adverbial, then, agrees with the event (i.e. its governing verb) rather than the subject. 6 Consequences 6.1 Types of agreement We have seen that the Latin data requires symmetric agreement, because NPe controls GENDER and NUMBER while V/NPc controls CASE, so that agreement must work in both directions. And it also requires feature sharing, since all three features are available for further agreement at both NPe and NPc . In sum, the Latin dominant construction requires an approach with both symmetry and feature sharing, either along the lines of Ackema and Neeleman (2013), or the LFG theory outlined above. Moreover, the feature sharing approach enables us to deal with reported cases of long distance agreement and hence avoid violations of syntactic locality. But we cannot conclude, as seems to be implied in Ackema and Neeleman (2013) (and other work arguing for feature sharing in agreement), that agreement always involves feature sharing.26 While the Latin dominant participle construction shows that feature sharing is needed in some cases, it does not show that it is needed in all cases. In fact, it is not clear that feature sharing is general inside Latin itself. For example, an alternative to the analysis in (59) could assume that agreement involves feature sharing only in the dominant participle case, while other instances of predicate-argument and head-modifier agreement are symmetric in the ordinary way. This yields (76). 26 Their only explicit claim is that agreement is symmetric, but their implementation of symmetry also incorporates feature sharing in the core syntax. 26 Dag Trygve Truslew Haug, Tatiana Nikitina (76) “ADD GLORY ” SUBJ “ BE BORN ” AGR SUBJ h i “AUGUSTUS ” NUMBER s AGR GENDER m CASE nom h i ADJ “ DIVINE ” (76) is a simpler structure than (59). On the other hand, (76) implies that the intuitive surface generalization in (33) does not actually correspond to a unified theoretical treatment of agreement in Latin. As far as we can tell, there is no empirical evidence in Latin that would let us decide between the two in a clear way.27 It is true that in Latin, finite CPs in subject position behave differently from dominant participles and always induce third singular masculine agreement irrespective of the properties of the subject, just like in English (77). (77) That I was elected is/*am surprising. However, we can assume that the complementizer introduces a separate layer of structure so that the verb features do not percolate upwards. Nevertheless, it is possible that a detailed investigation will show that both feature sharing and ordinary agreement is needed for Latin. We leave this matter here. Although it is possible that all Latin agreement is feature sharing, there is clearly evidence from other languages of agreement that cannot be feature sharing. For example, there are languages where predicates agree with more than one argument, as we saw for Passamaquoddy and Innu-Aimûn above, and as illustrated by (78) from Ostyak (= Dalrymple and Nikolaeva, 2011, 142, ex 2d). (78) (ma) tam kalaN-@t we:l-s@-l-am I these reindeer-PL kill-PAST-PL . OBJ-1 SG . SUBJ ‘I killed these reindeer.’ The predicate agrees with both its subject and its object, which have incompatible NUMBER features. Hence, if there was feature sharing in both these agreement processes, the verb would have an inconsistent feature bundle: feature sharing means that the agreement features are represented in the syntactic locus of the verb itself, so we cannot appeal to distinct agreement slots. (Recall that feature sharing means that the verb takes on the features of its agreeing dependents and is able to act as a controller of those features on another target.) It is still possible that one of the two agreement processes in (78) uses feature sharing, however, so that the verb takes on the features of, say, the subject, while it agrees “normally” (without feature sharing) with the object. Another example comes from Portuguese inflected infinitives (79) (= Raposo, 1987, ex 27a) (79) [Eles aprovarem a proposta] será difícil. they approve-INF.3 PL the proposal be.FUT.3 SG difficult ‘For them to approve the proposal will be difficult.’ The infinitive agrees in PERSON and NUMBER with its subject, but induces singular agreement on the matrix verb, so it cannot itself be PLURAL, as feature sharing would predict. It is not the case, then, that a feature sharing agreement theory is strictly more expressive (less restrictive) than an ordinary symmetric theory. There are phenomena that can be dealt with within a symmetric theory but not in a feature sharing one (at least not without ancillary hypotheses). This means that we cannot pin down a single agree mechanism. Some agreement phenomena require feature sharing, others do not, and yet others are incompatible with feature sharing. Along the other axis of theoretical variation, symmetry, it is clear that some phenomena require symmetry and others do not; but it is an open question 27 (i) A reviewer suggests that NPs with possessive pronouns (i) show that agreement cannot always be feature sharing. filius noster son.NOM ; M ; SG our[1.PL]:NOM ; M ; SG ‘our son’ The features PERSON 1 and NUMBER pl are INDEX features related to the reference of noster ‘our’. But there is no agreement in INDEX features in (i), so the question of feature sharing does not arise. What (i) shows is simply that INDEX and CONCORD features can diverge. Feature sharing in agreement 27 Agreement type CONCORD INDEX Typical domain NP clause Typical features CASE , GENDER , NUMBER PERSON , NUMBER , GENDER Table 2 Types of agreement whether there are cases which are incompatible with symmetry. The answer may well be no, for the arguments in favor of asymmetry typically revolve around methodological considerations and the desire to limit the expressivity of the formal theory of grammar. Laudable as this goal is in principle, it makes for arguments for asymmetry that fall apart once there is a single case that clearly requires the formal theory to be more expressive. It is therefore possible that the symmetry/asymmetry distinction, unlike feature sharing, classifies theories of agreement rather than agreement phenomena. But at this stage, we leave open whether an empirical argument in favour of asymmetric agreement can be made for some agreement phenomena. While it is possible within e.g. Minimalism to devise several agreement mechanisms, we take the diversity of agreement phenomena to speak against an architecture where the agreement mechanisms are theoretical primitives (like Minimalism’s Agree), because this leads to a multiplication of these primitives. The LFG architecture, by contrast, handles agreement by mechanisms that also appear elsewhere in the theory. Feature sharing is a general mechanism for handling cases where a syntactic entity is present in more than one position in the structure, including topicalization, raising and – as we have seen here – agreement. Standard symmetric agreement is handled by functional descriptions, which is the general mechanism for lexical items to provide information about themselves and their environment (subject to locality considerations): they are also used e.g. for case assignment, subcategorization and binding. LFG also has a mechanism that could be used for asymmetric agreement, if that should turn out to be necessary: these would be handled by constraining equations, a general mechanism for handling feature checking in LFG. In sum, the more abstract nature of LFG’s primitives, which correspond closely to mathematical operations, makes it easier to generalize across grammatical phenomena.28 6.2 The origins of feature sharing and long distance agreement Across frameworks, there is a widely shared assumption that there are theoretically interesting differences between predicate-argument agreement, and agreement inside the noun phrase. In traditional terms, a distinction is often drawn between concord (inside the noun phrase) and agreement (in predicate-argument structures), although these terms are not consistently used (Corbett, 2006, 5-7). Similar distinctions are drawn in Minimalism (Carstens, 2000), HPSG (Wechsler and Zlatic, 2003) and LFG (Dalrymple and King, 2004). Here we follow the HPSG/LFG tradition and distinguish between CONCORD and INDEX, which are prototypically (but not exclusively) associated with agreement inside the noun phrase and in the clause (predicate-argument). The two types of agreement also prototypically involve different features, possibly for historical reasons which we will only sketch here; see Wechsler (2011, section 5) for a more thorough treatment. Table 2 sums up the differences between CONCORD and INDEX agreement. HPSG develops an interesting view of INDEX agreement. INDEX features are thought of as properties of discourse referents, partially specifying their referential index. This is why it includes PERSON, which may identify the referent, but not CASE, which has no connection with reference. There is a historical explanation for this: predicate-argument agreement typically originates in pronouns via a path from full coreferent pronoun, to cliticization to incorporation (Givón, 1976; Bynon, 1992; Corbett, 1995). Hence, coindexation is part of predicateargument agreement from the outset. CONCORD agreement has a different origin, most likely in incorporated nominal classifiers (Greenberg, 1978; Corbett, 1991, 2006; Grinevald and Seifart, 2004), so there is no coindexation involved. We therefore predict that CONCORD agreement can involve feature sharing without leading to coreference. The stronger hypothesis would be that CONCORD always involves feature sharing. This view is implicit in much HPSG work (e.g. Wechsler and Zlatic, 2003) and also in more typologically oriented work such as that of Corbett (2006, 133-137) who argues that NP-internal agreement in case and definiteness results from these features being imposed on the noun phrase as a whole. It may not be possible to decide whether CONCORD is always feature sharing. For crucially, as long as we have CONCORD in its prototypical domain – the NP – we are unlikely to see visible effects of feature sharing, such as long-distance agreement, as in Latin and Tsez, or apparent agreement between coarguments in a clause, as in Archi and Warlpiri. On the other hand, we do observe the frequent identity of the morphological exponents on targets and controllers, which Kathol (1999) used as a conceptual argument for feature sharing. 28 HPSG is similar in this respect, so these observations hold for that framework too. 28 Dag Trygve Truslew Haug, Tatiana Nikitina The effects of feature sharing, then, only become visible if CONCORD agreement appears in non-prototypical environments, such as predicate-argument structures, as in the Latin dominant participle structure. This gives rise to a new question, however: Why do we find CONCORD agreement in predicate-argument structures at all? The Latin case is not unique: Wechsler and Zlatic (2003, 84) argue that secondary (but not primary) predicates in Serbo-Croat agree in CONCORD. (Conversely, Dalrymple and King (2004, 83–87) argue that there are instances of noun-determiner agreement in INDEX). If the hypotheses on the origin of CONCORD and INDEX agreement are correct, there must be a mechanism by which CONCORD agreement can spread outside its original, NP-internal domain. The history of the dominant participle construction offers one possible mechanism, namely headedness reversal. As argued in Nikitina and Haug (2015), the dominant participle construction ultimately arose as a result of reanalysis of headedness relations within a temporal expression. Expressions in which a noun with a temporal meaning was originally modified by a participle became reanalyzed as a clausal adjunct in which the participle agrees with its subject: (80) b. clausal adjunct “[with] winter ending” S a. attributive construction “[with] winter [which was] ending” NP NP AP N A winter ending ⇒ NP V N winter ending The construction’s origin is also what sets it apart from an otherwise similar construction, the English nominal gerund (e.g. my going away). The English nominal gerund is also semantically clausal and has the distribution of an NP. However, the verbal element of the nominal gerund is originally a verbal noun rather than a verbal adjective like the Latin participle. For this reason the subject-predicate relation is established via ordinary case assignment, at first the expected adnominal genitive/possessive case, then accusative (me going away). To the extent that long-distance agreement (and more generally, feature sharing agreement outside the NP) originates in such headedness reversals of an original NP CONCORD structure, this gives us a number of predictions about long distance agreement, both of which are borne out by the Latin and Tsez data, but contradicted by Passamaquoddy and Innu-Aimûn. First, there should be no long distance agreement in PERSON, both because PERSON is a prototypical INDEX feature and because PERSON is typically not found in NP-internal agreement Stassen (1997). Second, more generally, we expect long distance agreement to involve the same features as in noun-modifier agreement in any given language. Prototypically, these are CASE , NUMBER , GENDER (also known as noun class in Tsez and other languages). Again, we don’t expect to find long distance agreement in PERSON, since PERSON is typically absent from noun-modifier agreement. Third, we expect to see the same morphological exponents on both the higher and the lower verb, in line with the observation of Kathol (1999) for NP-internal agreement. Note that we do not necessarily see this exponent on the lower NP which ‘sets off’ the long distance agreement: if the relevant feature is inherent to that NP, it may not have a morphological exponent at all. Being based on diachrony, these are expectations rather than absolute predictions. First, subsequent evolution could disturb the pattern. Second, there may be alternative origins for long distance agreement. The Latin and Tsez constructions are structurally very similiar and bear their nominal origin on their sleeves in that the lower predicate is even synchronically a nominalized form of a participle. The Algonquian constructions discussed in section 5.1 are in this respect very different. The lower verb is finite with no nominal properties and the features relevant to long distance agreement are PERSON and NUMBER. This may point to a different origin of the phenomenon. As we saw in section 5.1, several Algonquian languages have proxy agreement structures that only look like long distance agreement. (Bruening, 2001, chapter 5) even argues that Passamaquoddy has both proxy agreement and real long distance agreement in the form of clause-periphery agreement. It is tempting, then, to conclude that Passamquoddy and Innu-Aimûn clause-periphery agreement arose from proxy agreement structures through the loss of the invisible proxy in the matrix clause, leading to a structure that can be captured synchronically by the same mechanism (feature sharing) as the Latin and Tsez data. Feature sharing in agreement 29 7 Conclusions In the introduction, we set out the analytic options that we have in dealing with agreement. This yielded a four-way typology with two cross-cutting features, symmetry and feature sharing. Throughout the paper, we have populated the symmetric slots of this typology: symmetric, feature sharing agreement is needed to deal with the Latin case, but – we argued – is also a more general phenomenon, and yields a better explanation of so-called long distance agreement than analyses which postulate violations of syntactic locality. So the locality of agreement is vindicated. However, there are also instances of agreement without feature sharing, as illustrated in (78)–(79). It follows that agreement is not a uniform phenomenon, which suggests it should not be a primitive of syntactic theory. We also saw that the Latin construction originated in headedness reversal. If this is also true for other cases of long distance agreement, it yields some interesting predictions about its behaviour. Finally, we would like to stress the implications for syntactic locality, in particular in agreement relations. As discussed in section 1.3, locality is a long-standing desideratum of formal syntactic theories, but one that scholars have recently tended to give up because of apparent long distance agreement in languages such as Tsez, Passamaquoddy and Innu-Aimûn. Instead they have assumed various non-local agreement mechanisms. In this paper, we have argued that the cyclic feature sharing approach to agreement is not only conceptually superior to long distance agreement, but also empirically more successful since it can account for the Latin dominant participle construction, which is incompatible with current long distance agreement theories, while also generalizing to the previously reported data. Acknowledgements We thank Oleg Belyaev, Grev Corbett, Mary Dalrymple, Michael Daniel, Marius Jøhndal, Louisa Sadler, associate editor Ad Neeleman and NLLT reviewers for valuable feedback on the research reported in this paper. References Ackema, Peter, and Ad Neeleman. 2013. Subset controllers in agreement relations. Morphology 23 (2): 291–323. Alsina, Alex. 2008. A theory of structure-sharing. In Proceedings of LFG08, eds. Miriam Butt and Tracy Holloway King, 5–25. Stanford: CSLI Publications. Andrews, Avery. 1982. Long distance agreement in Modern Icelandic. In The Nature of Syntactic Representation, eds. Pauline Jacobson and Geoffrey Pullum, 1–33. Dordrecht: D. Reidel. Asudeh, Ash. 2012. The Logic of Pronominal Resumption. Oxford: Oxford University Press. Barlow, Michael. 1992. A Situated Theory of Agreement. New York: Garland. Bhatt, Rajesh. 2005. Long distance agreement in Hindi-Urdu. Natural Language & Linguistic Theory 23 (4): 757– 807. Bobaljik, Jonathan. 2008. Where’s phi? Agreement as a post-syntactic operation. In Phi-theory: Phi-features across Interfaces and Modules, eds. Daniel Harbour, David Adger, and Susana Béjar, 295–328. Oxford: Oxford University Press. Boeckx, Cedric. 2008. Aspects of the Syntax of Agreement. London: Routledge. Boeckx, Cedric. 2009. On long-distance Agree. Iberia: An International Journal of Theoretical Linguistics 1: 1–32. Boeckx, Cedric, Norbert Hornstein, and Jairo Nunes. 2010. Control as Movement. Cambridge: Cambridge University Press. Bošković, Željko. 2003. Agree, phases, and intervention effects. Linguistic Analysis 33 (1–2): 54–96. Bošković, Željko. 2007. On the locality and motivation of Move and Agree: An even more minimal theory. Linguistic Inquiry 38 (4): 589–644. Branigan, Phil, and Marguerite MacKenzie. 2002. Altruism, A-movement, and object agreement in Innu-aimûn. Linguistic Inquiry 33 (3): 385–407. Bresnan, Joan. 1982. Control and complementation. In The Mental Representation of Grammatical Relations, ed. Joan Bresnan, 282–390. Cambridge, MA: MIT Press. Bresnan, Joan. 1997. Mixed categories as head sharing constructions. In Proceedings of LFG97, eds. Miriam Butt and Tracy Holloway King. Stanford: CSLI Publications. Bresnan, Joan. 2001. Lexical-Functional Syntax. Oxford: Blackwell. Bruening, Benjamin. 2001. Syntax at the edge: Cross-clausal phenomena and the syntax of Passamaquoddy. PhD diss, Massachusetts Institute of Technology. Bruening, Benjamin. 2009. Algonquian languages have a-movement and a-agreement. Linguistic Inquiry 40 (3): 427–445. Butt, Miriam. 1995. The Structure of Complex Predicates in Urdu. Stanford: CSLI Publications. 30 Dag Trygve Truslew Haug, Tatiana Nikitina Butt, Miriam. 2014. Control vs. complex predication. identifying non-finite complements. Natural Language & Linguistic Theory 32 (1): 165–190. Bynon, Theodora. 1992. Pronominal attrition, clitic doubling and typological change. Folia Linguistica Historica 13: 27–63. Carstens, Vicki. 2000. Concord in Minimalist theory. Linguistic Inquiry 31 (2): 319–355. Cecchetto, Carlo, and Renato Oniga. 2004. A challenge to null case theory. Linguistic Inquiry 35 (1): 141–149. Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press. Chomsky, Noam. 2000. Minimalist inquiries: The framework. In Step by Step. Essays on Minimalist Syntax in Honor of Howard Lasnik, eds. Roger Martin, David Michaels, and Juan Uriagereka, 89–155. Cambridge, MA: MIT Press. Chumakina, Marina, and Greville Corbett. 2008. Archi: The challenge of an extreme agreement system. In Fonetika i nefonetika. K 70-letiju Sandro V. Kodzasova, ed. Aleksandr Arxipov et al., 184–194. Moscow: Jasyki slavjanskix kul’tur. Corbett, Greville. 1979. The agreement hierarchy. Journal of Linguistics 15 (2): 203–224. Corbett, Greville. 1991. Gender. Cambridge: Cambridge University Press. Corbett, Greville. 1995. Agreement (research into syntactic change). In Syntax: An International Handbook of Contemporary Research, eds. Joachim Jacobs, Arnim von Stechow, Wolfgang Sternefeld, and Theo Venneman, Vol. II, 1235–1244. Berlin: de Gruyter. Corbett, Greville. 2006. Agreement. Cambridge: Cambridge University Press. Dalrymple, Mary, and Ronald M Kaplan. 2000. Feature indeterminacy and feature resolution. Language 76 (4): 759–798. Dalrymple, Mary, and Tracy Holloway King. 2004. Determiner agreement and noun conjunction. Journal of Linguistics 40 (1): 69–104. Dalrymple, Mary, and Irina Nikolaeva. 2011. Objects and Information Structure. Cambridge: Cambridge University Press. Devine, A. M., and L. D. Stephens. 2000. Discontinuous Syntax – Hyperbaton in Greek. Oxford: Oxford University Press. Falk, Yehuda. 2006. On the representation of case and agreement. In Proceedings of LFG06, eds. Miriam Butt and Tracy Holloway King. Stanford: CSLI Publications. Forker, Diana. 2011. A Grammar of Hinuq. PhD diss, Universität Leipzig. Frampton, Jon, and Sam Gutmann. 2000. Agreement is feature sharing. Ms. Northeastern University. Givón, Talmy. 1976. Topic, pronoun and grammatical agreement. In Subject and Topic, ed. Charles N. Li, 149–188. New York: Academic Press. Greenberg, Joseph H. 1978. How does a language acquire gender markers?, eds. J. Greenberg, C. A. Ferguson, and E. A. Moravcsik, 47–82. Stanford: Stanford University Press. Grinevald, Colette, and Frank Seifart. 2004. Noun classes in African and Amazonian languages: Towards a comparison. Linguistic Typology 8: 34–48. Harmer, Lewis, and Frederick John Norton. 1957. A Manual of Modern Spanish. London: University Tutorial Press. Haug, Dag Trygve Truslew, and Tanya Nikitina. 2012. The many cases of non-finite subjects: The challenge of "dominant" participles. In Proceedings of LFG12, eds. Miriam Butt and Tracy Holloway King, 292–311. CSLI Publications. Heick, Otto William. 1936. The ab urbe condita Construction in Latin. PhD diss, University of Nebraska. Kajita, Masaru. 1968. A Generative-Transformational Study of Semi-Auxiliaries in Present-Day American English. Tokyo: Sanseido. Kathol, Andreas. 1999. Agreement and the syntax-morphology interface in HPSG. In Studies in Contemporary Phrase Structure Grammar, eds. Robert Levine and Georgia Green, 223–274. Cambridge: Cambridge University Press. Kibrik, Aleksandr E. 1994. Archi. In The Indigenous Languages of the Caucasus. vol. 4 North-East Caucasian languages, part 2, ed. Rieks Smeet, 297–366. Delmar, NY: Caravan Books. Kibrik, Aleksandr E. 2003. Konstanty i peremennye jazyka. Sankt-Peterburg: Aleteia. Koopman, Hilda. 2006. Agreement configurations. In defense of "spec head". In Agreement Systems, ed. Cedric Boeckx, 159–199. Amsterdam: John Benjamins. Lapointe, Steven G. 1999. Dual lexical categories vs. phrasal conversion in the analysis of gerund phrases. In Umop 24: Papers from the 25th anniversary, eds. Paul de Lacy and Anita Nowak, 157–189. Amherst: University of Massachusetts Graduate Lingusitic Student Association. Legate, Julie Anne. 2005. Phases and cyclic agreement. MIT Working Papers in Linguistics 49: 147–156. Nikitina, Tatiana. 2008. The mixing of syntactic properties and language change. PhD diss, Stanford. Feature sharing in agreement 31 Nikitina, Tatiana, and Dag Trygve Truslew Haug. 2015. Syntactic nominalization in Latin: a case of non-canonical subject agreement. Transactions of the Philological Society 113 (1): 1–26. Pesetsky, David, and Esther Torrego. 2007. The syntax of valuation and the interpretability of features. In Phrasal and Clausal Architecture: Syntactic Derivation and Interpretation, eds. Simin Karimi, Visa Samiian, and Wendy K. Wilkins, 262–294. Amsterdam: Benjamins. Pesetsky, David, and Esther Torrego. 2011. Case. In The Oxford Handbook of Linguistic Minimalism, ed. Cedrick Boeckx, 52–72. Oxford: Oxford University Press. Pinkster, Harm. 1990. Latin Syntax and Semantics. London: Routledge. Polinsky, Maria. 2003. Non-canonical agreement is canonical. Transactions of the Philological Society 101 (2): 279–312. Polinsky, Maria, and Eric Potsdam. 2001. Long-distance agreement and topic in Tsez. Natural Language & Linguistic Theory 19 (3): 583–646. Pollard, Carl, and Ivan Sag. 1994. Head-driven Phrase Structure Grammar. Chicago: Chicago University Press. Preminger, Omer. 2013. That’s not how you agree: A reply to zeijlstra. The Linguistic Review 30 (3): 491–500. Pullum, Geoffrey. 1991. English nominal gerund phrases as noun phrases with verb phrase heads. Linguistics 29 (5): 763–799. Raposo, Eduardo. 1987. Case Theory and Infl-to-Comp: The Inflected Infinitive in European Portuguese. Linguistic Inquiry 18 (1): 85–109. Sag, Ivan. 2010. Feature geometry and predictions of locality. In Features. Perspectives on a Key Notion in Linguistics, eds. Greville Corbett and Anna Kibort, 236–271. Oxford: Oxford University Press. Sag, Ivan A., Thomas Wasow, and Emily M. Bender. 2003. Syntactic Theory. A Formal Introduction. Stanford: CSLI Publications. Simpson, Jane. 1991. Warlpiri Morpho-Syntax: A Lexicalist Approach. Dordrecht: Kluwer. Stassen, Leon. 1997. Intransitive predication. Oxford: Oxford University Press. von Heusinger, Klaus, and Johannes Wespel. 2006. Indefinite proper names and quantification over manifestations. In Proceedings of Sinn und Bedeutung 11, ed. E. Puig-Waldmüller, 332–345. Barcelona: Universitat Pompeu Fabra. Wechsler, Stephen. 2011. Mixed agreement, the person feature, and the index/concord distinction. Natural Language & Linguistic Theory 29 (4): 999–1021. Wechsler, Stephen, and Larisa Zlatic. 2003. The Many Faces of Agreement. Stanford: CSLI Publications. Welmers, William E. 1973. African Language Structures. Berkeley: University of California Press. Zeijlstra, Hedde. 2012. There is only one way to agree. The Linguistic Review 29 (3): 491–539.