THE CROSS-LINGUISTIC RELEVANCE OF HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR CONSTRAINTS IN REGARD TO BULGARIAN1 Assistant professor Tzvetomira Venkova, PhD Sofia University St. Kliment Ohridski Abstract: The paper discusses the issue of cross-linguistic relevance of HPSG constraints as a relation between universal and language-specific. Some basic aspects of linguistic universality within HPSG are investigated and problematic issues are distinguished. Some Bulgarian specific aspects in regard to their licensing by the universal HPSG constraints are presented. Key words: HPSG theory, contrastive studies of Bulgarian, formal syntax The application of Head-Driven Phrase Structure Grammar (HPSG) framework to the study of languages other than English has been an active research area in recent years. Such comprehensive HPSG studies as Slavic in HPSG (Borsley and Przepiórkowski 1999), German in HPSG (Nerbonne et al 1994) and Romance and HPSG (Balari and Dini 1998) have gained broad popularity in the linguistic community. The investigation of formal-syntactic phenomena in Bulgarian language within HPSG is a development which deserves further attention and which motivates the need of a systematic investigation of the cross-linguistic relevance of HPSG constraints. 1. Universal-language specific in HPSG theory The development of a conception of universal grammar, which is rich enough to permit simpler descriptions of particular languages, is considered to be a central goal in modern syntactic theory, and HPSG in particular, cf. Sag et al 2003: 296 Innateness of universal grammar HPSG follows the generativist tradition to pay special attention to the issues of universal grammar and its formal-theoretic representation. As is well-known, the issues of linguistic universals in syntactic theory were brought into focus by N. Chomsky, who discussed the problem of the ‘condition of generality’, posed by syntactic theory’ (Chomsky 1957:50) and since 1960s developed this line by analyzing the issues of Universal Grammar, defined by him as a ‘set of parameters, conditions, or whatever that constitute the initial state of the language learner, hence the basis on which knowledge of language develops (Chomsky 1980:69). The distinction between universal and language-specific determining the Principles and Parameters distinction in Chomsky’s Government and Binding (GB) theory (Chomsky 1981) where differences among grammatical subsystems result from different parameter settings for the various languages. Chomsky’s treatment of 1 The research work in this paper has been carried with the assisstnace of the Alexander von Humboldt Foundation. linguistic universality is predominantly motivated by biological necessity, i.e. it is innate (Newmeyer 1986:72). In general, HPSG follows Chomsky’s innateness claim. C. Pollard and I. Sag define universal grammar as ‘what every linguistically mature human being knows by virtue of being a linguistic creature” and they argue that to characterize universal grammar is the central goal of linguistic theory (Pollard and Sag 1994:14). They distinguish it from a theory of a particular language – a grammar - which characterizes ‘what linguistic knowledge (beyond universal grammar) is shared by the community of speakers of the language’ (ibid). universal character of grammatical relations and their hierarchy Furthermore, HPSG shares a different perspective in the treatment of the issues of universality in language with the relationally-based and lexicalist frameworks. In contrast to Chomsky’s treatment of grammatical relations as derivative relations between categories, Relational grammar posited the universal character of grammatical relations, taken as primitives, and produced universal rule inventory, cf. Perlmutter and Postal 1977. The accessibility hierarchy of grammatical relations, postulated by Keenan and Comrie 1977, was later adopted and developed by C. Pollard and I. Sag for HPSG, cf. Pollard and Sag 1987, Chapt.7. The idea of universality of grammatical relations, referred to as ‘grammatical functions’ HPSG shares with Lexical-Functional Grammar (LFG), cf. Bresnan and Kaplan 1982: i-iii. They postulated that functional structure attributes are considered to be universal. The notion of ‘nonderivationality’, the important role of the lexicon and the predicate-argument structures were to a great extent adopted by HPSG in its system of universal constraints. Formal precision of metalanguage In HPSG, similar to Generalized Phrase Structure Grammar (GPSG), many universals of language follow from the basic architecture of the grammar, in contrast to other frameworks, which have to particularly stipulate them, (cf. Gazdar et al 1985 and Newmeyer 1986:214 for a discussion of this aspect in GPSG). The universals are stated within the metalanguage, due to the very structure of the theory. HPSG further develops GPSG’s insistence on mathematical precision, with respect to the formal expression. Universal Grammar is viewed as a ‘mental computational system shared by members of a linguistic community’ (Pollard and Sag 1994:58). Reference to the task-specificity controversy Sag et al 2003 argue that HPSG provides an explicit semantic and syntactic analysis, that makes it possible to formalize more precisely what is at issue in the debate over the so called task-specificity, i.e. whether human capacity for language is specialized and distinct in its organization and functioning from other cognitive abilities or it is a side-effect of our general intelligence or other abilities, cf. Sag et al 2003:296. They argue that the grammatical constructs, developed by them, are wellsuited to a theory of Universal Grammar, whether or not that theory turns out to be highly task-specific. By providing formal representations of data structures and their interactions, HPSG permits to see the analogues in other cognitive domains. To justify the claim that the explicitness of HPSG proposals can be helpful in resolving the task-specificity question, Sag et al 2003 consider various components of HPSG theory and find that most of them have elements that are likely to be universal, cf. Sag et al 2003:297. They believe that the increasingly precise linguistic descriptions, developed by HPSG theory, will help to clarify the nature of task-specificity controversy. In this respect, they treat learnability as a criterion for theory evaluation. Clear-cut distinction between universal and language-specific A very positive aspect of HPSG is that it provides an attempt of a clear-cut distinction between universal and language-particular constraints. In Pollard and Sag 1994 Universal Grammar is divided into linguistic ontology, schemata and universal constraints, while Particular Grammar comprises lexicon, linguistic ontology (selection from and further articulation of the universal ontology) and schemata (selection from and further specification of the universal schemata), cf. Pollard and Sag 1994:58. The licensing of expressions in a particular language depends on the interaction among a complex system of universal and language-specific constraints. Both types of constraints must be ultimately realized in a computable form in the minds of the speakers of that language. In Sag et al 2003, the distinctions between universal and language-specific is reflected in the analysis of the various components of this theory development, namely the phrase-structure rules, the features and their values, the type hierarchy with its feature declarations and constraints, the definition of phrasal satisfaction, the binding theory and the lexical rules. The authors argue that these components contain many elements, e.g. the grammar rules, the definition of Well-Formed Structure, the features and their types, that are plausible candidates for playing a role in the theory of universal grammar, cf. Sag et al 2003:298. In regard to the system of constraints HPSG posits the criterion of decidability, i.e., using its formalized version as a hypothesis about the structure of human linguistic knowledge to develop integrated models of language processing. 2. Problematic aspects of HPSG in regard to cross-linguistic relevance The issue of linguistic universality and cross-linguistic relevance within HPSG framework poses problems that need further investigation and research. Firstly, the authors of HPSG do not provide a comprehensive discussion of the formal relations between universal and language-specific constraints. Their attitude towards the consideration of language-specific principles as parameterized version of universal ones is not sufficiently consistent. On the one hand, parameterbased accounts of cross-linguistic variation, termed ‘highly speculative’, cf. Pollard and Sag 1994:31, are rejected, since they do not meet a number of requirements, such as a list of parameters with their range of settings, worked-out fragments of languages with specifications of settings etc. (ibid). On the other hand, ‘from time to time’ as the authors put it, they justify the proposal of ‘certain variants of universal principles which might be regarded as parameterized forms of universal principles, in response to the empirical demands of particular languages’. These somehow controversial theses need to be further specified as they pose theoretical problems to the HPSG analysis for languages other than English. The HPSG analysis of Bulgarian should be based on a mechanism of enriching the theory with new language-specific constraints without falsifying or weakening the posited universal ones and on a consistent account of disjunctive language-specific constraints within an integrated HPSG grammar. Secondly, another problematic aspect is that in general most HPSG constraints are English-based. An advantage of the basic HPSG books, such as Pollard and Sag 1987, 1994, Ginsburg and Sag 2000, Sag et al 2003 is that they provide examples from other languages and suggest variants for their licensing. However, the inclusion of other languages seems not enough in respect to systematicity, range and effect when universal principles are posited. 3. Other methodological sources for the cross-linguistic application of HPSG Promising directions for the solution of the above mentioned problems can be sought in the employment of a shared grammar, cf. Avgustinova, Skut and Uszkoreit 1999, Avgustinova and Uszkoreit 2000 and the theory of grammatical archetypes cf. Ackerman and Webelhuth 1998. As methodological sources these two theoretical approaches are especially suitable for cross-linguistic applications of HPSG, including for licensing Bulgarian phrases. Shared grammar - Avgustinova, Skut and Uszkoreit 1999, Avgustinova and Uszkoreit 2000 The notion of shared grammar, introduced in Avgustinova, Skut and Uszkoreit 1999, suggests a formal basis for the analysis of the cross-linguistic reference of the HPSG description. The authors argue that HPSG is well-suited for formal description of differences and commonalities among sets of languages and explore the opportunities to apply HPSG to cross-linguistic studies. The notion of shared grammar is defined as a formal specification of the shared components of two grammars. The authors claim that this aim is attainable in HPSG because the universal and language specific principles are presented declaratively as a uniform formalism and can be encoded as principles of typed-feature logic. Their approach is based on two principles: abstracting over language-specific morphological realization and breaking down grammatical constructions into a number of primitives common to the languages in question. Moreover, the fact that they develop the idea of shared HPSG grammar in regard to the variation of verb diathesis system across Slavic languages deserves attention in regard to the formal analysis of Bulgarian. Theory of predicates based on grammatical archetypes Ackerman and Webelhuth 1998 Another solution, particularly concerning the VP, is the notion of grammatical archetypes, proposed within a theory of predicates in Ackerman and Webelhuth 1998. One of the main hypotheses they formalize in the book is that Universal Grammar defines predicate types in a type hierarchy which the grammars of individual languages can import as whole chunks. The archetypes are universal elementary types that are available for incorporation into individual grammars. The individual grammars benefit from importing them, as it keeps the overall degree of markedness low. Two sorts of archetypes: content-theoretic and form-theoretic are introduced, together with major arrangement types that can be found in languages. The thorough discussion of the interaction between archetypes and arrangement types suggests consistent basis for a cross-linguistic research. .4. Bulgarian-specific aspects and the relevance of HPSG constraints The application of HPSG to Bulgarian presupposes an investigation of the constraints, developed for English and hypothesized as universal. Some methodological difficulties are discussed below based on a particular Bulgarianspecific formal-syntactic feature of the verb phrase (VP). In principle, the existing feature-structure descriptions of the English VP can be posited as a source for the development of corresponding descriptions for the Bulgarian VP. The basic Bulgarian-specific VP constraints have to be distinguished and ways to formally interpret them within HPSG should be sought. Thus, the application of HPSG to Bulgarian should focus on the balance between universal and language-specific constraints and their interaction. Special attention is to be paid to tracing the idea of universality despite the difference of grammatical traditions. The results reported for languages other than English, especially for the Slavic ones, can also be adopted as a methodological basis for the research of Bulgarian VP, particularly with respect to those features that they have in common with Bulgarian. As a feedback, the results of the investigation of the Bulgarianspecific features and their formal representations, will provide further insights into the VP structure within HPSG in general. Bulgarian VP feature-structures relevance can be tested and verified on some existing treebanks and corpora, such as BulTreeBank (Simov et al 2003), The Tuebingen VERMOBIL Treebank of English and German, cf. Hinrichs et al 1999) and PSDB – The Parallel syntactic Database of English and Bulgarian, cf. Venkova 2002. In order to exemplify some specific problems, a discussion of a Bulgarianspecific aspect – doubled verb complements2 is presented below, focusing mainly on the specific problems they pose to HPSG theory and the direction for their methodological solution. 4.1. Doubled verb complements - examples In Bulgarian verb arguments can be expressed twice in the sentence. In these cases, the subject or object position is represented by two non-coordinated NPs. The doubling element is always a short personal pronoun3, cf. (1) – (3): Subject (1a) Toj chicho trygna. He uncle left-3p, sg. (1b) Toj trygna chicho. He left-3p, sg uncle. ‘Uncle left’. 2 3 Similar phenomena occur in some other languages, e.g. some Romance languages The doubled possessive phrases are not discussed here as they concern NP structure. Direct object (2a) Az Petar go pitah veche. I Petar him asked-1p, sg already. (2b) Pitah go az veche Petar. Asked-1p,sg him I already Petar. ‘I aksed Peter already’. Indirect object (3a) Na tebe utre shte ti kazha kakvo da pravish. To you tomorrow will you tell-1p,sg what to do. (3b) Utre shte ti kazha na tebe kakvo da pravish. Tomorrow will you tell-1p, sg to you what to do. ‘I will tell you tomorrow what to do’. Argument doubling is an element of the syntactic inventory of social markers, as discussed in Videnov 1998:181. Its syntactic interpretation concerns some basic principles of formal-theoretic description. 4.2. Doubled verb complements as problems to an HPSG grammar In relation to the HPSG formalization, doubled verb complements are problematic. For example, the indirect object argument can be expressed in three possible ways - by short dative pronoun, PP or combined, cf. 4 a-c. (4a) Kazhi mu istinata. Tell him the truth. (4b) Kazhi istinata na choveka. Tell the truth to the man. (4c) Kazhi mu istinata na choveka. Tell him the truth to the man. The main issue to solve is how to reflect in the VALENCE features of the verb the fact that in 4a-c it combines with one or with two indirect objects and in both cases it forms grammatical sentences. A further issue to solve in respect to the VALENCE features is to analyze the formal differences between subject doubling and object doubling and to decide if they should follow identical or different language-specific constraints in respect to the HPSG theory, so as not to undermine the universal constraints concerning verb complementation mechanism. Also, the issues of the distinction between doubling and dislocation, concerning this phenomenon, suggested within GB in Bojadzhiev et al 1998:565, and the expression of this distinction in HPSG terms has to be considered. Moreover, the formal distinction between doubling and apposition, as well as the issues of structural obligatoriness, cf.5a-c below, from optionality, cf. 4a-c, closely concern the relevance of HPSG architecture of VP to Bulgarian: (5a) Na deteto mu se spi To the child him refl sleep ‘The child feels like sleeping’ (5b) * Na deteto se spi To the child refl sleep 4.3. Directions for problems’ solution Different approaches for the solution of formal-structural problems, presented by doubled verb complements. are possible. Firstly, both mu ‘him’ and na choveka ‘to the man’ in 4c can be considered as representing two separate complements. This, however, if not further constrained, would entail that in 4a and 4b there is still one unrealized complement, which is not true. An alternative approach, suggested in the GB analysis of J. Penchev (Penchev 1993:122), is to consider the pronominal object mu ’him’ as clitic specifier of the verb in 4c, cf. Fig.1 below. In a more recent analysis (Bojadzhiev et al 1998:564-568) the doubling pronoun in 4c is defined as pronominal adjunct which is a VP sister, dominated by an upper VP segment, cf. Fig. 2: Figure 1 Figure 2 VP VP ClP V V PP1 VP PP V PP1 Cl dat V mu him kazhi tell NP istinata na choveka the truth to the man V mu kazhi him tell NP istinata na choveka the truth to the man These generativist contributions to the analysis of double verb complements in Bulgarian deserve serious consideration and re-interpretation into a feature structure consistent with the HPSG framework. When exploring the cross-linguistic relevance of the structural aspects of argument doubling, the analyses of similar structures in other Balkan languages (Barbu 2001, Mladenov 1968 ) and French and Spanish (Simeonov 1971 and Kynchev 1972) should also be taken into account. Special attention in regard to verb complementation mechanism in Bulgarian deserves the analysis of the usage of pronominal arguments that do not satisfy the illocution conditions, as presented by M. Videnov (Videnov 1990:383) These cases refer to the con-situational semantics and pose problems to the logical-semantic grounds of the HPSG theory. The interpretations of verb argument doubling in Bulgarian in terms of themerheme relations, as discussed in Ro Hauge 1999:207-210 provide interesting pragmatic insights and some traditional analyses, such as (Popov 1979), treat them in their historical and empirical aspect. 5. Basic methodological principles of seeking Bulgarian-specific relevance in HPSG By providing empirical cross-language analyses based on certain hypothesized universal constraints of HPSG theory, the application of HPSG to Bulgarian language tests their universal relevance. It provides feedback, based on bringing HPSG constraints into action cross-linguistically, that facilitates theory evaluation. The systematic verification of the universality hypothesis to particular syntactic domains provides useful data and support the basic principles of the theory. The structure of the results of the HPSG analysis of Bulgarian should be based on consistent logical principles that will make possible their further implementation in HPSG-based computational systems, such as Trale and ConTroll. The formal analysis structures can be tested and evaluated on parallel syntactic databases. The HPSG analysis of Bulgarian can provide useful comments and eventual revisions of some universal and parochial principles of English, enriching them with a cross-language perspective. However, the evaluation of the HPSG constraints should not aim at simply rejecting or confirming their relevance to Bulgarian in contrast to English and other languages, but would rather seek refinement of constraints’ claim and ways to incorporate language–specific variations. Thus, the analysis is to be based on the investigation of the formal syntactic mechanism of sharing common components and the addition (or deletion) of language-specific ones consistent with a logic formalism. Bibliography: Ackerman, F. and Webelhuth, G. 1998: A Theory of Predicates, CSLI Publications. Avgustinova, T., Skut, W. and Uszkoreit, H. 1999: Typological Similiarities in HPSG. –In: Borsley, R. and Przepiórkowski, A. (eds.) Slavic HPSG, CSLI Publications. Avgustinova, T. and Uszkoreit, H.2000: An ontology of systematic relations for a shared grammar of Slavic. Proceedings of COLING'2000, Saarbrücken, Vol.1, pp.28-34 Balari, S. and Dini,. L. 1998: (eds.) Romance in Head-Driven Phrase Structure Grammar, CSLI Publications. Bojadzhiev et al 1998: Бояджиев, Т., Куцаров, И., Пенчев, Й. Съвременен български език, Издателска къща “Петър Берон”, София. Barbu, V. 2001: Double Subject Constructions in Romanian. An HPSG Approach. Empirical Linguistics and Natural Language Processing Fall school: Sozopol, September 2002. Borsley, R. and Przepiórkowski, A. 1999: (eds.) Slavic in Head-Driven Phrase Structure Grammar, CSLI Publications. Bresnan, J. and Kaplan, R. 1982: Introduction In: Mental Representation of Grammatical Relations, Bresnan, J. (ed.), Cambridge, MA:MIT Press, pp. i-iii. Chomsky, N. 1957: Syntactic Structures. The Hague:Mouton. Chomsky, N. 1980: Rules and Representations. New York:Columbia University Press. Chomsky, N. 1981: Lectures on Government and Binding, Dordrecht: Foris. Gazdar et al 1985: Gazdar, G., Klein, E., Pullum, G., and Sag, I. Generalized Phrase Structure Grammar. Cambridge, MA: Harvard University Press. Ginsburg, J. and Sag, I. 2000: Interrogative Investigations. CSLI Publications, Stanford. Hinrichs et al 1999: Hinrichs, E., Kordoni, V., and Shulz, H. The Tuebingen VERMOBIL treebank of English and German. 21st Annual Meeting of the DFG, Konstanz, Germany. Kynchev 1972: Кънчев, Ив. Някои наблюдения върху употребата на удвоеното допълнение в испанския и българския език, Език и литература, кн.1, стр.52-58. Mladenov 1968: Младенов, Ц. Балканизъм ли е удвояването на обекта в българския език, В: Известия на Института за български език, т. XVI, 1968. Nerbonne et al 1994: Nerbonne, J. Netter, K. and Pollard C. (eds.) German in Head-Driven Phrase Structure Grammar, CSLI Publications. Newmwyer, F. 1986: Linguistic Theory in America, Academic Press, Inc., San Diego. Penchev 1993: Пенчев, Й. Български синтаксис. Управление и свързване. Пловдивско университетско издателство, Пловдив. Perlmutter, D. and Postal, P. 1977: Toward a Universal Characterization of Passive’ Papers from the Third Annual Meeting of the Berkley Linguistic Society, pp. 394-417. Pollard, C. and Sag, I. 1987: Information-Based Syntax and Semantics, Vol.1, Fundamentals. CSLI publications. Pollard, C. and Sag, I. 1994: Head-Driven Phrase Structure Grammar, CSLI, Stanford University of Chicago Press. Popov 1979: Попов, К. Стилно-граматическа употреба на удвоеното допълнение в българския книжовен език. - В: Помагало по български синтаксис, Издателство “Наука и изкуство”, София, стр. 92-107. Ro Hauge, K. 1999: A Short Grammar of Contemporary Bulgarian, Slavica Publishers, Bloomington, Indiana. Sag et al 2003: Sag, I., Wasow, T., Bender, E. Syntactic Theory: A Formal Introduction, 2nd edition, CSLI Publications. Simeonov 1971: Симеонов, Й. Дублиране на определени части от състава на френското изречение в сравнение с българския език , Език и литература, кн.4, стр.62. Simov et al 2002: Simov, K et al., HPSG-based syntactic treebank of Bulgarian (BulTreeBank). - In: "A Rainbow of Corpora: Corpus Linguistics and the Languages of the World", Munich 2002, pp. 135-142. Venkova, Tz. 2002: Bilingual corpora as a platform for cross-linguistic treebank development. - In: Proceedings of the First Workshop on Treebanks and Linguistic Theories (TLT 2002), Sept. 20-21, 2002, Sozopol, pp. 264-274. Videnov 1990: Виденов, М. Съвременната българска градска езикова ситуация. Университетско издателство “Св. Климент Охридски”, София. Videnov 1998: Виденов, М. Социолингвистическият маркер, Делфи издат, София.