Corpus-based Semantics of Concession Livio Robaldo Department of Computer Science, University of Turin robaldo@di.unito.it Eleni Miltsakaki Institute for Research in Cognitive Science, UPenn elenimi@seas.upenn.edu Alessia Bianchini Department of Computer Science, University of Turin alessia.bianchini84@gmail.com The PDTB: Discourse relations • They are conveyed by lexical items connecting two textual spans. Examples: On the one hand, John loves Barolo. So he went and ordered three cases. On the other hand, he didn’t have much money. So then he had to cancel the order. Discourse relations On the one hand – On the other hand John loves Barolo. So he went and ordered three cases. So he didn’t have much money. then he had to cancel the order. PDTB The Penn Discourse Treebank is, to date, the largest annotation effort at the discourse level (more than 40000 annotations). • It encodes discourse relations associated with discourse connectives. • It includes implicit connectives. E.g. [John broke his arm]arg1. (implicit=so) [Now, he can’t cycle to work]arg2 • Annotations are made on the Penn Treebank, including approximately 1 million words, taken from the Wall Street Journal. Sense annotation in the PDTB Some discourse connectives can be ambiguous and encompass more than one meaning. E.g. since: Temporal Since [the first fraud was discovered in July 1986 at an office of the People’s Bank of China]arg2, [15 major cases have been found]arg1. Causal [It was a far safer deal for lenders]arg1 since [NWA had a healthier cash flow and more collateral on hand]arg2. Aim of the sense annotation The aim of the annotation of senses is to provide sense tags which will flag the intended interpretation of the connectives. Identifying senses has proved to be a challenging task. Fine grained or coarse grained distinctions? Extensive sense annotation studies have been carried out to disambiguate the meaning of verbs (see, for example, Propbank: http://verbs.colorado.edu/~mpalmer/projects/ace.html). Much less for discourse connectives… Semantic CLASSes Semantic CLASSes TEMPORAL CONTINGENCY COMPARISON EXPANSION Comparison CLASS Types Comparison Contrast juxtaposition opposition Concession expectation contra-expectation Concession The Type Concession applies when: Arg2 event/state A implies an event/state C but Arg1 event/state B states or implies ~C OR Arg1 event/state A implies an event/state C but Arg2 event/state B states or implies ~C The former is tagged as ‘expectation’, the latter as ‘counter-expectation’. Literature on Concession • A general defeasible rule “Beatiful women In the literature Rimon, 94], [Lagerwerf, 98], usually [Winter married ” & holds in the context. [Korbayova & rule Webber, 07], two cases have been • This creates the expectation that Greta Garbo according married. distinguished, to how the expectation is • Argd directly denies this expectation, by denied: directly or indirectly. asserting exactly the opposite. Direct Contrast: Although [Greta Garbo was considered the yardstick of beauty]Argc, [she never married]Argd. • The concessive relation involves an intermediate proposition, called the Tertium Comparationis (TC) [Lagerwerf, 98], defeasibly implied by Argc and whose negation is (non-defeasibly) implied by In the literature [Winter & Rimon, 94], [Lagerwerf, 98], Argd. [Korbayova Webber, 07], • In the&example below, the two TC iscases “John have is not been distinguished, according how that the John expectation is mobile”: Argc defeasiblytoimplies is not while implies that he is mobile. denied:mobile, directly orArgd indirectly. Literature on Concession Indirect Contrast: Although [John does not have a car]Argc, [he has a bike]Argd. Logical accounts of Concession Previous logical accounts of Concession mirror the distinction between Direct and Indirect Contrast. Direct Contrast: (p ∧ q) ∧ (p →¬q) Indirect Contrast: (p ∧ q) ∧ ∃r[(p →¬r) ∧ (q → r)] Previous accounts of Concession Previous approaches to Concession mainly focus on how the expectation is denied (either directly or indirectly). They are almost silent about how the expectation is created, i.e. on the relation between Argc and the expectation that must be presupposed. Some [Lagerwerf, 98], [Sanders et al., 92] generally state it is of a causal nature. We think that characterizing how the relation is created is crucial for defining the semantic of concessive relations. We tried to take a first step starting from an empirical analysis of PDTB data. Four subcases of Concession Toy examples: Causality: Although [John studied hard]argc, [he did not pass the exam]argd. Implication: [Penguins are birds]argc. Nevertheless [they do not fly]argd Correlation: [John will write the report]argc but [he'll finish it at home]argd Implicature: Although [John ate a lot of pizza]argc [he did not eat it all]argd. From the PDTB: Causality: Although [they represent only 2% of the population]argc, [they control nearly one-third of discretionary income]argd. Implication: [The prime minister]argd [whose hair is thinning and gray and whose face has a perpetual pallor]argc nonetheless [continues to display an energy, a precision of thought and a willingness to say publicly what most other Asian leaders dare say only privately]argd. Correlation: [The Treasury will raise 10 billion in fresh cash by selling 30 billion of securities...]argc. But [rather than sell new 30-year bonds, the Treasury will issue 10 billion of 29-year, nine-month bonds]argd. Implicature: Although [it is not the first company to produce the thinner drives]argc, [it is the first with an 80megabyte drive]argd. Empirical (double) annotation on 1000 PDTB tokens Inter-annotator agreement k = 0.8 The analysis provides some evidence that the source of expectation is not always a causal relation. Implication Necessary conditions, rather than causal effects, inherited from some kind of prototype. Although [working for USA intelligence]argc, [Mr. Noriega was hardly helping the U.S. exclusively]argd. Although [insider trading has long been criminal]argc, [for example it has never been statutorily defined]argd. [You can do all this]argc even if [you’re not a reporter or a researcher or a scholar or a member of Congress]argd. Correlation Divergence from a contextually relevant trend Although [the notes held at a price of 92 to 93 immediately after the reset]argc, [they started falling soon afterward]argd. [The LDP won by a landslide in the last election, in July 1986]argc. But [less than two years later, the LDP started to crumble, and dissent rose to unprecedented heights]argd. [Yet the rowing machine hasn't been touched since]argd even though [he has moved it across the country with him twice]argc Correlation Events that “surprisingly” occur together Although [started in 1965]argc [Wedtech didn’t really get rolling until1975]argd. [The favorite remains Fernando Collor de Mello, a 40year-old former governor of the state of Alagoas]argc. But […Mr. Collor has slipped to about 30% in the polls from a high of about 43% only a few weeks ago]argd. [The Journal listed the creation of the money fund as one of the 10 most significant events in the world of finance in the 20th century]argc. But [the Reserve Fund, America’s first money fund, was not named, nor were the creators of the money-fund concept, Harry Brown and myself]argd. Implicature Violation of a Gricean Maxim: Argc is insufficient or unrelevant with respect to the speaker’s intentions. Although [John ate a lot of pizza]argc [he did not eat it all]argd. [Open the computer case]argc but [do not touch the wires]argd. Although [it is not the first company to produce the thinner drives]argc, [it is the first with an 80-megabyte drive]argd. Thank you!