An Investigation of Implicatures in Chinese Lingjia Deng, Janyce Wiebe Intelligent Systems Program Department of Computer Science University of Pittsburgh Outline Introduction Implicature in Chinese Inference in Chinese Extracting Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis Conclusions Outline Introduction Implicature in Chinese Inference in Chinese Extracting Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis Conclusions Introduction Scenario: The government proposes the bill of Affordable Care Act. We want to analyze everyone’s opinion of it. We can collect opinions by doing survey, questionnaire, etc. We can also collect the writer’s stances by analyzing their posts online. Introduction “The bill will lower the skyrocketing healthcare costs.” Explicit (Direct) Sentiment: writer negative toward the skyrocketing healthcare costs The healthcare cost is too high. I cannot afford it. Implicit (Inferred) Sentiment: writer positive toward the bill will lower costs There is a chance that the costs could be decreased! I love it! writer positive toward the bill The bill is able to do this! I’ll vote for it! GoodFor/BadFor Event “The bill will lower the skyrocketing healthcare costs.” <bill, lower, healthcare costs> Benefactive/Malefactive Event GoodFor/BadFor Event (Deng et al., ACL 2013 short): goodFor event: help, increase, etc badFor event: lower, destroy, decrease, etc <agent, goodFor/badFor event, object> GoodFor/BadFor Corpus (Deng et al., ACL 2013 short): 134 political editorials e.g. <bill, lower, healthcare costs> e.g. <positive, badFor, negative> almost 20% sentences have clear goodFor/badFor events available at mpqa.cs.pitt.edu Related Work Words/Phrases directly imply implicit opinions. (Zhang and Liu, 2011; Feng et al., 2013) Infer an overall polarity of a sentence by compositional semantics. (Choi and Cardie, 2008; Moilanen et al., 2010) Identify classes of goodFor/badFor terms, and carry out studies involving artificially constructed goodFor/badFor triples and corpus examples matching fixed linguistic templates. (Anand and Reschke 2010; 2011) Generate a lexicon of patient polarity verbs, which correspond to goodFor/badFor events whose spans are verbs. (Goyal et al., 2012) Investigate sarcasm where the writer holds a positive sentiment toward a negative situation. (Riloff et al., 2013) Our Work of GoodFor/BadFor An annotated goodFor/badFor Corpus. (Deng et al., ACL 2013 short) A sense-level goodFor/badFor lexicon. (Choi et al., WASSA 2014) Four inference rule schemas and a graph-based model for sentiment propagation. (Deng and Wiebe, EACL 2014) An optimization framework for joint sentiment inference and disambiguating goodFor/badFor components. (Deng et al., Coling 2014) A rule-based framework for representing and analyzing opinion implicatures. (Wiebe and Deng, arXiv 2014; WASSA 2014) Motivation For This Work This work is investigation of implicatures in Chinese. People speaking different languages may express their opinions in different ways. Before directly applying goodFor/badFor implicature in English to Chinese, we want to investigate: whether such implicature also exists in Chinese; whether the sentiment inference rules also apply to Chinese implicit opinions; whether it is feasible to extract goodFor/badFor events and the corresponding components in Chinese. Outline Motivation Implicature in Chinese Agreement Study Inference in Chinese Extracting Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis Conclusions Implicature in Chinese: Agreement Study An opinion-orientated, paragraph-paralleled corpus: Chinese version of the New York Times (http://cn.nytimes.com/). Select the English paragraphs containing English goodFor/badFor words. Present the parallel Chinese paragraphs. Implicature in Chinese: Agreement Study All the three annotators, including me, are Chinese graduate students in University of Pittsburgh. Annotate 60 paragraphs, 253 sentences. Conduct the agreement study in the same manner with (Deng et al., 2013). Implicature in Chinese: Agreement Study Train with English manual (Deng et al., 2013) and several Chinese annotated examples. Annotate: (A). spans of the goodFor/badFor events (B). spans of the agents and objects of the events (C). polarities of the events: goodFor or badFor (D). writer’s sentiments toward the agents and objects: positive, negative, neutral Evaluate by the same metrics as (Deng et al., 2013): for (A) & (B): percentage of span both annotate for (C) & (D): kappa Implicature in Chinese: Agreement Study All the scores are good: trained by the English manual, the annotators are able to detect similar implicature in Chinese. Scores of (A) and (D) are lower than those in the English goodFor/badFor agreement study (Deng et al., 2014). overlap(a,b) (A) goodFor/badFor span (B) agent span (B) object span Anno 1&2 0.7929 0.9091 0.9091 Anno 1&3 0.7044 0.9524 1.0 (C) goodFor/badFor polarity (D) sentiment toward agent (D) sentiment toward object Anno 1&2 0.9385 0.7830 0.7238 Anno 1&3 0.8966 0.5913 0.8478 kappa Implicature in Chinese: Agreement Study For annotating (D) writer’s sentiments, the main disagreement comes from: Anno 1 annotated as positive or negative Anno 2 annotated as neutral We conduct a phase-II agreement study on 10 editorials from the English corpus (Deng et al., 2013). Three scores: I. agreement scores in Chinese by three annotators II. agreement scores in English by three annotators III. previous agreement scores (Deng et al., 2013) score I = score II; score I < score III; score II < score III They have a similar understanding of implicatures in the two languages. Implicature in Chinese: Agreement Study For annotating (A) goodFor/badFor events, the major disagreement comes from: Anno 1 marks a goodFor/badFor span Anno 2 doesn’t mark it because he thinks it violates the syntax rules we specified in the English manual. Syntax rules are specified in the English manual to guide the annotators to focus on clear cases of goodFor/badFor events, e.g. The object should be the major semantic object. The goodFor/badFor polarity should be perceived within the triple. GoodFor/BadFor Cases Evoked by Chinese Syntax The goodFor/badFor polarity should be perceived within the triple. It will put the reform to die. In English: this is NOT annotated as a goodFor/badFor event. put is the verb <it, put, reform> put X to die: badFor X put X to revive: goodFor X GoodFor/BadFor Cases Evoked by Chinese Syntax The goodFor/badFor polarity should be perceived within the triple. It will put the reform to die. 这将把改革置于死地。 In Chinese: this can be represented as a clear goodFor/badFor case “put” is not a verb in the Chinese sentence BA structure (Chao, 1968; Li and Thompson, 1989; Sybesma, 1992) subject, BA, object, verb it will BA kill the reform Implicature in Chinese: Conclusion Such syntax is commonly seen in Chinese. These goodFor/badFor events due to the Chinese syntax are clear enough in Chinese. It will kill the reform. In order to fully study the Chinese goodFor/badFor, the manual should be revised to provide guidance to annotate such events. Overall, similar implicatures can be perceived in English and in Chinese. Outline Motivation Implicature in Chinese Inference in Chinese Graph Model for Sentiment Propagation (Deng and Wiebe, 2014) Extracting Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis Conclusions Graph Model (Deng and Wiebe, 2014) agent goodFor/badFor object agent/o bject goodFor badFor Encoding Inference Rules Inference in Chinese: Graph Model Performance We run an isolated evaluation of the graph model itself (Deng and Wiebe, 2014). For a node, calculate how many times it is propagated correctly given any neighbor node being assigned with a correct sentiment label. Dataset # subgraph correctness all subgraph 136 0.7058 multi-node subgraph 61 0.8251 The scores in Chinese are lower than those in English (89% in (Deng and Wiebe, 2014)). Blocked Inference Blocked Inference: In Chinese and English …a misreading which estimated the law would “reduce the amount of labor … <law, reduce, labor> The writer doesn’t believe <law, reduce, labor>. “misreading” believes so. The writer is negative toward “misreading”. For events which the writer doesn’t believe it is true, the inference should be blocked. It is not in the writer’s belief space (Wiebe and Deng, 2014). Inference in Chinese: Conclusion Though there are cases where the inference rules are blocked, The cases appear both in Chinese and in English. We didn’t find evidence showing that the blocked inference only occurs in English. Besides the blocked inferences, the good correctness scores provide evidence that the inference rules also apply to Chinese. Outline Motivation Implicature in Chinese Inference in Chinese Extract Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis English + Parallel Corpus? Conclusions Chinese GoodFor/BadFor Words Given we have an English goodFor/badFor lexicon (Choi et al., 2014), is it applicable to derive a bilingual goodFor/badFor lexicon from a parallel corpus? We manually find the parallel spans in English corresponding to the annotated goodFor/badFor spans in the Chinese. 76.25% annotated Chinese goodFor/badFor spans have parallel goodFor/badFor spans in English. For the other Chinese annotated goodFor/badFor spans, there is no corresponding goodFor/badFor span in English, due to: Chinese syntax; paraphrasing. Chinese Agent/Object We use the Stanford dependency parser to extract the agent/object in English (Deng et al., 2014). nsubj-(event, agent) dobj-(event, object) Can we use the same dependency labels to extract agent/object in Chinese? We choose the Chinese Stanford dependency parser. Some dependency labels exist both in Chinese and English. There are more nsubj and dobj in Chinese data than in English data. Some labels are especially designed for Chinese (Chang et al., 2009). 19.57% in agents, 25.82% in objects. They are similar to some labels in English. Chinese Sentiment Analysis Sentiment Lexicon: HowNet NTU Sentiment Dictionary (Ku and Chen, 2007) A sentiment lexicon from Tsinghua University (Li and Sun, 2007) Bilingual and Multilingual Chinese Sentiment Analysis Research Wan, 2008; Wan, 2009; Boyd-Graber and Resnik, 2010; Lu et al., 2011; etc. Chinese Sentiment Analysis Tools LingPipe http://alias-i.com/lingpipe/ Semantria https://semantria.com/ Outline Motivation Implicature in Chinese Inference in Chinese Extracting Chinese GoodFor/BadFor Conclusions Conclusions The implicatures that arise from explicit sentiment toward goodFor/badFor events exist in Chinese language and they are similar to those in English. The inference rules we developed for English apply to Chinese. There are several cases where the inferences are blocked and such cases exist both in Chinese and English. It is promising to develop systems automatically extracting Chinese goodFor/badFor events using the existing methods for English and leveraging the parallel corpus. Questions ? Thank Fan Zhang and Changsheng Liu for annotations. Part of References: Jordan Boyd-Graber and Philip Resnik. 2010. Holis- tic sentiment analysis across languages: Multilingual supervised latent dirichlet allocation. In Proceedings of the 2010 Conference on Empirical Meth- ods in Natural Language Processing. Lingjia Deng and Janyce Wiebe. 2014. Sentiment propagation via implicature constraints. In Meeting of the European Chapter of the Association for Computational Linguistics. Lingjia Deng, Yoonjung Choi, and Janyce Wiebe. 2013. Benefactive/malefactive event and writer attitude annotation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Lun-Wei Ku and Hsin-Hsi Chen. 2007. Mining opinions from the web: Beyond relevance retrieval. Journal of the American Society for Information Science and Technology. Jun Li and Maosong Sun. 2007. Experimental study on sentiment classification of chinese review using machine learning techniques. In Natural Language Processing and Knowledge Engineering, 2007. Bin Lu, Chenhao Tan, Claire Cardie, and Benjamin K Tsou. 2011. Joint bilingual sentiment classification with unlabeled parallel corpora. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Xiaojun Wan. 2008. Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. Xiaojun Wan. 2009. Co-training for cross-lingual sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Theresa Wilson and Janyce Wiebe. 2003. Annotating opinions in the world press. In Proceedings of the 4th ACL SIGdial Workshop on Discourse and Dialogue.