CLARIFY: Human-Powered Training of SMT Models D. Scott Appling & Ellen Yi-Luen Do darren.scott.appling@gatech.edu; ellendo@cc.gatech.edu School of Interactive Computing Tuesday, March 24, 2009 Overview •Statistical Machine Translation? •CLARIFY Framework •Human-Powered Computing •Developing Visual Data •Games for Translation Tuesday, March 24, 2009 Statistical Machine Translation (SMT) •Produce the best semantic translation •Advantages •Cheaper < Human •Cross-Cultural Communication •Mobility Tuesday, March 24, 2009 ls can be seen as the combination of several onents (language model, reordering model, SMT Approach ation steps, generation steps). These compoe.g fea el d o m define one or more feature functions . la tur that are s t h for ng e fu g target sentence i e wa uag w nc rd e tra mo tion ined in a log-linear model: s ns d e la l 1 p(e|f ) = exp Z source sentence n ! e s tio n s core co , re λi hi (e, f ) (1) i=1 s a normalization constant that is ignored in n i d te a l u lc a c t o n ce. To compute the probability of a translation (Koehn et. al., 2007) n an input sentence f, we have to evaluate each e function hi . For instance, the feature funcTuesday, March 24, 2009 ic t c a pr What is CLARIFY? •Data Acquisition Framework •Corpus Annotation Tools •Interactive Games •Paraphrase Validation Tuesday, March 24, 2009 CLARIFY System Process Bilingual Corpus Results Phrase Table Word Alignments Tuesday, March 24, 2009 Online Model Updating Direct Edit Mode Game Play Mode Updated Phrase Table Human-Powered Computing •Learning from Humans olve. We don’t expect volunteers to label all images on Web for us: we expect all images to be labeled because ple want to play our game. which the two players agree is typically a good label for the image, as we will discuss in our evaluation section. •Training Data NERAL DESCRIPTION OF THE SYSTEM call our system “the ESP game” for reasons that will ome apparent as the description progresses. The game is yed by two partners and is meant to be played online by arge number of pairs at once. Partners are randomly igned from among all the people playing the game. yers are not told whom their partners are, nor are they owed to communicate with their partners. The only thing tners have in common is an image they can both see. •The ESP Game (von Ahn & Dabbish, 2004) m the player’s perspective, the goal of the ESP game is guess what their partner is typing for each image. Once h players have typed the same string, they move on to next image (both player’s don’t have to type the string the same time, but each must type the same string at me point while the image is on the screen). We call the cess of typing the same string “agreeing on an image” e Figure 1). Figure 2. The ESP Game. Players try to “agree” on as many images as they can in 2.5 minutes. The thermometer at the bottom measures how many images partners have agreed on. Taboo Words Tuesday, March 24, 2009 A key element of the game is the use of taboo words associated with each image, or words that the players are not allowed to enter as a guess (see Figure 2). These words Visual Data Editors Improved Statistical Translation Through Editing (Callison-Burch et. al., 2004) Tuesday, March 24, 2009 CLARIFY Knowledge Editors •Visually Editing Machine Translation Model Information •Word Alignments •Phrase Alignments •Paraphrasing Tuesday, March 24, 2009 Word Alignment Editor The cat liked eating yellow bugs . 猫 は 黄色い 虫 が 食べる事 が 好き だった 。 Tuesday, March 24, 2009 Tuesday, March 24, 2009 Phrase Alignment Editor The cat liked eating yellow bugs . 猫 は 黄色い 虫 が 食べる事 が 好き だった 。 Tuesday, March 24, 2009 Tuesday, March 24, 2009 Paraphrasing Editor The cat liked eating yellow bugs . System Given Information 猫 は 黄色い 虫 が 食べる事 が 好き だった 。 User Segmented Input The cat liked to chow on yellow insects . Tuesday, March 24, 2009 15 Tuesday, March 24, 2009 CLARIFY Translation Games •Employ Three Previous Editors •Constraints on Games •Developing Intuitive Game Designs Tuesday, March 24, 2009 Paraphrase Bootstrap Process Automatic Similarity Comparison Discard Paraphrase Result Fails Threshold Satisfies Threshold Contains New Knowledge Paraphrase Result Tuesday, March 24, 2009 Save for Later Comparisons Found No Similar Response Comparison to Other Responses Found Similar Response Contains Previously Confirmed Knowledge Online Model Updating WordNet-based Similarity •Deciding on new Paraphrase Acquisition concept 1 1 tangible thing intangible thing 2 2 food emotion 3 3 love flour-based pastas 4 buckwheat flour 5 soba Tuesday, March 24, 2009 4 durum flour 5 spaghetti 2 philosophy Accounting for User’s Experience •Users have Different Levels of Knowledge •Want Smoother Modeling •Identify User Language Ability •Knowledge Source Score Component Tuesday, March 24, 2009 CLARIFY Review •Direct Editors •Interactive Games •Language Ability Discrepancies •Paraphrase Knowledge Validation Tuesday, March 24, 2009 CLARIFY: Human-Powered Training of SMT Models Thank You! D. Scott Appling & Ellen Yi-Luen Do darren.scott.appling@gatech.edu; ellendo@cc.gatech.edu School of Interactive Computing Tuesday, March 24, 2009