CLARIFY: Human-Powered Training of SMT Models D. Scott Appling & Ellen Yi-Luen Do

advertisement
CLARIFY:
Human-Powered Training
of SMT Models
D. Scott Appling & Ellen Yi-Luen Do
darren.scott.appling@gatech.edu; ellendo@cc.gatech.edu
School of Interactive Computing
Tuesday, March 24, 2009
Overview
•Statistical Machine Translation?
•CLARIFY Framework
•Human-Powered Computing
•Developing Visual Data
•Games for Translation
Tuesday, March 24, 2009
Statistical Machine Translation (SMT)
•Produce the best semantic translation
•Advantages
•Cheaper < Human
•Cross-Cultural
Communication
•Mobility
Tuesday, March 24, 2009
ls can be seen as the combination of several
onents (language model, reordering model,
SMT Approach
ation steps, generation steps). These compoe.g fea
el
d
o
m
define
one
or
more
feature
functions
. la tur that are
s
t
h
for ng e fu
g
target sentence
i
e
wa uag
w
nc
rd
e
tra mo tion
ined in a log-linear model:
s
ns
d
e
la
l
1
p(e|f ) = exp
Z
source sentence
n
!
e
s
tio
n s core
co
,
re
λi hi (e, f )
(1)
i=1
s a normalization
constant
that
is
ignored
in
n
i
d
te
a
l
u
lc
a
c
t
o
n
ce. To compute the
probability
of a translation
(Koehn
et. al., 2007)
n an input sentence f, we have to evaluate each
e function hi . For instance, the feature funcTuesday, March 24, 2009
ic
t
c
a
pr
What is CLARIFY?
•Data Acquisition Framework
•Corpus Annotation Tools
•Interactive Games
•Paraphrase Validation
Tuesday, March 24, 2009
CLARIFY
System
Process
Bilingual
Corpus
Results
Phrase Table
Word
Alignments
Tuesday, March 24, 2009
Online Model
Updating
Direct Edit
Mode
Game Play
Mode
Updated
Phrase Table
Human-Powered Computing
•Learning from Humans
olve. We don’t expect volunteers to label all images on
Web for us: we expect all images to be labeled because
ple want to play our game.
which the two players agree is typically a good label for
the image, as we will discuss in our evaluation section.
•Training Data
NERAL DESCRIPTION OF THE SYSTEM
call our system “the ESP game” for reasons that will
ome apparent as the description progresses. The game is
yed by two partners and is meant to be played online by
arge number of pairs at once. Partners are randomly
igned from among all the people playing the game.
yers are not told whom their partners are, nor are they
owed to communicate with their partners. The only thing
tners have in common is an image they can both see.
•The ESP Game
(von Ahn & Dabbish,
2004)
m the player’s perspective, the goal of the ESP game is
guess what their partner is typing for each image. Once
h players have typed the same string, they move on to
next image (both player’s don’t have to type the string
the same time, but each must type the same string at
me point while the image is on the screen). We call the
cess of typing the same string “agreeing on an image”
e Figure 1).
Figure 2. The ESP Game. Players try to “agree” on as many
images as they can in 2.5 minutes. The thermometer at the
bottom measures how many images partners have agreed on.
Taboo Words
Tuesday, March 24, 2009
A key element of the game is the use of taboo words
associated with each image, or words that the players are
not allowed to enter as a guess (see Figure 2). These words
Visual Data Editors
Improved Statistical Translation Through Editing
(Callison-Burch et. al., 2004)
Tuesday, March 24, 2009
CLARIFY Knowledge Editors
•Visually Editing Machine Translation Model
Information
•Word Alignments
•Phrase Alignments
•Paraphrasing
Tuesday, March 24, 2009
Word Alignment Editor
The cat liked eating yellow bugs .
猫 は 黄色い 虫 が 食べる事 が 好き だった 。
Tuesday, March 24, 2009
Tuesday, March 24, 2009
Phrase Alignment Editor
The cat liked eating yellow bugs .
猫 は 黄色い 虫 が 食べる事 が 好き だった 。
Tuesday, March 24, 2009
Tuesday, March 24, 2009
Paraphrasing Editor
The cat liked eating yellow bugs .
System Given
Information
猫 は 黄色い 虫 が 食べる事 が 好き だった 。
User Segmented
Input
The cat liked to chow on yellow insects .
Tuesday, March 24, 2009
15
Tuesday, March 24, 2009
CLARIFY Translation Games
•Employ Three Previous Editors
•Constraints on Games
•Developing Intuitive Game Designs
Tuesday, March 24, 2009
Paraphrase
Bootstrap
Process
Automatic
Similarity
Comparison
Discard
Paraphrase
Result
Fails
Threshold
Satisfies
Threshold
Contains New Knowledge
Paraphrase
Result
Tuesday, March 24, 2009
Save for Later
Comparisons
Found No
Similar Response
Comparison to
Other Responses
Found
Similar Response
Contains Previously
Confirmed Knowledge
Online Model
Updating
WordNet-based Similarity
•Deciding on new Paraphrase Acquisition
concept
1
1
tangible thing
intangible thing
2
2
food
emotion
3
3
love
flour-based pastas
4
buckwheat
flour
5
soba
Tuesday, March 24, 2009
4
durum flour
5
spaghetti
2
philosophy
Accounting for User’s Experience
•Users have Different Levels of Knowledge
•Want Smoother Modeling
•Identify User Language Ability
•Knowledge Source Score Component
Tuesday, March 24, 2009
CLARIFY Review
•Direct Editors
•Interactive Games
•Language Ability Discrepancies
•Paraphrase Knowledge Validation
Tuesday, March 24, 2009
CLARIFY:
Human-Powered Training
of SMT Models
Thank You!
D. Scott Appling & Ellen Yi-Luen Do
darren.scott.appling@gatech.edu; ellendo@cc.gatech.edu
School of Interactive Computing
Tuesday, March 24, 2009
Download