Error typology and refinement revisited (from SS03)

advertisement
Error typology and refinement revisited (from SS03)
Reason or source of Who’s
Refinement or fix
error
responsible
A.
No Lack
of
lexical RR
Add a lexical entry
translation
coverage
- use user’s feedback to
determine the translation, i.e. the
y-side of it.
- need to determine the POS
(maybe I can do this separately,
using the known POS sequence
and MLE)
B. Wrong 1. right word is not RR
In either case, we need to
Agreement1 in the lexicon. Right
pinpoint the incorrect word2 (user
(num, pers, now, the TE outputs
will correct it and give us the
gen, tense) the first lexical entry
right form of the word)
with the same form
1. need to build a new lexical
entry with the same form but
2.
agreement
with the right constraints (this
constraint missing in
will depend on the case, if it’s a
one or more rules
verb, need to add the constraints
(and probably in the
of the subject)
relevant
lexical
2. need to find feature  which
entries)
would have triggered the correct
word to come up3 (hopefully
given by user)
Error type
C.
1
Wrong 1.
Missing
right RR
In both cases, we need to
duplicate
the
appropriate
grammar rule(s) and refine it to
include the missing agreement
constrainst. 4
1. Add a new lexical entry with
It is not uncommon that agreement constraints change from language to language. Examples of this
are the constraint agreement on gender (in addition to number and person) between the subject and the
verb in Hebrew, which only transfers to Spanish for the past participle forms, and which does not
transfer in any case for English.
Another example is that adjectives in Semitic languages (Hebrew and Arabic) have to be marked for
definiteness (the same way determiners do), but in Romance languages, adjectives are marked for
gender and number (and for English adjectives have no agreement constraints).
2
Sometimes user won’t know, which is the incorrect one. For example, given the English “the white
sheep”, what’s the incorrect word in “las ovejas blanca”?
3
In general, feature agreement is preferable over feature value specifications, since feature values tend
to over specify. However, I still need to figure out which feature agreement is correct, if we had the
source German sentence “Das House des Vaters” and the English “the father’s houses”, the agreement
constraint needed would be an XY constraint (X noun NUM = Y noun NUM), as opposed to an
agreement constraint on the number of the father and the house (Y noun num != Y gen num).
4
I’ll need to generate and retain more than 1 hypothesis (at a later stage I will generate discriminatory
examples, active learning), edit the rule and cross-validate with attached examples, if any. I would be
editing both learned rules (have a history and a cross-validation set attached) and manual rules (might
have a cross-validation set attached).
sense of the sense in the lexicon
word
2. lexical entries with
different senses for
the same srs word,
are
underspecified
wrt. their application
D. Incorrect 1.Due to context:
RR
7
word
-phonetic
-syntactic
-semantic8
-selectional
restrictions9
(figure
out
boundaries)
2.
lexicalized
expression, figure of
speech (idiom)10
3.
morphology
(overregularization11)
that sense as the tgt word. All the
features that made the rule apply,
are added to the entry, but there
might be some other features that
also need to be added to that
entry and maybe to other entries
in the sentence (-> refinement).5
2. need to find feature  which
would have triggered the correct
sense of the word to come up6
1. need to determine what kind of
context is affecting the form of
the word. (hard to determine)
-Hypothesize
selectional
restrictions w. head and generate
examples that test all the
hypothesis (active learning)12.
Need to refine the grammar rule
to include the new constraints.
2. need to detect the words that
constitute the idiom and add
them as an entry to the lexicon
(hopefully, user will have aligned
things correctly)
3. need to add missing irregular
noun and verb forms to the
For example “me sente en el banco”-> “I sat on the bank”: bank -> bench, which will have the same
features (POS = N, num = sg, etc.) so that the same rule (NP->Det N_SG) can apply. But the entry for
“sat” needs to be refined to prefer physical objects as their direct object, and “bench” needs to be
refined as being of type “physical object”.
Arguably, “me sente en el banco” could mean “I sat in a bank”, but it is a less preferred reading
6
In general, feature agreement is preferable over feature value specifications, since feature values tend
to over specify. However, I still need to figure out which feature agreement is correct, if we had the
source German sentence “Das House des Vaters” and the English “the father’s houses”, the agreement
constraint needed would be an XY constraint (X noun NUM = Y noun NUM), as opposed to an
agreement constraint on the number of the father and the house (Y noun num != Y gen num).
7
Example: “We had an sixty meter rope and two ninety meter ones” should be “We had a sixty meter
rope and two ninety meter ones”.
8
Example: “all climber was patiently waiting for her turn” should be “every climber was patiently
waiting for her turn”. Another example of this would be the mass vs countable N distinction (need to
refine the lexicon and add a mass/count feature so that the appropriate determiners and quantifiers can
be used with each word).
9
Example: “It was very freezing” should be “It was freezing” or “It was very cold”.
10
Example: “John kicked the bucket” -> “Juan se murió” and there is no “cubo” anywhere in the
source sentence.
11
This is just one reason, among many possible other ones.
12
If we have 3 hypothesis, we can generate 3 distinct semantic features automatically (XXX, YYY,
and ZZZ) and test them by generating new examples and have user tell you which are right/wrong.
Need to have part of the lexicon with semantic features, so that we can generate these examples
automatically. If the semantic constraint that seems to hold is ZZZ, say, the grammar writer can look at
the examples later and give it a more mnemonic name (if “llevar” is the head, the N in the object NP
has to be a physical object, as opposed to institution or geographic place, and thus ZZZ-> physical
object).
5
E. Wrong 1. local WWO
1. RR
Word order 2. constituent level
WWO
2. RL
F.
Information
missing
1. a lexical entry
might be missing
2. might indicate a
problem
with
a
transfer rule
G.
extra TL uses less words
word(s)
to
express
the
concept
H.
1. rule (syntactic
Translation pattern) missing
error
2. other
I.
Beyond help, too
impossible many errors
to say
A = Ariadna
E = Erik
K = Kathrin
MR = manual rules
RL = Rule Learning Module
RR = Rule Refinement Module
TE = Transfer engine
RL or RR?
RR
RL or RR14
RL
lexicon
later: get morphological info
from the morphological analyzer
1. modify local ordering of POS
in a rule13
2. harder, experiment with it,
probably there is a missing rule
Will depend on user feedback
(need to see examples)
Need to refine the grammar rule
to not generate that word at the
y-side
Need to add new rule, might
need to create POS
RR needs to pass it to the RL as a
new training example
Should I D to the TCTool error choices?
NECESSARY REFINEMENTS15
GRAMMAR
Not fixed in grammar2.trf:
+ need to create a rule for clitic pron as obj where they move in front of
the verb in spanish: 6,8,9,13,20,21,22,25,26,31
type of error: C.1 (tu->te (+tonic)) + E.1
In cases like “el amarillo banco”, just need to change DET ADJ N to DET N ADJ. However, when
we get more examples, we might run into “el gran banco”, which is correct and which should cause the
RR to backtrack and recover the original rule (last rule in the history of the DET N ADJ rule) and
specify some constraints. To determine which type of adjectives should be further specified, we can
again present the user with a few automatically generated cases and then see what is the most common
order, the default order, and mark the “marked” adjectives, in this case, “bueno” should be marked with
a “pre-mod” feature.
14
Depending on whether it’s a modification of an existing pattern or a completely new pattern
15
by each refinement, there is the number of the sentence in input-xfer.out.debug which would be
affected by it
13
+ prep + tonic clitic pronoun: 7,23
type of error: C.1 (tu->te (+tonic)) + H.1
+ duplicate clitic in front of verb: 25
type of error: C.1 (him->lo (+clitic)) + E.1 (or maybe H.1)
+ agreement between subj and adj predicate (create new rule w. appropriate
constraint): 4,29,30
type of error: B.2 (missing agreement constraint)
+ add obj agreement for "gustar": 5 [+lexicon refinement]
type of error: B.2.
+ refine vp(aux v) rule to deal with future: 25
type of error: B.2.
+ add governed prepositions: 16,31(play), 25(help) [+lexicon refinement]
?add lexical entry for "jugar a" + det N (al, a la). Need to do some kind of
postprocessing to get a+el->al
type of error: D.1.
+ wh-questions coverage: 19
type of error: E.1. and/or H.1.
+ modal questions (inverted subject): 26
type of error: E.1. and/or H.1.
Fixed in grammar2.trf:
+ "it" can only be translated into "lo/la" when it's accusative
added: {S,1} ((y1 case) = nom)
; subj it !-> la/lo
LEXICON
+ the right form of the verb is missing from the lexicon: (1,2),10,13,17,18,21,27
-> ! need to detect that it's a lexical gap as opposed to agr-unification failure
type of error: B.1.
+ OOVW: 23(girl; -> Det-N agreement error)
type of error: A
+ add a different sense(translation) for an existing entry:
15(to->para +que,a)
23(in->en +dentro)
30(look->parecer +mirar)
type of error: C.1.
+ refine lexical entry: 24(would like TO),29(lexicalized entry "there is/are" -> hay)
type of error: D.2.
+ ser/estar difference (lexical selection?): (4),11,19,29
type of error: B.2. and/or D.1.
+ lexical selection: 30( shining brightly ->! brillaba brillantemente -> vivamente,
mucho)
type of error: D.1.
We can try to group these sources of errors into larger problem classes. Here is a first
approach of what are the different cases that users will encounter and a sketch of what
strategy I might need to adopt to refine the rules:
1. Detection (simplest case)
When there is more than one translation, have the user pick the ones that are correct
(for each translation, the user can assign a binary label: wrong/correct), or have him
set preferences (this is the best, these 2 are ok, but not great, etc.).
If the difference between a correct and an incorrect translation is that one or more
different words were used, ultimately, I need to determine whether they have the same
root, and the morpheme was wrong (conjugation problem), or whether the root itself
was wrong. For now, while there is no morphology module, I can do a letter
comparison, measure the distance, or see if they have the same affixes (prefix/suffix).
When we have morphology incorporated, need to detect if a word is a morphological
variant or a different root (hard for irregular verbs).
2. Lexical problems
- lexical ambiguity
- wrong sense of the word has been translated (banco -> bench, bank)
a. if missing in the lexicon, add the other sense to the dictionary
b. set up a strategy to determine which one should be the default
- I could interact with the user to determine which one should
be the default, or
- I could look at head word (V or N) and introduce semantic
constraints (selectional restrictions). This is harder, since I need
to elicit the semantic constraints from user, and Erik might
need to modify the Transfer engine to be able to deal with
semantic constraints.
- poor coverage (OOV word -> need to augment bilingual lexicon)
- wrong register (add a formality constraint?)
- etc.
3. Structural problems
The source of the problem can be in the automatically learned rules or in the manually
written rules, the different kinds of rules need to be tagged in the Transfer engine.
Either way, problems that originate in the rules of the MTS should result in at least 2
different cases of learning.
In the situation where Kathrin’s RL module generalizes from S1-T1 and S2-T2 and
obtains a rule R, given a new sentence S3, that we think is similar to S1 and S2, we
run it through R and obtain T3’ (instead of T3), we anticipate the learning to happen
in one of these 2 ways (note that this is also valid if R is manually written):
1. Refinining the rule R, so that it successfully translates S1, S2 and S3.
Examples:
a. “la casa rojo -> la casa roja”, and probably the permanent vs temporary
Quechua example given at the beginning of this section.
RR needs to add a new constraint (gender{masc,fem,neut}, perm{+,-})
b. different grammatical relations (GR) might be translated differently (in
German object and subject have different cases)
2. Bifurcating or Splitting the rule R, so that R translated S1 and S2 (as before)
and rule R’ translates S3. We might need to restrict R so that it doesn’t apply
to s3. Example:
a. “casa grande/azul”, but “*casa gran” -> “gran casa” need a new rule
(R’) that applies only to pre-modifier adjectives (premod +). Note that
we also need to restrict the application of R to (premod -) adjectives
only, so that it doesn’t apply to S3.
When we encounter a new adjective with the same behavious, we
know we need to tag it with the feature (premod +).
The 2nd case will probably apply to all exception rules [check Marisa’s LREC 02
paper on how to deal with diminutives in Spanish].
On the other hand, every time that a rule is applied correctly to successfully translate a
sentence (once we pick the correct translation from all the alternatives), we should
probably increase its weight.
What I need to do now is find a way to know which one of the 2 cases needs to apply
based on the feedback (correction) the bilingual informant gives (through the
TCTool); and whether there are any other learning cases that do not fall in any of the
two previously mentioned categories.
To be able to refine rules when there is a structural problem, I need Kathrin to provide
me with the history of the generalization of the rules. For the manual rules, and
refined rules, maybe we can put them into the same hierarchical lattice, as if they
were seed rules.
Kathrin’s Rule Learning module makes safe generalizations, and when it is not sure, it
makes a hypothesis which gets validated or not through the TCTool.
At run time, Kathrin needs to pass me all the rules applied in the translation, so that I
can interact with the user throughout the process.
And when I change a rule, I need to keep track of the history of the rule (linked list),
so we need to modify Kathrin's DS to support linked lists.
In sum, what I need to do is a diagnosis of what kind of errors there are for each kind
of rules (overgeneralization, missing feature, etc.). If a feature is missing from our
learned grammar rules, hopefully we can learn it through the RR module, and this
should effectively place the refined rule below the seeded rule space.
Download