Non-deterministic Tree Automata Models for Statistical Machine

advertisement
Chiara Moraglia



Branch of computational linguistics
The study of mathematical structures and
methods that pertain to linguistics.
Combines aspects of computer science,
mathematics and linguistics.


Words: anchor
Sentences : Cleaning fluid can be dangerous.
Claire kicked the bucket.
Machine translation that keeps in mind the
problem of ambiguity.
 A sequence of reordering decisions and word
translation decisions, each with a probability
assigned based upon linguistic data.
 2 main reordering models:
1) phrase-based models: re-align phrases
(strings of words)
2) syntax-based models: can use tree
transducers to permute trees (syntactic
structure) with words as leaves

http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf

Generalize the work on tree automata and
tree transductions to non-deterministic
models and explore the equivalence
properties that were proven to hold in the
deterministic case.

A hierarchical collection of labeled nodes
connected by edges, starting at a root node
https://upload.wikimedia.org/wikipedia/commons/f/f7/Binary_tree.svg
A tree transducer is a 5-tuple <F,H,Q, qin,R> where
i) F is a functional signature of input symbols
ii) H is a functional signature of output symbols
iii) Q is a finite set of states
iv) qin∈Q is the initial state
v) R is a finite set of rules <q, φ> ζ where ζ is
1) <q’, ψ>
2) h(< q1, ψ1>,…,< qk, ψk>)
Φ gives the conditions the current node must satisfy,
Ψ says which node to go to from the current node

(Courcelle & Engelfriet, 2012)


A functional signature is a set of function
symbols, each with an associated arity ρ(f)
(the number of arguments the function takes
on)
E.g. f(x), ρ(f)=1
h(x,y,z), ρ(h)=3
(Courcelle & Engelfriet, 2012)
i) F={f,a,b} where ρ(f)=2, ρ(a)=ρ(b)=0
ii) H= {a,b,ε} where ρ(a)=ρ(b)= 1, ρ(ε)=0
iii) Q={qin,q1,q2}
iv) qin∈Q is the initial state
v) R=
1) <qin, labf(x1)>
<qin, down1>
2) <qin, labx(x1)^bri(x1)>
x(< qi, up>)
3) <q1, True>
< qin, down2 >
4) <q2, bri(x1)>
< qi, up >
5) <q2, rt(x1)>
ε
6) <qin, labx(x1)^rt(x1)>
x(< q2, stay>)
(Courcelle & Engelfriet, 2012)
a or b(<a or b &1st child, >)
q1
<f,
>
<True, >
qin
<1st child, >
<a or b & root, stay>
<2nd child, >
a or b (<a or b & 2nd child, >)
q2
ε
<root, stay>
input tree
f
a
output tree
a
b
b
ε



A tree transducer is deterministic if the state
and the position in the tree uniquely
determine what rule should be applied
Otherwise, it is non-deterministic
E.g. <qin, labf(x1)> <qin, down>
<qin, labf(x1)> <q1, up>
g(<f, >)
qin
<a, stay>
a
h(<f, >)
Modified from (Fülöp, 1981)
input
possible outputs
f
g
h
g
h
f
g
h
h
g
a
a
a
a
a


The possible output trees would be assigned
probabilities
Then the words would be translated into the
target language
Courcelle, B., & Engelfriet, J. (2012). Graph Structure and
Monadic Second-Order Logic. Cambridge: University Press.
Fülöp. (1981). On attributed tree transducers. Acta Cybernetica,
5, p.261-279.
Knight, K., & Koehn, P. What’s new in statistical machine
translation [PDF document]. Retrieved from
http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf
Tree (data structure). (n.d.). Retrieved from
http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf
Download