1

advertisement
1
>> Sumit Gulwani: Hello, everyone. So thanks for coming to the talk. It is
my great pleasure to introduce Mehdi Manshadi, who is finishing up his Ph.D. at
University of Rochester. So before that, Mehdi went to University of
Technology in Tehran, in Iran, and his primary region of interest is in deep
understanding of natural language. And this is precisely the kinds of
techniques which we feel we need for building good, natural language front-end
interfaces for end-user programming in intelligence computing systems.
So Mehdi is also interested in machine learning and human computer interaction
and today, he's going to talk about dealing with quantifier scope ambiguity in
computational linguistics. And I think this is -- this should be of joint
interest to both people in programming languages and natural language
processing groups.
So all to you, Mehdi.
>> Mehdi Hafezi Manshadi: Thank you very much. Hello, everybody. So today,
I'm going to talk about dealing with quantifier scope ambiguity in natural
language processing. I'm going to talk about three different topics in
quantifier scope, dealing with quantifier scope ambiguity, scope
underspecification, building a corpus and finally doing automatic scope
disambiguation using that corpus.
So here is an example of quantifier scope ambiguity. Actually, it happened
that quantifier scope ambiguity is one of the most challenging problems in
natural language understanding. Even when there was a lot of optimism about,
you know, deep language understanding, like two decades ago when actually
people started doing semantics, they soon realized that quantifier scope
ambiguity is really hard. And as I discuss later, actually they decided not to
do it from the first place.
So as you see, these two sentence, I have it here and this is slide two. So
each of them has two readings, right. And the joke is comes from the point
that actually, the person interprets the sentence in the less plausible reading
of the scoping, the less preferred reading. So one of the reading in the first
sentence, every has wide scope, the other one, one has wide scope and the same
for the second sentence, one has wide scope or not has wide scope.
I prefer to represent, you know, two readings in terms of tree structures
because we're going to use the tree structures a lot during the rest of my
2
talk. So, you know, these are the first two readings. The two readings for
the first sentence. As you can see, in one of them, every out scopes one and
the other one outer scopes every. And the same for not. So you have one
out-scope not, and not out-scopes one.
So, but quantifiers and negation are not the only sources of scope ambiguity.
There are a lot of other scope ambiguity in natural language. For example,
plurals carry a lot of scope ambiguity and actually, they're one of the most
challenging part in dealing with the scope ambiguity in natural language. Like
if you have, you know, all be students met the faculty candidate, the question
is whether there were just one, there was one meeting or there were, you know,
many meetings, and every person individually met the candidate. We have modal
operators, we have sentential adverbials, and we have frequency adverbials. So
these are some of the things that actually carry scope ambiguity when you deal
with natural language.
So how do we deal with quantifier scope ambiguity in computational linguistics?
The problem is that, you know, as I said, from the very first point, everybody
realize that actually quantifier scope ambiguity is really hard. So basically,
they decided not to do it as explain how they actually get around it later.
But another good point is that for many tasks, actually, you can, you know,
have some rough representation and you work with that and it's not that
critical to get the right scoping, you know. Maybe ambiguous interpretation
would be enough to do the task.
>>:
Legal context is what you --
>> Mehdi Hafezi Manshadi: Yes, I'll get to that. So the other reason that
there hasn't been much work on quantifier scope ambiguity is the focus in the
last two decades, has everybody knows, has been on shallow text processing.
But the other reason is that only a few restricted scope-annotated corpora are
available, and that's why a statistical community wasn't very excited to work
on this problem.
But finally, and most importantly, quantifier scope ambiguity has been
misunderstood in computational linguistics, as I'm going to discuss it in my
next slide.
>>:
[indiscernible] English than in other languages?
3
>> Mehdi Hafezi Manshadi: No, it's almost the same in all languages, yes.
Yes. Yes, it's pretty much universal. So because I work in quantifier scope
ambiguity and stuff and I go to conferences, I often here these kind of
sentences from people. Like, you know, we search through whole Brown corpus
and we find only two cases of ambiguity. So that's exactly the
misunderstanding I'm talking about. And let's see what I mean by
misunderstanding.
The point is that most people actually, when you talk about quantifier scope
ambiguity, they, you know, they think about every NP, every N sum, explicit
quantification. But there are many other noun phrases that actually carry
scope ambiguity. For example, NPs with definites, definite NPs even carry
scope ambiguity. Most people believe that definite always have the widest
scope. But that's not true. For example if you look at the first sentence, we
paid 2,000 there to the father of every family. Now, you know, therefore,
there is in the scope of every family, right. Season though you my find
context actually it's the other way around. If you Google the father of every
family, you will find a lot of really [indiscernible] people talk about the
father of every family when they mean God.
So basically, you know, definites can carry a scope ambiguity. Even bare nouns
like in order, sort nates in alphabetical order. Or more importantly,
conjunctions that happen a lot. Like if you have Canada and Australia have a
universal healthcare system. Here you have an ambiguity where they're the same
universal healthcare system or different ones, where you have A outer scopes
conjunction, or conjunction outer scopes A.
>>:
Where is the ambiguity in the order, for example?
>> Mehdi Hafezi Manshadi: Sort the names in alphabetical order. So basically,
think about what is, like, you know, you can have in all different orders,
okay. So what I'm saying here is that they that I not be a true ambiguity in
this sentence. That's exactly what I want to talk about. There may not be
true ambiguities, but for the machine, there is an ambiguity because order can
have ambiguity.
For example, think about sort the names in every alphabetical order. You still
have an ambiguity whether order out-scopes names or names out-scope order.
That's not a true ambiguity, because for human beings, we know that there is
4
one order for all the names. But when it comes to machine understanding,
because we don't have that word knowledge, okay, then there is this ambiguity
whether order out-scopes names or names out-scopes order. But I will get to
that very shortly.
So theoretical semanticists are mainly interested in examples with true
ambiguity. That's exactly my point. You know, like everybody likes two songs
from this album. Whether there's two songs the same or it's different. Or
three many carried two desks. So these kind of examples actually throw off a
lot of computational language and people say we don't care, actually, whether
three men carried the same two desks or different. Usually, in the context,
you know what you're talking about, probably there are just two desks and all
you care is that the two desks has been carried and you don't care whether
three men, they all helped or everybody individually did that.
So that's why sometimes this problem has thrown off people. But actually, when
you are talking about computational linguistics, every sentence that has more
than one scope bearing elements carry a scope ambiguity. And that's exactly my
point.
>>: The whole point of natural language is this you don't have to specify that
carefully what this person would carry. So why is this a difficult problem?
>> Mehdi Hafezi Manshadi:
Why is it an important problem?
>>: Some of these people, when they say three men carried two desks, that's
what they mean.
>> Mehdi Hafezi Manshadi:
>>:
Exactly.
So that's one of the reasons that we --
People care about the --
>> Mehdi Hafezi Manshadi: It depends on the task, okay? Here is exactly the
answer to your question and your previous question. So quantifier scope
ambiguity. How critical quantifier scope ambiguity is, it depends on the
domain that you're working in. And that's exactly another point that I want to
make.
So if you are talking about legal documents, as you said before, if you are
talking about scientific text, physics, math and this kind of stuff, or if
5
you're talking about natural language programming and descriptions that are
about natural language, then you have to be precise. No ambiguity is
tolerated, for example, if you are in natural language because you are
converting this natural language in a formal description, programming code, and
there's no ambiguity can be tolerated.
So yes, it depends definitely on the domain, but also on the task too. For
example, you know, maybe you are able to do the task, some tasks without even
if a domain has a lot of scoping traction, you can do it with that ambiguous
action representation. So some tasks may require actually quantifier scope
ambiguity, like entailment or natural language programming or depending on the
type of questions. Question answering may also be one of the tasks that
actually requires that understanding, that disambiguation.
>>:
[inaudible] some of us are looking into it.
>>:
I think we [inaudible].
>>:
Oh, I see.
>>: And also very interested in allowing people [indiscernible] natural
language.
>>:
[indiscernible].
>>: So I'm not really sure about all that, but there's a new line of work that
is coming up that is called programs [indiscernible] for Windows specifications
and examples of natural language are [indiscernible] that can actually be
programmed. And [indiscernible] specifications.
>> Mehdi Hafezi Manshadi: So since you asked Sumit whether people are working,
I answered whether you meant in Microsoft. If you mean in general people are
working, the answer is yes. I'm actually one of those people who are actually
working in this from quantifier scope ambiguity, I got into domain. And once I
got into the domain, I was okay, let's go one step further and see if we can
actually do the natural language programming.
>>:
Are there groups of people doing natural --
>> Mehdi Hafezi Manshadi:
Aren't, there aren't many people.
I'll get there
6
when I talk about using natural language or using programming, by example, to
do quantifier scope disambiguation and this kind of stuff, I will talk about
natural language programming and why it has not attracted much attention.
So what do we do with quantifier scope ambiguity natural language? Well, as I
said, what many people actually decided to do, like [indiscernible], was not to
do it because it was really hard and they decided to actually leave the scope
under specified and answer to your question, maybe you don't even need for many
tasks to do scope disambiguation.
For example, if you have there is one soulmate for every person, you represent
the semantic using something like that. Where actually you don't talk about,
as you can see, the body of the quantifiers, it's left underspecified so you
don't talk about, basically, which quantifier has wider scope and you just use
the scoping under specified. So either way can interpret it from this
sentence.
But also some people do scope disambiguation. So there are two ways actually
to do scope disambiguation. Heuristic-based approach and corpus-based. I'll
talk about both of them very soon.
Scope underspecification. So actually, if you look at deep understanding
systems, current deep understanding, deep natural language understanding
system, most of them use this underspecification thing. For example, TRIPS
that we have in Rochester, Boxer system from Bos in the university of Rome in
Italy, ERG, which is one of the biggest, actually, resources for English
grammar, which is basically HGSP-based, head driven phrase and structure-based.
Corpora, and it uses minimal recursion semantics. They're all used
underspecified semantic representation. So basically, that's the most popular
way to deal with quantifier scope ambiguity, because actually, you know, in
many tasks, can do some stuff without actually specifying the scope precisely.
>>: So basically, if you use this, you list all the possible ambiguous
interpretation?
>> Mehdi Hafezi Manshadi: Yes. There are ways to do that. First of all, you
know, there are works on actually doing inference with underspecified semantic
representations. But one of the things that people do, like they use weakest
reading, okay. There's this concept, you know, the weakest reading of a
sentence. When you have many possible readings, okay. So you actually get the
7
ones that are weakest, that actually can be entailed from some reading, but
cannot entail any other reading.
So if you entail something from that weakest reading, that can be entailed from
the other readings, definitely, but not vice versa. So any entailment that you
make from the weakest reading is sound. But definitely, it's not complete
because you're missing some information. So there are ways, actually, to do
entailment without doing the scope disambiguation.
So like early frameworks were very simple, something like that, like would
suggest. But recent formalism actually use constrained-based frameworks. What
does it mean, constraint-based frameworks? I'm going to show you exactly how
it works. So let's have a quick look again on the kind of readings that we had
before for the two sentences that we were discussing.
So we actually show quantifiers and scope operators in this way. So every
quantifier, you have the reception of the quantifier, and you don't specify
what predicate is in using the restriction and in the body.
Now, and you have this predicate as dot the notes, which we call them labels
and you have holes and you just plug these labels into holes to build one
reading. But obviously, not every plugging of labels to hole is a valid
reading. For example, you cannot plug person of X in the body of every,
because person of X is actually the restriction of every, right. So there are
some constraints. So this is exactly what they're talking about when you say
constraint-based frameworks.
So you have some constraint. These are dominance constraint. So, for example,
they say that the final solution, for example, this, whatever label is plugged
into this hole out-scope person of X. Must be above person of X. Must
dominate person of X. So the same thing for soulmate. And these two actually
are binding constraints, because every available has to be in the scope of its
quantifier. So this predicate exists for X and Y, has to be in the scope of
both every and one.
So the same thing very similar thing for not. So now, what happens, for
example, this is one reading that you can build, satisfying all the
constraints. As you can see, we have plugged person into this hole, soulmate
into this hole one of Y into the body of every X and finally X is into the body
of one. And that that's going to be this reading where every has the wider
8
scope over one. So, you know, there are one more way of plugging labels to
holes where actually you satisfy all the constraint and that is the other
reading.
So that's now we have the mathematical model, right. So this is exactly our
way to model under framework. So now you have these three structures who some
of the leaf notes are holes and you want to plug in the root of some tree into
the hole of another tree and find that you want to build one tree structure, a
single tree structure which has no hole in it, all the holes are filled, and
satisfies all the constraints that has been represented in the representation.
So now this is a mathematical model and we want to solve that.
So there are two algorithmic problems that need to be involved. One of them is
satisfiability problem, where there actually is any reading that satisfies all
this constraint. And if there is, let's enumerate all the possible readings,
for example. Both problems are NP-complete. It's very intuitive why they are
NP-fleet.
So people try to find a tractable subset of them. So the first tractable
subset was dominance weak nets. But it has some limitations. They claim that
actually it covers the coherent, the semantic representation of every coherent
sentence, but that was not true. So later, sentences were found that actually
are not covered by this tractable subset. So that's why we started work on
that and we tried to extend, actually, that framework so that it covers those
sentences.
So -- sure, go ahead.
>>:
Understanding the complexity, NP the size of what?
>> Mehdi Hafezi Manshadi:
>>:
Let's say in number of quantifiers.
The number of quantifiers?
>> Mehdi Hafezi Manshadi: Well, actually, number of quantifiers means number
of noun phrases in the sentence, actually.
>>: This is a very small number.
understand.
So why is this a problem, I guess?
I don't
9
>> Mehdi Hafezi Manshadi: No, it's not a small number, necessarily. If you
would look at, let's say, sentences in the [indiscernible], we can easily have
like ten noun phrases. Remember that every noun phrase, you don't have to have
explicit quantification. Every noun phrase, definite, indefinite, bare nouns,
can carry a scope ambiguity. And actually, not every noun phrases introduce
only one element. You have noun phrases that introduce more than one element
in the domain of discourse.
Like plurals, for example. Why you have collectivity versus distributivity,
because plurals, when you're talking about plurals, you're talking about a set.
Let's say all the students, okay. You're talking about a set. But at the same
time, you have a universal quantified variable over the elements in the set.
So that's why when you say all the students met the teacher, there are two
possible readings whether, you know, the whole set as an entities is the
argument. But actually there's a universe of quantification over every element
in this set and there is an individual meeting for every element.
So there are actually, there are noun phrases that introduce more than one
element in the domain of discourse. So there is actually a problem in general.
>>: Still a small number, even in noun phrases, and it seems like not
everything is going to have a scope problem. Are there some statistics about
on average how many things can be resolved in a set?
>> Mehdi Hafezi Manshadi: Well, obviously, it depends on the corpus. But
remember that on underspecified representation is -- this is not one problem
that actually people have looked into, okay. And that has merit and it has
value in terms of linguistic point of view and that's exactly what I want to
get into.
Even if you don't care how much this takes, there are still satisfiability
problems and enumeration problems that you want to solve. The other thing that
is so for example, what I want to get into is that this underspecification
frameworks and the study of actually how you can solve these underspecified
representations in polynomial time has resulted in defining a notion of
coherence for us.
So basically, this has helped us to develop a mathematical notion for coherence
of a sentence. We've actually converted the problem of, you know, predicting
the scope ambiguity to predicting a linear order of quantifiers. So while what
10
you have seen so far, you have these two structures for scope disambiguation
that you have to predict. But actually using a notion of coherence, you can
actually reduce this problem to find the linear order of quantifiers.
So basically, my focus here
thing. Maybe you're right,
ten, you know, readings and
that's not the point I want
is not that this NP-complete problem is a big
maybe actually who cares for, you know, two to the
actually we can do that and we don't care. But
to make in this talk.
>>: So let me see if I got it, and I'll do it by analogy. So with something
like the traveling salesman problem. That's an NP-complete problem.
>> Mehdi Hafezi Manshadi:
Yes.
>>: And there, there are lots and lots of possible solutions to enumerate.
But if you have a particular solution, it's very easy to evaluate how good it
is.
>> Mehdi Hafezi Manshadi:
Yes.
>>: Just adding up the distances as trivial and we understand how to do it.
Here, apriori, it seems to me that the hard part is not enumerating the
possibilities, but evaluating particular possibility, how good it is.
>> Mehdi Hafezi Manshadi:
Yes.
>>: And what you're saying is that, in fact, those two problems are linked
together and you're going to have a clever method for evaluating how good
something is that, in fact, ends up constraining the search spaces; is that
right?
>> Mehdi Hafezi Manshadi: Yes, exactly. And that's what I'm going to get at.
And if you still have question after I go over this, I would be more than happy
to answer, okay?
So okay. This is our formulation. Just let's look at sentences with just noun
quantifiers, no other scopal operators, okay? So if you have two quantifiers,
it's easy to see there are four possible readings, four possible tree
structures. Not readings, I mean tree structures. Some of them are valid,
some of them are not valid. So this is a notion that we define, actually.
11
Heart-connected specification graphs, okay.
So here, let's look at this sentence. Every professor, whom somebody knew a
child of, showed up. And this is the unspecified representation. It's easy to
see why. All you have is restriction predicates and the [indiscernible] whole
of the quantifier, the constraints, and also binding constraints. That's all
you have. Nothing magic here. Now, what we do is we collapse every quantifier
and restriction into one note. Okay. As you see in this figure. So only the
interconnecting edges remain. And you get to this, actually, graph, which we
call a dependency graph.
So now, the node which corresponds to the heart formula of the sentence, main
predicate of the sentence, we call it the heart, and that's where the notion of
heart-connectedness comes from. If you look at every node in this sentence,
the node can reach the heart using a directed path. So that's what we call
heart-connect in this property. It means every node can reach the heart using
a directed path.
Now, let's see why it helps to solve the problem in polynomial time.
Basically, for those of you who are familiar with vectorization algorithm, this
is, the idea of solving this problem is very similar to the idea of this
vectorization algorithm, when you actually have dependencies between four
loops. So you want to have a four-loop and you want to see whether you can
actually do it using a vector operation.
So actually, I got the idea from that algorithm. So you recognize the
strongly-connected components. And you collapse this around the connected
components into one node. So now, you have a directed cyclic graph. And this
is directed cyclic graph you can solve easily. For example, if you have a
topological order of the nodes, this builds a reading of the sentence.
Now, once you do that, then you come back and you go inside every strongly
connected component. For example, you go to this strongly connected component.
And then you try to solve that. Actually, now this was the heart of the main
dependency graph. Now, within this small thing, which is the node that
connects to the outside world, we call that node head and actually, it
corresponds to the head of the noun phrase, semantic head of the noun phrase.
Now, that becomes the head of this small dependency graph, and you have another
dependency graph and now you want to solve that. Now you have the directed
cycling graph and again if you have strongly connected components, you do this
12
same thing, recursively, and then you solve that. And once you solve it, you
replace this node with the actual solution of this.
So it's very similar to the idea of a vectorization algorithm, okay. The
interesting thing is that once a sentence is heart-connected, if you pull
out-scope a quantifier and you want to put that quantifier on top, the rest of
the quantifiers are divided into two sets. One set is definitely in the
restriction of the quantifier on top, and the other set has to be in the body.
So that's exactly where the concept of heart connectedness helps. So once you
have a node on top, the rest of the graph are deterministically divided into
two sets that can be put in the body and restriction. And that's exactly the
point that I want to make about heart connected graphs that actually helps to
define quantifier scope ambiguity as to find a linear order of quantifiers.
>>: So I'm curious, what's the [indiscernible] evaluate which one.
[indiscernible].
>> Mehdi Hafezi Manshadi:
>>:
No, not yet.
It's not
We'll get there.
Otherwise, you --
>> Mehdi Hafezi Manshadi: This is just about satisfiability. So you have some
hard constraints and you have to satisfy all of the constraints, okay. And you
want to know whether there is a reading or not.
>>: But one single sentence, sometimes you cannot do that. It's inherently
ambiguous. If you have another sentence or maybe in the whole paragraph, then
you apply the constraint, it's more powerful.
>> Mehdi Hafezi Manshadi:
>>:
Well, that's true.
Here, for one single sentence --
>> Mehdi Hafezi Manshadi: Even one single, you know, the idea of
constraint-based underspecification is that as you go deeper in linguistic
processing, you add now constraints. For example, you go to the discourse
level. Now you have some new information and you add that constraint. Now you
go to pragmatics level and you have some other information and you add this
13
constraint. So as you go deeper and deeper, you add new constraint and that's
exactly the point of making ->>:
[indiscernible].
>> Mehdi Hafezi Manshadi: Yes. But, you know, that doesn't matter. You can
actually have all the sentences in the domain of discourse and then contribute
to the meaning of the sentence and you do that at the same time with this big,
huge, actually, underspecified representation.
>>:
So that constraint is just [indiscernible] otherwise because it's binary.
>> Mehdi Hafezi Manshadi:
>>:
Satisfiable or not at this point.
Okay.
>> Mehdi Hafezi Manshadi: So why it's important, because once you want to add
a constraint, you want to make sure that your reading, your underspecified
representation is still satisfiable. If not satisfiable, there is something
wrong with that constraint or something wrong before in the underspecified
representation. That's why it's important, okay?
Now, the beauty of hard connectedness, is actually that we can prove that the
dependency graph of every coherent sentence is heart connected. We can
linguistically justify that soar strongly connected components actually
represent a noun phrase in the sentence. And if a sentence is coherent, this
means that this noun phrase, if it introduces a variable in the domain of
discourse, contributes to the meaning of the whole sentence somehow.
So there must be a predicate outside this noun phrase that has this noun phrase
as its argument. So there must be an outgoing edge from this strongly
connected node, component, to something outside it.
The same for here. And what does that mean? That means at the end, finally
you are going to end up in the heart. So there must be a directed path from
every node, actually, to the heart. And that's what we call
heart-connectedness. So that's the notion of coherence that we defined.
That's the mathematical notion of coherence and we showed that that's actually
linguistically justified. And that's defined tractable subset of unspecified
representations. Whether you care or not, you can solve it in polynomial time.
14
But more importantly, that notion of coherence helps actually to reduce the
problem of quantifier scope disambiguation to find a linear order of
quantifiers? Why? Because once you have a linear order, that linear order
imposes one tree, exactly one tree. Why? Because if you have a node on top of
another, one quantifier on top of another quantifier, the rest of the
quantifier are deterministically divided into two sets. Some of them are in
the restriction and some of them inner the body. So it's enough to find the
linear order of the quantifiers and that was exactly the point that I wanted to
make. So that's the consequence of heart-connectedness that we basically get
from this. And that helps us in the rest of the actually work we have. So it
connects, actually, to what we have done before.
But the problem actually has attracted a lot of attention in the theoretical
semantics and theoretical computer science. However, I mean, in other answer
to your question, that you may actually care is that, you know, if you have a
whole document, and every document actually is using the sentence. And usually
when you want to do a scope ambiguity, you have discourse, and that discourse
helps to find what the preferred scoping is. For example, let's talk about
natural language description of a series of instructions which at the end are
going to perform a task, a programming task.
So each sentence, actually, introduces some variable in the domain of
discourse. But at the end of the day, you want to have the representation,
semantic representation for the whole set of sentences. If, let's say, like
it's a pseudo code in natural language. So that thing is going to be big, and
that's why you care, because once you want to add a constraint, you want to
know whether they actually, the -- basically, whether it remains satisfiable or
not.
>>:
Just to clarify it, are you saying coherence?
>> Mehdi Hafezi Manshadi:
Yes.
>>: Is that related to the linguistic notion of coherence, or do you mean
something else?
>> Mehdi Hafezi Manshadi: No, this is mathematical notion. That's a
mathematical notion. That's a mathematical model that we define for coherence,
which is linguistically just fight and helps here.
15
>>:
But it's different from the linguistic notion of coherence?
>> Mehdi Hafezi Manshadi: Yes. But this is, you know, from a particular point
of view which helps on this representation to help to us reduce the problem of
quantifier scope ambiguity to predicting a linear order of quantifiers.
So in general, you know, if you are looking at the problem from different
angles, you may have different definition of scope, you know, coherence. But
this is the part that helps. Let's say this is a necessary condition for
coherence. So maybe it's not sufficient, but that's a necessary condition. So
we get that necessary condition, and we get to this heart-connected directed
graphs, okay?
Okay. So now, you know, we have this -- so we have translated the quantifier
scope disambiguation to predicting a linear order of. Quantifiers, right.
So we want to do an automatic scope disambiguation in the classic approach,
people use, you know, heuristic-based approaches. What do they do? Like if
you look at, you know, works in '80s and '90s, you see people use like these
kind of heuristics, like lexical heuristics. For example, each, the quantifier
each tends to have the widest scope. Or for example, subject tends to have
widest scope over direct object. So they use this kind of heuristics to find a
reading.
>>: So how do they approaches evaluate what they're doing?
know that they're doing well?
I mean, how do you
>> Mehdi Hafezi Manshadi: Exactly. That's another point that I'm going to get
into very quickly, okay. The point is that when you don't have a corpus that
has been labeled, how you are evaluating. So basically, people relied on their
linguistic intuition, okay, to decide what this heuristics are going to be.
But there are no way, there was no way of evaluating them. And I get to that,
actually, in a couple of slides.
So now so you have the heuristics, but sometimes they conflict, right. Like
you have each which is in the direct object. So one of them it has wide scope,
another heuristic says it has another scope. So even here, they actually
manually assign weights based on their linguistic intuition.
16
But very natural extension of this is using a corpus-based method. First of
all, you can define heuristics as features. You can incorporate a lot of other
features, like lexical features, like the words which helps to encode
domain-specific knowledge, which is very important. You can actually manually
assign weights using your intuition. You can actually train your model on a
corpus and actually learn the weights, and you can move it to other domains
using domain adaptation techniques and you don't have to manually adjust
weights.
And actually, the other thing is that once you have a corpus, you have an
evaluation, basically. You can know how well you're doing if you have a label
corpora, right.
So the natural extension is corpus-based methods. So okay. So let's get to a
corpus-based method. But we have to build a corpus. Interestingly enough,
there weren't many scope disambiguity corpora available. Those are the
previous corpora available.
As you can see, they are very restricted. First of all, like first of all,
they're all about just having two quantifiers in the sentence. And they have
to be explicit quantifications. Like they don't, you know, handle the
definites or bare plurals or bare nouns or bare plurals, conjunctions, nothing.
Just two explicit quantifiers. Some in A, some in every. So even with this
restriction, still, this one, which was one of the best corpora available
before actually we build our own corpora, the inter-annotated agreement is 52%,
which is very, very low on this binary decision whether the first quantifier
out-scopes the second one or vice versa.
>>: Is there always -- sometimes, it is not always you have to read the whole
paragraph to understand ->> Mehdi Hafezi Manshadi: The point is that also to understand, actually, that
one of the open problems in cognitive science. Even though we understand the
meaning, okay, when we want to spell out the scoping, it's really hard for us
to do that. So that's why ->>:
[inaudible].
>> Mehdi Hafezi Manshadi: Right so that's why even hand annotation of a
scoping is hard, and that's exactly why there are no corpora available before.
17
So people who did that, they just, you know, relied on their linguistic
intuition. So the first corpora became available in 2003, which was very
actually restricted. And the ones after actually were even more restricted.
For example, this one only looks at every and A. Only these two quantifier.
And the sentence has an exact syntax, direct object. That's it.
So that shows how hard the problem is, that people have to narrow down the
problem so much to actually be able to solve it.
>>:
[indiscernible].
>> Mehdi Hafezi Manshadi: Sure, and that's going to be my next slide, or the
one after, okay? So we decide ->>:
I'm sorry.
I just have one more question on the previous one.
>> Mehdi Hafezi Manshadi:
Sure.
>>: When the inter-annotator agreement is 52%, how many different
interpretations on average are there?
>> Mehdi Hafezi Manshadi: So 33% is the baseline, because there were three
labels. So you have two quantification. You have wide scope, narrow scope or
no scope interaction. So basically, because there were two quantifiers, you
have one classification task with three classes. So 33%, if you pick random.
But because in their corpus, actually, the baseline is higher because usually
the first one has widest scope, I guess, as I remember. So it's something like
40 percent or something. So actually, it's pretty low, and that shows
amazingly how hard this quantifier scope disambiguation is even for
[indiscernible] want to spell it out. And we get why actually this is the case
in later slides.
So now, because this corpora is available, we want to be like corpus, which is
much richer. We want to have all nouns incorporating, actually have them
assume that they have a scoping traction and we want to have scopal operators
like negation and logical operators. More importantly, we want to cover
plurals and distributive versus collective and everything. We want to have
this unrestricted, full, comprehensive scope disambiguation, okay?
So now we are looking for a domain.
So that comes to your point.
So ordinary
18
language is sometimes really hard because it's not intuitive for us. What is
the scoping. But if you go to some domains that are critical that scoping is
critical, like legal documents that you said, for natural language programming
that I'm interested in, you actually have very intuitive notion of scoping,
because basically, to write the code, you are consciously actually doing this
scope disambiguation. So it becomes much more intuitive.
So basically, that's why we picked this domain, natural language interface for
editing plain text files. And these are the kind of examples that we are
looking into.
Now, so this is our corpus. We find 500 sentences from online resources, like
one-line tutorial or arc or Linux commands or some graduate students that use
these regular expressions for editing text files. We ask them to give us
language description of the tasks that they do. So now we have these 500
sentences. We chunk them into noun phrases and scopal terms automatically with
manual supervision so that we have gold standard chunking. And then we did
fully scope annotation.
But that's not easy, even though the domain seems to be so intuitive in terms
of scope disambiguation. So year are some of the statistics of the corpus,
just so you have some idea of the distribution of the elements. So like, you
know, we have four noun phrases percent in this domain. And these are the
distribution of the actual explicit quantification versus definites versus bare
nouns, so on and so forth.
Now, we actually present a notation for the first time, because before, nobody
build this corpora. They build the corpora, but there was just too basically
noun phrases and it was a single classification. So there were labels, widest
scope, no scope, no interaction. But now we have this several noun phrases,
scopal operators, so we actually define this notation for scope annotation. We
have plurals. So basically, for like plurals, we have two elements. We have
two. We chose the set. In answer to your question, this element introduces
two elements to the set to the is the universal quantification over the set.
So this is basically how we annotate distributive versus collective readings.
So we define this notation. We also annotate core reference and
[indiscernible] because we believe that actually they help a lot when you want
to do scope disambiguation. Also, that's not the main task that we are looking
into.
19
So we defined this notation. But, you know, as I said, it's really hard. Even
in this domain. Why? Why is this challenging? First of all, like a lot of
the time we have logical equivalence, because you have like two existential or
even existential and indefinite. And it doesn't really matter. It turns out
that in these cases, people rely on their intuition. And their intuition is
not reliable. Even one person's intuition is not reliable, because, you know,
because actually logically, sometimes they say okay this has wider scope.
But later, a very similar structure, a very similar sentence, they believe the
other one has wider scope. Why? Because they're both equivalent logically,
okay. So that's one of the problems. So you may suggest why just you just
don't label those with, okay, they're the same. But because you don't even
have the interpretation of the quantifier, it's not easy to say whether this
quantifier is existential or not. For example, you have A, like A can be
existential. A can be indefinite. A can be referential, so if you don't have
the label of the quantifier, it's even hard to find these cases, these
equivalence cases.
>>:
Who do you have annotate this?
>> Mehdi Hafezi Manshadi: So I had two undergrad students, linguistic
students. They were pretty smart, actually.
>>: One of the things, I came out of linguistic many years ago [indiscernible]
quantifiers, quantification, things of that sort. One of the things that we
used a little bit in working with linguistics is our intuition has changed as
we saw more examples of phenomena and more discussions of what the phenomena,
the possible analyses were. We started to see more. And so I was wondering
whether they were training [indiscernible].
>> Mehdi Hafezi Manshadi:
>>:
Multiple iterations, for example.
>> Mehdi Hafezi Manshadi:
>>:
Yes.
Sure.
Use different results.
>> Mehdi Hafezi Manshadi:
Exactly.
That's exactly one of the points I want to
20
make in this slide, actually. So first, I mean, let me tell you what the
challenges are. So there's like, because quantifier scope is really deep
semantics, okay. And the point is that you have to resolve a lot of semantic
phenomena before you actually get to the quantifier scoping.
One of the problems that we had, for example, type-token distinction. What is
type-token distinction? Let's say I say, you know, every line ends with a
comma. So what is your intuition? We are talking about text files where every
line, there is a different comma for every line or the same. Comma is comma,
all right. So basically, somebody may put comma out-scopes every. Because
some people interpret a comma as the abstract entity type, where philosophers
call it.
And some people interpret it as a token, as a physical realization of comma.
So those, you know, where here actually, many people, if you look at
semanticists, very many semanticists say every has wider scope over A. But
exactly the same people, if you give them this sentence, they say
[indiscernible] has wider scope. Why? Because this is a definite. But this
is, you know, a comma. But actually is the same concept.
So these are like the kind of phenomena you deal with. You see inter-annotated
disagreement, but actually the disagreement is not in scoping. It's somewhere
else. So that's exactly the point. That's why intuitions change. So I spend
like two years just with these two annotators, we iterate through the corpus,
and we try to find places where actually you have scoped disagreement, but
actually disagreement is not because of scope ambiguity. It comes from
somewhere else. And try to come up with this annotation scheme that actually
tries to minimize this. And that's exactly, if you refer to my, my very recent
paper, you can see actually some -- actually, you know that's a conference
paper so it's very short, but you can at least see some of the problems and how
we solved it. But hopefully, very soon the journal version is going to come
out.
>>: One of the reasons I asked the question about language [indiscernible] is
because in Chinese, there's no [indiscernible] become more ambiguous.
>> Mehdi Hafezi Manshadi: It's the same, okay. But it doesn't change the
phenomena. So you don't have articles. It's a bare noun. But it still has
the same, you know, the same concept, right?
21
>>:
Except [indiscernible].
>> Mehdi Hafezi Manshadi: Yeah, and sometimes you have more -- exactly, sure.
Exactly. Sure. Sure. So now, I mean, this is like the inter-annotator
agreement that we got, this is like last summer. So we are actually doing the
last iteration is still being doing with the last, final annotation scheme. So
we probably have better results than this. But this is what we know we had
last summer. So the corpora score for inter-annotator argument were two
annotators that they had in the constraint level for each pair was 75%. And
the sentence level means I count the sentence as correct if every pair of
quantifiers have been scoped correctly, it was 66%.
In easy ones, because we have also as you know a scoping traction for every
pair, I just didn't care about scoping traction. I only considered two pair
incorrect if both annotators assigned the scope inference but they didn't
disagree. So definitely that's an easier thing. But maybe in practice,
actually, that makes more sense.
So you definitely have a higher ->>:
Could you just explain what inter-annotator agreement is?
>> Mehdi Hafezi Manshadi: Sure. So you have two an Taters, okay. And each of
them are, let's say you have a classification task. You want to assign two
labels to binary classification to some points in the space. And so you have
two annotators labeled as classes. And then you compare how much they agree so
what do you do first? You subtract the chance. If the agreement by chance.
For example if you have one classification, binary classification. So there's
a 50 percent chance they actually agree.
So if they agree on 50 percent of the points, basically the inter-annotator
agreement is zero. So basically, you subtract the chance part and then you
have some notion of what is the portion of the labels that actually they agree
on this. Points that they agree on the labeling. So that's basically roughly
the idea of inter-annotator ->>:
[indiscernible] comma, what do you ask annotator to annotate?
>> Mehdi Hafezi Manshadi: So basically, we defined this problem of type-token
distinction in the scheme and said okay, actually, that's another noun phrase
22
that actually uses more than one entity in the domain of discourse. So when
you have a comma, you have two entity. One entity is the type, and one entity
is the realization of the type. That physical realization.
So basically, we have two entity. And we ask them to annotate both entities.
However, we have default rules so that it doesn't become cumbersome, the
annotation.
So it becomes very, actually, intuitive, the way we develop it.
>>:
[inaudible].
>> Mehdi Hafezi Manshadi: It depends if the noun is a type-token or not.
every noun is a type-token noun, okay?
>>:
Not
It has to be very complicated.
>> Mehdi Hafezi Manshadi:
It's very complex.
Very, very complex, yes.
>>: So one more follow-up question on the IAA.
IAA, the better of quantity of the --
So the higher the value of
>> Mehdi Hafezi Manshadi: Annotation, and the easier you would say the tasks.
Because even human, here even humans don't agree that much on what this
co-annotation is. So if you have a question answering like, you know, very
common sense knowledge, so you have very high agreement among annotators, okay.
But like in this case, it's hard, even, you know, with all the things that we
consider and this large annotation scheme, still we only have 75% agreement
between annotators.
>>: Since you do annotate the data, you could just use the standard machine
learning technique.
>> Mehdi Hafezi Manshadi:
Sure, yes.
>>: [indiscernible] as your structure base, I suppose you are using all these
constraints [indiscernible].
>> Mehdi Hafezi Manshadi: Right. But this is just for the hand annotations.
That's annotation scheme for humans. So once we have the corpus, we just learn
23
machine learning techniques to learn it.
>>:
I see.
So do you use the structure as --
>> Mehdi Hafezi Manshadi: Yeah, of course. Those are features.
Some features may help, some may help better. Sure.
>>:
So another question.
They help.
So what's the Kappa easy?
>> Mehdi Hafezi Manshadi: So this is basically, if you have no scoping
traction between two pairs.
>>:
I see.
>> Mehdi Hafezi Manshadi: And one person actually says no scoping traction,
one person, you know, places actually order, we say that's okay.
>>:
Okay.
>> Mehdi Hafezi Manshadi: Because, you know in practice it doesn't matter.
They know it's captured, but the order doesn't matter. That's the easy one.
>>: Are these two annotators different than the people who originally labeled
the corpus, or is it the same people?
>> Mehdi Hafezi Manshadi: Well, actually, one of -- there are two different
people. One of them is from that two-person that actually were involved in the
annotation. And one of them was a person was new and we just gave them the
annotation scheme and said okay, you should annotate that.
>>: So these annotations, there's a question about their agreement with the
other annotations?
>> Mehdi Hafezi Manshadi:
>>:
Because they're sort of a true annotation baseline, presumably.
>> Mehdi Hafezi Manshadi:
>>:
Right.
Yes.
And you're going to use your machine.
24
>> Mehdi Hafezi Manshadi:
>>:
Exactly, sure, sure.
How did they do with respect to that?
>> Mehdi Hafezi Manshadi:
minutes.
So actually, I'll get to that in just a couple of
Okay. So we also tried to actually get unlabeled data, because it's always
hard to get data, especially in this domain. So we did this, actually. We had
the natural language descriptions of editing plain text. We annotate them with
two examples of input output text. Ske put that text on the mechanical Turk
just to input output, and we asked the people, random people, give us a
description of a task that if you apply to the input, it generates the output.
Why give two? Because if you give one, there's always a trivial, you know,
function that does that. So we give two so that we actually rule out that
trivial one. But still, you know, you may have a lot of tasks. That's okay.
We don't care if it's exactly the same thing.
>>:
I'm confused.
5,000, 3,000, 4,000.
>> Mehdi Hafezi Manshadi: This is just a text file that you apply to the task,
but this is the task. Sort all the lines by their second field, okay. So we
give them this input/output and we ask them to give some description. And this
is like the real descriptions that they have given us.
>>:
They have to detect the pattern.
>> Mehdi Hafezi Manshadi: Exactly, they have to detect the pattern. And
actually, they did pretty good job on that. And, you know, this, we have
really a large corpus. So we have like 2,000. We enlarge our corpus from 500
to 2000 using just cloud sourcing. And question have a variety of syntactical
structures, because now they're ordinary people. The people that we'd actually
like to work with at the end if we want to do natural language programming, not
just programming experts that regular expressions, that they know programming.
Like they have their specific grammar. So these are like we have a variety of
syntactic structures, a variety of vocabulary.
>>:
Were they just sorting problems?
25
>> Mehdi Hafezi Manshadi: No. Editing plain text files. If you remember, we
had 500 tasks to begin with. We gathered those tasks from internet. Now we
annotated those 500 tasks with examples. There were all sort of regular
expression, you know, processing or any kind of editing plain text files that
you could do with any structure, like Linux set.
So given the naive crowd, how well were they able to identify what was intended
by the different you put in the output? I found very simple mathematic tasks,
you've give people a set of four numbers and say, okay, what's the change
between the left-hand side and the right-hand side, they often can't figure it
out.
>> Mehdi Hafezi Manshadi:
>>:
Cannot.
Can or cannot?
Just by inspecting the numbers.
>> Mehdi Hafezi Manshadi: Right. So the point is that we may -- the original
task may be about something, but actually the example may have more than one
solution. We are absolutely fine with that. So all we care is to get more
sentences. We don't care if actually they intend our intended task or not. We
just want to enlarge or corpus. So that's not a factor here. That's not
important for us at all. Do you see my point?
But actually, it really depends on how good examples are and, you know, how
many actual tasks you can find in a right away. So it's like a very complex
measurement if you want to measure how good they find it. So it hasn't been
designed for this purpose, basically.
Okay. So let's have a quick look on comparison with the existing corpora. So
this is actually those three corpora that were available. People did automatic
scope disambiguation. They trained the statistical model and they tried to get
accuracy. So these are basically the accuracies. I just want to give you some
idea what the level of accuracy, when you have just one class. You have two,
you know, noun phrases and you have three label. A out-scopes B, B out-scopes
A, no scoping traction. So this is the level of accuracy they get.
So now let's see what we did.
>>:
Is there a difference in baseline?
26
>> Mehdi Hafezi Manshadi: So the baseline, basically, for example, for this
baseline is the majority label. For most of them, action is the majority
label. For, example if no scoping traction is the most common, most frequent
label, then that's the label, because it's a single class classification,
basically.
Okay. So now because, you know, nobody has previously done quantifier scope
disambiguation automatically when you have more than one quantifier so we have
to actually define a model for quantifier scope disambiguation. So basically,
for every two pair you have three kind of labels. That's the same as people
previously did, okay.
But now we have for every sentence, you have -- if you have N noun phrases, N
scoping elements, you have the choices of two out of N pairs. And for every
pair, you have one of these three. What is this? This is actually partial
order. So that's actually how we define a scope, quantifier scope
disambiguation. For us, quantifier scope disambiguation is defined as to learn
a partial order. Which is equivalent to a direct cyclic graph, right.
So this is our model of quantifier scope disambiguation. That's definitely a
simplified model. It doesn't take care of all the phenomena that's existed,
but it's a fair model. It's reasonable. So it's interesting. You know, okay,
I understand quantifier scope ambiguity because it's hard, you know, and very
often it doesn't matter. People haven't worked on it. But even in machine
learning community, you can't see much work on learning partial orders. There
are lots of work on learning total orders, ranking because of ranking web pages
and this kind of stuff. But there is no work on partial learning to build
partial orders. When not every two pair you have an order for every two pair.
So, for example if you go -- so, for example, here, there is no order between
these two.
>>:
Oh, I see.
>>:
I see.
So this means partial order.
>> Mehdi Hafezi Manshadi: There is actually people learn partial order, but
just because they're not confident, you know, what the order is. Not because
the items are not comparable.
>>:
So pretty much by limiting the tree structure?
27
>> Mehdi Hafezi Manshadi:
>>:
Exactly correct.
That's basically what it is.
Yes.
It's hard.
>> Mehdi Hafezi Manshadi: Yes, it is hard. So that's a structured learning
problem in general. Now let's define evaluation metrics. There's no
evaluation metric before. So we have the direct cyclic graphs and we define
evaluation metrics like, you know, basically precision recall on the number of
actually edges that you can recover correctly. So we don't care that much
about the pairs that have no scoping traction so we care out of the pairs that
have scope preference, what is the portion that you get right and this kind of
stuff.
And this is, what is this plus means in this is transitive closure. So if you
look here, for example, this is five and three. Five out-scopes three, right,
but somebody may actually have that in some annotator. The other annotator say
okay, you can entail that from the other. So two actually have a correct
evaluation method to build a transitive closure. Transitive closure means that
if there is a path between A to B, you connect them using with one edge.
And so you define this transitive closure, and these are the metrics, based on
transitive closure, and in our machine learning model is pretty basic.
Basically wanted to define a baseline on this corpus for scope disambiguation.
So what is this baseline model that we use? It's a naive model that we use.
We basically, just to read every pair of elements as a single classification
task. Where you have either A out-scopes A or B out-scopes A or no scoping
traction. So and we do this individual classification tasks. So basically,
you know, now it becomes a classification task. We use multiclass SVM to
actually do the classification.
>>:
So how about where those words are in the sentence, the position?
>> Mehdi Hafezi Manshadi: Okay. Sure. Sure. Here. So your question is what
kind of features, basically, we use. So, you know, these are like some of the
features. Just to give you some idea. So we have this sentence, and these are
the chunks, the scope bearing elements that you have. And so we want to
predict a directed cyclic graph over these nodes, for example, here we have,
like, four nodes, for example.
28
So a dependency parser. We use a dependency parser to rank its syntactic
structure of the sentence. And we use that dependency actually features.
feed that into the out model. So that's basically where the syntactic ->>:
We
You could have [indiscernible].
>> Mehdi Hafezi Manshadi: Sure, of course. I mean, we also have lexical
features of the domain. This is actually the most important part. You know,
the scope disambiguation, we use a lot of domain knowledge to do that. In text
editing domain, usually every file has some lines, every line has some word,
every word has some characters, right. So this is the kind of knowledge that
our model tries to have learn from data, okay. And because the number of
concepts are small, this 500 sentences are enough to have reliable statistics
for this kind of domain knowledge.
But at the same time, so these are the lexical features that help. Also,
lexical features of the quantifiers and not just nouns. The concepts, also the
quantifiers plus syntactic structure. So these are the kind of actual features
that we use. And so this is our experiment. We use hundred sentence for the
development, and for the 400 sentence, we just use five-fold cross-validation.
So this is the number of noun, you know, noun phrases in the sentence average,
3.6. So this gives the number of constraints. You remember that we have, for
every pair, there is one point in the space and one classification task, right.
And this is actually the performance of our model and that actually gives you
some idea of how model does with respect to the inter-annotator agreement. I
think that was your question. So the baseline for us is left-to-right order.
And then that's the best baseline we've found. So this is the baseline and
this is our model, which, you know, has 78%. So if you compare with
inter-annotator agreement, which was 75%, actually it goes above inter
annotator agreement, and that's exactly your point, because inter-annotator
agreement is when you bring another person. But when two person have worked,
then the annotations are much more consistent.
So we can go, actually, much above this number. But that's a very preliminary
model, you know, we're justifying the baseline on this corpus if people want to
work on quantifiers [indiscernible] later on this corpus, they have some
baseline.
So these are the results that we got.
So here you can actually have a
29
comparison with respect to other work. This is not really a good comparison,
because these are different kind of tasks, right. Here, you have like, you
know, very, very complex tasks. But here, basically, it's just two quantifiers
and one single class classification task. Also, the data is not the same. So
here you have like Wall Street Journal, this domain is very simple. This
domain is very complex. And our domain is very different.
>>: Wall Street Journal, what kind of task, do they talk about the noun scope
for Wall Street Journal.
>> Mehdi Hafezi Manshadi: Yes. So you have two noun phrases, which have
explicit quantification. They're restricted noun phrases, just restricted
quantification and they just decide whether A out-scopes B or B out-scopes A or
there's no traction. That's it.
>>: So the 70 percent, so that's some metric. The question is in terms of the
task, if you were to take the interpretation of the task of that sentence, how
close, how often does it get the right interpretation?
>> Mehdi Hafezi Manshadi: Well, that's another question and I'm going to get
to that in the next slide.
>>:
[indiscernible].
>> Mehdi Hafezi Manshadi: Well, I'll get to that in my next slide. But not in
this domain because we don't have data is not available. But the data for this
work is available, yes. I did. And this is basically how we did on their
corpus. We did, actually, more poorer, but that's very natural, because, you
know, we developed our model on this corpus, but they developed their model on
their own corpus, so we just applied the model without any modification, with
no domain adaptation. We just applied or model and it does pretty competitive
with respect to what they did on their own model.
>>: It looks like [indiscernible] so what if you look at some structured
domains like [indiscernible]. Do you think this technique would work much
better?
>> Mehdi Hafezi Manshadi: Well, natural language programming is already, I
think it's very intuitive domain and that's why we picked that. So basically,
I don't think that you can get easier than that, because as I said, it's very
30
intuitive once you have the sentence and you want to -- you have to have a
conscious knowledge of scope disambiguation if you want to do this task.
>>:
But then people are probably saying things [indiscernible].
>> Mehdi Hafezi Manshadi:
>>:
Right.
So is that your text editing tasks are [indiscernible].
>> Mehdi Hafezi Manshadi: Well, I mean, sure. That's one -- as I said, there
are three domains that I'm interested to work on as I previously mentioned.
Legal documents, because they have to be precise. Scientific texts, and
natural language programming. So for now, we have used natural language
programming domain. But definitely, that's one of the domains that I want to
do work on and actually it's going to be on my future work list that actually
moving to these domains, like scientific texts and seeing how different.
I really don't have any, you know, I don't make any conjecture it's going to be
better or not. But we'll see.
Okay. So now I want to get to actually another point that you made, okay. We
don't have the quantifier scope ambiguity right. Are we actually able to do
the actual task correctly or not? Because it's not the main topic of my talk,
I'm not going to go into the detail of this, and because this is a work in
progress. But if you are interested, I'm more than happy to talk about it
after the talk.
So here is where actually quantifier scope disambiguation means programming by
example. It actually is in, you know, this is a concept that has been
introduced recently in the last couple of years in the statistical NLP,
indirect supervision. So what does indirect supervision means? Let's say you
have quantifier -- we have question answering system, okay. So when you have
full supervision, you have, let's say, you know, the syntactic parts of the
sentences. You have the semantic representation of the parses, and you try to
train your model on that. But let's say you don't have any of those. You just
have the question and you have the answer. That's all you have. So it is sort
of supervised, because you have something. Some label. But it's, you know, an
indirect supervision, because you don't have the actual like semantic label
with semantic representations. So that's pretty much what we want to do here
or we are doing here.
31
So what we did, we did for these tasks, we actually annotated these tasks with
examples. We used crowd sourcing so that we have many examples, let's say,
like five examples, six examples, eight examples for each task. And then we
used actually a programming by example system to learn the program. And now,
you don't have a human annotation anymore. You basically have the output of -you have the natural language description, and the output in terms of the
programming could have been learned by these examples.
And then you can actually see whether your model is able to find that
programming code or not. So, you know, it doesn't matter. Maybe you don't get
even the quantification wrong. But as long as you're able to predict the code,
you're good.
So that's basically what we are working on right now. And that's what I
believe is the right way to go. Because that's what humans do. We don't do
this quantifier scope ambiguity in abstract. We have model of the work, and we
do that in respect to the model. And that's why we do that so easily. Because
once you have the model, it's much easier to actually do the scope
disambiguation, when you have the final goal. So that's basically what my
current research is actually on and I'm working on actually doing these tasks
and using this indirect supervision without manual annotation of the scoping by
human.
Okay. So let me just summarize. So we had contributions in three different
areas when it comes to quantifier scope ambiguity. Scope underspecification,
we found the largest tractable subset based on underspecification framework
that has been found so far. And we showed that actually, you know, every
coherent natural language sentence belongs to that subset. So basically, I
mean, I believe that we have solved that problem, because, you know, every
coherent natural language sentence is in our subset. So we are done with that,
that people have been working on for two, three decades.
In scope annotation, we developed the first comprehensive scope disambiguation
corpus, which we consider all the scope bearing elements quantifiers, you know,
I mean bare nouns, plurals and scopal operators.
And finally, we have the first very naive model that actually does this
unrestricted quantifier scope disambiguation and you have more than two scope
elements in the sentence and we developed the evaluation metrics and the model
32
and everything.
So once again, so this is the summary of our contributions to scope
underspecification. The other thing our notion of coherence, remember, showed
that if you have a linear order of quantifier, that's enough, which helped a
lot so you don't have to predict that through your structures.
And this is our contribution in scope annotation.
disambiguation.
And finally, the scope
So what is the future work? Improving annotation scheme. We can still do that
if we want to do hand annotation. We have, in every round in actual annotation
we have done, we have improved. We still can do that.
Expanding the corpus, more natural, less restricted domains or also even strict
domains, like legal documents or scientific texts so I'm very interested
actually to apply this corpus to other domains. And then, you know, using the
model, actually, without hand annotation. Just a new domain. Using domain
adaptation techniques, see how well it does. That's another interesting
problem to look at.
And finally, and more importantly, learning to do scope disambiguation with
indirect supervision, which we don't have that actually annotation of scoping
and that's actually what I think matters and that's what actually important in
practice.
So that was my talk, and I would be more than happy to answer any questions.
>>:
So which [indiscernible] disambiguation versus just [indiscernible]?
>> Mehdi Hafezi Manshadi: This is much harder than [indiscernible]
ambiguation. I told you before, we had just the first corpus had just only 52%
inter-annotator agreement. But for disambiguation, even with word, which is
crazy, we had this fine-grained census that it's really hard even for human
being, still, you know, inter-annotator agreement is in 70 something percent.
So scope annotation is really hard because it's a very deep semantic phenomena,
and you have so resolve a lot of other issues before ->>:
[indiscernible].
33
>> Mehdi Hafezi Manshadi: Exactly, you cannot grab Jack on the street and say
do a scope annotation for us. It doesn't work.
>>: So you're interested in legal and scientific domains. So what's the
application of this to those domains? If you had this and you could do it, how
would it help you?
>> Mehdi Hafezi Manshadi: Well, I mean, basically, you can, there are a lot of
things you can do. For example, you can do [indiscernible]. You have like the
laws and expressions in article language sentences, and you want to Intel
whether something is a crime or not, for example. So you have these all, you
know, full semantic representation and you can do the Intelment on it. So
that's one thing.
But the other thing is when you are working on scientific texts, you can do
problem solving, like you have a physics question and you want to see whether
somebody's answer is correct or not, automatic evaluation of answers, so forth.
Tutoring systems, for example.
So yeah, there are actually.
>>:
[indiscernible].
>> Mehdi Hafezi Manshadi: Actually, no. You know why? Because in motion
translation, people don't care. Because, you know, you just translate noun
phrase to noun phrase and you don't care whether, you know, this noun phrase
have wider scope. There might be cases that actually matters, because, you
know, depending on what your scope, your preferred scope is, but as ->>:
But think about just [indiscernible] there's not this type of --
>> Mehdi Hafezi Manshadi:
>>:
It doesn't matter.
[indiscernible].
>> Mehdi Hafezi Manshadi: But, you know, the point is that in motion
translation, we don't even do semantics. It's very, very shallow sense.
have a long time to get to the point of actually this matters.
So it
34
>>: I would imagine being used as a sort of final stage re-ranking, maybe
checking whether the original has been preserved or maybe [indiscernible]
preserves the ranking order. But it's a lot of if for very little gain.
>> Mehdi Hafezi Manshadi: For example in Persian, there is word order
variation. So what your scope, what you're scoping you are actually mean to
carry may depend on the order of the, like, you know, constituent in the
sentence. So that may actually help that point. But, you know, that's looking
long in the future. Right now ->>: In Japanese, scope order is generally read off the sentence, the
[indiscernible] of phrases. Introduces ambiguity.
>> Mehdi Hafezi Manshadi:
Right.
>>: If you want to keep that ambiguity, then maybe you need to
[indiscernible].
>> Mehdi Hafezi Manshadi:
>>:
Okay.
So think of it as less ambiguous.
>> Mehdi Hafezi Manshadi: Well, because in English, it has articles all the
time, it may help because like in Persian, all the nouns, most of the nouns are
bare nouns so it's like you don't have much clue, you know, at least from
lexical item of the quantifier may help. And actually, it does help in
English. We've used that as a feature, like, but you often don't have that in
Persian. So it makes it a little harder. But I don't think that much.
>>: So there's been this fair amount of work on understanding or doing natural
language interfaces to databases, natural language queries. And the question,
I guess, is given the investment in that, and there is at some level some
commercial interest in that capability, how does your work relate to it and why
is it not available right now? Terms of success?
>> Mehdi Hafezi Manshadi: Sure, okay. First, let me answer your first
question. So actually, very, very, very good point. So as soon as -- so as I
said before, people really didn't, weren't very interested, especially
statistical people in scope ambiguity. But recently, people have tried to go
deep, right, like queries, natural language data query databases. Like we have
35
this geography domain that you have geography of the United States and there
are questions about what is the state that borders this and this.
And actually, the best model known so far, the state of the art model, which
was actually proposed by -- what was his name? I forgot. [indiscernible] last
summer in ACL, actually incorporates the scope ambiguity into the model.
However, you know, because there you have a lot of proper names, like state
names and this kind of, you don't have many scoping tractions. So they were
able, actually, to have this scope, quantifier scope ambiguity modeled in the
system and helped, actually, them to have a better performance.
But they didn't model as a separate -- they didn't have it as a separate model.
So yes, that's definitely going to help in natural language queries to
databases. That's actually one of the applications that when I started I was
looking at. Then I got into natural language programming and I stuck there.
And why it's not available? It's actually going to be available very soon.
Hopefully at the end of the summer. I mean, that's, yeah, before my, you know,
I defend my dissertation, which hopefully is in the summer. That data will be
available on my web page. And you can get, you know, the data, the classifier
that will train the corpus and everything.
And the PODs are available in all the publications I have had, which a couple
of them are very recent so you may want to wait a couple of weeks so that they
are actually available on the internet. But you can email me and I can send
you my publications.
>> Sumit Gulwani:
So let's thank our speaker now.
Download