1 >> Sumit Gulwani: Hello, everyone. So thanks for coming to the talk. It is my great pleasure to introduce Mehdi Manshadi, who is finishing up his Ph.D. at University of Rochester. So before that, Mehdi went to University of Technology in Tehran, in Iran, and his primary region of interest is in deep understanding of natural language. And this is precisely the kinds of techniques which we feel we need for building good, natural language front-end interfaces for end-user programming in intelligence computing systems. So Mehdi is also interested in machine learning and human computer interaction and today, he's going to talk about dealing with quantifier scope ambiguity in computational linguistics. And I think this is -- this should be of joint interest to both people in programming languages and natural language processing groups. So all to you, Mehdi. >> Mehdi Hafezi Manshadi: Thank you very much. Hello, everybody. So today, I'm going to talk about dealing with quantifier scope ambiguity in natural language processing. I'm going to talk about three different topics in quantifier scope, dealing with quantifier scope ambiguity, scope underspecification, building a corpus and finally doing automatic scope disambiguation using that corpus. So here is an example of quantifier scope ambiguity. Actually, it happened that quantifier scope ambiguity is one of the most challenging problems in natural language understanding. Even when there was a lot of optimism about, you know, deep language understanding, like two decades ago when actually people started doing semantics, they soon realized that quantifier scope ambiguity is really hard. And as I discuss later, actually they decided not to do it from the first place. So as you see, these two sentence, I have it here and this is slide two. So each of them has two readings, right. And the joke is comes from the point that actually, the person interprets the sentence in the less plausible reading of the scoping, the less preferred reading. So one of the reading in the first sentence, every has wide scope, the other one, one has wide scope and the same for the second sentence, one has wide scope or not has wide scope. I prefer to represent, you know, two readings in terms of tree structures because we're going to use the tree structures a lot during the rest of my 2 talk. So, you know, these are the first two readings. The two readings for the first sentence. As you can see, in one of them, every out scopes one and the other one outer scopes every. And the same for not. So you have one out-scope not, and not out-scopes one. So, but quantifiers and negation are not the only sources of scope ambiguity. There are a lot of other scope ambiguity in natural language. For example, plurals carry a lot of scope ambiguity and actually, they're one of the most challenging part in dealing with the scope ambiguity in natural language. Like if you have, you know, all be students met the faculty candidate, the question is whether there were just one, there was one meeting or there were, you know, many meetings, and every person individually met the candidate. We have modal operators, we have sentential adverbials, and we have frequency adverbials. So these are some of the things that actually carry scope ambiguity when you deal with natural language. So how do we deal with quantifier scope ambiguity in computational linguistics? The problem is that, you know, as I said, from the very first point, everybody realize that actually quantifier scope ambiguity is really hard. So basically, they decided not to do it as explain how they actually get around it later. But another good point is that for many tasks, actually, you can, you know, have some rough representation and you work with that and it's not that critical to get the right scoping, you know. Maybe ambiguous interpretation would be enough to do the task. >>: Legal context is what you -- >> Mehdi Hafezi Manshadi: Yes, I'll get to that. So the other reason that there hasn't been much work on quantifier scope ambiguity is the focus in the last two decades, has everybody knows, has been on shallow text processing. But the other reason is that only a few restricted scope-annotated corpora are available, and that's why a statistical community wasn't very excited to work on this problem. But finally, and most importantly, quantifier scope ambiguity has been misunderstood in computational linguistics, as I'm going to discuss it in my next slide. >>: [indiscernible] English than in other languages? 3 >> Mehdi Hafezi Manshadi: No, it's almost the same in all languages, yes. Yes. Yes, it's pretty much universal. So because I work in quantifier scope ambiguity and stuff and I go to conferences, I often here these kind of sentences from people. Like, you know, we search through whole Brown corpus and we find only two cases of ambiguity. So that's exactly the misunderstanding I'm talking about. And let's see what I mean by misunderstanding. The point is that most people actually, when you talk about quantifier scope ambiguity, they, you know, they think about every NP, every N sum, explicit quantification. But there are many other noun phrases that actually carry scope ambiguity. For example, NPs with definites, definite NPs even carry scope ambiguity. Most people believe that definite always have the widest scope. But that's not true. For example if you look at the first sentence, we paid 2,000 there to the father of every family. Now, you know, therefore, there is in the scope of every family, right. Season though you my find context actually it's the other way around. If you Google the father of every family, you will find a lot of really [indiscernible] people talk about the father of every family when they mean God. So basically, you know, definites can carry a scope ambiguity. Even bare nouns like in order, sort nates in alphabetical order. Or more importantly, conjunctions that happen a lot. Like if you have Canada and Australia have a universal healthcare system. Here you have an ambiguity where they're the same universal healthcare system or different ones, where you have A outer scopes conjunction, or conjunction outer scopes A. >>: Where is the ambiguity in the order, for example? >> Mehdi Hafezi Manshadi: Sort the names in alphabetical order. So basically, think about what is, like, you know, you can have in all different orders, okay. So what I'm saying here is that they that I not be a true ambiguity in this sentence. That's exactly what I want to talk about. There may not be true ambiguities, but for the machine, there is an ambiguity because order can have ambiguity. For example, think about sort the names in every alphabetical order. You still have an ambiguity whether order out-scopes names or names out-scope order. That's not a true ambiguity, because for human beings, we know that there is 4 one order for all the names. But when it comes to machine understanding, because we don't have that word knowledge, okay, then there is this ambiguity whether order out-scopes names or names out-scopes order. But I will get to that very shortly. So theoretical semanticists are mainly interested in examples with true ambiguity. That's exactly my point. You know, like everybody likes two songs from this album. Whether there's two songs the same or it's different. Or three many carried two desks. So these kind of examples actually throw off a lot of computational language and people say we don't care, actually, whether three men carried the same two desks or different. Usually, in the context, you know what you're talking about, probably there are just two desks and all you care is that the two desks has been carried and you don't care whether three men, they all helped or everybody individually did that. So that's why sometimes this problem has thrown off people. But actually, when you are talking about computational linguistics, every sentence that has more than one scope bearing elements carry a scope ambiguity. And that's exactly my point. >>: The whole point of natural language is this you don't have to specify that carefully what this person would carry. So why is this a difficult problem? >> Mehdi Hafezi Manshadi: Why is it an important problem? >>: Some of these people, when they say three men carried two desks, that's what they mean. >> Mehdi Hafezi Manshadi: >>: Exactly. So that's one of the reasons that we -- People care about the -- >> Mehdi Hafezi Manshadi: It depends on the task, okay? Here is exactly the answer to your question and your previous question. So quantifier scope ambiguity. How critical quantifier scope ambiguity is, it depends on the domain that you're working in. And that's exactly another point that I want to make. So if you are talking about legal documents, as you said before, if you are talking about scientific text, physics, math and this kind of stuff, or if 5 you're talking about natural language programming and descriptions that are about natural language, then you have to be precise. No ambiguity is tolerated, for example, if you are in natural language because you are converting this natural language in a formal description, programming code, and there's no ambiguity can be tolerated. So yes, it depends definitely on the domain, but also on the task too. For example, you know, maybe you are able to do the task, some tasks without even if a domain has a lot of scoping traction, you can do it with that ambiguous action representation. So some tasks may require actually quantifier scope ambiguity, like entailment or natural language programming or depending on the type of questions. Question answering may also be one of the tasks that actually requires that understanding, that disambiguation. >>: [inaudible] some of us are looking into it. >>: I think we [inaudible]. >>: Oh, I see. >>: And also very interested in allowing people [indiscernible] natural language. >>: [indiscernible]. >>: So I'm not really sure about all that, but there's a new line of work that is coming up that is called programs [indiscernible] for Windows specifications and examples of natural language are [indiscernible] that can actually be programmed. And [indiscernible] specifications. >> Mehdi Hafezi Manshadi: So since you asked Sumit whether people are working, I answered whether you meant in Microsoft. If you mean in general people are working, the answer is yes. I'm actually one of those people who are actually working in this from quantifier scope ambiguity, I got into domain. And once I got into the domain, I was okay, let's go one step further and see if we can actually do the natural language programming. >>: Are there groups of people doing natural -- >> Mehdi Hafezi Manshadi: Aren't, there aren't many people. I'll get there 6 when I talk about using natural language or using programming, by example, to do quantifier scope disambiguation and this kind of stuff, I will talk about natural language programming and why it has not attracted much attention. So what do we do with quantifier scope ambiguity natural language? Well, as I said, what many people actually decided to do, like [indiscernible], was not to do it because it was really hard and they decided to actually leave the scope under specified and answer to your question, maybe you don't even need for many tasks to do scope disambiguation. For example, if you have there is one soulmate for every person, you represent the semantic using something like that. Where actually you don't talk about, as you can see, the body of the quantifiers, it's left underspecified so you don't talk about, basically, which quantifier has wider scope and you just use the scoping under specified. So either way can interpret it from this sentence. But also some people do scope disambiguation. So there are two ways actually to do scope disambiguation. Heuristic-based approach and corpus-based. I'll talk about both of them very soon. Scope underspecification. So actually, if you look at deep understanding systems, current deep understanding, deep natural language understanding system, most of them use this underspecification thing. For example, TRIPS that we have in Rochester, Boxer system from Bos in the university of Rome in Italy, ERG, which is one of the biggest, actually, resources for English grammar, which is basically HGSP-based, head driven phrase and structure-based. Corpora, and it uses minimal recursion semantics. They're all used underspecified semantic representation. So basically, that's the most popular way to deal with quantifier scope ambiguity, because actually, you know, in many tasks, can do some stuff without actually specifying the scope precisely. >>: So basically, if you use this, you list all the possible ambiguous interpretation? >> Mehdi Hafezi Manshadi: Yes. There are ways to do that. First of all, you know, there are works on actually doing inference with underspecified semantic representations. But one of the things that people do, like they use weakest reading, okay. There's this concept, you know, the weakest reading of a sentence. When you have many possible readings, okay. So you actually get the 7 ones that are weakest, that actually can be entailed from some reading, but cannot entail any other reading. So if you entail something from that weakest reading, that can be entailed from the other readings, definitely, but not vice versa. So any entailment that you make from the weakest reading is sound. But definitely, it's not complete because you're missing some information. So there are ways, actually, to do entailment without doing the scope disambiguation. So like early frameworks were very simple, something like that, like would suggest. But recent formalism actually use constrained-based frameworks. What does it mean, constraint-based frameworks? I'm going to show you exactly how it works. So let's have a quick look again on the kind of readings that we had before for the two sentences that we were discussing. So we actually show quantifiers and scope operators in this way. So every quantifier, you have the reception of the quantifier, and you don't specify what predicate is in using the restriction and in the body. Now, and you have this predicate as dot the notes, which we call them labels and you have holes and you just plug these labels into holes to build one reading. But obviously, not every plugging of labels to hole is a valid reading. For example, you cannot plug person of X in the body of every, because person of X is actually the restriction of every, right. So there are some constraints. So this is exactly what they're talking about when you say constraint-based frameworks. So you have some constraint. These are dominance constraint. So, for example, they say that the final solution, for example, this, whatever label is plugged into this hole out-scope person of X. Must be above person of X. Must dominate person of X. So the same thing for soulmate. And these two actually are binding constraints, because every available has to be in the scope of its quantifier. So this predicate exists for X and Y, has to be in the scope of both every and one. So the same thing very similar thing for not. So now, what happens, for example, this is one reading that you can build, satisfying all the constraints. As you can see, we have plugged person into this hole, soulmate into this hole one of Y into the body of every X and finally X is into the body of one. And that that's going to be this reading where every has the wider 8 scope over one. So, you know, there are one more way of plugging labels to holes where actually you satisfy all the constraint and that is the other reading. So that's now we have the mathematical model, right. So this is exactly our way to model under framework. So now you have these three structures who some of the leaf notes are holes and you want to plug in the root of some tree into the hole of another tree and find that you want to build one tree structure, a single tree structure which has no hole in it, all the holes are filled, and satisfies all the constraints that has been represented in the representation. So now this is a mathematical model and we want to solve that. So there are two algorithmic problems that need to be involved. One of them is satisfiability problem, where there actually is any reading that satisfies all this constraint. And if there is, let's enumerate all the possible readings, for example. Both problems are NP-complete. It's very intuitive why they are NP-fleet. So people try to find a tractable subset of them. So the first tractable subset was dominance weak nets. But it has some limitations. They claim that actually it covers the coherent, the semantic representation of every coherent sentence, but that was not true. So later, sentences were found that actually are not covered by this tractable subset. So that's why we started work on that and we tried to extend, actually, that framework so that it covers those sentences. So -- sure, go ahead. >>: Understanding the complexity, NP the size of what? >> Mehdi Hafezi Manshadi: >>: Let's say in number of quantifiers. The number of quantifiers? >> Mehdi Hafezi Manshadi: Well, actually, number of quantifiers means number of noun phrases in the sentence, actually. >>: This is a very small number. understand. So why is this a problem, I guess? I don't 9 >> Mehdi Hafezi Manshadi: No, it's not a small number, necessarily. If you would look at, let's say, sentences in the [indiscernible], we can easily have like ten noun phrases. Remember that every noun phrase, you don't have to have explicit quantification. Every noun phrase, definite, indefinite, bare nouns, can carry a scope ambiguity. And actually, not every noun phrases introduce only one element. You have noun phrases that introduce more than one element in the domain of discourse. Like plurals, for example. Why you have collectivity versus distributivity, because plurals, when you're talking about plurals, you're talking about a set. Let's say all the students, okay. You're talking about a set. But at the same time, you have a universal quantified variable over the elements in the set. So that's why when you say all the students met the teacher, there are two possible readings whether, you know, the whole set as an entities is the argument. But actually there's a universe of quantification over every element in this set and there is an individual meeting for every element. So there are actually, there are noun phrases that introduce more than one element in the domain of discourse. So there is actually a problem in general. >>: Still a small number, even in noun phrases, and it seems like not everything is going to have a scope problem. Are there some statistics about on average how many things can be resolved in a set? >> Mehdi Hafezi Manshadi: Well, obviously, it depends on the corpus. But remember that on underspecified representation is -- this is not one problem that actually people have looked into, okay. And that has merit and it has value in terms of linguistic point of view and that's exactly what I want to get into. Even if you don't care how much this takes, there are still satisfiability problems and enumeration problems that you want to solve. The other thing that is so for example, what I want to get into is that this underspecification frameworks and the study of actually how you can solve these underspecified representations in polynomial time has resulted in defining a notion of coherence for us. So basically, this has helped us to develop a mathematical notion for coherence of a sentence. We've actually converted the problem of, you know, predicting the scope ambiguity to predicting a linear order of quantifiers. So while what 10 you have seen so far, you have these two structures for scope disambiguation that you have to predict. But actually using a notion of coherence, you can actually reduce this problem to find the linear order of quantifiers. So basically, my focus here thing. Maybe you're right, ten, you know, readings and that's not the point I want is not that this NP-complete problem is a big maybe actually who cares for, you know, two to the actually we can do that and we don't care. But to make in this talk. >>: So let me see if I got it, and I'll do it by analogy. So with something like the traveling salesman problem. That's an NP-complete problem. >> Mehdi Hafezi Manshadi: Yes. >>: And there, there are lots and lots of possible solutions to enumerate. But if you have a particular solution, it's very easy to evaluate how good it is. >> Mehdi Hafezi Manshadi: Yes. >>: Just adding up the distances as trivial and we understand how to do it. Here, apriori, it seems to me that the hard part is not enumerating the possibilities, but evaluating particular possibility, how good it is. >> Mehdi Hafezi Manshadi: Yes. >>: And what you're saying is that, in fact, those two problems are linked together and you're going to have a clever method for evaluating how good something is that, in fact, ends up constraining the search spaces; is that right? >> Mehdi Hafezi Manshadi: Yes, exactly. And that's what I'm going to get at. And if you still have question after I go over this, I would be more than happy to answer, okay? So okay. This is our formulation. Just let's look at sentences with just noun quantifiers, no other scopal operators, okay? So if you have two quantifiers, it's easy to see there are four possible readings, four possible tree structures. Not readings, I mean tree structures. Some of them are valid, some of them are not valid. So this is a notion that we define, actually. 11 Heart-connected specification graphs, okay. So here, let's look at this sentence. Every professor, whom somebody knew a child of, showed up. And this is the unspecified representation. It's easy to see why. All you have is restriction predicates and the [indiscernible] whole of the quantifier, the constraints, and also binding constraints. That's all you have. Nothing magic here. Now, what we do is we collapse every quantifier and restriction into one note. Okay. As you see in this figure. So only the interconnecting edges remain. And you get to this, actually, graph, which we call a dependency graph. So now, the node which corresponds to the heart formula of the sentence, main predicate of the sentence, we call it the heart, and that's where the notion of heart-connectedness comes from. If you look at every node in this sentence, the node can reach the heart using a directed path. So that's what we call heart-connect in this property. It means every node can reach the heart using a directed path. Now, let's see why it helps to solve the problem in polynomial time. Basically, for those of you who are familiar with vectorization algorithm, this is, the idea of solving this problem is very similar to the idea of this vectorization algorithm, when you actually have dependencies between four loops. So you want to have a four-loop and you want to see whether you can actually do it using a vector operation. So actually, I got the idea from that algorithm. So you recognize the strongly-connected components. And you collapse this around the connected components into one node. So now, you have a directed cyclic graph. And this is directed cyclic graph you can solve easily. For example, if you have a topological order of the nodes, this builds a reading of the sentence. Now, once you do that, then you come back and you go inside every strongly connected component. For example, you go to this strongly connected component. And then you try to solve that. Actually, now this was the heart of the main dependency graph. Now, within this small thing, which is the node that connects to the outside world, we call that node head and actually, it corresponds to the head of the noun phrase, semantic head of the noun phrase. Now, that becomes the head of this small dependency graph, and you have another dependency graph and now you want to solve that. Now you have the directed cycling graph and again if you have strongly connected components, you do this 12 same thing, recursively, and then you solve that. And once you solve it, you replace this node with the actual solution of this. So it's very similar to the idea of a vectorization algorithm, okay. The interesting thing is that once a sentence is heart-connected, if you pull out-scope a quantifier and you want to put that quantifier on top, the rest of the quantifiers are divided into two sets. One set is definitely in the restriction of the quantifier on top, and the other set has to be in the body. So that's exactly where the concept of heart connectedness helps. So once you have a node on top, the rest of the graph are deterministically divided into two sets that can be put in the body and restriction. And that's exactly the point that I want to make about heart connected graphs that actually helps to define quantifier scope ambiguity as to find a linear order of quantifiers. >>: So I'm curious, what's the [indiscernible] evaluate which one. [indiscernible]. >> Mehdi Hafezi Manshadi: >>: No, not yet. It's not We'll get there. Otherwise, you -- >> Mehdi Hafezi Manshadi: This is just about satisfiability. So you have some hard constraints and you have to satisfy all of the constraints, okay. And you want to know whether there is a reading or not. >>: But one single sentence, sometimes you cannot do that. It's inherently ambiguous. If you have another sentence or maybe in the whole paragraph, then you apply the constraint, it's more powerful. >> Mehdi Hafezi Manshadi: >>: Well, that's true. Here, for one single sentence -- >> Mehdi Hafezi Manshadi: Even one single, you know, the idea of constraint-based underspecification is that as you go deeper in linguistic processing, you add now constraints. For example, you go to the discourse level. Now you have some new information and you add that constraint. Now you go to pragmatics level and you have some other information and you add this 13 constraint. So as you go deeper and deeper, you add new constraint and that's exactly the point of making ->>: [indiscernible]. >> Mehdi Hafezi Manshadi: Yes. But, you know, that doesn't matter. You can actually have all the sentences in the domain of discourse and then contribute to the meaning of the sentence and you do that at the same time with this big, huge, actually, underspecified representation. >>: So that constraint is just [indiscernible] otherwise because it's binary. >> Mehdi Hafezi Manshadi: >>: Satisfiable or not at this point. Okay. >> Mehdi Hafezi Manshadi: So why it's important, because once you want to add a constraint, you want to make sure that your reading, your underspecified representation is still satisfiable. If not satisfiable, there is something wrong with that constraint or something wrong before in the underspecified representation. That's why it's important, okay? Now, the beauty of hard connectedness, is actually that we can prove that the dependency graph of every coherent sentence is heart connected. We can linguistically justify that soar strongly connected components actually represent a noun phrase in the sentence. And if a sentence is coherent, this means that this noun phrase, if it introduces a variable in the domain of discourse, contributes to the meaning of the whole sentence somehow. So there must be a predicate outside this noun phrase that has this noun phrase as its argument. So there must be an outgoing edge from this strongly connected node, component, to something outside it. The same for here. And what does that mean? That means at the end, finally you are going to end up in the heart. So there must be a directed path from every node, actually, to the heart. And that's what we call heart-connectedness. So that's the notion of coherence that we defined. That's the mathematical notion of coherence and we showed that that's actually linguistically justified. And that's defined tractable subset of unspecified representations. Whether you care or not, you can solve it in polynomial time. 14 But more importantly, that notion of coherence helps actually to reduce the problem of quantifier scope disambiguation to find a linear order of quantifiers? Why? Because once you have a linear order, that linear order imposes one tree, exactly one tree. Why? Because if you have a node on top of another, one quantifier on top of another quantifier, the rest of the quantifier are deterministically divided into two sets. Some of them are in the restriction and some of them inner the body. So it's enough to find the linear order of the quantifiers and that was exactly the point that I wanted to make. So that's the consequence of heart-connectedness that we basically get from this. And that helps us in the rest of the actually work we have. So it connects, actually, to what we have done before. But the problem actually has attracted a lot of attention in the theoretical semantics and theoretical computer science. However, I mean, in other answer to your question, that you may actually care is that, you know, if you have a whole document, and every document actually is using the sentence. And usually when you want to do a scope ambiguity, you have discourse, and that discourse helps to find what the preferred scoping is. For example, let's talk about natural language description of a series of instructions which at the end are going to perform a task, a programming task. So each sentence, actually, introduces some variable in the domain of discourse. But at the end of the day, you want to have the representation, semantic representation for the whole set of sentences. If, let's say, like it's a pseudo code in natural language. So that thing is going to be big, and that's why you care, because once you want to add a constraint, you want to know whether they actually, the -- basically, whether it remains satisfiable or not. >>: Just to clarify it, are you saying coherence? >> Mehdi Hafezi Manshadi: Yes. >>: Is that related to the linguistic notion of coherence, or do you mean something else? >> Mehdi Hafezi Manshadi: No, this is mathematical notion. That's a mathematical notion. That's a mathematical model that we define for coherence, which is linguistically just fight and helps here. 15 >>: But it's different from the linguistic notion of coherence? >> Mehdi Hafezi Manshadi: Yes. But this is, you know, from a particular point of view which helps on this representation to help to us reduce the problem of quantifier scope ambiguity to predicting a linear order of quantifiers. So in general, you know, if you are looking at the problem from different angles, you may have different definition of scope, you know, coherence. But this is the part that helps. Let's say this is a necessary condition for coherence. So maybe it's not sufficient, but that's a necessary condition. So we get that necessary condition, and we get to this heart-connected directed graphs, okay? Okay. So now, you know, we have this -- so we have translated the quantifier scope disambiguation to predicting a linear order of. Quantifiers, right. So we want to do an automatic scope disambiguation in the classic approach, people use, you know, heuristic-based approaches. What do they do? Like if you look at, you know, works in '80s and '90s, you see people use like these kind of heuristics, like lexical heuristics. For example, each, the quantifier each tends to have the widest scope. Or for example, subject tends to have widest scope over direct object. So they use this kind of heuristics to find a reading. >>: So how do they approaches evaluate what they're doing? know that they're doing well? I mean, how do you >> Mehdi Hafezi Manshadi: Exactly. That's another point that I'm going to get into very quickly, okay. The point is that when you don't have a corpus that has been labeled, how you are evaluating. So basically, people relied on their linguistic intuition, okay, to decide what this heuristics are going to be. But there are no way, there was no way of evaluating them. And I get to that, actually, in a couple of slides. So now so you have the heuristics, but sometimes they conflict, right. Like you have each which is in the direct object. So one of them it has wide scope, another heuristic says it has another scope. So even here, they actually manually assign weights based on their linguistic intuition. 16 But very natural extension of this is using a corpus-based method. First of all, you can define heuristics as features. You can incorporate a lot of other features, like lexical features, like the words which helps to encode domain-specific knowledge, which is very important. You can actually manually assign weights using your intuition. You can actually train your model on a corpus and actually learn the weights, and you can move it to other domains using domain adaptation techniques and you don't have to manually adjust weights. And actually, the other thing is that once you have a corpus, you have an evaluation, basically. You can know how well you're doing if you have a label corpora, right. So the natural extension is corpus-based methods. So okay. So let's get to a corpus-based method. But we have to build a corpus. Interestingly enough, there weren't many scope disambiguity corpora available. Those are the previous corpora available. As you can see, they are very restricted. First of all, like first of all, they're all about just having two quantifiers in the sentence. And they have to be explicit quantifications. Like they don't, you know, handle the definites or bare plurals or bare nouns or bare plurals, conjunctions, nothing. Just two explicit quantifiers. Some in A, some in every. So even with this restriction, still, this one, which was one of the best corpora available before actually we build our own corpora, the inter-annotated agreement is 52%, which is very, very low on this binary decision whether the first quantifier out-scopes the second one or vice versa. >>: Is there always -- sometimes, it is not always you have to read the whole paragraph to understand ->> Mehdi Hafezi Manshadi: The point is that also to understand, actually, that one of the open problems in cognitive science. Even though we understand the meaning, okay, when we want to spell out the scoping, it's really hard for us to do that. So that's why ->>: [inaudible]. >> Mehdi Hafezi Manshadi: Right so that's why even hand annotation of a scoping is hard, and that's exactly why there are no corpora available before. 17 So people who did that, they just, you know, relied on their linguistic intuition. So the first corpora became available in 2003, which was very actually restricted. And the ones after actually were even more restricted. For example, this one only looks at every and A. Only these two quantifier. And the sentence has an exact syntax, direct object. That's it. So that shows how hard the problem is, that people have to narrow down the problem so much to actually be able to solve it. >>: [indiscernible]. >> Mehdi Hafezi Manshadi: Sure, and that's going to be my next slide, or the one after, okay? So we decide ->>: I'm sorry. I just have one more question on the previous one. >> Mehdi Hafezi Manshadi: Sure. >>: When the inter-annotator agreement is 52%, how many different interpretations on average are there? >> Mehdi Hafezi Manshadi: So 33% is the baseline, because there were three labels. So you have two quantification. You have wide scope, narrow scope or no scope interaction. So basically, because there were two quantifiers, you have one classification task with three classes. So 33%, if you pick random. But because in their corpus, actually, the baseline is higher because usually the first one has widest scope, I guess, as I remember. So it's something like 40 percent or something. So actually, it's pretty low, and that shows amazingly how hard this quantifier scope disambiguation is even for [indiscernible] want to spell it out. And we get why actually this is the case in later slides. So now, because this corpora is available, we want to be like corpus, which is much richer. We want to have all nouns incorporating, actually have them assume that they have a scoping traction and we want to have scopal operators like negation and logical operators. More importantly, we want to cover plurals and distributive versus collective and everything. We want to have this unrestricted, full, comprehensive scope disambiguation, okay? So now we are looking for a domain. So that comes to your point. So ordinary 18 language is sometimes really hard because it's not intuitive for us. What is the scoping. But if you go to some domains that are critical that scoping is critical, like legal documents that you said, for natural language programming that I'm interested in, you actually have very intuitive notion of scoping, because basically, to write the code, you are consciously actually doing this scope disambiguation. So it becomes much more intuitive. So basically, that's why we picked this domain, natural language interface for editing plain text files. And these are the kind of examples that we are looking into. Now, so this is our corpus. We find 500 sentences from online resources, like one-line tutorial or arc or Linux commands or some graduate students that use these regular expressions for editing text files. We ask them to give us language description of the tasks that they do. So now we have these 500 sentences. We chunk them into noun phrases and scopal terms automatically with manual supervision so that we have gold standard chunking. And then we did fully scope annotation. But that's not easy, even though the domain seems to be so intuitive in terms of scope disambiguation. So year are some of the statistics of the corpus, just so you have some idea of the distribution of the elements. So like, you know, we have four noun phrases percent in this domain. And these are the distribution of the actual explicit quantification versus definites versus bare nouns, so on and so forth. Now, we actually present a notation for the first time, because before, nobody build this corpora. They build the corpora, but there was just too basically noun phrases and it was a single classification. So there were labels, widest scope, no scope, no interaction. But now we have this several noun phrases, scopal operators, so we actually define this notation for scope annotation. We have plurals. So basically, for like plurals, we have two elements. We have two. We chose the set. In answer to your question, this element introduces two elements to the set to the is the universal quantification over the set. So this is basically how we annotate distributive versus collective readings. So we define this notation. We also annotate core reference and [indiscernible] because we believe that actually they help a lot when you want to do scope disambiguation. Also, that's not the main task that we are looking into. 19 So we defined this notation. But, you know, as I said, it's really hard. Even in this domain. Why? Why is this challenging? First of all, like a lot of the time we have logical equivalence, because you have like two existential or even existential and indefinite. And it doesn't really matter. It turns out that in these cases, people rely on their intuition. And their intuition is not reliable. Even one person's intuition is not reliable, because, you know, because actually logically, sometimes they say okay this has wider scope. But later, a very similar structure, a very similar sentence, they believe the other one has wider scope. Why? Because they're both equivalent logically, okay. So that's one of the problems. So you may suggest why just you just don't label those with, okay, they're the same. But because you don't even have the interpretation of the quantifier, it's not easy to say whether this quantifier is existential or not. For example, you have A, like A can be existential. A can be indefinite. A can be referential, so if you don't have the label of the quantifier, it's even hard to find these cases, these equivalence cases. >>: Who do you have annotate this? >> Mehdi Hafezi Manshadi: So I had two undergrad students, linguistic students. They were pretty smart, actually. >>: One of the things, I came out of linguistic many years ago [indiscernible] quantifiers, quantification, things of that sort. One of the things that we used a little bit in working with linguistics is our intuition has changed as we saw more examples of phenomena and more discussions of what the phenomena, the possible analyses were. We started to see more. And so I was wondering whether they were training [indiscernible]. >> Mehdi Hafezi Manshadi: >>: Multiple iterations, for example. >> Mehdi Hafezi Manshadi: >>: Yes. Sure. Use different results. >> Mehdi Hafezi Manshadi: Exactly. That's exactly one of the points I want to 20 make in this slide, actually. So first, I mean, let me tell you what the challenges are. So there's like, because quantifier scope is really deep semantics, okay. And the point is that you have to resolve a lot of semantic phenomena before you actually get to the quantifier scoping. One of the problems that we had, for example, type-token distinction. What is type-token distinction? Let's say I say, you know, every line ends with a comma. So what is your intuition? We are talking about text files where every line, there is a different comma for every line or the same. Comma is comma, all right. So basically, somebody may put comma out-scopes every. Because some people interpret a comma as the abstract entity type, where philosophers call it. And some people interpret it as a token, as a physical realization of comma. So those, you know, where here actually, many people, if you look at semanticists, very many semanticists say every has wider scope over A. But exactly the same people, if you give them this sentence, they say [indiscernible] has wider scope. Why? Because this is a definite. But this is, you know, a comma. But actually is the same concept. So these are like the kind of phenomena you deal with. You see inter-annotated disagreement, but actually the disagreement is not in scoping. It's somewhere else. So that's exactly the point. That's why intuitions change. So I spend like two years just with these two annotators, we iterate through the corpus, and we try to find places where actually you have scoped disagreement, but actually disagreement is not because of scope ambiguity. It comes from somewhere else. And try to come up with this annotation scheme that actually tries to minimize this. And that's exactly, if you refer to my, my very recent paper, you can see actually some -- actually, you know that's a conference paper so it's very short, but you can at least see some of the problems and how we solved it. But hopefully, very soon the journal version is going to come out. >>: One of the reasons I asked the question about language [indiscernible] is because in Chinese, there's no [indiscernible] become more ambiguous. >> Mehdi Hafezi Manshadi: It's the same, okay. But it doesn't change the phenomena. So you don't have articles. It's a bare noun. But it still has the same, you know, the same concept, right? 21 >>: Except [indiscernible]. >> Mehdi Hafezi Manshadi: Yeah, and sometimes you have more -- exactly, sure. Exactly. Sure. Sure. So now, I mean, this is like the inter-annotator agreement that we got, this is like last summer. So we are actually doing the last iteration is still being doing with the last, final annotation scheme. So we probably have better results than this. But this is what we know we had last summer. So the corpora score for inter-annotator argument were two annotators that they had in the constraint level for each pair was 75%. And the sentence level means I count the sentence as correct if every pair of quantifiers have been scoped correctly, it was 66%. In easy ones, because we have also as you know a scoping traction for every pair, I just didn't care about scoping traction. I only considered two pair incorrect if both annotators assigned the scope inference but they didn't disagree. So definitely that's an easier thing. But maybe in practice, actually, that makes more sense. So you definitely have a higher ->>: Could you just explain what inter-annotator agreement is? >> Mehdi Hafezi Manshadi: Sure. So you have two an Taters, okay. And each of them are, let's say you have a classification task. You want to assign two labels to binary classification to some points in the space. And so you have two annotators labeled as classes. And then you compare how much they agree so what do you do first? You subtract the chance. If the agreement by chance. For example if you have one classification, binary classification. So there's a 50 percent chance they actually agree. So if they agree on 50 percent of the points, basically the inter-annotator agreement is zero. So basically, you subtract the chance part and then you have some notion of what is the portion of the labels that actually they agree on this. Points that they agree on the labeling. So that's basically roughly the idea of inter-annotator ->>: [indiscernible] comma, what do you ask annotator to annotate? >> Mehdi Hafezi Manshadi: So basically, we defined this problem of type-token distinction in the scheme and said okay, actually, that's another noun phrase 22 that actually uses more than one entity in the domain of discourse. So when you have a comma, you have two entity. One entity is the type, and one entity is the realization of the type. That physical realization. So basically, we have two entity. And we ask them to annotate both entities. However, we have default rules so that it doesn't become cumbersome, the annotation. So it becomes very, actually, intuitive, the way we develop it. >>: [inaudible]. >> Mehdi Hafezi Manshadi: It depends if the noun is a type-token or not. every noun is a type-token noun, okay? >>: Not It has to be very complicated. >> Mehdi Hafezi Manshadi: It's very complex. Very, very complex, yes. >>: So one more follow-up question on the IAA. IAA, the better of quantity of the -- So the higher the value of >> Mehdi Hafezi Manshadi: Annotation, and the easier you would say the tasks. Because even human, here even humans don't agree that much on what this co-annotation is. So if you have a question answering like, you know, very common sense knowledge, so you have very high agreement among annotators, okay. But like in this case, it's hard, even, you know, with all the things that we consider and this large annotation scheme, still we only have 75% agreement between annotators. >>: Since you do annotate the data, you could just use the standard machine learning technique. >> Mehdi Hafezi Manshadi: Sure, yes. >>: [indiscernible] as your structure base, I suppose you are using all these constraints [indiscernible]. >> Mehdi Hafezi Manshadi: Right. But this is just for the hand annotations. That's annotation scheme for humans. So once we have the corpus, we just learn 23 machine learning techniques to learn it. >>: I see. So do you use the structure as -- >> Mehdi Hafezi Manshadi: Yeah, of course. Those are features. Some features may help, some may help better. Sure. >>: So another question. They help. So what's the Kappa easy? >> Mehdi Hafezi Manshadi: So this is basically, if you have no scoping traction between two pairs. >>: I see. >> Mehdi Hafezi Manshadi: And one person actually says no scoping traction, one person, you know, places actually order, we say that's okay. >>: Okay. >> Mehdi Hafezi Manshadi: Because, you know in practice it doesn't matter. They know it's captured, but the order doesn't matter. That's the easy one. >>: Are these two annotators different than the people who originally labeled the corpus, or is it the same people? >> Mehdi Hafezi Manshadi: Well, actually, one of -- there are two different people. One of them is from that two-person that actually were involved in the annotation. And one of them was a person was new and we just gave them the annotation scheme and said okay, you should annotate that. >>: So these annotations, there's a question about their agreement with the other annotations? >> Mehdi Hafezi Manshadi: >>: Because they're sort of a true annotation baseline, presumably. >> Mehdi Hafezi Manshadi: >>: Right. Yes. And you're going to use your machine. 24 >> Mehdi Hafezi Manshadi: >>: Exactly, sure, sure. How did they do with respect to that? >> Mehdi Hafezi Manshadi: minutes. So actually, I'll get to that in just a couple of Okay. So we also tried to actually get unlabeled data, because it's always hard to get data, especially in this domain. So we did this, actually. We had the natural language descriptions of editing plain text. We annotate them with two examples of input output text. Ske put that text on the mechanical Turk just to input output, and we asked the people, random people, give us a description of a task that if you apply to the input, it generates the output. Why give two? Because if you give one, there's always a trivial, you know, function that does that. So we give two so that we actually rule out that trivial one. But still, you know, you may have a lot of tasks. That's okay. We don't care if it's exactly the same thing. >>: I'm confused. 5,000, 3,000, 4,000. >> Mehdi Hafezi Manshadi: This is just a text file that you apply to the task, but this is the task. Sort all the lines by their second field, okay. So we give them this input/output and we ask them to give some description. And this is like the real descriptions that they have given us. >>: They have to detect the pattern. >> Mehdi Hafezi Manshadi: Exactly, they have to detect the pattern. And actually, they did pretty good job on that. And, you know, this, we have really a large corpus. So we have like 2,000. We enlarge our corpus from 500 to 2000 using just cloud sourcing. And question have a variety of syntactical structures, because now they're ordinary people. The people that we'd actually like to work with at the end if we want to do natural language programming, not just programming experts that regular expressions, that they know programming. Like they have their specific grammar. So these are like we have a variety of syntactic structures, a variety of vocabulary. >>: Were they just sorting problems? 25 >> Mehdi Hafezi Manshadi: No. Editing plain text files. If you remember, we had 500 tasks to begin with. We gathered those tasks from internet. Now we annotated those 500 tasks with examples. There were all sort of regular expression, you know, processing or any kind of editing plain text files that you could do with any structure, like Linux set. So given the naive crowd, how well were they able to identify what was intended by the different you put in the output? I found very simple mathematic tasks, you've give people a set of four numbers and say, okay, what's the change between the left-hand side and the right-hand side, they often can't figure it out. >> Mehdi Hafezi Manshadi: >>: Cannot. Can or cannot? Just by inspecting the numbers. >> Mehdi Hafezi Manshadi: Right. So the point is that we may -- the original task may be about something, but actually the example may have more than one solution. We are absolutely fine with that. So all we care is to get more sentences. We don't care if actually they intend our intended task or not. We just want to enlarge or corpus. So that's not a factor here. That's not important for us at all. Do you see my point? But actually, it really depends on how good examples are and, you know, how many actual tasks you can find in a right away. So it's like a very complex measurement if you want to measure how good they find it. So it hasn't been designed for this purpose, basically. Okay. So let's have a quick look on comparison with the existing corpora. So this is actually those three corpora that were available. People did automatic scope disambiguation. They trained the statistical model and they tried to get accuracy. So these are basically the accuracies. I just want to give you some idea what the level of accuracy, when you have just one class. You have two, you know, noun phrases and you have three label. A out-scopes B, B out-scopes A, no scoping traction. So this is the level of accuracy they get. So now let's see what we did. >>: Is there a difference in baseline? 26 >> Mehdi Hafezi Manshadi: So the baseline, basically, for example, for this baseline is the majority label. For most of them, action is the majority label. For, example if no scoping traction is the most common, most frequent label, then that's the label, because it's a single class classification, basically. Okay. So now because, you know, nobody has previously done quantifier scope disambiguation automatically when you have more than one quantifier so we have to actually define a model for quantifier scope disambiguation. So basically, for every two pair you have three kind of labels. That's the same as people previously did, okay. But now we have for every sentence, you have -- if you have N noun phrases, N scoping elements, you have the choices of two out of N pairs. And for every pair, you have one of these three. What is this? This is actually partial order. So that's actually how we define a scope, quantifier scope disambiguation. For us, quantifier scope disambiguation is defined as to learn a partial order. Which is equivalent to a direct cyclic graph, right. So this is our model of quantifier scope disambiguation. That's definitely a simplified model. It doesn't take care of all the phenomena that's existed, but it's a fair model. It's reasonable. So it's interesting. You know, okay, I understand quantifier scope ambiguity because it's hard, you know, and very often it doesn't matter. People haven't worked on it. But even in machine learning community, you can't see much work on learning partial orders. There are lots of work on learning total orders, ranking because of ranking web pages and this kind of stuff. But there is no work on partial learning to build partial orders. When not every two pair you have an order for every two pair. So, for example if you go -- so, for example, here, there is no order between these two. >>: Oh, I see. >>: I see. So this means partial order. >> Mehdi Hafezi Manshadi: There is actually people learn partial order, but just because they're not confident, you know, what the order is. Not because the items are not comparable. >>: So pretty much by limiting the tree structure? 27 >> Mehdi Hafezi Manshadi: >>: Exactly correct. That's basically what it is. Yes. It's hard. >> Mehdi Hafezi Manshadi: Yes, it is hard. So that's a structured learning problem in general. Now let's define evaluation metrics. There's no evaluation metric before. So we have the direct cyclic graphs and we define evaluation metrics like, you know, basically precision recall on the number of actually edges that you can recover correctly. So we don't care that much about the pairs that have no scoping traction so we care out of the pairs that have scope preference, what is the portion that you get right and this kind of stuff. And this is, what is this plus means in this is transitive closure. So if you look here, for example, this is five and three. Five out-scopes three, right, but somebody may actually have that in some annotator. The other annotator say okay, you can entail that from the other. So two actually have a correct evaluation method to build a transitive closure. Transitive closure means that if there is a path between A to B, you connect them using with one edge. And so you define this transitive closure, and these are the metrics, based on transitive closure, and in our machine learning model is pretty basic. Basically wanted to define a baseline on this corpus for scope disambiguation. So what is this baseline model that we use? It's a naive model that we use. We basically, just to read every pair of elements as a single classification task. Where you have either A out-scopes A or B out-scopes A or no scoping traction. So and we do this individual classification tasks. So basically, you know, now it becomes a classification task. We use multiclass SVM to actually do the classification. >>: So how about where those words are in the sentence, the position? >> Mehdi Hafezi Manshadi: Okay. Sure. Sure. Here. So your question is what kind of features, basically, we use. So, you know, these are like some of the features. Just to give you some idea. So we have this sentence, and these are the chunks, the scope bearing elements that you have. And so we want to predict a directed cyclic graph over these nodes, for example, here we have, like, four nodes, for example. 28 So a dependency parser. We use a dependency parser to rank its syntactic structure of the sentence. And we use that dependency actually features. feed that into the out model. So that's basically where the syntactic ->>: We You could have [indiscernible]. >> Mehdi Hafezi Manshadi: Sure, of course. I mean, we also have lexical features of the domain. This is actually the most important part. You know, the scope disambiguation, we use a lot of domain knowledge to do that. In text editing domain, usually every file has some lines, every line has some word, every word has some characters, right. So this is the kind of knowledge that our model tries to have learn from data, okay. And because the number of concepts are small, this 500 sentences are enough to have reliable statistics for this kind of domain knowledge. But at the same time, so these are the lexical features that help. Also, lexical features of the quantifiers and not just nouns. The concepts, also the quantifiers plus syntactic structure. So these are the kind of actual features that we use. And so this is our experiment. We use hundred sentence for the development, and for the 400 sentence, we just use five-fold cross-validation. So this is the number of noun, you know, noun phrases in the sentence average, 3.6. So this gives the number of constraints. You remember that we have, for every pair, there is one point in the space and one classification task, right. And this is actually the performance of our model and that actually gives you some idea of how model does with respect to the inter-annotator agreement. I think that was your question. So the baseline for us is left-to-right order. And then that's the best baseline we've found. So this is the baseline and this is our model, which, you know, has 78%. So if you compare with inter-annotator agreement, which was 75%, actually it goes above inter annotator agreement, and that's exactly your point, because inter-annotator agreement is when you bring another person. But when two person have worked, then the annotations are much more consistent. So we can go, actually, much above this number. But that's a very preliminary model, you know, we're justifying the baseline on this corpus if people want to work on quantifiers [indiscernible] later on this corpus, they have some baseline. So these are the results that we got. So here you can actually have a 29 comparison with respect to other work. This is not really a good comparison, because these are different kind of tasks, right. Here, you have like, you know, very, very complex tasks. But here, basically, it's just two quantifiers and one single class classification task. Also, the data is not the same. So here you have like Wall Street Journal, this domain is very simple. This domain is very complex. And our domain is very different. >>: Wall Street Journal, what kind of task, do they talk about the noun scope for Wall Street Journal. >> Mehdi Hafezi Manshadi: Yes. So you have two noun phrases, which have explicit quantification. They're restricted noun phrases, just restricted quantification and they just decide whether A out-scopes B or B out-scopes A or there's no traction. That's it. >>: So the 70 percent, so that's some metric. The question is in terms of the task, if you were to take the interpretation of the task of that sentence, how close, how often does it get the right interpretation? >> Mehdi Hafezi Manshadi: Well, that's another question and I'm going to get to that in the next slide. >>: [indiscernible]. >> Mehdi Hafezi Manshadi: Well, I'll get to that in my next slide. But not in this domain because we don't have data is not available. But the data for this work is available, yes. I did. And this is basically how we did on their corpus. We did, actually, more poorer, but that's very natural, because, you know, we developed our model on this corpus, but they developed their model on their own corpus, so we just applied the model without any modification, with no domain adaptation. We just applied or model and it does pretty competitive with respect to what they did on their own model. >>: It looks like [indiscernible] so what if you look at some structured domains like [indiscernible]. Do you think this technique would work much better? >> Mehdi Hafezi Manshadi: Well, natural language programming is already, I think it's very intuitive domain and that's why we picked that. So basically, I don't think that you can get easier than that, because as I said, it's very 30 intuitive once you have the sentence and you want to -- you have to have a conscious knowledge of scope disambiguation if you want to do this task. >>: But then people are probably saying things [indiscernible]. >> Mehdi Hafezi Manshadi: >>: Right. So is that your text editing tasks are [indiscernible]. >> Mehdi Hafezi Manshadi: Well, I mean, sure. That's one -- as I said, there are three domains that I'm interested to work on as I previously mentioned. Legal documents, because they have to be precise. Scientific texts, and natural language programming. So for now, we have used natural language programming domain. But definitely, that's one of the domains that I want to do work on and actually it's going to be on my future work list that actually moving to these domains, like scientific texts and seeing how different. I really don't have any, you know, I don't make any conjecture it's going to be better or not. But we'll see. Okay. So now I want to get to actually another point that you made, okay. We don't have the quantifier scope ambiguity right. Are we actually able to do the actual task correctly or not? Because it's not the main topic of my talk, I'm not going to go into the detail of this, and because this is a work in progress. But if you are interested, I'm more than happy to talk about it after the talk. So here is where actually quantifier scope disambiguation means programming by example. It actually is in, you know, this is a concept that has been introduced recently in the last couple of years in the statistical NLP, indirect supervision. So what does indirect supervision means? Let's say you have quantifier -- we have question answering system, okay. So when you have full supervision, you have, let's say, you know, the syntactic parts of the sentences. You have the semantic representation of the parses, and you try to train your model on that. But let's say you don't have any of those. You just have the question and you have the answer. That's all you have. So it is sort of supervised, because you have something. Some label. But it's, you know, an indirect supervision, because you don't have the actual like semantic label with semantic representations. So that's pretty much what we want to do here or we are doing here. 31 So what we did, we did for these tasks, we actually annotated these tasks with examples. We used crowd sourcing so that we have many examples, let's say, like five examples, six examples, eight examples for each task. And then we used actually a programming by example system to learn the program. And now, you don't have a human annotation anymore. You basically have the output of -you have the natural language description, and the output in terms of the programming could have been learned by these examples. And then you can actually see whether your model is able to find that programming code or not. So, you know, it doesn't matter. Maybe you don't get even the quantification wrong. But as long as you're able to predict the code, you're good. So that's basically what we are working on right now. And that's what I believe is the right way to go. Because that's what humans do. We don't do this quantifier scope ambiguity in abstract. We have model of the work, and we do that in respect to the model. And that's why we do that so easily. Because once you have the model, it's much easier to actually do the scope disambiguation, when you have the final goal. So that's basically what my current research is actually on and I'm working on actually doing these tasks and using this indirect supervision without manual annotation of the scoping by human. Okay. So let me just summarize. So we had contributions in three different areas when it comes to quantifier scope ambiguity. Scope underspecification, we found the largest tractable subset based on underspecification framework that has been found so far. And we showed that actually, you know, every coherent natural language sentence belongs to that subset. So basically, I mean, I believe that we have solved that problem, because, you know, every coherent natural language sentence is in our subset. So we are done with that, that people have been working on for two, three decades. In scope annotation, we developed the first comprehensive scope disambiguation corpus, which we consider all the scope bearing elements quantifiers, you know, I mean bare nouns, plurals and scopal operators. And finally, we have the first very naive model that actually does this unrestricted quantifier scope disambiguation and you have more than two scope elements in the sentence and we developed the evaluation metrics and the model 32 and everything. So once again, so this is the summary of our contributions to scope underspecification. The other thing our notion of coherence, remember, showed that if you have a linear order of quantifier, that's enough, which helped a lot so you don't have to predict that through your structures. And this is our contribution in scope annotation. disambiguation. And finally, the scope So what is the future work? Improving annotation scheme. We can still do that if we want to do hand annotation. We have, in every round in actual annotation we have done, we have improved. We still can do that. Expanding the corpus, more natural, less restricted domains or also even strict domains, like legal documents or scientific texts so I'm very interested actually to apply this corpus to other domains. And then, you know, using the model, actually, without hand annotation. Just a new domain. Using domain adaptation techniques, see how well it does. That's another interesting problem to look at. And finally, and more importantly, learning to do scope disambiguation with indirect supervision, which we don't have that actually annotation of scoping and that's actually what I think matters and that's what actually important in practice. So that was my talk, and I would be more than happy to answer any questions. >>: So which [indiscernible] disambiguation versus just [indiscernible]? >> Mehdi Hafezi Manshadi: This is much harder than [indiscernible] ambiguation. I told you before, we had just the first corpus had just only 52% inter-annotator agreement. But for disambiguation, even with word, which is crazy, we had this fine-grained census that it's really hard even for human being, still, you know, inter-annotator agreement is in 70 something percent. So scope annotation is really hard because it's a very deep semantic phenomena, and you have so resolve a lot of other issues before ->>: [indiscernible]. 33 >> Mehdi Hafezi Manshadi: Exactly, you cannot grab Jack on the street and say do a scope annotation for us. It doesn't work. >>: So you're interested in legal and scientific domains. So what's the application of this to those domains? If you had this and you could do it, how would it help you? >> Mehdi Hafezi Manshadi: Well, I mean, basically, you can, there are a lot of things you can do. For example, you can do [indiscernible]. You have like the laws and expressions in article language sentences, and you want to Intel whether something is a crime or not, for example. So you have these all, you know, full semantic representation and you can do the Intelment on it. So that's one thing. But the other thing is when you are working on scientific texts, you can do problem solving, like you have a physics question and you want to see whether somebody's answer is correct or not, automatic evaluation of answers, so forth. Tutoring systems, for example. So yeah, there are actually. >>: [indiscernible]. >> Mehdi Hafezi Manshadi: Actually, no. You know why? Because in motion translation, people don't care. Because, you know, you just translate noun phrase to noun phrase and you don't care whether, you know, this noun phrase have wider scope. There might be cases that actually matters, because, you know, depending on what your scope, your preferred scope is, but as ->>: But think about just [indiscernible] there's not this type of -- >> Mehdi Hafezi Manshadi: >>: It doesn't matter. [indiscernible]. >> Mehdi Hafezi Manshadi: But, you know, the point is that in motion translation, we don't even do semantics. It's very, very shallow sense. have a long time to get to the point of actually this matters. So it 34 >>: I would imagine being used as a sort of final stage re-ranking, maybe checking whether the original has been preserved or maybe [indiscernible] preserves the ranking order. But it's a lot of if for very little gain. >> Mehdi Hafezi Manshadi: For example in Persian, there is word order variation. So what your scope, what you're scoping you are actually mean to carry may depend on the order of the, like, you know, constituent in the sentence. So that may actually help that point. But, you know, that's looking long in the future. Right now ->>: In Japanese, scope order is generally read off the sentence, the [indiscernible] of phrases. Introduces ambiguity. >> Mehdi Hafezi Manshadi: Right. >>: If you want to keep that ambiguity, then maybe you need to [indiscernible]. >> Mehdi Hafezi Manshadi: >>: Okay. So think of it as less ambiguous. >> Mehdi Hafezi Manshadi: Well, because in English, it has articles all the time, it may help because like in Persian, all the nouns, most of the nouns are bare nouns so it's like you don't have much clue, you know, at least from lexical item of the quantifier may help. And actually, it does help in English. We've used that as a feature, like, but you often don't have that in Persian. So it makes it a little harder. But I don't think that much. >>: So there's been this fair amount of work on understanding or doing natural language interfaces to databases, natural language queries. And the question, I guess, is given the investment in that, and there is at some level some commercial interest in that capability, how does your work relate to it and why is it not available right now? Terms of success? >> Mehdi Hafezi Manshadi: Sure, okay. First, let me answer your first question. So actually, very, very, very good point. So as soon as -- so as I said before, people really didn't, weren't very interested, especially statistical people in scope ambiguity. But recently, people have tried to go deep, right, like queries, natural language data query databases. Like we have 35 this geography domain that you have geography of the United States and there are questions about what is the state that borders this and this. And actually, the best model known so far, the state of the art model, which was actually proposed by -- what was his name? I forgot. [indiscernible] last summer in ACL, actually incorporates the scope ambiguity into the model. However, you know, because there you have a lot of proper names, like state names and this kind of, you don't have many scoping tractions. So they were able, actually, to have this scope, quantifier scope ambiguity modeled in the system and helped, actually, them to have a better performance. But they didn't model as a separate -- they didn't have it as a separate model. So yes, that's definitely going to help in natural language queries to databases. That's actually one of the applications that when I started I was looking at. Then I got into natural language programming and I stuck there. And why it's not available? It's actually going to be available very soon. Hopefully at the end of the summer. I mean, that's, yeah, before my, you know, I defend my dissertation, which hopefully is in the summer. That data will be available on my web page. And you can get, you know, the data, the classifier that will train the corpus and everything. And the PODs are available in all the publications I have had, which a couple of them are very recent so you may want to wait a couple of weeks so that they are actually available on the internet. But you can email me and I can send you my publications. >> Sumit Gulwani: So let's thank our speaker now.