>> Yuri Gurevich: It's my pleasure so introduce Tamar... on conjunctive grammars and synchronized alternating pushdown automata. Please.

advertisement
>> Yuri Gurevich: It's my pleasure so introduce Tamar Aizikowitz. And she will speak
on conjunctive grammars and synchronized alternating pushdown automata. Please.
>> Tamar Aizikowitz: Thank you. So today I'm going to be presenting a joint with
Michael Kaminski. Both of us are from the Technion - Israel. And hopefully by the end
of the lecture, we'll all understand all of the words in the title.
Okay. So context-free languages combine expressiveness with polynomial parsing,
which makes them very appealing for practical applications. And in fact they're possibly
the most widely used language class in computer science. They form much of the
theoretical basis for fields such as programming languages, computational linguistics,
formal verification and more.
Now, the goal of -- in recent years of several of the people in formal language theory has
been to find models of computation that generate a slightly stronger language class
without sacrificing polynomial parsing.
And the reason for that is those types of models seem to have great potential, because
every applicative field that's based on context-free languages, if we now plug in a
stronger model without sacrificing efficiency, potentially we could be able to do more
things.
And, in fact, several fields -- for example, computational linguistics have actually already
voiced the need for a stronger language class.
So one of these models is called conjunctive grammars. They were introduced by
Alexander Okhotin in 2001. And they're an extension of context-free grammars. And
they're an extension in the sense that they add explicit intersection rules.
So, for example, here you can see that from the variable S we derive a conjunction of two
variables, in this case, A and B, which -- and, again, we'll get into the specifics of the this
later, but let's say we continue the derivation that at some point each of them has derived
a word. And then if they both agree on the same word, the conjunction can be collapsed
and we have a derivation of W.
So essentially what we have here is the language of A and B of that conjunction is the
intersection of the language of A and the language of B. They both have to be able to
derive the same word for the conjunction to derive that word.
Now, for those of you who might not remember from their undergraduate studies,
context-free languages are not closed under intersection, so this is a stronger language
class.
Yet, we do have polynomial parsing for this language class, so, again, it is a candidate for
practical applications.
>>: And it has all the other -- it retains all the other nice properties that context-free
languages have?
>> Tamar Aizikowitz: A lot of them.
>>: [inaudible] all the other operations?
>> Tamar Aizikowitz: We'll actually get to closure properties towards the end of the
lecture, but, yes, a lot of the properties are kind of the same, which is what I think makes
these grammars pretty cool.
So our major contribution has been to introduce the automaton model for this class of
languages, which we originally showed in a 2008 paper. It's an extension of classical
pushdown automata. This is a synchronized alternating -- et cetera; we'll just call it
SAPDA, because it's a bit shorter.
Now -- whoops, I went backwards instead of forward. There we go.
Okay. So in an SAPDA, the stack is modeled as a tree, as we can see here. So instead of
just having one stack, we have something more along the lines of a tree of stacks. And
throughout the computation we require that all of these branches that we open up, they all
need to accept.
So because we have the first one needs to accept and the second one needs to accept,
that's where we get the -- again, the intersective quality.
The model uses a limited form of synchronization to create localized parallel
computations. So, for example, these two can do their computation in parallel. But at
some point they're going to have to synchronize and collapse back to their parents, so it's
not a complete separate computation for the two of those.
This is the first automaton counterpart that has been showed for conjunctive grammar, so
it's kind of closing a gap that existed in the theory for these grammars.
And I'm also going to touch on a second work that we recently showed in the formal
grammars conference that was just held where we show that a subfamily, one-turn
SAPDA is equivalent to the linear subfamily of conjunctive grammars. This is mirroring
the classical equivalence between one-turn pushdown automata and linear grammar, so it
kind of strengthens the claim that these are in fact a natural counterpart.
>>: So your machines accept exactly the same ->> Tamar Aizikowitz: Exactly the same.
>>: -- process as ->> Tamar Aizikowitz: Yes. Conjunctive grammars. Exactly.
Okay. So in our talk today we'll start out with a slightly more rigorous definition of the
computation models themselves, then we'll do a very high-level look at the equivalence
results, and we'll talk a bit more about the subfamilies that I mentioned before.
Then we'll talk a bit about the language class, the characterizations, the closure
properties. We'll also talk a bit about mildly context sensitive languages, which for those
of you who might not be familiar with them, that's from the field of computational
linguistics. And we'll wrap up with a short summary and some future directions.
>>: So one of those things at the end is [inaudible] examples of really useful things that
are not context-free?
>> Tamar Aizikowitz: Yeah. I'm going to try and show some of the types of languages
that we can generate using these models and hopefully convince you that they're
interesting.
Okay. So let's start out with the model definitions, and first the grammars. So a
conjunctive grammar has four sets: a set of non-terminals or variables I sometimes call
them, a set of terminal symbols, derivation rules, and the start symbol. And this is
exactly like a regular context-free grammar. The difference is in the derivation rules.
So now from a non-terminal A we can derive a conjunction of any number of strings over
terminals and non-terminals. So instead of just having one like in a regular grammar,
here we have a conjunction.
If we do happen to have exactly one, that's just a standard context-free grammar rule, so
it is a conservative extension.
For example, here we can see a rule with three conjuncts, so A derives little aAB and Bc,
et cetera, or we can also have the regular types of rules. That's fine.
Okay. Now, in terms of the way our derivation looks, we have two types of derivation
steps. The first is the standard rewriting step. So I could take any variable in my
derivation and replace it with the right-hand side of one of its rules. Except in our case
that right-hand side could potentially be a conjunction of more than one string.
But now once we've introduced these conjunctions we need a method to kind of get rid of
them. And that's where the second type of step comes in. If at some point we have a
conjunction where all the conjuncts are the same terminal word, we can take that
conjunction and just collapse it back to the word itself.
In terms of the language, we're talking about all of the terminal words derivable from the
start symbol. And this is the important part. Because the parentheses symbols and the
ampersand sign, they're not part of our terminal alphabet, we have to get rid of them.
If we were trying to derive a word in the language, any conjunction that we open up
during the course of the derivation will have to be collapsed. Okay. So anything we
create we have to -- they have to ultimately agree with each other, all of the conjuncts,
derive the same word, and then they can be collapsed back.
>>: [inaudible]
>> Tamar Aizikowitz: Sure.
>>: The last clause, is it really important to require the Ws?
>> Tamar Aizikowitz: Is the terminal word? Actually, it is. It changes the power of the
model. It's significantly stronger if we allow it to be any string.
It's may be a bit hard to see why at this point -- I was confused by this at first as well. It
was the first question I asked. But this way we're forcing them to agree on the end result,
what is the word that we derived. And that's the word that's going to be in the
intersection of our languages. If they agree on kind of an intermediate stage, then from
that intermediate stage we can still go in lots of different directions.
So it gives an added power to the model. It is critical that it be like that. And actually in
the automaton, we'll see how -- there it's much clearer that it has to work in that way.
Okay. So, again, like I said, every conjunction that we open up will have to collapse
back.
And that's -- again, that's actually what gives us that intersective quality. If we have a
word derived from a conjunction, that can happen if and only if it is derivable from each
and every one of the conjuncts independently. And, again, that's what gives us that the
language of A and B is the intersection of the languages. Okay.
Let's see a quick example. So we're going to look at a conjunctive grammar for the
multiple agreement language, A and B and C, which is one of the simplest most classic
examples of a noncontext-free language.
And assume that we have a set of, in this case, four standard context-free rules that add
up to the fact that the language of the non-terminal C is ANBNCN. So coordination of
the first two sets.
And in the same way we can have another set of standard context-free rules where the
language of A will be ANBNCN, so coordination of the last two.
And now very easily we can add a single conjunctive rule from the start symbol, which
will give us that the language of S is the intersection of the languages of C and of A,
which I'm sure you all agree is the multiple agreement language. So if it's a finite
intersection of context-free languages, we can very easily show a grammar for it.
And this is what the derivation looks like. We start out of course with this conjunctive
rule. And then from this point on, both of these derivations are independent until we
collapse them back.
Okay. So let's just say we start with C, so we develop C in the regular way using the
rules that we have, and ultimately here we've reached ABC, the word they we were trying
to derive. Now same thing from A. And once I have that word on both sides, I can
collapse it back and that's a derivation of ABC.
>>: [inaudible] if you have ANBNCN, BN also, BN, you can keep doing this, right?
>> Tamar Aizikowitz: Exactly. Any questions at this point? Okay. So those are the
grammars.
And now let's take a look at our automaton model. So, again, SAPDA are an extension
of classical pushdown automaton. The interesting thing about them is the transitions are
made to conjunctions of state and stack-word pairs.
So, for example, let's say I'm in some state Q and the next input symbol is sigma and the
top stack symbol is X. So now I have two optional transitions. The first I can transition
to state P1 while writing XX to the stack, and to P2 while writing Y, or the second option
is to transition to P3 while writing Z to the stack.
So, first of all, notice that we have more than one option. This is still a nondeterministic
model, so there's many possible transitions. And also we can notice that if we only have
one of these -- if the conjunction is only of one pair, like in this case, then that's a regular
PDA transition, so again we have a conservative extension of the classical model.
Okay. Now, the question is how in a pushdown automaton can we make two transitions
at once. Okay. Obviously a regular automaton can't cope with that. So for that reason,
the stack of an SAPDA is a tree. Okay. And if I have a transition to n pairs, that splits
the current branch that I'm in right now into n branches.
So here you can see in the example, we had one branch with A at the top and the state
was Q. We apply an appropriate transition, which in this case goes to -- has to conjuncts,
so we open up two branches; one with the state and the contents from here and one with
the state and the contents from there.
Okay. Now, once those are opened, the branches are processed independently. So for a
portion of the computation, that's like having two separate automatons doing whatever
they want. The question is what happens if the stack's empty. Okay. A pushdown
automaton can't continue its computation if the stack is empty. So this is where the
synchronization part comes in.
If I have sibling branches that are all emptied, I can collapse them back to the parent
branch if they're all synchronized.
Okay. Now, what do I mean by synchronized. They have to agree on what the next step
of the computation is going to be. So that means they have to be in the same state and
they have to have read the same portion of the input. Okay. So if that's the case, for
example, as in here, I could collapse them back and now the computation continues -just a moment -- from the parent branch with the state that they agreed on and with the
next input character being whatever they were both about the read. Yes.
>>: So is this about the [inaudible] the same portion of the input? I thought that the
input reading was always synchronized [inaudible] computation proceeding along these
[inaudible] branches, you give the same input for all of them, they all make transitions
[inaudible].
>> Tamar Aizikowitz: You could model the automaton in that way. The question was
why don't they all read the input synchronously. So it would be equivalent if they did. It
actually would make our proofs and showing the correlation with the grammar a bit more
difficult. It's like adding another restraint that we don't really need.
As far as I'm concerned, when they're like this, they're each doing their own thing. Now,
if one of them read more of the input, then obviously they can't be collapsed back and
continued because they're not synchronized.
>>: [inaudible]
>> Tamar Aizikowitz: It's the same expressive power. It's the exact same thing. Okay.
Any other questions? Okay.
So the automaton looks exactly like a regular one. It has states and regular symbols and
stack symbols, an initial state, an initial stack symbol, and the transition function. And,
again, like before, the difference is in the transition function, which, as we saw, we
transitioned to a subset of conjunctions of these state and stack-word pairs.
And in terms of how we describe the configuration of the automaton, we do that using a
labeled tree, so each branch has a node in the tree. If it's an internal branch, then the label
is just the contents of that stack. And if it's one of the leaves, then we also remember the
state, the remaining input and the contents of the stack. So it's kind of an extension of a
regular configuration for an automaton.
Okay. So like we said, each step of transition is applied to one of the stack branches. We
could equivalently have done all of them together, but it's actually simpler to look at it
this way.
If a stack branch is empty, it cannot be selected. And if we have synchronous empty
sibling branches, we can collapse them. And, again, synchronous means that they agree
on the state and on where they are in the input.
So an initial configuration is as we would expect, the initial symbol in the stack and the
initial state. In terms of acceptance, throughout the talk today we're going to be
considering acceptance by empty stack. So we don't care what state we're in, we just
need to empty the stack.
The model does have an equivalent mode of acceptance using states, so exactly like with
a classical pushdown automaton. And the language just -- even ignore this. The
language is all of the words that have an accepting computation. Okay?
And like we said before, because -- think of it this way: We're going through a
computation, we're opening up all of these branches, but ultimately to accept the word,
we want the stack to be completely empty.
So any branch that we've opened we need to collapse. In other words, to collapse them
they need to be empty. And empty, our way of looking at it is kind of like acceptance.
So these -- all these little subcomputations that we open up, they all have to agree with
each other and they all have to accept. And, again, that's where we get the interceptive
quality.
>>: What's the bottom symbol in the --
>> Tamar Aizikowitz: This? That's the initial symbol in the stack. So when I start out
my computation, that's what the stack looks like. It has just one root branch with that
symbol in it. It could be any symbol.
>>: It gets eliminated somehow before you [inaudible]?
>> Tamar Aizikowitz: That's up to how we define the automaton. But, yes, ultimately
we're going to have to eliminate it because we have to have a completely empty stack.
Okay?
Okay. Let's see an example. And it's actually a bit more sophisticated than the previous
one. Here we're going to look at the reduplication with the center marker language,
which is all the words of the form W$W. So a repetition of a series of symbols with a
dollar sign delimiting the two repetitions.
Okay. Now, this is actually a pretty interesting language. You were asking before if we
would show what type of languages we can accept here. So actually a lot of patterns that
we see in those applicative fields that are currently using context-free languages to model
what they're doing, they actually find all these phenomena that aren't context-free and in
this configuration have that form of a repetition of symbols.
So, first of all, we have copying phenomena in natural languages. For example, is she
beautiful or is she beautiful. So we have repetition of the same phrase with a connector
as a delimiter in this case.
Okay. We also have all sorts of patterns in DNA and all sorts of microRNA patterns.
Don't ask me what they mean. But they're pretty cool, from what I understand. And,
again, they have a repetition of symbols with something delimiting in the middle.
So this is a language that kind of describes something that we tend to see. Actually I'll
have another example that is similar to this having to do with programming languages
further on in the talk.
>>: [inaudible] context-free?
>> Tamar Aizikowitz: No.
>>: It's not?
>> Tamar Aizikowitz: It's not. WW reverse is context-free. But WW isn't.
>>: This one belongs to your ->> Tamar Aizikowitz: Yes.
>>: All right.
>> Tamar Aizikowitz: Not only -- actually, that's a good question. Not only is it not
context-free, it is also not the finite intersection of context-free languages.
So as opposed to the multiple agreement language that we saw before, which was just
take two context-free languages and intersection them, here we can't take any finite
number. We're going to have to do something a bit more sophisticated in order to be able
to accept this language. Okay? Yes.
>>: Is a general XML document also an instance of this where you have matching tags
nested?
>> Tamar Aizikowitz: No, because that's more like parentheses, and matching balanced
parentheses is probably the classic context-free language example.
Here we're looking for ->>: [inaudible] any word in XML tag it has to ->> Tamar Aizikowitz: Oh, in terms of the tags themselves, yeah, you -- okay. If you -okay. Depends at what level you look at the XML. If you kind of look at it maybe
abstracted a bit to what it isn't, then, yes, that would just be balanced parentheses, and
then that would be context-free.
If you actually look at the data that you're putting in and the tags that you're putting in,
then that could go from something like this to infinite alphabets and all sorts of
interesting directions.
So it depends at what level of abstraction you're looking at the XML. It's true that if you
are looking to see the same string of characters for that opening and closing tag, then,
yeah, that would be very similar to this. Yes.
>>: If it belongs to conjunctive grammar and conjunctive grammars are [inaudible] the
languages of conjunctive grammars obtained of context-free languages by means of
intersection, you say this is ->> Tamar Aizikowitz: Right. So what we're going to have to do here is actually ->>: How do you build it?
>> Tamar Aizikowitz: Right. So we -- give me one more slide. Okay? It's a good
question; give me one more slide.
Okay. So we're going to construct an automaton for this. Of course we could do a
grammar as well, but I want this to be an automaton example.
Kind of food for thought: It's not known whether reduplication without a center marker
can be derived by a conjunctive grammar. So if I take out -- we're going to really use that
dollar sign to be able to build our automaton, or if we were building a grammar, then the
grammar.
The question is if we take out that dollar sign can we still do that with these models, yes
or no. That's actually a question that Okhotin posed already in 2001. So it's not a new
question. But it is a question.
>>: That was my next question.
>> Tamar Aizikowitz: Okay. So especially here it really -- it is an open question. It -we don't know whether it's true or not.
Again, in most applications it's actually not critical because we usually do have some sort
of delimiter. So it's still very useful, but we don't know if we could do without it or not.
If anyone has any ideas, you're welcome to ->>: I'm interested in this XML question.
>> Tamar Aizikowitz: Okay.
>>: Sorry. So XML you have arbitrary other, you know, tag pairs in the middle. So
would this be -- does this have to be a constant -- [inaudible] have to be a constant strain
in order for this to work out, or can it be something more general?
>> Tamar Aizikowitz: I think as long as you know you are able to 100 percent know that
the delimiter has started at some point and ended at another point, as long as I know I can
see that that has happened, then it doesn't have to be a specific string.
But I have to be able to recognize it. There can't be any guesswork in terms of the
delimiter. It can be a really long string, it can be a lot of different options, but I have to
be able to recognize it.
Let's take a look at the construction. Hopefully you'll see a bit more how we use it.
So I know I said we're going to do the reduplication language. But we're going to do
something close just so that it's not too complicated.
We're going to do the same language except you see that U part? Maybe we have some
extra characters after the dollar sign. And the reason that I'm going to do it that way is
because that is really easily modified to be an automaton for reduplication, because all we
have to do is add another branch that makes sure that the number of characters before and
after the dollar is the same number.
So that's not the interesting part. That's a context-free criteria, just counting up and then
counting down. So we're going to focus on the other part, and that's going to be the
language that we define.
Now to Yuri's question. So it's not a finite intersection, how are we going to do this?
And the answer is going to be recursion. Okay. We're going to open up these branches,
these conjunctive branches based on the number of characters we have in our input word.
And that's how we're going to be able to derive this language. So it's a pretty nice
technique.
Okay. So what's the idea going to be. For every letter in the first part of the word, okay,
let's say this letter sigma is the Nth letter from the dollar sign. Okay. What we're going
to check is whether the Nth letter from the end of the word is also sigma.
Okay. If we do that for all of the letters here, then we will in fact check that the suffix of
the input word is that W that we had before the dollar and we've done our job.
So just to scare you a bit, this is the construction. We're not going to talk about it.
Because it's a bit too technical for a slide. But just so that you can see, this is actually the
recursive transition.
We have the status Q0, and the symbol I have in the stack is that bottom symbol. And
now I have a conjunction where the right conjunct brings me to the exact same state. I'm
again in Q0, and I have that same symbol in the stack.
So this is the recursion. And now let's just look at an example to see how this works. So
this is our input word. Our automaton is in the initial sate. And now we come to read the
first letter, A in this case. And let's see what happens.
What we can see is that two branches have opened. The left branch is going to check that
criterion that we talked about in the previous slide in regards to this letter A. So it's going
to count how many characters until it sees the dollar sign. And then it's going to look for
an A in the second part of the word and make sure that this number of letters after it is the
same.
Okay. And it remembers that the letter that it's looking for is A with this little subscript,
and we'll talk about the superscript 1 a little bit later.
And then the right branch, that's the recursive one, so you'll see it looks exactly like we
had the initial state, and it's going to continue opening up a branch like that to check for
each and every letter in the first part of the input.
So now when we read the letter B, first of all, this guy needs to count symbols. How
does a pushdown automaton count symbols? It pushes symbols into the stack. So this
means that it counted 1. Here we have the new branch for B. And here again the branch
perpetuating the recursion.
And with the third B, the same thing happens. Now, the first branch has counted two
symbols, the middle one, one symbol, a new branch for the third B. And, again, the
recursive branch.
Okay. So this would continue for as many characters as we had in this first part of the
input word.
Now this is the part where we use the delimiter. And this is why I said we have to be
able to recognize it. It can be a bunch of characters, but we have to be able to recognize
it.
Why? Once we read the dollar sign, first of all, the branch that was perpetuating the
recursion knows that it's done. It doesn't need to open up any more new branches. It's
finished its job, so it emptied its stack and transitioned to this state QE which is kind of
going to be an end state that all the branches ultimately are going to reach. So from now
on nothing is going to happen to that rightmost branch.
Now, the rest of the branches know that they have to kind of switch gears, right? Up
until now they were counting how many characters until the dollars sign. Now they need
to look for that letter that they remember and make sure that it's the right number of
characters from the end.
So that's why now instead of superscript 1 they have superscript 2. That means they're in
phase 2 of their computation and they're looking for their respective letters.
Okay. So this is that extra letter that we said we might have after the dollar. And so
thankfully all of our branches correctly guessed that they don't need to do anything when
they read it.
But once we read the letter A, this branch correctly guessed that that is the A that it was
looking for. So it transitions to QE. And now it's going to start counting backwards.
Okay. And the other two branches know they're not looking for an A, so they just ignore
it and we continue on.
>>: [inaudible] this final state QE, so it's just got a loop that says sort of ignore ->> Tamar Aizikowitz: Actually, it doesn't even need a loop. Because, see, this branch is
empty. So it can't make any more transitions. What it's -- it's essentially stuck. And
what it's doing is it's waiting.
If at some point this one is empty as well, maybe they'll be able to collapse and we'll get
rid of the branch. But we can't make any more transitions because it has no more
symbols in the stack. That's just how a pushdown automaton works.
>>: Right. I'm just thinking about the criteria, and you said that they all have to absorb
the same number of symbols. So this one has to be like they have to -- the thing that we
discussed in the beginning where ->> Tamar Aizikowitz: I see what you're saying. Yeah. Okay. That's a good point. It
would have to kind of continue ignoring -- right. Right. It would have to loop until the
end of the input. That's right.
Okay. So let's just go back one. So we were at this point, like this. And then we read the
B, so this one has counted backwards by one, so it empties a symbol from the stack. And
this one has now transitioned to QE as well. And when we read the third B and those last
two symbols are emptied, they're all in QE.
And at this point what we need to do is just -- they empty their bottom symbols. And
now notice that they're all in the same state, their stacks are empty, they finished reading
the input and now we can just have all of these collapsing moves which bring it to the
accepting state.
Okay. So you can imagine that the longer this part was the more of this big tree we
would have opened. But then assuming everyone made the correct choices along the
way, we would be able to collapse it back to this -- yes.
>>: Question. There is this thing called alternating finite automata, and then there's this
alternating pushdown automata also. Can they recognize this thing?
>> Tamar Aizikowitz: That's a good question also. Alternating pushdown automata is
actually pretty -- it's not a new model. It's been around I think since 1981. Don't hold me
to it. And what happens there is they also have these types of conjunctive transitions but
instead of it just being a local split, the entire automaton is duplicated.
So now I have two completely separate automatons. And they also have the duplicate of
what they already had in their stack. The bottom part of their stack is duplicated twice.
So they can do different things on it.
Now, actually that model can accept the exponential time languages, so it's a stronger ->>: Stronger than this.
>> Tamar Aizikowitz: Stronger than this. This is kind of a very delicate extension,
because it's just localized, these splits, but then they have to agree and go back. Also
with the grammars. You open up the conjunction, but you have to get rid of it as opposed
to the full alternating model which gives you two completely separate machines all the
way through the end of the computation. And that's why it's stronger.
>>: Is the reason you don't want to go there is because probably all the questions that
you might be interested in asking about those guys are too hard?
>> Tamar Aizikowitz: Right. It would -- you get a model which is too strong, so it's not
good for practical usage.
Okay. So now let's -- I'll try to sort of convince you that there are -- they accept the same
languages, although I think it is to a degree pretty intuitive. The technical proof itself is
quite involved, but if you believe that context-free grammar is in pushdown automata,
except the same class of languages, then you can see that we have made sort of the same
type of extension on both models and it would be pretty believable that maybe we got the
same result.
But let's take a quick look at it anyway. So our first theorem is exactly that, that a
language is generated by a conjunctive grammar if and only if it is accepted by an
SAPDA.
The equivalence is very similar to the classical equivalence. And in fact the proofs are
extended versions of the classical proofs. Okay. So it's not a new proof technique, it's
just an extension of those classical proofs so that they can deal with the more
sophisticated models.
So let's just quickly look at the first direction. Given a grammar, we want to construct,
and in this case we're going to construct a single-state automaton, so it's not going to use
states in order to decide what to do.
Again, this is an extension of the classical construction. We're going to simulate the
derivation of the grammar in the stack of the automaton where if the top symbol of the
stack is a variable or a non-terminal, we replace it with the right-hand side of one of its
tools.
And if we have a terminal symbol at the top of the stack, we empty it while reading the
same symbol from the input. And in this way we have a correlation between the stack
contents and the derivation of the grammar. And through that correlation we can show
that they derive the same language.
Let's see a quick example of what that looks like. Okay. So let's say we have some
derivation here on the left. And we're looking at some point in the middle of the
derivation where we have a prefix of only terminal symbols. That's W. Then we have
some variable, the leftmost variable, A, and then whatever is left of that sentinel form.
And in terms of the automaton for the correlation to hold, we assume that it's already read
those characters from W out of the input. And the contents of the stack is what remains
from what we've seen in the derivation.
So now if the top symbol in my stack is a variable, we need to switch it with the
right-hand side of the rule. So we look at what happens in the derivation.
Let's say that here A is replaced with a conjunction where the leftmost conjunct is UB
beta. So in the automaton we remove A, open up the conjunction, and then we put the
right-hand side of the rule into the stack.
Okay. Now we have a terminal symbol at the top. So we empty these symbols while we
read the same symbols from the input. So we empty U while reading U.
Okay. And we continue in the same way. Now we have a variable, so we need to replace
it with a rule. We go with the same rule that we had in the grammar, so again it's
replaced by that right-hand side. If we have terminal symbols, then we empty them.
Okay. So it's just running the derivation of the grammar in the sack of the automaton.
Okay. Let's actually go to the second direction.
So in the other direction given an automaton we want to construct a grammar. Again,
we're using an extension of the classical proof. Now, in the classical case this is the easy
direction. It's very, very simple to do this side, but here it's a bit more complex and
therefore we're not going to get into it. Okay. So you're just going to have to take my
word for it.
Now, an interesting thing to notice, the if proof that you just believe me is true, translates
a general SAPDA into a conjunctive grammar. Whereas the proof that we kind of looked
at before, the only if direction translates a conjunctive grammar into a single-state
automaton.
So that means that single-state and multistate SAPDA are equivalent, which, by the way,
is a characteristic of classical pushdown automata as well. So, again, a lot of what we're
used to seeing in the classical case translates more or less as is to these extended models.
Okay. So let's talk a little bit about the subfamilies. Linear conjunctive grammars were
introduced in that same original paper from 2001. They're an interesting subclass, first of
all, because they have especially efficient parsing. So the parsing algorithms for them are
better.
Now, Okhotin at one point showed that they are equivalent to a type of automata called
trellis automata. They kind of have like a pyramid of states where in each level two
states make some sort of joint decision as to what the state in the level above them is
going to be and the whole automata wants that the top state at the top of the pyramid is
going to decide whether it accepts or doesn't accept the word.
So it's kind of an exotic type of automata. He showed that they are equivalent just to the
subclass, but that equivalence isn't something that can be extended to the general class of
grammars.
>>: [inaudible]
>> Tamar Aizikowitz: No. No, no. That's not one of his models.
>>: [inaudible]
>> Tamar Aizikowitz: No -[multiple people speaking at once]
>> Tamar Aizikowitz: I think it's also trellis automata and also sometimes it's called
systolic automata. Don't ask me who defined. There's a couple of names there. But I
don't -- what?
>>: It's an old model.
>> Tamar Aizikowitz: It's not as old as the alternating pushdown automata, which I do
know who invented. But it's older than these models. So he did take it and show an
equivalence to an existing automaton model. And he got some interesting results through
that. It's okay.
And when we say that a grammar is linear, that means that in all of the derivation rules,
there is at most one variable. Okay. So let's say I have a rule for some variable A, it can
derive a string or a conjunction of strings of whatever characters. But each string like
that can only contain at most one variable. Okay. That's the same definition for a
classical linear grammar.
>>: [inaudible]
>> Tamar Aizikowitz: No. So each conjunct can only contain one grammar. So you
can't have A derives BC, two variables.
Okay. So what we're going to do is define a subfamily of our automata and show they're
equivalent to these linear grammars. Okay. So that will have -- give us an equivalence
for the general case and also an equivalence for the subfamilies.
Now, what's our motivation for this. Linear conjunctive grammars as a subfamily of
conjunctive grammars are defined in exactly the same way as linear grammars are a
subfamily of context-free grammar. So it's an analogous definition.
Now, it's a well-known classical result, not very new, from 1966, Ginsburg and Spanier,
that linear grammars are equivalent to one-turn pushdown automata. So that's why we
were thinking, okay, if this is the classical result, let's see if we can define a one-turn
version of our automata and show that that would maintain the same type of equivalence.
Because, again, we want to show that these new models kind of interact in the same way.
So what is a one-turn automaton. So a turn is a computation step where the stack height
changes from increasing to decreasing. Okay. So if I was going up and now I empty
something from the stack, then that would be a turn.
A one-turn automaton is one where in all-accepting computations they have just one turn.
So the stack height goes up, up, up, up, up to a certain point, and then from that point on
it's only empty.
Okay. So along the same lines, we want to define a one-turn SAPDA. And essentially
what it is is we say, okay, the tree can grow in whatever direction it's going, but at some
point it's going to collapse back. So the same criteria but on a branch level. Okay.
By the way, note that the fact that we're requiring at least one turn isn't limiting. Because
we are looking at acceptance by empty stack. We start out with one symbol in the stack.
We need to end up with no symbols in the stack. So there has to be at least one turn in
every accepting computation. What we're demanding is that there be at most one turn.
Let's actually just look at the animation I have here.
So the way we're going to define what a turn is, is for every branch we're going to expect
to see three phases. Okay. The first phase is going to be increasing transitions applied to
that branch. The second phase, maybe we have a conjunctive transition, opens up some
branches, does whatever it does in that part and in the end collapses it back. And then the
third phase, decreasing transitions on that branch.
Okay. So first we go up, we open branches if we do that, collapse them back and then
decrease.
Now, notice if this is a classical pushdown automaton and there's only one branch, there's
no second phase because we can't open up those conjunctions, and then all we have left is
phase one and phase three. So increasing and then decreasing, which is exactly the
classical definition. So we've just taken a straightforward extension of that.
And happily we did get the equivalence that we wanted, so just like in the classical case
we have that language is generated by a linear conjunctive grammar if and only if it's
accepted by a one-turn SAPDA, and, again, the fact that this mirrors the same type of
equivalence as we have in the classical case strengthens the claim of SAPDA as a natural
counterpart.
And, by the way, we also get their equivalent to trellis automata. Okay.
Yes.
>>: Where do these one-pass branch and automatas sit with respect ordinary context-free
grammars?
>> Tamar Aizikowitz: We'll get to that too.
>>: Oh, sorry.
>> Tamar Aizikowitz: Actually, in the next slide. That's a very good question.
Okay. So let's talk a bit about the generative power of these grammars. So, first of all,
conjunctive grammars, as we saw, they can derive any finite intersection of context-free
languages and also some additional languages, like the reduplication with the center
marker, so it's larger than that, in the same way linear conjunctive grammars can define -can derive any finite intersection of linear grammars and also some additional languages
[inaudible] reduplication with a center marker is actually a linear conjunctive language,
so it's the same. The same example works in both cases.
But and I think this was your question, if I understood it correctly, there are some
context-free languages, not even intersections, was just regular context-free languages
that cannot be expressed using a linear conjunctive grammar. So these two classes are
kind of -- it's not that one is stronger than the other, but the general conjunctive grammars
have all the -- yes.
>>: [inaudible] for parsing, do you have a slide on that?
>> Tamar Aizikowitz: Yes. You guys are really good. But it's not the next one, it's the
one after.
Okay. So closure properties first, which I owe you from the beginning of the lecture,
right? So union, concatenation intersection, Kleene star very, very easy to do that just
using the grammars. You can just -- as you would in the classical case almost, and we
saw an example of intersection, so that's pretty easy.
Not closed under homomorphism, but, interestingly enough, closed under inverse
homomorphism, and we're actually going to -- I'm not going to prove it, don't worry, but
we're going to touch on the proof of that one in the next slide.
>>: Can you tell me what closure and homomorphism means? I'm not familiar with that
term.
>> Tamar Aizikowitz: Sure. A homomorphism is just a function which translates a letter
in the alphabet of the original language into a word over possibly a different alphabet. So
it's kind of like a transformation that you apply to the language.
For example, if you had words of AB and you translate A to C, then you would have
words over CB.
Okay. So it's just a kind of a transformation, which, interestingly enough, it's not closed
under, because ->>: Kind of surprising.
>> Tamar Aizikowitz: Yeah. But, again, it's just because -- for example, okay, this isn't
a completely rigorous claim, but intuition. Let's look at the W$W. Okay. If we had a
homomorphism that took that dollar sign, you can also -- since I'm copying it to a word,
translating it to a word, that could be to the empty word. Okay. And that way I could
just get rid of that dollar sign, and we said, hey, that's an open question. Okay. So that
would already be kind of problematic.
So here it's actually not an open question. There is a proof that that is not closed under
homomorphism. But because when I translate things sometimes I lose the boundaries
that I use to do the intersections, and then I might get a language which is too strong, that
I can't have gotten in a different way. Okay?
Now, linear conjunctive languages are ->>: Sorry. Can I ask one more question?
>> Tamar Aizikowitz: Yeah, sure.
>>: For example, a finite state or dollar, they're probably closer to homomorphism,
right?
>> Tamar Aizikowitz: Yes. All of these are -- regular languages are closed under ->>: Everything.
>> Tamar Aizikowitz: -- everything. Everything's standard. If you invent like a really
strange function, then maybe not, but ->>: [inaudible] context-free languages, are they closer to homomorphism?
>> Tamar Aizikowitz: Yes. Context-free languages are closed under all of these except
intersection and complement.
>>: Yes. That's right.
>> Tamar Aizikowitz: Okay. So but the linear context-free languages are closed under
complement, linear context-free are also closed. So this is actually kind of an interesting
result. But it's an open question whether the general grammar class is or isn't closed
under complement.
By the way, if we answer this question, then we also know the reduplication without a
center marker question, because the complement of WW is context-free.
So if they're closed under complement, then we also know that they can derive WW. If
they're not closed under complement, we're still in the dark.
Okay. Now, the inverse homomorphism. This is just kind of me trying to convince you
that it's nice to have an automaton model, okay?
So Okhotin originally proved this closure property using his grammar model. He built
quite an elaborate proof to show this. The proof itself is 13 pages long as part of a
25-page technical report that he wrote on this subject. And it requires a separate proof
for the linear subfamily. So actually that is proven through their equivalence to the trellis
automata that we talked about before. So it's kind of involved.
Whereas, using our automaton model, classically inverse homomorphism is proven using
pushdown automata, because it's easier. And so that same easiness carries through to the
extended models as well. It's just a very intuitive extension of the classical proof. It's not
even an extension; it's just take the exact same construction, apply it to the stronger
model, you get the proof.
One page is even an overestimation. And the same proof works as is for whatever
subfamily you're looking at if you apply it to a one-turn automaton, the resulting
automaton is also one turn, so automatically you get that proof as well.
>>: [inaudible] given for standard [inaudible]?
>> Tamar Aizikowitz: How do you show that? For that we need to explain what an
inverse homomorphism is. Because I didn't -- yeah, you forgot to ask that one.
>>: [inaudible] ask that one.
>> Tamar Aizikowitz: Okay. Good thing I teach automata in formal languages, so I
know all of these very well.
Inverse homomorphism is kind of sort of backwards from what I said before, but it's not
completely backwards, because the translation of a letter to a word is not necessarily
one-one. Okay. So we can't just say it's -- you can't just say it's the inverse function.
Okay.
So what you do is it looks at a word and the inverse homomorphism of that word is all of
the words that could have been translated to that word. So it might be more than one.
Okay. Let's say the letter A is translated to the letter C ->>: [inaudible]
>> Tamar Aizikowitz: Right. I have to define a homomorphism and then the inverse is
all of the words that would have been translated to that word. So, again, let's just not
look at words; let's look at letters. So let's just look at letters for a second. Let's say that
my homomorphism says that both the letter A and the letter B are translated to the letter
C. So the inverse homomorphism of the letter C would be the language A and B, A
comma. Okay?
So it's kind of a more sophisticated or complicated closure property to explain, but
once -- if you're familiar with it from the classical case, then, again, it's the exact same
thing, it works in the exact same thing, the construction is very similar. The automaton
essentially just takes the input, translates it under the homomorphism and checks whether
the original automaton would have accepted it or not. It's not that difficult. But I just
don't want to get into it at this point because we won't have time to finish the lecture.
Okay. So this is an example, though, that shows how having the automaton model apart
from just theoretically it's nice to have an automaton to go along with the grammar, it
also gives some added intuition and in certain cases it can simplify proofs.
So okay. Decidability problems or membership. Maybe most important one possibly.
So for linear conjunctive grammars, the best algorithm has quadratic time in linear space.
And for general grammars, it's cubic time and quadratic space, which is the same as in
the context-free case. So again we have -- we've gained some power, but we haven't had
to pay in efficiency, at least not in terms of the order of the efficiency.
The sort of bad news is that emptiness, finiteness, equivalence, inclusion, regularity, all
of these properties are undecidable. I know that for people informal verification, usually
this is the one that makes them the saddest because lots of techniques do some sort of
whatever, and then they check if the automaton is empty.
So we can't check if it's empty. But, again, the membership is decidable efficiently, so
that would be the major redeeming feature of these models.
Okay. I want to show you a really quick example of what could be done with these
grammars. This is by no means an actual application; it's just kind of a proof of concept
or food for the thought or something like that. Don't take it too seriously.
We have a very silly programming language called PrintVars. It has three parts. Let's
just look at an example. The first one we have -- we define a set of variables. In the
second part we assign them values in whatever order. And in the third part we just write
them back out and then the output just prints to the screen the values of those variables.
So extremely simple programming language. Not very useful, but okay.
Now, a PrintVars program is well formed, at least let's show parts of the specification.
First of all, it has the correct structure according to what we saw in the previous slide.
And also we want to say that every variable that we used we actually defined. Every
variable that we defined we ended up using. We assigned a value to every variable that
we defined and we didn't assign a value to a variable that doesn't exist.
Okay. So sort of making sure that everything is working properly. Okay.
So item 1 we can easily define a context-free grammar that spits out the correctly
structured programs. That's the classical context-free language. But if we look at items 2
through 5, that actually amounts to something which is kind of like reduplication with the
center marker. Because, for example, we have that list of variables many the beginning
that we're defining, and then we have to sort of have the same list exactly, maybe a
permutation with values assigned.
So it's sort of similar to that copying of -- or in this case, a permutation, and it's not
context-free. Okay. We can't build a grammar that will actually encompass all of these
five ->>: [inaudible] kind of surprising because at least, No. 2, all used variables are defined?
>> Tamar Aizikowitz: You might be able to do one of them, but not ->>: I was just saying that, you know, this is the standard thing that ->> Tamar Aizikowitz: No, I think all of them, though.
>>: But doing dataflow analysis, right?
>> Tamar Aizikowitz: No. You would do that, but you wouldn't ->>: [inaudible] analysis can be reduced to see [inaudible] am I right?
>>: I can't guarantee that you're right.
>>: Oh, okay.
>> Tamar Aizikowitz: I think in ->>: Okay. I don't know. So keep going. Yeah.
>> Tamar Aizikowitz: I don't want to -[multiple people speaking at once]
>> Tamar Aizikowitz: Usually you would have something -- an additional process that
would check that. It wouldn't be in the lexical analysis of the program. You would have
another level on top that would be checking that criteria. Because it's just too
complicated for a context-free grammar.
Now, I'm ->>: [inaudible] the same -- that it's the same identifier [inaudible] twice come across the
same variable [inaudible].
>> Tamar Aizikowitz: So in real life we always say that programming languages are
context-free and they are in terms of structure, but when we start talking about the
correctness in more complicated terms, that's not a context-free language anymore.
And so I'm not necessarily saying that the right way to do this is with the grammar, but
it's interesting to think that we have a grammar formalism that can encompass all of these
criteria.
>>: [inaudible] Pascal, is it ->> Tamar Aizikowitz: I think a lot of the -- I tried to think of some of the different types
of correctness checks that we would do in a programming language. Many of them seem
to be something that we could do with these grammars, but I definitely don't want to
vouch for Pascal or Java or anything like that.
But, again, it's an interesting direction maybe for people who that would be their
expertise, to look at, okay, do these grammars give us any added value.
Okay. So let's just really quickly look at what this would look like.
From my start variable, I'm actually deriving a conjunction of five variables, one for each
of the items on the previous slide. So each one of those variables is going to be in charge
of deriving only the strings that meet that specific condition.
And then the conjunction of all of those will give us all the well-formed programs.
So the structure is just regular context-free rules. We won't get into that. That's what
we're used to seeing. And let's look at, for example, define_used. So we want to make
sure that all the variables that we had in the definition section are also going to appear in
the print section, we're going to use them. Okay?
So define_used, first make sure that we have this Vars token for the beginning of that
section, and then we have another variable which is actually going to do that check,
check define_used.
So for that we're going to need a variable that derives all strings, X. Okay. We're going
to use that. And now let's look at this first option.
So assume for a moment that the name of the first variable is A. Okay. So in that case
what we would want to see is that A is the first variable, then we have whatever other
characters. This would be the other variables that we were defining. Then we have a
valid vals section, the middle section where we put the values.
Then, again, any characters that A showing up a second time and any character. So this
means that the A that we saw first in the definition section will appear somewhere in the
usage section.
Okay. And then we also want to make sure that this holds and also that all the variables
appearing after that first A will continue to have the same type of criteria met for them.
Okay. Now, it's not necessarily that all of them are called A, so we also have to have one
of these rules for B and for C, and here I'm making a simplistic assumption that variable
names are one letter. Again, I don't want to make something too complicated at this
point.
But you do see that we can check this kind of more sophisticated requirement at the
grammar level, at the parsing level, okay, and not as a second layer of checking over the
structure.
Okay. So one -- yeah.
>>: [inaudible]
>> Tamar Aizikowitz: Excuse me?
>>: So this is an example of a nonlinear conjunctive?
>> Tamar Aizikowitz: This is a -- yes, a nonlinear conjunctive. I think it might be
expressible as -- I'm just saying W -- because this is a little bit similar to the W$W
language, that one is linear, it could be that this could somehow be reworked to be linear,
but I'm not sure.
And I'm not sure if the structure that I defined before is even a linear context-free
language or not. So maybe yes, but I'm not sure. It would definitely make it a little bit
more complicated to write out.
>>: [inaudible] they have like these extended regular expressions which are much richer
than actual [inaudible] expressions. Do you know if conjunctive grammars, are they
more general than these extended regular expressions?
>> Tamar Aizikowitz: Can you give me an example of what type of extension? Because
regular expressions generate regular languages, so that would be even weaker than
context-free languages.
>>: Yes. But so [inaudible] extended regular expressions you can usually have like
constraints that [inaudible] two variables in a sense have the same value. So you have
[inaudible].
>> Tamar Aizikowitz: Okay.
>>: But they're not really regular expressions [inaudible].
>>: I think the answer is no; that there things in those regular expressions that can't be
expressed here, because I believe that someone has proved that [inaudible] polynomial
time parsing for solving [inaudible].
>> Tamar Aizikowitz: Yeah.
>>: [inaudible]
>>: Okay.
>> Tamar Aizikowitz: Actually, it's -- the question, by the way, of, okay, given a
language is that expressible using a conjunctive grammar, yes or no, is not an easy
question, as opposed to context-free languages where we have a pumping lemma so we
can kind of say, okay, that can't be, we've made sure that that isn't possible.
Here we don't have a pumping lemma, so we do have that kind of very rough, if it's
exponential, then, no, we can't do it. But for all the polynomial time languages, there is
no kind of standard methodology for proving that a language is not a conjunctive
language. Okay. That's another sort of open question.
>>: [inaudible]
Okay. So negation would already be -- negation would be -- again, that holds for the
linear subfamily but not -- we don't know if that's true for all of them.
Yeah, Okhotin actually also invented Boolean grammars that have the same type as here,
but they also have explicit negation in them. So there's also that type of ->>: [inaudible]
>> Tamar Aizikowitz: Conjunction and negation. Another extension.
>>: [inaudible]
>> Tamar Aizikowitz: Yes. No, it's not -- he doesn't know because if these happen to be
closed under complementation, then they would be the same.
>>: So at least as rich.
>> Tamar Aizikowitz: At least as rich. Exactly.
>>: And also in polynomial time?
>> Tamar Aizikowitz: Yes, I believe so. Yes.
Okay. So the last thing I want to talk about actually in this lecture are mildly context
sensitive languages. They come from the field of computational linguistics. Now,
computational linguistics, as a field, what they want, their Holy Grail is a computational
model which exactly describes natural languages.
Which of course is a bit impossible because we can't agree amongst ourselves what
exactly is a natural language, but that's what they're trying to do.
And originally, actually, what they considered was context-free models. Context-free
grammars were suggested as a -- and introduced as a natural language grammar initially.
Ended up not being too good for that and being really good for programming languages.
But a lot of other natural language formalisms is also essentially context-free.
But they found out that there's lots of noncontext-free structures in natural language. For
example, that copying phenomena that we saw earlier. So it led to an interest, again, in a
slightly extended class of languages which came to be known as mildly context sensitive
languages. So something between context-free and context sensitive.
And there was also a paper from '94 which showed that several existing formalisms that
they had been looking at all converged to the same group of languages that met the
criteria for being mildly context sensitive, so that kind of made them all a little bit more
sure that maybe this was the right way to go.
Okay. Now this kind of [inaudible] slightly extended class, that kind of sounds like what
we've been doing, so let's take a look at how conjunctive languages relate to mildly
context sensitive languages.
So mildly context sensitive languages are loosely categorized. It's not a specific
definition, but they have a couple of criteria that they need to meet. So we'll just go
criteria by criteria and see what happens.
So the first one is that they contain the context-free languages. Okay. We do that.
Right? We contain the context-free languages.
The second, they contain multiple-agreement, cross-agreement, and reduplication. Also
good.
>>: Sorry [inaudible].
>> Tamar Aizikowitz: Multiple-agreement is the original language that we saw, the
ANBNCN. Reduplication was the WW language, and cross-agreement is like this,
ANBMCNDM. So it's just classical examples of noncontext-free languages. Which are
noncontext-free but not too much. It's not like the language of all primes or something
like that. Okay?
They're polynomially parsable. Check. They're semilinear. This one we're not -- we
don't hold to that one. Because we have a conjunctive grammar for ->>: [inaudible]
>> Tamar Aizikowitz: I'll explain. We have a conjunctive grammar for this language
which has exponential growth. That means that it's not semilinear. What does that
mean? If I look at all of the words in the language and I kind of order them according to
their length, then in a semilinear language, more or less they grow at a constant pace or at
a linear pace.
Okay. In an exponential language, the gaps in length can grow exponentially.
Okay. Now, it makes sense that natural language wouldn't be one that had exponential
growth, because it sounds reasonable that that wouldn't be something that we would be
looking for, and here we can derive these types of exponential languages with a
conjunctive grammar.
So on the one hand, that means that maybe we're not an exact characterization of natural
languages. But if we're looking at this through an applicative point of view, we still have
the applicative potential, because we do hit on all of these things that they say that they
need, and especially the polynomial parsing. But maybe we could do some extra things
that we don't need.
So if you want to use these types of models for an actual natural language application, it's
possible. But they wouldn't be a good candidate for a formal description of natural
languages perhaps.
Okay. Let's wrap up.
>>: It seems like from a usefulness perspective [inaudible].
>> Tamar Aizikowitz: Right. Okay. This is something ->>: [inaudible] characterize more languages, that's good.
>> Tamar Aizikowitz: It took me a while to understand this as well. Because I'm not -- I
don't consider myself a computational linguistic, but we do tend to be at the same
conferences and stuff. We travel in the same circle. So ->>: [inaudible] anybody? Because the problem is completely [inaudible] because if you
take a natural language, a set of phrases, not a set, just not a set, because when -- there
always will be phrases and we disagree [inaudible] English or not English.
>> Tamar Aizikowitz: Right.
>>: So there is no formalism in the world [inaudible].
>> Tamar Aizikowitz: So it's kind of -- it's not -- it's a weird combination between sort of
ultra theoretical mathematical computer science formal language theory on the one side,
and then sort of philosophy, psychology, cognition on the other side. Like there's lots of
people who do work on having grammars that evolve and kind of modify themselves in
the same way that a child acquires speech.
So lots of different people look at it from different directions. They don't even all agree
to the claim that natural languages are not context-free. You can literally look at paper
titles. One is natural languages are not context-free, and then the next one is natural
languages are context-free, and then the next one is, no, they're not, and then the next
one, yes, they are.
But what they're trying to do is not do -- not build something that has to do with natural
languages but rather to describe what that is, which is difficult. And I don't think any of
them expect to succeed, but it's the quest that drives them.
So again, in terms -- if you were a computational linguistic, you would say, hey, that's not
an example characterization. But this is what I'm saying. It's still useful if you want to
actually build something using these grammars.
Okay. So let's -- just a quick wrap-up. I hope I've convinced you that conjunctive
languages are interesting. The main two characteristics that we've talked about is that
they have a strong, rich class of languages that can describe structures that we can't
describe using context-free languages, but yet that they are polynomially parsable, so we
can consider them for actual practical applications.
They also have, in my opinion, a third characteristic, which is very, very important, and I
think that because their models of computation are very intuitive, relative to what else is
going out there, they're intuitive, they're easy to understand because they highly resemble
the classical case.
So if you're not interested in the theory, you don't have to go into all of the ins and outs of
the proofs, but if you're familiar with context-free languages, context-free grammars and
pushdown automata, you look at these models and you say, you know what, that's quite
familiar.
Okay. And that's not true of all of the -- there's a bunch of other slightly extended models
out there. I find a couple of new ones every day. But most of them are pretty exotic. So
it takes a while to figure out what they do.
And I think the fact that these are very intuitive and familiar would make them appealing
for possibly a wider audience of people and maybe would make them easier to kind of
plug into some sort of application.
In terms of specifically our contribution with the automaton model, so, again, it is the
first model that's been presented for these grammars, it's a natural extension. And, again,
that was really important for us to make sure that it did work in the same way that the
classical models did.
And, as we saw, it gives some additional intuition. It can help simplify proofs, like in the
inverse homomorphism example. So it kind of helps round out the theoretical basis for
these languages.
In terms of future directions that we're looking at right now, there are two main
directions. The first is broadening the theory. The next thing that we've already started
looking at is a deterministic version of these automata. And, again, that would probably
tie in with LR conjunctive grammars, possibly taking this in a direction which would be
better for compilation theory. So that's something that we're looking at already.
And also we're considering possible applications for these languages, maybe to find a
real-life problem that could be solved or a process that could be made better using these
languages.
We have started looking at some promising directions in formal verification, but it's very,
very early stages. So I don't really have anything to say on that yet.
And of course I'm always happy to hear about other people's ideas. So if any of you think
that these types of languages could tie into something that you're doing, then that's
something that I'm always happy to hear about.
Thank you very much for bearing with me to the end. These are the references that I've
talked about. Also if you need any additional references or anyone wants to read up on
any of these things, you can contact me and I would be happy to give those to you.
Because I'm sure you can't read them from where you're sitting.
>>: Thank you very much.
[applause]
>> Yuri Gurevich: Any more questions?
>>: So it is true that in like most used like modern programming languages there are a
bunch of features that are not context-free, that are [inaudible] people [inaudible] various
different hacks or to parse them it would be nice to be able to clean up a little ->> Tamar Aizikowitz: I think almost all -- I think almost all actual applications which
theoretically are taught as based on context-free languages, when it comes down to it,
they are dirty around the edges.
>>: So I can give you -- if you're interested, I can give you some reference, but just to ->> Tamar Aizikowitz: I'd be happy to hear it.
>>: I don't know -- I don't know if these things are going to fall into your class or not,
but ->> Yuri Gurevich: [inaudible] thanks again.
>> Tamar Aizikowitz: Thank you.
Download