>> Eric Horvitz: Good morning, I am Eric Horvitz. ... northwest event. Before beginning, just a bit of news...

advertisement
>> Eric Horvitz: Good morning, I am Eric Horvitz. Welcome to tonights AAAI
northwest event. Before beginning, just a bit of news from the AAAI speaking as
the president. Wanted to announce that we've met one of my goals during my
presidency, which was taking a serious look at the issue of access to technical
publications and the challenges and opportunities that come with the
consideration of opening up technical -- access to technical journals, and digital
library more generally.
I have to say that after a year of deliberations the AAAI moved boldly on this. I'm
proud to say that as of this week all AAAI conference proceedings and technical
reports are now available freely to the international and research community.
And it's a lot of, a lot of issues were balanced in doing that. We hope other
scholarly societies, professional societies will follow suit. It's the trend, and we
think it's actually important even for the success of AAAI over the long term. The
AAAI has been having these kind of evening gatherings around the country over
the past year. At the last event in the Bay area, San Francisco Bay area we
brought together Sebastian Thrun and Red Whittaker for the first time since they
dueled for the DARPA grand challenge prize, and each provided their
perspectives on autonomous driving.
Tonight we'll focus on what I think is even more important, more exciting issues
than cars palleted by machine intelligence and these are concepts will be
addressing tonight that cut to the core of all computer science that provide with
us the foundations of computability with the whole notion of algorithms are
introduced and linked to more abstract mathematics. We're honored tonight to
have Yuri Gurevich here to tell us about the history of efforts leading up to focus
work on the [inaudible] problem and how Church and Turing pursued key
concepts with the ability of mechanical procedures to generate numbers. The
work led to what we now call the Turing Machine, a construct that provides us
with insights about the nature and limitations of computing and beyond history
and probably the reason we're here tonight, we'll hear about some exciting new
analysis of limitations in the classical work, new results on proofs of computability
that Yuri has done with his co-author Nachum Dershowitz.
I'm very honored to talk about Yuri. Yuri is a principle researcher at Microsoft
Research here in this building where his office is housed. It's nice to do one of
these events locally that I've been doing around the country. He's also Professor
Emeritus at the University of Michigan. He's been honored with the distinction of
ACM fellow and Guggenheim Fellow, among many other awards and
recognitions. Yuri was born and was educated in Russia, former Soviet Union,
and taught in Israel before coming to the United States.
He was a professor at Michigan before coming to Microsoft Research. He's well
known for his work in finite model theory and the theory of abstract state
machines of which he's the author. He's also made a contribution to the very
challenging area of average case complexity theory. He actually taught me quite
a bit about that several years ago as someone interested in what we might know
about real world performance of algorithms. He's beyond though the worst case
analysis we usually learn about in more standard computer science courses.
Beyond theory, Yuri has been passionate about computation, computer science
in practice, in the spirit of I guess Boltzmann's statement that there's nothing
more practical than a good theory. He's made numerous contributions to
Microsoft where his work has included the use of abstract state machines to build
model based testing tools, including some which are in daily use for conformant
testing for Microsoft protocols. He's helped design efficient algorithms for a
distributed file replication including methods that are now used in Windows
server and instant messenger and recently he started to investigate new
authorization logics for dealing with distributed data sets knowledge. Here's Yuri
to tell us more about the Church-Turing thesis.
[applause].
>> Yuri Gurevich: Thank you very much, Eric. And thank you for coming. I have
a story to tell, and I wanted to start it from Antiquity, but then I realized that I will
never finish. So we'll start it from -- okay, from agenda. So the thesis is the
Church-Turing thesis, so first I'll speak about the prehistory. Once I was told that
this habit of giving the agenda in the beginning and summary at the end comes
from American army. Then the little critique and new developments and we'll
start from prehistory. But not from Antiquity, from the middle of 19th Century.
So George Boole formalized propositional logic. And then Gottlob Frege took a
much harder challenge to account for what we call now quantifiers, statements
like for all their exist or some and that was quite a difficult task. In the middle of
this he receives this letter from Bertrand Russell of June 16th, 1902, and Russell
writes, Herr Professor, Dr. Frege, I enjoyed reading the first volume of yours and
I am waiting for the second volume, and by the way here is a little problem.
Frege never recovered from that. He discovered a contradiction in Frege's
system. So have to have hidden slide. It's only two lines but to give justice to the
argument would take somewhat longer. So Russell and Whitehead took on a
similar task. The idea was in Frege's work and also in there is to derive all
mathematics from pure logic. So there was a kind of philosophical ambition there
called logic, logicism that logic is the primary thing.
And in order to avoid the contradiction of the contradictions of the kind Russell
found in Frege's work, they stratified sets. So this set can speak about maybe
elements. Now, that set can speak about sets of elements but not about sets of
sets. If you want to speak about set of sets, you go there. So the very first level
where you only speak about elements is called first order. Then there is second
order, third order. So that stratification kept them consistent as far as they
thought and as far as we know till this day.
So the problem with the book called Principia Mathematica, one of the most
famous book of 20th Century and safely one of the boring -- most boring. So one
question is whether Principia is consistent. Another is whether their ambition
was justified, whether indeed every true mathematical statement can be derived
from pure logic. So Principia succeeded in the following way. They only derived
a small part of mathematics, in particular arithmetic. However, for the experts,
maybe I should tell you it's often quoted fact, they announce a theorem that one
plus one equals two at the end of the first volume and prove it by the end of the
second volume. Because they had to develop all this logic apparatus. But by the
end of third volume, they were quite advanced. But of course they didn't cover
mathematics as too big. But for the expert, it was clear for those experts who
had patience to arrive to the end of third volume, it was clear that you can go on
and derive almost any mathematics, maybe any mathematics. So building
comes in on this. Hilbert posed what is known by the German name [inaudible],
the decision problem. So the problem was this. Take the logic, for example
Principia mathematics, or the first order logic. In general you take axiomatic
system where statements have well defined notion of truth or falsity. So every
statement is either true or false. And this is supposed to be well defined.
So take any such logic. Take any Statement in this logic. So it has to be true or
false. Can you algorithmically decide this? So the input is a logic in the
statement, the output is true or false. Now, these days we often pose the
question as is there an algorithm in but at the time as far as they could see -- I'm
probably older than many here, but I wasn't there at that time.
But as far as I could dig out, the question wasn't is there an algorithm or not, the
question was find the algorithm. Now, it isn't that he claimed that algorithm can
be found quickly. Maybe it never will be found. Maybe it take us centuries to
find. But the idea was find. That was the problem. And as typical of
mathematics or you can find -- consider special cases.
And the most interesting special cases are Principia Mathematica in first order
logic. And then Kurt Godel appeared. Very young at the time. Took me a long
time to find a picture of Godel where he's young. So he proves that first order
logic is complete in the following sense, that every true statement can be derived
from the axioms. Notice this does not mean that you have a decision procedure.
Why? So you start from axioms, you imagine a certain engine and it derives
truths. So if the statement is true, then you eventually will derive it. But if it's
false, you will never derive it. So his result solves only sort of half of the problem.
This is famous Godel's completeness result. He was 23 at the time. His second
result in two years was even more famous. What he proved that contrary to what
Whitehead and Russell believed, and with them many mathematicians, it is not
the case that every true math -- every mathematical statement can be derived
from logic. So there are statements which can be formulated in Principia but
from the axioms of Principia you cannot derive truth and you cannot derive
falsity. So Principia is incomplete. But his result was much stronger than that it's
not just that Principia is incomplete, but tiny portion of Principia. If you just go
beyond first order logic, if you have arithmetic, then any system which accounts
for that will be incomplete.
So it was a devastating result especially devastating to Hilbert because he
developed certain logical foundation he called finitism. Which Gallo proved not to
be successful. So I don't want to dwell on that famous result. It will take us
away. What I wanted to note that even though Principia was famous for, you
know, at least for the ambition to express most or at least much of mathematics,
the first order logic is of that kind as well. So much of mathematics can be
formulated in such a way that it becomes pure logic. So let me say there isn't no
mystery there. So suppose you have certain mathematical theorem in certain
mathematical theory. Now, this theory have certain assumptions, so you write
them as axioms, and then you say this axioms imply this statement.
So you have to axiomatize the whole theory. And this implication gives you an
adequate translation into pure logic. Now, there are problems sometimes, there
is infinite amount of axioms, but in most cases at least in sufficiently many cases
it works. So after Godel's result, Hilbert's decision problem was rethought and by
now it is mostly known in that form as Hilbert decision problem for first order
logic. So given a statement in first order logic, you want to decide is it true or
false?
And when Hilbert asked find an algorithm. By that time people began to suspect,
especially after Godel's result that it is not impossible to have a negative result.
So it's -- by that time some people tend to think that maybe the right question is
not to find an algorithm but to ask whether their exists 1. Okay. But I'm running
ahead of myself. It is that problem that attracted the attention of Church-Turing
and the famous work that they did. It was in order to solve that problem. And if
there was an algorithm for that we didn't -- there would have been much need in
having too many mathematicians because some of the work would be done
mechanically.
Okay. Now I am moving to the story itself. So let me give away the plot so you'll
see what it's all about. So as I said after Godel's negative result, people started
to think in terms of negative results. And both Church and Turing but not only
them, also [inaudible], some other great mathematicians, logicians of that period
started to suspect that maybe there is no algorithm of the kind that Hilbert
wanted.
Now, how can you possibly prove that? How can you prove there is no
algorithm? So one way is to define what an algorithm is. But there is an easy
way to find a system of algorithms which is sufficiently universal in the sense that
if no one algorithm in that system solved the problem that no algorithm at all will
solve the problem. But, you see, there is a gap between general algorithms and
algorithms in the system. Okay? So now with this background I come to the
story.
At that time Church was a professor at Princeton, and Godel was in Princeton as
well. Let me see. Maybe I'm -- now I'm not so sure. It wasn't Princeton after the
war. But in that time maybe they exchanged letters. If they don't state
something. I'm not sure.
So in any case, Church was working on lambda notation. So what is lambda
notation? He didn't invent it, but he was fond of it, and he pushed it very far. So
let me say what is a lambda notation? Suppose you have this, A plus X square,
so you can write it, you know, in high school. What does A plus square mean?
Typically people mean that A is a constant and X is a variable. But it's a
convention. Can you make it precise? And answer is yes. So you write this
lambda X and lambda X turns this expression into function, function of X. So
now lambda with the square bracket is a function. So you can apply this
function.
Suppose you apply this function to two, two goes instead of A and you get A plus
four. That's a lambda notation. So Church wanted to compete with Principia and
probably more important with set theory. So Whitehead and Russell were
philosophers. And then there were mathematicians who didn't care much about
philosophy. He probably didn't read Frege. And they developed set theory from
mathematical consideration started from Kanter.
And by 1930, set theory was gaining acceptance would be an understatement. It
was quite recognized theory. And Church wanted instead of sets to work with
functions. He had two great students, Kleeny [phonetic] and Russell and he put
them to work on his system. And sometimes it's very dangerous to have good
students. They demolished his system. They proved that it's inconsistent. So
here is a very -- oops. Here is a very partial list of logicians who published
systems later proven to be inconsistent. So the list looks like an honor roll. So
Russell himself as you see in the same company. But Church was not like
Frege, he didn't collapse, was very sane man.
So he saved -- he took a safe portion of his system; in fact, provably safe, and
created what is known now as lambda calculus. So now many people work on
lambda calculus. I should say that the story of logic is very much related to
modern computer science. So I mention type theory that Whitehead and Russell
invented. And of course type theory plays enormous role in functional languages
and in programming languages.
Similarly lambda calculus is quite popular in some, mostly theoretical computer
science today, and Church considers to be the father of lambda calculus. So he
created lambda calculus for -- and now his goal was different. To disprove
Hilbert's I don't know say conjecture, the way Hilbert formulate -- to prove there is
no algorithm for deciding first order logic. So he took them, his students few
years, and they developed elaborate machinery. It's a very strange calculus. So
they had to -- they were able to express a lot of lot of functions, and there was a
stumbling block. They couldn't prove that N minus one is expressible in lambda
calculus. And so there's a story of Kleeny who came with this idea of being in a
barber shop and so he ran out because he understood suddenly how to express.
But eventually they -- Church convinced himself that any computer will function
can be expressed in lambda calculus.
By that time Godel was in Princeton. What I'm not sure when he relocated there
permanently. But in 1934, he was there because he gave very famous lectures
in the winter semester of 1934. So Godel was in the institute and Church was in
the university. So Church comes to Godel and with this thesis. He actually
called it a hypothesis. It was Kleeny [phonetic] who named -- later renamed this
hypothesis to thesis. Church wasn't ->>: [inaudible] in 34.
>> Yuri Gurevich: Say it again?
>>: Was Godel in Princeton in ->> Yuri Gurevich: Yes.
>>: He couldn't have been at the institute. The institute was formed after the
war.
>> Yuri Gurevich: Thank you. So he probably was at university at the time.
Okay.
So Godel finds this idea thoroughly unsatisfactory. So why should they believe
you? Now, let us come book in few years. Few years. There are many ways to
write algorithms. One of them is recursion. And one of the most common
recursion is illustrated here. So you define a function in this case a functorial.
So you define F of M plus one, using the value of function at N. Okay? These
days this kind of recursion called primitive. At the time, it was the only known
recursion. And it occurred to Hilbert that maybe it is very general. So he asked
his students other computable function which cannot be defined using this
primitive recursion? And he has two students who worked on this, in fact three,
but this, the more people, the more famous result. Should this -- opps, should be
two Ns, came with a counterexample, famous counterexample that in textbooks
today. Now, Rosa Peter actually did much more fundamental work. What she
dig out that in addition to this primitive recursion which when you go on one
parameter, you recurs on one parameter, there is a kind of two dimensional
recursion and three dimensional and there are other complications. So she
wrote a book. As a student I buy chance bought Russian translations of her
book, and it's a great book. And so it was a zoo of all kinds of recursions, more
and more powerful.
So why do I tell the story? Because in the winter semester of 1934 Godel a
course of lectures. And he asked this question in the course, whether every
computable function can be defined by some kind of recursion. And his own
answer was it seems to be so, but you have to use not only primitive recursion
but all kind of recursions. So by the way I'm telling this story using a fascinating
article by Martin Davis called why Godel didn't have Church's thesis. And Davis
says that this two event, which I called act 1, which Church came to Godel, and
when Church said that in his lectures difficult to place which of them was earlier,
so they were roughly simultaneously.
By the end of the course, Church and Godel worked out a recursion calculus of
great generality. All the recursion of Rosa Peter were there. And any other kind
of recursion that was known on numbers was only about integers here. On
integers it seems to be there. And Martin Davis said at that point Godel could
have pronounced a thesis. But he didn't.
So act 3. Church and Kleeny prove that Godel's recursion calculus and their
lambda calculus are equivalent. So take any expression in lambda calculus,
which you can think about a program for creating a function from integers to
integers, there is a systematic way to transform it into an expression in Godel's
recursion calculus. And the other way around.
At this point, Church didn't consult Godel and announced his result on meeting of
American Mathematical Society. So Church's thesis became public in this form.
It's interesting that he used that form, he used Godel's recursion calculus rather
than his own lambda calculus. Apparently he thought that somehow more
appealing.
Before Church publishes his thesis, and proves Hilbert decision. So what he
actually proves that there is no function which can be expressed in Godel's
recursion calculus which solves -- which gives the decision procedure for first
order logic. Okay? So let me recall. What was the problem? We want an
algorithm which takes a statement in first order logic and decides is it true or is it
false? So he proves mathematically that no function -- no expression in Godel's
recursion calculus program, can program such a decision procedure. And then
he uses his thesis to say there is no algorithm. Because if there was any kind of
algorithm, then there would be a computable function, which gives the decision
procedure. So his thesis was this bridge. By the way, I don't mind being
interrupted if you feel like that.
And then there was Turing back in England. Working alone, as far as we know.
And he wrote a stunning paper. So it was analysis of a computer. Now, what
kind of computer? This is 1930s. The computer was a mathematician. Turing
called him calls the mathematician a man. So it's a man. So what Turing did, he
said, okay, suppose we have computable function, any computable function from
integers to integers. In his case actually from strings to strings. So there is some
kind of algorithm that computes. Remember this algorithm is a man. So he says
how does this man compute? So how does mathematician computes? On a
piece of paper. He writes something, he goes back, he looks up, looks forward.
So without loss of generality he says the paper is like an arithmetical paper for
children without loss of generality in every square only one symbol is written and
so on. He goes and after, I don't remember, a dozen or so, without generality he
arrives to a Turing Machine and arrives to what today is called Turing's thesis.
Now, he didn't call the Turing Machine. He's a modest man. And he didn't call
the thesis. But it's known now as a Turing thesis. Every computable function is
computable by Turing Machine.
And he also derives the negative results. That's what prompted them in the first
place to solve Hilbert's decision problem. When he submits his work, he learns
that Church already published this. So necessarily the work is accepted and he
has an appendix. And in appendix he proves that his thesis and that of Church
are equivalent. So if you take one for granted you can prove the other one, and
the other way around. Or simpler, take a Turing Machine which computes some
function from nature of numbers to nature of numbers. There is a program in
Godel's recursion calculus which gives you the same function and the other way
around.
Church write a review and he was a gentleman. He starts his [inaudible] stating
clearly it was an independent investigation. He also acknowledges that Turing's
approach is more convincing. And then he says something that his approach
also has some advantages. And everybody acceptance Church-Turing thesis.
Including Godel, the great skeptic. Okay. That concludes the story.
Now critique. So the thesis became accepted universally. There are some
crazies here and there or at least only crazy seems to the publish something
about their doubts. But if you ask people who think about such things, of course
most people don't think, and there is nothing wrong with that. In a sense, it even
complimentary. Sometimes people who work in foundations complain that
people you know don't give enough thought to foundations. But in fact it's a
compliment because people trust you. You know, you have good foundation.
In any case, but if you ask people who do think about such things why do they
believe Church's thesis or Turing's thesis, the most common result, there are so
much different models and indeed after Church and Turing accomplished their
papers a great many models appeared and they were all proven -- of
computability, and they all were proven equivalent. In fact, the very first model
by Emile Post from the City University of New York appeared before to right after
church, after Church but before Turing just a few months there, in this period.
So I have a problem with this argument. First of all, it's not that many
independent problems, so there was lambda calculus and recursion calculus and
Turing machines and that's about it. The others were derived at least
conceptually. These were different forms but not truly different. But even if there
were hundreds of different models and they all coincide, it's only proved that the
notion captured by those models is robust. It does not prove that the notion is
right. Because there could be systematic mistake that all of them make.
Another argument is that there's so many years of experience. And of course
with each additional year there are more years of experience. This argument is
also not so strong. Popper writes, Karl Popper in one of his books that imagine
you live in the low country, low like Netherlands, not in the mountains, and the
water boils at hundred degrees centigrade, okay? And you -- these people may
live there for centuries. And there's a lot of experience until they climb a
mountain. So really the most convincing argument for the thesis is still Turing's
speculative analysis.
And this analysis is very powerful. Turing was far from apparent, so his analysis
in this paper and in other paper is a kind of broad sweep. So when you read you
know you go to opera and you are overtaken by music so somehow he overtaken
by this grand sweep of his vision. But if you take magnifying glass and start to
look closer, then certain question may arise. But let's see. I'm going ahead of
myself. Before I go there, let me mention that there seems to be one additional
attempt to analyze computation. So there is a talk by Komagorof [phonetic] in
1953 and there is a paper with a student of his in 1958, where they define
so-called Komagorof machines. Some very interesting machines. There is very
-- there is no philosophy there. It's not clear where they were coming. But
favorite students of Komagorof, [inaudible] Levin who teaches at Boston
University, tells me that what Komagorof had in mind a different kind of analysis.
So Turing analyzed a person doing a computation, performing a computation.
What Komagorof seemed to be thinking was computation developing in time and
space. And especially in space. So every element is somewhere in this space
and in the closed vicinity there could be only so many other elements relevant to
the computation.
So quite an interesting idea. There is a little flaw. Komagorof machine doesn't fit
in three dimensional space or 17 dimensional space or any finite dimensional
space. So you've got to be very generalized kind of space. It was very
interesting advance. Let me mention some other aspects. Turing machine
contrary to Church, Turing Machine allowed us to count steps and therefore it
was the beginning of modern theory of complexity. Komagorof Machine gave
birth to another kind of complexity. So he asked the question and possibly
rhetorical question in his seminar, Komagorof. Can you -- can you multiply
numbers quicker than the normal method? And you cannot ask these questions
on Turing Machines because they're so clumsy, it doesn't make sense. But
Komagorof Machines were much more advanced, much closer to computer of
today. And surprisingly, the answer was yes, you can multiply faster. Okay. But
that's different story.
So the question is what exactly did Turing assume? And that was not so
obvious. So if you go through the -- his argument, you know, he speaks about a
person. So at certain point he says that there is only finite many states of mind,
finite many state of mind as far as performing algorithm is concerned. Suppose
you perform certain algorithm, now you can stop, go for a launch, you can return,
you resume. You can even leave some note and ask somebody else to
continue. So it seems convincing. But it's hard to put it as a mathematical
axiom.
So Godel thought about another approach. That it may be possible you know
instead of analyzing anything particular just start from axioms. That was one
reason I wanted to start with Antiquity and the Euclid when you start from the
birth of axiomatic method. So you start from axioms. You postulate certain
general properties of computability, hopeful acceptable to all. And then maybe
you establish that indeed every computable function is Turing computable. So
another part of critique.
The thesis succeeded so much that people often identify algorithms with Turing
Machines. So there are books of very respectful authors where they just say that
by definition algorithm is a Turing Machine. And this is wrong. There is much
more to an algorithm that the function it computes. So you know, the algorithm
you have certain idea. There is a level of obstruction, data structure, certain
complexity. Many things will be erased. So I'll give you a very simple example,
at least if you remember how do Euclidian algorithm, how do you compute
general common denominator. And Euclid himself did it via differences.
Today, we do the division. Now, if you try both versions, which are quite
different, if you try to write Turing Machines for both of them, first of all you have
to do something about division, what we will do division, well, you'll implement it
as difference. So the whole different -- the extinction between two algorithms will
be erased by the time you arrive to Turing comply mentation.
Let me notice this. So in this lecture I speak about algorithms in the classical
sense. So Komagorof said it quite well. Algorithms compute in steps of bounded
complexity. In steps of bounded complexity. So each step is small. And you
compute step after step. So today we speak about distributed algorithms,
realtime algorithm. So here we speak about classical kind of algorithms.
So there are such algorithms that computes in steps of bounded complexity that
cannot be implemented at all on Turing Machines. So here is one example.
Professor Ricey [phonetic] from Humboldt University somewhere here. So it's a
geometric algorithm from times of Euclid. So you have a circle with a center, P.
Here circle and center P. And you have a point outside. And the problem is to
construct tangent like that. That's the problem. And they had quite a nice
algorithm for this. So here is -- here it is. First you connect P and Q with a line.
You know, they did this with a ruler and what is the standard expression?
>>: [inaudible].
>> Yuri Gurevich: Compass. Ruler compass. By the way, sometimes people
wonder why ruler and compass? Couldn't they understand, you know, that there
is also a leaps and something else? They could. They understood very well.
But ruler and compass was their computer. It was their true computer. They
really computed that way. And that's why it was so important for them to do with
a ruler and compass. And so all these separations can be done with a ruler and
compass. So you put the line, you find the middle point and take the circle and
take a point there. There is another point. And then you connect. That's all
there is. Can you do it in Turing Machines? Or in C? So not every algorithm
even Turing computable. Not even [inaudible] relation. Whoops. Let's see did I
miss another one? I will show another algorithm of that sort later on.
So now we come to new developments. So this part is a bit more technical. So
the bottom line we implement what Godel was proposing. We propose certain
principles. Hopefully acceptable to people. And derive the these from those
principles. Okay. That's the bottom line. Analysis.
So when I started in the early 1980s quite a while ago when they came to
Michigan, I moved to -- from mathematics to computer science. And I wanted to
understand, what do they do in computer science, and I arrived to a conclusion
that what they do are algorithms. Operating system is an algorithm. Come
[inaudible] is an algorithm. So it seems to be all about algorithms. Now, how do
you -- what is the math mat cal apparatus to deal with algorithms? So in physics
you use differently equations to speak about physical processes. What are the
differential equations of computer science? So we had this Turing model. But it
seems utterly inadequate.
Not taking away from the glory of Turing but he made enormous advance but the
model was inadequate for our days. So I was thinking, maybe we can have
much more powerful model. So what's a problem with Turing machine? The
problem is that you deal on the very low level obstruction. You deal with single
bits. Now, at the time there were very sophisticated programming languages like
PL1. You can take two matrices of arbitrary dimensions, same dimension but
arbitrary, and multiply them at once. Now, Turing Machine will walk painfully for
long time to accomplish the same result. So can you have a computation model
that will leave on the same level of obstruction as the reality you deal with? And
it seems to be too good to be true. Now, if it works, then there are good uses for
this. For example, you write software specs. You think what your software's
supposed to do, and you write something on that level. So here is, as they say in
German, [inaudible] experiment. Suppose there is such a model. How would it
look like? And this way it's a long story again, a separate story. I arrived to a
versatile model are called abstract state machines. So here is an example. And
that's another example of something the Turing Machine cannot compute at all,
and again courtesy of Professor Ricey.
So what is the problem? This is not from Antiquity, that's from 19th Century. So
you have a continuous function. You want to find a point close to the zero of this
function, to the root. So a point where this function is zero called the zero of the
function so you don't necessarily find the point itself that's much too complicated,
but you want to arrive to the vicinity so you start with -- you want to arrive to
epsilon vicinity of that point. And the algorithm is very simple. You start from
points A and B such that function at A is less than at B. Then you have. Half.
You take the medium between A and B and you check what happens. If you
already close enough, fine. If not, you see what's the sign. If it's positive okay,
this will be the new B. Okay? You do another step, find another function, okay,
this will be new A. And you keep going. And of course your interval shrinks
exponentially fast and function continues and by classical theorems you solve the
problem.
And here is -- so initial condition and here is a program in the language abstract
state machines. It's a kind of theoretical language. The whole program is one
step. And you repeat the step like with Turing Machine, repeat until you finish.
So and that more or less says what I explained as you see. So you compute the
medium and if you're close enough, you output the medium and halt, otherwise
you check where you are, and you redefine A or B. So later in fact when they
joined Microsoft 10 and a half years ago, I was able to write axioms for the notion
of algorithm. And that's what the next part of the story.
So there will be three axioms or postulates. So one of them is an algorithm
computes in steps. It's a kind of reality. But you have to say it. Axioms often
look be very trivial. In fact, the more trivial the better. Axioms supposed to be
obvious. And obvious can be pushed. There is a story of a math professor gives
a lecture and says this equation is obvious. Okay. Most people write it's
obvious. And there is one clever fellow raises his hand, is it really obvious?
Professor starts to pace the floor. Goes to the corridor, paces the floor, comes
back, says, yes, it's obvious and continues. [laughter].
So now with this postulate just two problems remain. One problem is when they
say it's a transition system, you go from one state to another, and transition is
going from one state to another, so two simple problems, what are the state and
what are the transitions. And if you allow me another joke, that one goes to
[inaudible], presumably. I haven't heard this joke directly from him, but it was
attributed to him. And he said there are two main problems in AI, and they are
what is A, and what is I? [laughter].
>>: The previous story [inaudible].
>> Yuri Gurevich: Thank you.
>>: So they say.
>> Yuri Gurevich: You're a wealth of information. Okay. Now, what are the
states? So a state, if you think about the computation, what's a state? A state is
certain information which together with a program completely determines all the
future, as for as computation is concerned. So this is a very natural definition of
state, and it's very different from the one given in books. So why? Typically in
books if you go to say a textbook on C, they would say that state is given by the
values of the variables. And that's a very different notion of state. The values of
the variables do not determine future computation. Why? First of all, you have
inform know where you are in the program. Then you may have a stack with
frames, incomplete computations which eventually come into play. And so on.
So our definition of state is a completely transparent state. So in state is such
that it determines the future of computation completely. Nothing is hidden. By
the way, that's why semantics of programming language is so hard because they
want to present a simplified point of view to you. So when you program and see
you just see few variables. And you don't think about all this unfinished
computation that you put into stack or in C# or in Java that are on the heap.
>>: Would you [inaudible] a simulation for random variables.
>> Yuri Gurevich: Say again?
>>: Would you admit a simulation, a step that called a random -- a [inaudible].
>> Yuri Gurevich: Yes. So I'm speaking about deterministic computations, but if
you speak about non deterministic computations, then it will be not the future of
the computation but the full variety of possible computations.
So here is the second postulate. I want to be very clear the copyright is okay.
And so the picture of one of my daughters and she gave me her permission.
[laughter].
Okay. So the second postulate says that states are what logicians call
structures. So it -- if you don't know, it's very easy to define. It's a set with
operation simulations, but truly appreciate the notion of a structure you simply -you have somehow to work with this a little. But mathematicians don't speak
about this, but I think it's pretty thoroughly common that people would agree that
any static reality can be described a structure. So if you go to math department
and fetch the first mathematician, ask what he's working on, it will be some kind
of structures, maybe graphs, it could be Hilbert spaces but some kind of
structure. I don't necessarily mean the structures of finite, whatever they are.
So this is the -- conceptually the deepest postulate. So in the abstract part is that
this in structure is that you don't look from what elements are made. So you say
suppose -- suppose have a graph and you say but what are the elements of the
graph? You know, what are the ideas in there are no ideas. So for those who
know lisp when you start the lisp before you start use quotes. So that's very
clean. There are those atoms, they have no identity, they just atoms. Okay. So
that kind of structure.
So the only information is given by operations and relations. The elements
themselves have zero information. They just points. So -- and that's the abstract
part. That's exactly what allows us to go to any level of abstraction where we
want to be.
And the finite postulate solves the problem that Komagorof sort of informally
posed steps of bounded complexity. How do you say that step is a bounded
complexity in it's a very different kind of complexity like polynomial or something.
Because when you speak about polynomial complexity or exponential, it's the
number of steps. But here you have one step. So the intuition is, let's see,
imagine huge state and you are little and, and you make a transformation one
step, and you see just a little bit around you. You don't see the whole state. And
that postulate took me like 15 or so years to arrive at.
>>: [inaudible] does this mean that beta reduction in lambda calculus or
unification, would they be considered bounded exploration as well?
>> Yuri Gurevich: No.
>>: Okay.
>> Yuri Gurevich: No. Because beta reduction can take term of arbitrary with,
so this is not ->>: It's not just termination, it has to be absolutely bounded by ->> Yuri Gurevich: Bounded, yes, every step. Much depends how do we look at
it. If we go to very high level and we say terms are elements, having no
structure, then it may be different. Much depends how do you look at it? So it's
all relative. You take certain level abstraction and you see what happens on that
level.
>>: Well maybe you would admit to kind of a more detailed beta reduction that
would ->> Yuri Gurevich: Yes.
>>: Every step change little bit as long as it's the same behavior [inaudible].
>> Yuri Gurevich: If it's doable. I'm not sure. Yes, but you are on the right way.
So the last postulate says that there are certain expressions in the language. So
where I'm coming from. So we have the state. And state is structure. Structure
-- there are the separations. Relations are all separations, they're just Boolean
value separations. So only have the separations. Suppose they want to say give
me that element or just to point to some element, how can they point to an
element? I cannot say element whose ID is 17 because they don't have any IDs.
So the only way for me to point to an element to have an expression, okay.
So for example in arithmetic if I have an expression zero, which already points to
one element, and they say successor of successor of zero, that's what I can point
to, to a particular element that we call two. So there are -- there is a fixed
number -- for a given algorithm, fix number of such expressions. And during one
step, algorithms only evaluates these expression, nothing else. It doesn't see
anything else. So how can you say that you don't see anything else? And the
idea is that the change that you make depends only on these expressions.
And I'll give you an example. Let me see. Coming back to this program. So
what are these critical terms or expressions here? They are let's see, first of all,
this expression, B minus A over two, which we just call M. But the expression is
this. Then there is this Boolean expression. The whole thing. May be true, may
be false. Then this expression. And I think that's it.
So imagine you have two functions which coincide here, but maybe very different
outside. But then the process described here will go exactly the same way,
which is don't care what is outside, we never see it. In fact, you may have
function which is different here, but we will miss it. Okay. I'm more or less on
time. So that allows to define algorithm as any transition system subject to this
two.
Postulate abstract state and bounded exploration. And that allows to prove this
representation theorem. So if you take an algorithm in that sense, so any,
anything such as finding the three postulate, then there is an abstract state
machine which as transition system is exactly the same, same state, same
transitions. Behavior, there's zero difference.
Okay. At that point, Nachum Dershowitz was visiting here. Back to Church. And
the -- so he was already for a while working on these abstract state machines.
And so he brought -- maybe we can actually use this axioms. Now, in my mind,
you know, I came to Microsoft. I am not a mathematician, I'm computer guy.
And my purpose was to do something useful. And in deed we used this abstract
state machine theory and we have a tool called spec explore was shipped
internally and used in Windows. And so all my direction was this part of my life is
engineering. And in fact, I didn't even think, never heard of this problem. And we
started to think. And turns out that this three postulate are sufficient to define
algorithms, but they are not quite sufficient to derive a thesis. We need a little bit
more. And that what we did in this paper. So let me remind what the Church's
thesis is. So why Church's? So we have this choice to prove in a very pedantic
Church's these. And Church deals with arithmetic where things are much more
standard. Everybody knows what's arithmetic. You know, zero, one,
multiplication, addition, multiplication, division. Now, with strings different groups
have their own basic operations. So we went with easier case of arithmetic.
So what you want to prove, that every computable numeric function is
expressible in Godel's recursion calculus. So now we want -- how do we want to
prove it? Now we have a notion of algorithm. So we want to prove there is no
algorithm. Just directly. So here is clarification, what numerical functions. I said
from integers to -- integers historically they consider didn't want to deal with
negative numbers. For no particular reason, it's just tradition. And functions may
be partial. So is that's Godel's program. Oh, Godel's approach. And so the one
additional postulate was needed. So you see, in a sense so powerful you can
define this geometric algorithms and algorithms on continuous functions or
functions are not necessarily continuous. You can work with Hilbert spaces. You
know. But Church wants to work on integers, with integers. And so you have to
start from something very specific. And so our additional postulate has this form,
okay? We didn't -- so we put something which everybody agrees is computable.
So in fact zero and successor sufficient. Or you can put zero successor plus and
multiplication.
And then call an algorithm satisfying the four postulate arithmetical. And then it
becomes a theorem, Church's thesis becomes a theorem in this framework.
Similarly, we can prove from scratch Turing's thesis, but we need to modify this
postulate, so instead of arithmetical algorithms, we need string algorithms. And
this can be done. And you get the theorem, too. And that's it. Thank you.
[applause].
>> Eric Horvitz: Thank you very much, Yuri. Any questions?
>>: Certainly tantalizing enough to read the paper.
>>: [inaudible] so you said [inaudible] for instant the [inaudible] testing
algorithms. Are these covered by this framework or are they beyond the scope
of this framework?
>> Yuri Gurevich: They are covered by this framework. So let me explain how.
[inaudible] nondeterminism, but same issue. So the question is whether
nondeterministic algorithm is an algorithm. And if you think about it, it's a
contradiction in terms. How nondeterministic something being an algorithm.
Let's see. There is -- Yogi Berra, one of his saying is when you come to
intersection ->>: To a fork.
>> Yuri Gurevich: -- to a fork, take it. [laughter]. Here is an algorithm, very clear
algorithm, nondeterministic obviously there is something wrong with it.
So it is indeed algorithm cannot possibly be nondeterministic. So what happens
how do we have nondeterministic algorithms? So algorithm needs help. So for
example in C, when you say E1 gets E2, according to the manual you can
evaluate E1 first or E2 first. And it matters. In good programming it shouldn't.
But it may. So which one do we take? The C itself doesn't know. A compiler
should help it. And compiler does help. Okay. Or for example in C# or Java,
you declare an object and an object appears and kind of very nondeterministic.
How does it appear in who made it? All right. So a lot of work goes to create an
object. You need to allocate space, you could prove certain kind of things. So in
fact, when algorithms says, you know, new object of that class such and such, it's
really SOS. And operating system comes and helps. Okay? So in that sense,
so when you also incorporate the helpers, so two ways to deal and we've dealt
with them both. One way to incorporate more complicated system together with
helpers, okay, and another approach is to consider interactive algorithms which
openly, you know, sent the query and wait for the result and so on. Any
questions?
>> Eric Horvitz: Well, thank you very much for your [inaudible].
[applause]
Download