Past, Present and Future of Set Theory

advertisement
Jaakko Hintikka
PAST, PRESENT AND FUTURE OF SET THEORY
What one can say about the past, present and future of set theory depends on what one
expects or at least hopes set theory will accomplish.
In order to gauge the early
expectations, I begin with a quote from the inaugural lecture in 1903 of my mathematical
grandfather, the internationally known Finnish mathematician Ernst Lindelöf.
The
subject of his lecture was – guess what – Cantor’s set theory. In his conclusion, Lindelöf
says of Cantor’s results:
For mathematics they have lent new tools and opened up new fields of
research, they have thrown entirely new light on the foundations of
analysis and brought clarity and order where there was only disorder and
contradictions. Thus they have greatly contributed to the harmony that is
the essence of mathematics, a harmony a grasp of which is the reward of
mathematical research.
(Quoted in Olli Lehto, Tieteen aatelia, Otava, Helsinki, 2008, p. 263)
We can all agree with the compliments Lindelöf pays to set theory as an
impressive specimen of mathematical research, including the theory of infinite cardinals
and ordinals. But as far as the foundational role of set theory is concerned, in the
perspective of the subsequent century his words read as an example of supreme historical
irony. Far from bringing harmony into the foundations of mathematics, problems arising
D:\687295072.doc.2/6/2016
from set theory led to a schism between different schools of thought.
Few
mathematicians think of set theory as a tool for reaching new results outside set theory..
On the contrary, an interesting rich tradition called reverse mathematics takes significant
mathematical results and asks what set-theoretical assumptions are needed to prove them.
Set-theoretical paradoxes have greatly increased mathematicians’ concerns about
contradictions instead of assuaging them. Many foundationalists would blandly deny that
we have even now, more than a hundred years later, reached “clarity and order” about the
foundations of analysis.
What Lindelöf took to be the results of set theory thus were in reality so many
hopes that set theory was expected to fulfill. But precisely what prompted these hopes in
the first place? What can we objectively speaking expect of set theory? In the hands of
Georg Cantor, the centerpiece of set theory was his theory of infinite numbers,
hierarchies of infinite cardinals and of infinite ordinals. But in this direction frustration
reigns. Already in Cantor’s lifetime, his theory came into a screeching halt in its main
direction. Neither Cantor not anyone else could relate the two hierarchies to each other.
The main symptom of this syndrome is known as the continuum problem, the question
whether the cardinality of the first nondenumerable ordinal is the first nondenumerable
cardinal. The positive answer to this question is called the continuum hypothesis, in short
CH. To add demonstrative insult to aspirational injury, Kurt Gödel and Paul Cohen
proved that the continuum problem cannot be solved in the most commonly used
approach to the study of set theory, viz. the Zermelo-Fraenkel (ZF) axiomatization of set
theory. Their results have been hailed as major achievements in the foundations of
mathematics. They may be feats of clever reasoning, but for reasons to be spelled out
2
later they have absolutely no constructive significance. They do not have any relevance to
the question whether CH is actually true. All they do is to demonstrate the shortcomings
of the ZF set theory.
There have been other major developments in the past of set theory. Gradually the job
description of set theory has come to include another, even more important task. Set
theory has come to be considered as the repository of all the modes of reasoning needed
in mathematics. Our generally accepted working logic, the received first-order logic, is
too poor to capture all the patterns of inference used by mathematicians. They apparently
need higher-order modes of inference, that is, modes of reasoning that involve such
higher-order entities as sets or concepts. A representative case in point is the so called
axiom of choice (AC). What AC says is that for any set of nonempty sets there exists a
function picking out precisely one member of each. Formally, AC can be thought of as
being captured by the schema
(1) (x)(y) F[x,y]  (f)(x) F[x,f(x)].
Here (f) is a second-order quantifier, and hence, apparently goes beyond first-order
logic.
The problems that manifested themselves in the form of the so-called paradoxes
of set theory prompted mathematicians to flee to the safe-looking heaven of
axiomatization.
Such an axiomatization was carried out Ernst Zermelo beginning in 1908. His axiom
system developed into the currently most commonly used approach to set theory, the ZF
first-order axiomatization of set theory. It has some ???, but the differences between the
different first-order axiomatizations are not important.
3
What is there to be said of ZF set theory? Let us look at the history. Zermelo
did
not propose to axiomatize set theory out of the disinterested goodness of his theoretical
heart. He had a specific personal reason. Zermelo had published his proof of the wellordering theorem a couple of years earlier. The proof had run into some heavy weather
of criticism. The criticism focused on his use of the axiom of choice. Hence Zermelo
wanted to vindicate it and proposed to do so in the way Hilbert had made popular, that is
to say, by building it into a natural set of axioms. In this way, AC came to be considered
not a logical truth, but a specifically mathematical assumption.
Zermelo’s enterprise was seriously misguided, however. Not only has it led set
theorists in a wrong direction. It is something still worse: it is unnecessary for his own
main purpose. Contrary to the virtually universally accepted view of AC as a special
mathematical assumption, I will show that AC not only can somehow be thought of as
being logical principle, but is in fact a first-order logical truth. Whether this fulfills
Hilbert’s expectation that from a suitable point of view AC can be seen to be as
unproblematically obvious as 2+2=4, I will leave for my audience to judge.
I have to show the character of AC as a logical principle also to defend myself
against a plausible-looking objection. Otherwise I might seem to be off base when I
denied that set theory has “lent new tools” for mathematical practice. For by anybody’s
token AC is a bread-and-butter assumption in almost all mathematics.
This result is furthermore representative enough of the general situation to be
worth spelling out. To see the lay of the land, consider the familiar rule of inference of
ordinary first-order (FO) logic called existential instantiation. What it says is that in a
sentence (x)F[x] with a sentence-initial existential quantifier, you may omit the
4
quantifier if you replace the variable bound to it by a new individual constant. This
constant is sometimes referred to as a dummy name. The intuitive meaning of this rule
can be explained to an algebraist by saying that a dummy name is just like a symbol for
an unknown solution of a solvable equation, and explained to a judge by saying that a
dummy name is like the pseudo-names “John Doe”, “Jane Roe” etc. in the legal jargon
for perpetrators or litigants whose identity is not known or is kept confidential. A dummy
name can also be thought of as representing some arbitrarily chosen representative of a
nonempty class.
But why must the instantiated existential quantifier be sentence-initial? Wherever
an existential quantifier occurs, it asserts the existence of a certain kind of individuals.
Why cannot we always choose one of them to represent all of them so that we can argue
them? Students who have learned the rule of existential instantiation often try to apply it
also inside larger formulas. Yet such applications are fallacious.
Why so? The intuitive reason is obvious. The choice of a John Doe individual
may depend on the values of those quantifiers further out on which it depends. For
instance, in (x)(y)(x admires y); in words “everyone admires someone”, we cannot
choose a value b of y that would satisfy (x)(x admires b). For this would say that one
and the same idol is admired by everyone.
But as soon as we see the nature of the difficulty, a solution becomes obvious.
The variable in a dependent existential quantifier cannot be replaced by a single dummy
name, but it can be replaced by a function term which spells out the relevant
dependencies. Thus in (x)(y) F[x,y] we can replace y by a term of the form f(x). The
result is (x)F[x,f(x)]. Intuitively speaking, it says that there is a way of finding an
5
admireree for any given person. It is still a first-order sentence but one where we have
instead of a new dummy name we have a new dummy function.
This can be generalized. We can formulate a generalized but obviously valid
instantiation rule which allows us to eliminate an existential quantifier (x) prefixed to a
formula F[x] in a given context if we replace the variable x by a function term
f(y1, y2, …). Here f is a new function constant and (y1),(y2), … are all the universal
quantifiers on which (x) depends in its given context. I am here assuming that the given
sentence in which (x) occurs is in the negation normal form. The result is a first-order
sentence with no higher-order quantifiers. Instead of introducing a “John Doe” like
individual, we are now introducing an “arbitrarily chosen” “John Doe” function. I will
call the generalized rule the rule of functional instantiation.
Several comments are in order here. First, this reformulation yields a first-order
logic that is typically more convenient and elegant than the conventional formulation. It
allows us to replace in any one application all predicates by their characteristic functions,
eliminate existential quantifiers, and by so doing reduce all first-order reasoning to a
function calculus where the only quantificational rule of inference is the replacement of
(universally quantified) variables by function terms (over and above rules for identity). I
will call this logic the (first-order) function calculus.
This calculus can be further generalized. In the rule of functional instantiation,
the term f(y1,y, …) replacing x has as its arguments all the variables yi bound to universal
quantifiers (yi) on which (x) depends. This dependence is in the conventional FregeRussell first-order logic expressed by the fact that (x) occurs in the syntactical scope of
(yi). This is an inflexible way of expressing dependence. We can obtain a more
6
flexible first-order logic by allowing (x) to be independent of some universal quantifier
(z) within whose formal scope it occurs. This will be expressed by writing (x) as
(xz). This independence will manifest itself in the fact that the variable z does not
occur among the arguments of the term f(y1, y2, …) replacing x in the rule of functional
instantiation.
This modification of first-order logic results in what I have called independencefriendly (IF) first-order logic.
The correlated function calculus will be called, not
surprisingly, IF function calculus. This calculus is so rich that arbitrary computations can
be expressed in it. For instance, Kleene’s analysis of computability can be expressed in
it. This opens an interesting possibility of approaching problems in computation theory
in terms of first-order logic. For instance, it is known that if we can solve the famous P
vs. NP problem for consistency questions in IF first-order logic, we can in effect solve
the P vs. NP problem in general.
Our result can be generalized. Not only is the axiom of choice dispensable as a
special mathematical assumption. In a sense, the entire foundational function of set
theory can be shown to be dispensable in principle. All that is needed is a suitable firstorder logic.
This dispensability claim might at first sight seem unrealistic. Many, perhaps
most mathematicians would agree that it would be marvelous if we could avoid the use of
higher-order conceptualizations altogether. Hilbert blamed all the difficulties on the
foundations of mathematics. Dispensing with them was Hilbert’s real ambition. The socalled Hilbert program was only one way of trying to do so. It was for the purpose of
7
turning the axiom of choice into a first-order principle that Hilbert developed his epsilon
calculus.
Hilbert’s attempt did not succeed. However, he was on the right track. Indeed,
the treatment of the “axiom” of choice earlier in this essay can serve as a paradigm case
of what can be done in general. What was shown there was how the axiom of choice
could be turned into a first-order logical truth by improving the received first-order logic.
In general, we have already seen one way of formulating a richer logic than the
received first-order logic. It is to allow more flexible ways of expressing relations of
dependence and independence between quantifiers, which in effect are relations of
dependence and independence of the variables occurring in the quantifiers. The result is
what I have called IF first-order logic.
However, IF logic is not yet strong enough to capture set-theoretical or other
higher-order reasoning. It can be enriched, however, while remaining on the first-order
level, by introducing the contradictory negation  over and above the negation ~ of IF
first-order logic, which does not obey the law of excluded middle. Elsewhere, I have
shown that in this way we can obtain a first-order logic that is equivalent with the entire
second-order logic. Since second-order logic is all we need to capture all the usual
modes of reasoning in mathematics, this fully extended IF logic makes both higher-order
logic and set theory dispensable in the codification of the principles of mathematical
reasoning.
The rule of functional instantiation is rooted deeply in the semantics of first-order
logic. The natural truth-condition of a first-order (quantificational) sentence S is the
existence of suitable “witness individuals” that show (almost in the sense of displaying)
8
its truth. Many of such witness individuals depend on other individuals. Hence their
existence in effect means the existence of the functions that yield them as their values.
These functions are known as the Skolem functions of S. Now the “dummy functions”
introduced in the rule of fractional instantiation are essentially arbitrarily chosen
representatives of Skolem functions. The natural truth condition for S is therefore the
existence of the Skolem functions of S, in other words, the statement that all such dummy
functions for different existential quantifiers in S exist. Hence the rule of functional
instantiation is but a corollary to the natural truth conditions of first-order sentences.
But there are other objections to ZF set theory than redundancy. One can go on as
far as to say that it represents a misuse or at best a very dicey use of the axiomatic
method. The basic reason lies in the nature of axiomatization. The purpose of an
axiomatization of a theory is not to facilitate the discovery of new truths, nor is the
purpose to enhance the credibility theorems of the system. The purpose of an axiom
system is to facilitate the study of a class of structures, viz. the models of the axioms.
Establishing the deductive links between axioms and theorems is only one way of doing
so. Often metatheoretical results serve the same purpose, such as showing the
independence of certain theorems of some specific axioms or establishing representation
theorems. In mathematical practice, an axiom system typically comprises its own model
theory. And this model theory always comes with an implicit extra assumption, which is
its own consistency, without which its model theory cannot be discussed.
But no matter what Zermelo’s original intentions were, ZF set theory has come to
be understood as a first-order theory, that is to say, a theory whose logic is the received
FO logic. Hence the models of a first-order theory like ZF set theory are structures of
9
individuals (particulars), not structures of sets. Hence all use of first-order logic as the
medium of axiomatization is an indirect way of approaching set-theoretical structures
seem seriously misguided, and in any case it is at best a tricky and precarious enterprise.
We are supposed to find out about structures of sets by studying certain structures of
particular objects. If this is what it means to deal with sets as if they were individuals,
such an enterprise is more easily said than done — at best. .
It is known that there are difficulties in trying to derive information from the firstorder models of ZF set theory concerning actual structures of sets. The so-called axioms
of set theory are attempts to free the structures of particular objects that constitute the
domains of the models of a first-order set theory to resemble structures of sets. It is a
disconcerting fact that this situation is not always brought out of the closet and discussed
systematically. It appears that initially the problems were thought to manifest themselves
only among very large sets. In reality, they are much more endemic than that. Among
other things, these problems make it virtually impossible to study the model theory of set
theory in set theory itself, even though set theory is often thought of as the medium of
choice of all model theory.
For example, not all expressions of truth conditions for set-theoretical sentences
can be true in any model of first-order set theory, although those expressions are
themselves set-theoretical statements. For instance, if they were, and the relevant set
theory were rich enough to allow the formulation of its own syntax (as ZF set theory in
fact is), then we could formulate a truth predicate for a set-theoretical language in the
same language. Because we are dealing with first-order set theory, this is impossible in
virtue of Tarski’s impossibility theorem. But without a truth definition of a theory it is
10
not possible to discuss its model theory. Hence it is hopeless to try to do model theory
for first-order axiomatic set theory in terms of that set theory.
Moreover, the natural truth condition of a set-theoretical proposition was seen to
be the existence of its Skolem function. Now in its most general form AC guarantees the
existence of those functions. Hence the unrestricted form of AC would make possible a
truth definition of systems like ZF. Since this is impossible, adding the unrestricted form
of AC to ZF would make it inconsistent.
There is yet another respect in which a first-order axiomatic set theory does not
operate in the same way as a normal mathematical axiomatization. Normally, an axiom
system is calculated to capture a class of structures as its models with an eye on
investigating their features. But set theory is “a theory of everything”. The set-theoretical
universe is supposed to include all sets. What sense does it make to try to find the
characteristic features of everything?
What this result means is something much more important than the impossibility
of capturing the full force of the idea behind the axiom of choice in ZF set theory. Since
the concept of truth is the key of all model theory, what has been found shows that we
cannot study the model theory of ZF set theory simply in terms of set theory itself.
Instead of being the universal medium of model theory, as some logicians and
philosophers seem to think, ZF set theory cannot ever deal with its own model theory.
What does all this suggest about the future of set theory? Two recommendations
seem reasonably clear. First, it should be acknowledged that the two parts of the current
job description of set theory cannot reasonably be combined. A first-order axiomatic
11
theory of a set-theoretical universe just cannot also capture the principles of reasoning
used in logic.
This immediately leads to a second suggestion. It is that set theory should return
to Cantor’s success and become a model theory of set-theoretical structures, prominently
including infinite cardinals and infinite ordinals. Now this recommendation might seem
extremely difficult to implement. I have myself repeatedly faced the skeptical objection:
Where can you find valid set-theoretical assumptions that go beyond current
axiomatizations of set theory? What would such exotic assumptions look like? The
answer is that the assumptions needed need not be exotic at all. For one thing, in all
model theory you assume as a matter of course the existence of the models you are
studying. This means assuming the consistency of the theory whole models they are. And
since axiom systems like ZF set theory use conventional first-order logic, their
consistency cannot be proved in the theory itself.
It is here that IF logic and its extensions offer essential advantages over the
received first-order logic, while still remaining on the first order level with all of its
theoretical advantages. For one thing it is no longer impossible to prove the consistency
of a first-order theory in the same theory. Indeed, if elementary arithmetic uses IF logic
instead of the received first-order logic, its consistency can easily be proved in the same
arithmetic in an elementary fashion. In principle, the situation should be similar in set
theory.
In general, the use of IF logic gives opportunities of cultivating the model theory
of set theory without any new set-theoretical or other mathematical assumptions. For
instance, there are no reasons to think that the continuum problem should be impossible
12
to solve by means of this approach. I suggested that fully extended IF logic could in
principle replace set theory. Perhaps a better formulation would be to say that set theory
can be assimilated into fully extended IF logic. This logic hence seems to offer definite
advantages.
But can we really make progress in this way? Ultimately, future will have to
answer this question. However, in any case, the natural formulation of many settheoretical problems is not a deductive one, but model-theoretical. A case in point is the
continuum problem. It is a model-theoretical question concerning the set-theoretical
structure known as the second number class, also the class of all countable infinite
ordinals. Its cardinality is know to be the second infinite cardinal, that of the second
number class (class of all countable ordinals), Hence the continuum problem is
equivalent to the model-theoretical question whether the cardinality of the second number
class equals that of the continuum.
I have been thinking about the continuum problem in this form. I am not in a
position to announce any definitive results, but I have become convinced that a positive
solution can be found in this direction. I can in fact indicate the intuitive reasons for this
belief. It is not hard to show that the overall structure of the second number class allows
for an everywhere branching classification of ordinals.
This is because any two
conditions on ordinals that do not define a specific ordinal can be satisfied by a pair of
ordinals in either order. (This is because definable ordinals form an initial sequence of
countable ordinals.) Hence any finite number of conditions or ordinals can be realized in
any order. The cardinality of different possible kinds of ordinals is hence that of the
13
continuum. They are all satisfied, basically because the second number class comprises
all countable ordinals.
Should this line of though succeed, it would provide a powerful example of what
can be done in the core ideas of set theory once it is liberated from the fetters of the firstorder axiomatic approach. But even without this particular result, I believe that I have
shown to you where the future of set theory is to be found. Maybe in the long run
Lindelöf turns one to have been right.
14
Download