Jaakko Hintikka PAST, PRESENT AND FUTURE OF SET THEORY What one can say about the past, present and future of set theory depends on what one expects or at least hopes set theory will accomplish. In order to gauge the early expectations, I begin with a quote from the inaugural lecture in 1903 of my mathematical grandfather, the internationally known Finnish mathematician Ernst Lindelöf. The subject of his lecture was – guess what – Cantor’s set theory. In his conclusion, Lindelöf says of Cantor’s results: For mathematics they have lent new tools and opened up new fields of research, they have thrown entirely new light on the foundations of analysis and brought clarity and order where there was only disorder and contradictions. Thus they have greatly contributed to the harmony that is the essence of mathematics, a harmony a grasp of which is the reward of mathematical research. (Quoted in Olli Lehto, Tieteen aatelia, Otava, Helsinki, 2008, p. 263) We can all agree with the compliments Lindelöf pays to set theory as an impressive specimen of mathematical research, including the theory of infinite cardinals and ordinals. But as far as the foundational role of set theory is concerned, in the perspective of the subsequent century his words read as an example of supreme historical irony. Far from bringing harmony into the foundations of mathematics, problems arising D:\687295072.doc.2/6/2016 from set theory led to a schism between different schools of thought. Few mathematicians think of set theory as a tool for reaching new results outside set theory.. On the contrary, an interesting rich tradition called reverse mathematics takes significant mathematical results and asks what set-theoretical assumptions are needed to prove them. Set-theoretical paradoxes have greatly increased mathematicians’ concerns about contradictions instead of assuaging them. Many foundationalists would blandly deny that we have even now, more than a hundred years later, reached “clarity and order” about the foundations of analysis. What Lindelöf took to be the results of set theory thus were in reality so many hopes that set theory was expected to fulfill. But precisely what prompted these hopes in the first place? What can we objectively speaking expect of set theory? In the hands of Georg Cantor, the centerpiece of set theory was his theory of infinite numbers, hierarchies of infinite cardinals and of infinite ordinals. But in this direction frustration reigns. Already in Cantor’s lifetime, his theory came into a screeching halt in its main direction. Neither Cantor not anyone else could relate the two hierarchies to each other. The main symptom of this syndrome is known as the continuum problem, the question whether the cardinality of the first nondenumerable ordinal is the first nondenumerable cardinal. The positive answer to this question is called the continuum hypothesis, in short CH. To add demonstrative insult to aspirational injury, Kurt Gödel and Paul Cohen proved that the continuum problem cannot be solved in the most commonly used approach to the study of set theory, viz. the Zermelo-Fraenkel (ZF) axiomatization of set theory. Their results have been hailed as major achievements in the foundations of mathematics. They may be feats of clever reasoning, but for reasons to be spelled out 2 later they have absolutely no constructive significance. They do not have any relevance to the question whether CH is actually true. All they do is to demonstrate the shortcomings of the ZF set theory. There have been other major developments in the past of set theory. Gradually the job description of set theory has come to include another, even more important task. Set theory has come to be considered as the repository of all the modes of reasoning needed in mathematics. Our generally accepted working logic, the received first-order logic, is too poor to capture all the patterns of inference used by mathematicians. They apparently need higher-order modes of inference, that is, modes of reasoning that involve such higher-order entities as sets or concepts. A representative case in point is the so called axiom of choice (AC). What AC says is that for any set of nonempty sets there exists a function picking out precisely one member of each. Formally, AC can be thought of as being captured by the schema (1) (x)(y) F[x,y] (f)(x) F[x,f(x)]. Here (f) is a second-order quantifier, and hence, apparently goes beyond first-order logic. The problems that manifested themselves in the form of the so-called paradoxes of set theory prompted mathematicians to flee to the safe-looking heaven of axiomatization. Such an axiomatization was carried out Ernst Zermelo beginning in 1908. His axiom system developed into the currently most commonly used approach to set theory, the ZF first-order axiomatization of set theory. It has some ???, but the differences between the different first-order axiomatizations are not important. 3 What is there to be said of ZF set theory? Let us look at the history. Zermelo did not propose to axiomatize set theory out of the disinterested goodness of his theoretical heart. He had a specific personal reason. Zermelo had published his proof of the wellordering theorem a couple of years earlier. The proof had run into some heavy weather of criticism. The criticism focused on his use of the axiom of choice. Hence Zermelo wanted to vindicate it and proposed to do so in the way Hilbert had made popular, that is to say, by building it into a natural set of axioms. In this way, AC came to be considered not a logical truth, but a specifically mathematical assumption. Zermelo’s enterprise was seriously misguided, however. Not only has it led set theorists in a wrong direction. It is something still worse: it is unnecessary for his own main purpose. Contrary to the virtually universally accepted view of AC as a special mathematical assumption, I will show that AC not only can somehow be thought of as being logical principle, but is in fact a first-order logical truth. Whether this fulfills Hilbert’s expectation that from a suitable point of view AC can be seen to be as unproblematically obvious as 2+2=4, I will leave for my audience to judge. I have to show the character of AC as a logical principle also to defend myself against a plausible-looking objection. Otherwise I might seem to be off base when I denied that set theory has “lent new tools” for mathematical practice. For by anybody’s token AC is a bread-and-butter assumption in almost all mathematics. This result is furthermore representative enough of the general situation to be worth spelling out. To see the lay of the land, consider the familiar rule of inference of ordinary first-order (FO) logic called existential instantiation. What it says is that in a sentence (x)F[x] with a sentence-initial existential quantifier, you may omit the 4 quantifier if you replace the variable bound to it by a new individual constant. This constant is sometimes referred to as a dummy name. The intuitive meaning of this rule can be explained to an algebraist by saying that a dummy name is just like a symbol for an unknown solution of a solvable equation, and explained to a judge by saying that a dummy name is like the pseudo-names “John Doe”, “Jane Roe” etc. in the legal jargon for perpetrators or litigants whose identity is not known or is kept confidential. A dummy name can also be thought of as representing some arbitrarily chosen representative of a nonempty class. But why must the instantiated existential quantifier be sentence-initial? Wherever an existential quantifier occurs, it asserts the existence of a certain kind of individuals. Why cannot we always choose one of them to represent all of them so that we can argue them? Students who have learned the rule of existential instantiation often try to apply it also inside larger formulas. Yet such applications are fallacious. Why so? The intuitive reason is obvious. The choice of a John Doe individual may depend on the values of those quantifiers further out on which it depends. For instance, in (x)(y)(x admires y); in words “everyone admires someone”, we cannot choose a value b of y that would satisfy (x)(x admires b). For this would say that one and the same idol is admired by everyone. But as soon as we see the nature of the difficulty, a solution becomes obvious. The variable in a dependent existential quantifier cannot be replaced by a single dummy name, but it can be replaced by a function term which spells out the relevant dependencies. Thus in (x)(y) F[x,y] we can replace y by a term of the form f(x). The result is (x)F[x,f(x)]. Intuitively speaking, it says that there is a way of finding an 5 admireree for any given person. It is still a first-order sentence but one where we have instead of a new dummy name we have a new dummy function. This can be generalized. We can formulate a generalized but obviously valid instantiation rule which allows us to eliminate an existential quantifier (x) prefixed to a formula F[x] in a given context if we replace the variable x by a function term f(y1, y2, …). Here f is a new function constant and (y1),(y2), … are all the universal quantifiers on which (x) depends in its given context. I am here assuming that the given sentence in which (x) occurs is in the negation normal form. The result is a first-order sentence with no higher-order quantifiers. Instead of introducing a “John Doe” like individual, we are now introducing an “arbitrarily chosen” “John Doe” function. I will call the generalized rule the rule of functional instantiation. Several comments are in order here. First, this reformulation yields a first-order logic that is typically more convenient and elegant than the conventional formulation. It allows us to replace in any one application all predicates by their characteristic functions, eliminate existential quantifiers, and by so doing reduce all first-order reasoning to a function calculus where the only quantificational rule of inference is the replacement of (universally quantified) variables by function terms (over and above rules for identity). I will call this logic the (first-order) function calculus. This calculus can be further generalized. In the rule of functional instantiation, the term f(y1,y, …) replacing x has as its arguments all the variables yi bound to universal quantifiers (yi) on which (x) depends. This dependence is in the conventional FregeRussell first-order logic expressed by the fact that (x) occurs in the syntactical scope of (yi). This is an inflexible way of expressing dependence. We can obtain a more 6 flexible first-order logic by allowing (x) to be independent of some universal quantifier (z) within whose formal scope it occurs. This will be expressed by writing (x) as (xz). This independence will manifest itself in the fact that the variable z does not occur among the arguments of the term f(y1, y2, …) replacing x in the rule of functional instantiation. This modification of first-order logic results in what I have called independencefriendly (IF) first-order logic. The correlated function calculus will be called, not surprisingly, IF function calculus. This calculus is so rich that arbitrary computations can be expressed in it. For instance, Kleene’s analysis of computability can be expressed in it. This opens an interesting possibility of approaching problems in computation theory in terms of first-order logic. For instance, it is known that if we can solve the famous P vs. NP problem for consistency questions in IF first-order logic, we can in effect solve the P vs. NP problem in general. Our result can be generalized. Not only is the axiom of choice dispensable as a special mathematical assumption. In a sense, the entire foundational function of set theory can be shown to be dispensable in principle. All that is needed is a suitable firstorder logic. This dispensability claim might at first sight seem unrealistic. Many, perhaps most mathematicians would agree that it would be marvelous if we could avoid the use of higher-order conceptualizations altogether. Hilbert blamed all the difficulties on the foundations of mathematics. Dispensing with them was Hilbert’s real ambition. The socalled Hilbert program was only one way of trying to do so. It was for the purpose of 7 turning the axiom of choice into a first-order principle that Hilbert developed his epsilon calculus. Hilbert’s attempt did not succeed. However, he was on the right track. Indeed, the treatment of the “axiom” of choice earlier in this essay can serve as a paradigm case of what can be done in general. What was shown there was how the axiom of choice could be turned into a first-order logical truth by improving the received first-order logic. In general, we have already seen one way of formulating a richer logic than the received first-order logic. It is to allow more flexible ways of expressing relations of dependence and independence between quantifiers, which in effect are relations of dependence and independence of the variables occurring in the quantifiers. The result is what I have called IF first-order logic. However, IF logic is not yet strong enough to capture set-theoretical or other higher-order reasoning. It can be enriched, however, while remaining on the first-order level, by introducing the contradictory negation over and above the negation ~ of IF first-order logic, which does not obey the law of excluded middle. Elsewhere, I have shown that in this way we can obtain a first-order logic that is equivalent with the entire second-order logic. Since second-order logic is all we need to capture all the usual modes of reasoning in mathematics, this fully extended IF logic makes both higher-order logic and set theory dispensable in the codification of the principles of mathematical reasoning. The rule of functional instantiation is rooted deeply in the semantics of first-order logic. The natural truth-condition of a first-order (quantificational) sentence S is the existence of suitable “witness individuals” that show (almost in the sense of displaying) 8 its truth. Many of such witness individuals depend on other individuals. Hence their existence in effect means the existence of the functions that yield them as their values. These functions are known as the Skolem functions of S. Now the “dummy functions” introduced in the rule of fractional instantiation are essentially arbitrarily chosen representatives of Skolem functions. The natural truth condition for S is therefore the existence of the Skolem functions of S, in other words, the statement that all such dummy functions for different existential quantifiers in S exist. Hence the rule of functional instantiation is but a corollary to the natural truth conditions of first-order sentences. But there are other objections to ZF set theory than redundancy. One can go on as far as to say that it represents a misuse or at best a very dicey use of the axiomatic method. The basic reason lies in the nature of axiomatization. The purpose of an axiomatization of a theory is not to facilitate the discovery of new truths, nor is the purpose to enhance the credibility theorems of the system. The purpose of an axiom system is to facilitate the study of a class of structures, viz. the models of the axioms. Establishing the deductive links between axioms and theorems is only one way of doing so. Often metatheoretical results serve the same purpose, such as showing the independence of certain theorems of some specific axioms or establishing representation theorems. In mathematical practice, an axiom system typically comprises its own model theory. And this model theory always comes with an implicit extra assumption, which is its own consistency, without which its model theory cannot be discussed. But no matter what Zermelo’s original intentions were, ZF set theory has come to be understood as a first-order theory, that is to say, a theory whose logic is the received FO logic. Hence the models of a first-order theory like ZF set theory are structures of 9 individuals (particulars), not structures of sets. Hence all use of first-order logic as the medium of axiomatization is an indirect way of approaching set-theoretical structures seem seriously misguided, and in any case it is at best a tricky and precarious enterprise. We are supposed to find out about structures of sets by studying certain structures of particular objects. If this is what it means to deal with sets as if they were individuals, such an enterprise is more easily said than done — at best. . It is known that there are difficulties in trying to derive information from the firstorder models of ZF set theory concerning actual structures of sets. The so-called axioms of set theory are attempts to free the structures of particular objects that constitute the domains of the models of a first-order set theory to resemble structures of sets. It is a disconcerting fact that this situation is not always brought out of the closet and discussed systematically. It appears that initially the problems were thought to manifest themselves only among very large sets. In reality, they are much more endemic than that. Among other things, these problems make it virtually impossible to study the model theory of set theory in set theory itself, even though set theory is often thought of as the medium of choice of all model theory. For example, not all expressions of truth conditions for set-theoretical sentences can be true in any model of first-order set theory, although those expressions are themselves set-theoretical statements. For instance, if they were, and the relevant set theory were rich enough to allow the formulation of its own syntax (as ZF set theory in fact is), then we could formulate a truth predicate for a set-theoretical language in the same language. Because we are dealing with first-order set theory, this is impossible in virtue of Tarski’s impossibility theorem. But without a truth definition of a theory it is 10 not possible to discuss its model theory. Hence it is hopeless to try to do model theory for first-order axiomatic set theory in terms of that set theory. Moreover, the natural truth condition of a set-theoretical proposition was seen to be the existence of its Skolem function. Now in its most general form AC guarantees the existence of those functions. Hence the unrestricted form of AC would make possible a truth definition of systems like ZF. Since this is impossible, adding the unrestricted form of AC to ZF would make it inconsistent. There is yet another respect in which a first-order axiomatic set theory does not operate in the same way as a normal mathematical axiomatization. Normally, an axiom system is calculated to capture a class of structures as its models with an eye on investigating their features. But set theory is “a theory of everything”. The set-theoretical universe is supposed to include all sets. What sense does it make to try to find the characteristic features of everything? What this result means is something much more important than the impossibility of capturing the full force of the idea behind the axiom of choice in ZF set theory. Since the concept of truth is the key of all model theory, what has been found shows that we cannot study the model theory of ZF set theory simply in terms of set theory itself. Instead of being the universal medium of model theory, as some logicians and philosophers seem to think, ZF set theory cannot ever deal with its own model theory. What does all this suggest about the future of set theory? Two recommendations seem reasonably clear. First, it should be acknowledged that the two parts of the current job description of set theory cannot reasonably be combined. A first-order axiomatic 11 theory of a set-theoretical universe just cannot also capture the principles of reasoning used in logic. This immediately leads to a second suggestion. It is that set theory should return to Cantor’s success and become a model theory of set-theoretical structures, prominently including infinite cardinals and infinite ordinals. Now this recommendation might seem extremely difficult to implement. I have myself repeatedly faced the skeptical objection: Where can you find valid set-theoretical assumptions that go beyond current axiomatizations of set theory? What would such exotic assumptions look like? The answer is that the assumptions needed need not be exotic at all. For one thing, in all model theory you assume as a matter of course the existence of the models you are studying. This means assuming the consistency of the theory whole models they are. And since axiom systems like ZF set theory use conventional first-order logic, their consistency cannot be proved in the theory itself. It is here that IF logic and its extensions offer essential advantages over the received first-order logic, while still remaining on the first order level with all of its theoretical advantages. For one thing it is no longer impossible to prove the consistency of a first-order theory in the same theory. Indeed, if elementary arithmetic uses IF logic instead of the received first-order logic, its consistency can easily be proved in the same arithmetic in an elementary fashion. In principle, the situation should be similar in set theory. In general, the use of IF logic gives opportunities of cultivating the model theory of set theory without any new set-theoretical or other mathematical assumptions. For instance, there are no reasons to think that the continuum problem should be impossible 12 to solve by means of this approach. I suggested that fully extended IF logic could in principle replace set theory. Perhaps a better formulation would be to say that set theory can be assimilated into fully extended IF logic. This logic hence seems to offer definite advantages. But can we really make progress in this way? Ultimately, future will have to answer this question. However, in any case, the natural formulation of many settheoretical problems is not a deductive one, but model-theoretical. A case in point is the continuum problem. It is a model-theoretical question concerning the set-theoretical structure known as the second number class, also the class of all countable infinite ordinals. Its cardinality is know to be the second infinite cardinal, that of the second number class (class of all countable ordinals), Hence the continuum problem is equivalent to the model-theoretical question whether the cardinality of the second number class equals that of the continuum. I have been thinking about the continuum problem in this form. I am not in a position to announce any definitive results, but I have become convinced that a positive solution can be found in this direction. I can in fact indicate the intuitive reasons for this belief. It is not hard to show that the overall structure of the second number class allows for an everywhere branching classification of ordinals. This is because any two conditions on ordinals that do not define a specific ordinal can be satisfied by a pair of ordinals in either order. (This is because definable ordinals form an initial sequence of countable ordinals.) Hence any finite number of conditions or ordinals can be realized in any order. The cardinality of different possible kinds of ordinals is hence that of the 13 continuum. They are all satisfied, basically because the second number class comprises all countable ordinals. Should this line of though succeed, it would provide a powerful example of what can be done in the core ideas of set theory once it is liberated from the fetters of the firstorder axiomatic approach. But even without this particular result, I believe that I have shown to you where the future of set theory is to be found. Maybe in the long run Lindelöf turns one to have been right. 14