The Magical Origin of the Natural Numbers

The Magical Origin of the Natural Numbers Paul Corazza Department of Mathematics Maharishi University of Management Presented at the Annual Faculty Research Seminar April 26, 2013, Maharishi University of Management, Fairfield, IA DRAFT Version 6: September 12, 2015 Abstract: The usual treatment of the set N of natural numbers in foundational theories, particularly ZFC, is simply to assert axiomatically that “N exists,” but not to provide a perspective about how the sequence of natural numbers might originate. We review the perspective of the ancient wisdom, particularly Maharishi Vedic Science, which considers knowledge to be fundamentally incomplete when discrete values are understood only in terms of their separateness and not in terms of a unified source. We argue that the usual—incomplete—way of treating the natural numbers in set theory, in the form of the Axiom of Infinity, has made certain modern-day problems about the infinite in mathematics more difficult to solve, particularly the well-known Problem of Large Cardinals. We suggest in this paper a way to reformulate the Axiom of Infinity so that the natural numbers are seen to emerge as precipitations or side-effects of more fundamental self-referral dynamics of an unbounded field. We observe that this New Axiom of Infinity has characteristics that naturally generalize to stronger forms, which can be used—and, in fact, have been used—to solve the Problem of Large Cardinals. 1 Contents 1 The Role of N in Mathematics 4 2 The Maharishi Vedic Science Perspective 4 3 The Origin of N According to Modern Mathematics 6 4 A Plan for a New Axiom of Infinity 8 5 Dedekind-Infinite Sets and Dedekind Self-Maps 9 6 A New Axiom of Infinity 15 7 Obtaining the Well-Ordered Set W 16 8 Generating the Natural Numbers with the Mostowski Collapse 24 9 The Self-Map s : ω → ω As the Smallest Among All Dedekind Self-Maps 31 10 Co-Dedekind Self-Maps, Dedekind Maps, and co-Dedekind Maps 46 10.1 Dedekind Maps and co-Dedekind Maps . . . . . . . . . . . . . . . . . . . . . . . . 50 11 Blueprints Generated by a Dedekind Self-Map 56 12 A First Look at Dedekind Self-Maps of the Universe 59 13 The Classes ON and V 62 14 The Class of Natural Numbers 65 15 Strengthening Proper Class Dedekind Self-Maps 68 16 The Relationship Between the Different Notions of Critical Point 87 17 Class Dedekind Self-Maps and Functors 90 18 Emergence of the Infinite in the Universe 105 19 The “Naturalness” of the New Axiom of Infinity 109 20 Tools for Generalizing to a Context for Large Cardinals 118 20.1 The structure of a Dedekind self-map . . . . . . . . . . . . . . . . . . . . . . . . . . 119 21 Introduction to Large Cardinals 124 22 Some Strengthenings of Dedekind Self-Maps to Account for Large Cardinals 127 2 23 The Theory ZFC + BTEE + MUA 133 24 The Theory ZFC + WA 24.1 Restrictions of a WA-Embedding and Its Critical Sequence . . . . . . . . . . . . 24.2 A Connection Between the Critical Sequence of a WA-Embedding and Mathematical Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.3 The WA Blueprint and the Ancient View of Blueprints . . . . . . . . . . . . . . . 146 . 150 . 154 . 170 25 Conclusion 178 26 Appendix I: A First Look at How 1 Emerges from 0 185 27 Appendix II: Unfoldment of the Universe Within the Unmanifest 190 3 The set N = {0, 1, 2, 3, . . .} of natural numbers has always been recognized as playing a vital role in the development of mathematics. Historically, though, mathematics has not felt the need to examine the more philosophical question of the origin of the natural numbers—what might be their source?—though, of course, the fundamental axioms (the ZFC axioms) do provide some information about it since the natural numbers, like the rest of mathematics, are derived from ZFC. In this paper we will offer an alternative formulation of the ZFC axiom that declares the existence of an infinite set, the Axiom of Infinity. Our new Axiom of Infinity, though mathematically equivalent to the original version, views the set of natural numbers not as simply “given,” as is usually assumed, but rather as emerging from self-referral dynamics of an unbounded field. We will argue that this alternative perspective provides advantages and insight in addressing issues having to do with the mathematical infinite—most importantly, the century-old Problem of Large Cardinals. 1 The Role of N in Mathematics Prior to the emergence of the ZFC axioms as the standard foundation for mathematics, mathematicians, at least in an intuitive sense, viewed the natural numbers as the source of all of mathematics. Nineteenth century German mathematician Leopold Kronecker made the oftenquoted statement [37], “God made natural numbers; all else is the work of man.” In an attempt to identify a “starting point” for establishing results about the natural numbers, Italian mathematician Giuseppe Peano, building upon earlier work of Peirce and Dedekind, formulated a set of axioms for the natural numbers and their fundamental properties; his axioms are known today as Peano Arithmetic, or PA. With the advent of ZFC, the prominent role of N in mathematics appeared to fade somewhat, since now both N and each particular natural number were to be seen as special kinds of sets, built from the ZFC axioms. However, it was observed in subsequent decades that ZFC is equivalent to Peano Arithmetic together with the statement that all the natural numbers can be collected together into a single completed set: ZFC ⇔ PA + the natural numbers 1, 2, 3, . . . form a set.1 One may therefore conclude that, even from the perspective of modern mathematics, all of mathematics can be derived from the natural numbers and the knowledge that the natural numbers form an infinite set. 2 The Maharishi Vedic Science Perspective In 1994–95, Maharishi gave a series of lectures on his Vedic Mathematics and Absolute Number. In one such lecture (which the author had the privilege of attending), he made the following 1 Recent work [15] in the model theory of PA suggests that a more accurate formulation of the equivalence here is that PA is equivalent to the theory ZFC − Infinity + ¬Infinity + Trans, where Trans asserts that every set is included in a transitive set. (A set X is transitive if, for all y, z, if y ∈ X and z ∈ y, then z ∈ X.) 4 remarks:2 All of creation comes from the natural numbers, and, in the same lecture: To view the natural numbers only as separate quantities, unconnected to their source, is the beginning of ignorance.3 Certainly the view that Maharishi warns against in the second of these quotes—that the natural numbers 0, 1, 2, . . . are nothing more than conceptual devices, without a common source, to serve the practical need of counting items in the world around us—is the common view, the view that one learns in school, and a view that one would have little occasion to reflect upon or call into question. To remedy this incomplete view of the natural numbers, Maharishi introduced his Absolute Number, the ultimate wholeness underlying all numbers and all creation [25, pp. 625–626]—the field of “infinite dynamism on the ground of infinite silence” [25, p. 620]. Appreciating each number in terms of its source in the Absolute Number, as Maharishi explains, restores the full dignity of each number as both an individual expression with its own unique characteristics, and as an expression of infinite wholeness. He says [25, pp. 614-15]: The ever-expanding value of the universe, in terms of an infinity of numbers, is the natural characteristic feature of the Absolute Number, which enables all numbers to function from their common basis. It is this effect of the Absolute Number on all numbers that actually initiates and maintains order in the ever-evolving infinite diversity of the universe. He also explains [25, p. 633] Without the support of the Absolute Number the study of modern mathematics can never account for the total reality—which is infinite—because it [mathematics] proceeds in steps. This is the time for the mathematics of finite number, of finite steps, to peep into the reality of the Absolute, and this Absolute Number represents complete knowledge, the Veda. When the natural numbers are able to realize their “full potential” on the ground of the Absolute Number, one may wonder, in what way would they be different from the natural numbers as they are commonly understood today? Maharishi has elaborated on the significance of several natural numbers, indicating that their true nature is more of a universal principle than mere distinct quantities. He describes the number 1 as unity, an eternal continuum [25, p. 613], from 2 These statement are quotes taken from my notes from the lecture and do not, as far as we know, appear in any of the Maharishi’s published works; however, similar statements do appear in his Absolute Theory of Defence [25]. 3 Elaborating further on the loss to life that occurs when individuals are disconnected from their source, Maharishi remarks, “This segregation of the individual from the cosmos is very unnatural. . . When the connectedness of individual life with Cosmic Life is damaged, indvidual intelligence remains disconnected from its own cosmic value. It remains like a bud without flowering” [28, pp. 200–201]. 5 which all the other natural numbers emerge. He has described the number 2 as the view of wholeness in which wholeness assumes the role of subject and object, infinite silence and infinite dynamism, intelligence and existence, Purusha and Prakriti [25, p. 630]. He has described the number 3 as the fundamental structure of unity, in terms of Rishi, Devata, and Chhandas [25, p. 630]. In these examples we see that each number has its own character but at the same time gives expression and remains connected to fullness. It is this more expanded view of the nature of the natural numbers, it would seem, that makes it possible for them to truly give rise to everything in the universe. The view that the natural numbers have a deeper, universal significance has been expressed in many ancient traditions of knowledge. In the West, Pythagoras and his school maintained that all things in the universe are, fundamentally, natural numbers. “All is number” is an expression attributed to this school.4 Pythagoras also maintained [13, p. 137] that at the basis of all natural numbers is a “Number of numbers,” an ultimate source of all numbers, something Divine in nature. Likewise, the Tao Te Ching, one of the deepest treasures of China’s ancient wisdom, sees unity, one, giving rise to two, two giving rise to three, and three giving rise to all things [16, v. 42]); and the source of all number is Tao, wholeness. There is even an admonition in the Tao Te Ching to appreciate all individual expressions in terms of their source—that the role of Tao in relation to the natural numbers, and to all particular expressions in the universe, is similar to that of Maharishi’s Absolute Number.5 3 The Origin of N According to Modern Mathematics It is natural to wonder to what extent the common view of the natural numbers is truly the mathematical view. All mathematics arises from set theory, so we can look to the axioms of set theory to see to what extent the commonly held restricted view of the natural numbers is present even in the foundation of mathematics. Among the axioms of ZFC6 (Zermelo-Fraenkel set theory with the Axiom of Choice), there 4 5 A discussion of the Pythagorean school can be found in [5, Part I]. In verse 52 [16], we read: The beginning of the universe is the mother of all things. Knowing the mother, one also knows the sons. Knowing the sons, yet remaining in touch with the mother, brings freedom from the fear of death. The passage emphasizes the importance of individual expressions remaining connected to—and being appreciated in terms of—their source. 6 In this paper, we will often switch between the theories ZFC and ZFC without the Axiom of Infinity: ZFC − Infinity. In the latter case, the theory that actually is needed (in light of footnote on p. 4) is ZFC − Infinity + Trans. To avoid becoming too involved in fine points, in this paper we simply assume that Trans is one of the standard axioms of ZFC. With this assumption, ZFC − Infinity automatically also includes Trans. For reference, we list our version of the ZFC axioms here. A functional formula is a formula φ(x, y, z1 , . . . , zk ) with the property that, for any sets a1 , . . . , ak , whenever t, u1 , u2 are sets and both φ(t, u1 , a1 , . . . , ak ) and φ(t, u2 , a1 , . . . , ak ) hold, then u1 = u2 . (Empty Set) There is a set with no element. (Axiom of Infinity) There is an inductive set. (Axiom of Extensionality) Two sets are equal if and only if they have the same elements. (Pairing Axiom) For any sets x, y, the collection {x, y} is also a set. More precisely, for all x, y, there 6 is just one axiom that talks about infinite sets; this axiom is called the Axiom of Infinity. Historically, this axiom was considered to be essential because of the work of Cantor, who showed that a rigorous formulation of a number system as fundamental as the real number line would not be possible without the notion of infinite sets—in particular, it is necessary to conceive of the natural numbers 0, 1, 2, 3, . . . as elements of a single, completed set, which we denote, in modern times, by N [17]. In coming up with a precise statement of the Axiom of Infinity, the early founders of set theory had to decide how to express the idea that “an infinite set exists” or “there is a set whose elements are the natural numbers” in the language of set theory. In the language of set theory, everything is taken to be a set, but at the time the axioms were being formulated, the natural numbers themselves were not usually thought of in this way. To meet the need of representing natural numbers as sets and asserting that there is a set that contains all natural numbers, the early crafters of the axioms settled on the notion of an inductive set [19, 18]. Therefore, the Axiom of Infinity as it is known today asserts the existence of an inductive set. A set S is inductive if it satisfies two properties: (1) S contains the empty set ∅, and (2) for any set x belonging to S, the set s(x) = x ∪ {x} also belongs to S. The natural numbers are then defined to be precisely those sets that belong to all inductive sets; the set of all natural numbers is denoted by the Greek letter ω (“omega”). This definition established ω, the set of natural numbers, as the smallest inductive set. This approach to the natural numbers provides is z such that the only elements of z are x and y. (Union Axiom) The union of any set of sets is again a set. More precisely, for all x, there is a z such that z = ∪x. (Power Set Axiom) The collection of all subsets of a set is again a set. More precisely, for all x, there is z such that, for all u, u ∈ z if and only if u ⊆ x. (Foundation) Every set has an ∈-minimal element. In other words, for every x, there is y such that for all z ∈ y, z 6∈ x. (Separation) For every formula φ(x, z1 , . . . , zk ), every set A, and all sets a1 , . . . , ak the collection {y ∈ A | φ(y, a1 , . . . , ak )} is a set. (Replacement) For any functional formula φ(x, y, z1 , . . . , zk ), any set A, and any sets a1 , . . . , ak , the collection {v | ∃u ∈ A φ(u, v, a1 , . . . , ak )} is a set. (Choice) For any set X of nonempty sets, there is another set Y containing an element of each of the elements of X. (Trans) For every set X there is a transitive set Y such that X ⊆ Y . 7 a formal way of declaring that the natural numbers have the following definition: 0 = ∅ 1 = {0} 2 = {0, 1} 3 = {0, 1, 2} · = · · = · · = · n + 1 = s(n) = n ∪ {n} = {0, 1, 2, . . ., n} · = · · · = = · · The function s is fundamental to the definition of the natural numbers; it tells us, given any number n, which number follows n in sequence. The usual way of defining s is by s(n) = n + 1, but in the context of pure sets, this definition becomes s(x) = x ∪ {x}.7 Viewing the natural numbers in this way—as simply the sets ∅, {∅}, {∅, {∅}}, . . ., or even as the “smallest inductive set”—provides no clue that there might be, even within the field of mathematics itself, an origin to the natural numbers; that the natural numbers might reasonably be seen as emerging from some source—a source that could be considered a mathematical analogue to Maharishi’s Absolute Number. 4 A Plan for a New Axiom of Infinity We will argue that failure to recognize such an origin has resulted in an unnecessarily limited view of the nature of the mathematical infinite. One of the ongoing problems in foundations of mathematics has been to discover axioms that could be added to the ZFC axioms to provide an axiomatic foundation for certain extremely large sets that have arisen in mathematical practice, known as large cardinals.8 A natural place to look for answers to questions about the “infinite” 7 Note that s, acting on any input x, has the effect of producing a new set x ∪ {x} that includes all elements of x together with one additional element, x itself. It should be observed that the set x ∪ {x} always consists of one more set than x itself because x and {x} are disjoint. They are disjoint because, in the universe of sets, no set is a member of itself—it is never the case that x ∈ x. 8 Cantor showed, at the end of the 19th century, that there are different sizes of infinite sets; every infinite set has one of these sizes. But shortly after his discovery, certain types of infinite set emerged that were so enormous, they were very difficult to classify, and, years later, it was shown that such infinities could not actually be proven to exist at all from ZFC. These notions of infinity are known today as large cardinals. What is surprising about these large cardinals is that they have appeared as key elements in the solutions of a wide range of research problems in mathematics; moreover, despite concerted efforts by many early set theorists, no one has ever proved that large cardinals do not exist. The so-called “Problem of Large Cardinals” is the problem of adding to the standard axioms of ZFC one or more axioms that could be used to derive the known large cardinals. The need here is to 8 in mathematics (such as “Which, if any, large cardinals exist—that is, which should be taken as valid objects in the universe of mathematics?”) is the Axiom of Infinity. As we have seen, the Axiom of Infinity tells us little more than that an infinite set exists, and that, in particular, there is a set ω consisting precisely of the set-versions of the natural numbers. Surprisingly, the properties of ω do already suggest the naturalness of some of the smaller large cardinals, but these properties are not nearly deep enough to motivate or justify the existence of the stronger large cardinal notions. Our aim in this paper is to take seriously the proposal that the natural numbers do indeed have a source that can be represented naturally in the axiomatic foundation of the natural numbers. We will suggest an alternative to the Axiom of Infinity that asserts, in the language of set theory, the existence of “transformational self-referral dynamics of an unbounded field”; we will see that a consequence of these dynamics is the sequential emergence of the natural numbers, but the Axiom of Infinity itself will be concerned simply with the existence of the underlying transformational dynamics. We will suggest that this new viewpoint concerning the formulation of the Axiom of Infinity provides strong motivation for a solution to the Problem of Large Cardinals. 5 Dedekind-Infinite Sets and Dedekind Self-Maps Among the many early definitions of “infinite set” that were considered, as the axioms of set theory were being formulated, a notion of infinity that did not rely on the sequence of natural numbers was Dedekind-Infinite sets, named after the mathematician, Richard Dedekind, who proposed the idea. We state the formal definition here: Definition 1 (Dedekind-Infinite Sets). A set A is Dedekind-Infinite if it can be put in 1-1 correspondence with a proper subset9 of itself. In other words, A is Dedekind-Infinite if there is a 1-1 and onto function f : A → B where B is a proper subset of A. An easy example of a Dedekind-Infinite set is A = {1, 2, 3, . . .}. Here, an example of a proper subset B of A that can be put in 1-1 correspondence with A is given by B = {2, 3, 4, . . .}, with correspondence given by f (n) = n + 1. find naturally motivated axioms that could be truly considered foundational axioms for mathematics, in the same spirit as the ZFC axioms themselves. See [6, 7, 9]. 9 A subset B of a set A is a proper subset if B 6= A. 9 Associated with every Dedekind-Infinite set A is a corresponding self-map j : A → A. For instance, if A is Dedekind-Infinite and B ⊆ A is a proper subset, and f : A → B is a 1-1 correspondence, the associated self-map j : A → A is defined just to be f itself, except the codomain10 has changed from B to A. Now, since j is a function from A to A, j is not itself a 1-1 correspondence: It is 1-1 but not onto.11 Since j is not onto, there is at least one element a in the codomain of j that is not in the range.12 Any such element a that is in the codomain but not the range will be called a critical point of j. We state a formal definition: Definition 2 (Dedekind Self-Maps). A Dedekind self-map is a 1-1 function j : A → A that has a critical point. The diagram shows an example of a Dedekind self-map with critical point 1. Here, the function f : A → B of the previous example has been replaced by j : A → A, but acts on elements of A in exactly the same way: For each n, j(n) = n + 1. Notice that the critical point of j has been circled. 10 The codomain of a function h : C → D is D; one writes cod h = D. A function h : C → D is 1-1 if, whenever x 6= y are distinct elements of C, h(x) 6= h(y) are distinct elements of D. Also, h is onto if, for every d ∈ D there is an x ∈ C such that h(x) = d. 12 The range of a function h : C → D is the set of all outputs of h: ran h = {h(x) | x ∈ C}. 11 10 A critical point a of a Dedekind self-map j : A → A plays an important role in the “internal dynamics” of A because it guarantees that the range of j, which we denote B (as in the examples), is itself a Dedekind-Infinite set, and that its restriction13 to B is in fact a Dedekind self-map with critical point j(a). We spend a moment to verify these details: Let B = j[A] and let i = j B. We show ran i ⊆ B: If b ∈ B, let x ∈ A be such that b = j(x). Since j(x) ∈ A, we have j(b) = j(j(x)) ∈ j[A] = B. It follows that i : B → B. Since i is a restriction of j, i is 1-1. We verify that j(a) is a critical point of i: If i(b) = j(a) for some b ∈ B, let x ∈ A with b = j(x). Then i(j(x)) = j(a) implies j(x) = a, which is impossible since a 6∈ ran j. It follows, therefore that i = j B : B → B is a Dedekind self-map with critical point j(a). Also, since i is 1-1, it follows that i : B → C = i[B] is a bijection; since j(a) 6∈ C, the existence of C shows that B itself is Dedekind-infinite. It is not difficult to carry the reasoning further and show that C itself is Dedekind infinite and j C : C → C is another Dedekind self-map with critical point j(j(a)). This reasoning leads to an infinite chain of Dedekind-infinite sets, Dedekind self-maps, and a sequence of critical points a, j(a), j(j(a)), . . .. This perspective is represented in the following diagram: The restriction i of a function h : S → T to a subset R of its domain S, denoted i = h R, has domain R and is defined by i(x) = h(x) for all x ∈ R. 13 11 The reader will notice that the sequence a, j(a), j(j(a)), . . . closely resembles the sequence of natural numbers. As we will show, it is natural to regard this sequence as a precursor to the “real” natural numbers. Another sequence, that represents the natural numbers more abstractly, is idA , j, j ◦ j, j ◦ j ◦ j, . . ., where idA : A → A is the identity function (that is, idA (x) = x for all x ∈ A). From this perspective—which, at this point in the discussion, has not yet been fully or rigorously developed—“natural numbers” a, j(a), j(j(a)), . . ., seem to arise as concrete “precipitations” from a self-referral flow (j : A → A), originating from the interaction between j and its critical point a. And, more abstractly, a subtler version of the natural numbers, idA , j, j ◦ j, . . ., arises simply from the interaction of j with itself; from this point of view, each natural number is seen as a self-referral loop. In addition, the dynamics expressed by j have an important characteristic: While j does in fact transform A (in the sense that j is not simply the identity function; values of A are moved), j nevertheless preserves the essential character of A, namely, that of being Dedekind-infinite, since, as we showed earlier, by virtue of the critical point of j, the image B of A under j also turns out to be Dedekind-infinite. These characteristics of a Dedekind self-map j : A → A are reminiscent of the transformational dynamics of wholeness, as described in Maharishi Vedic Science, from which the Veda, the blueprint of creation, emerges. In these dynamics, the first move of wholeness, of “A,” gives rise to a sequence of transformations from “AK” to “Agni” to “Agnimile” to the first pada, first richa, first mandala, to the entire Rik Veda [25, p. 636]. In these transformations, what is preserved is the essential fullness of “A”, even in its collapse to “K” (which Maharishi describes as “fullness of emptiness”). The particulars of manifest existence, like the natural numbers themselves, emerge as a side-effect of the unmanifest dynamics of wholeness. Moreover, Maharishi has described the ultimate nature of these particulars as “self-referral loops” within pure consciousness: “The evolution of consciousness into its object-referral expressions . . . progresses in self-referral 12 loops—every step of progress is in terms of a self-referral loop” [28, pp. 63–64]. When the “natural numbers” are seen to be represented by idA , j, j ◦ j, j ◦ j ◦ j, . . ., it is possible to see that each successive term of the sequence is an elaboration of the previous term. This viewpoint is expressed in the following diagrams, showing how the move to each successive term elaborates the previous term. Corresponding to the number 0, and by analogy, the first letter A of Rik Veda, we have the identity map idA : A → A that does nothing; it represents complete silence: id A A −→ A. Corresponding to the number 1, and by analogy, the first syllable AK of Rik Veda, we have the fundamental transformation given by a Dedekind self-map j : A → A, including the “collapse” of A to the critical point a. j A - @ A f@ inc R @ B In the diagram, f : A → B is a 1-1 correspondence from A to the proper subset B of A, and inc : B → A is a function that behaves like the identity map in that inc(x) = x for all x ∈ B; the only the difference between the identity map on B and the function inc is that the codomain of inc is A rather than B. The function inc is called the inclusion map from B to A; when necessary, its domain and codomains are indicated as well: inc = incB,A. The diagram as a whole—known as a commutative diagram—indicates that, following the behavior of the map across the top, namely, j, produces the same result as following the two lower arrows: f followed by inc, denoted inc ◦ f . Thus, for any x ∈ A, the value of j(x) is the same as inc(f (x)), and it is easy to verify the truth of this. The diagram shows that j is formed in two steps: The first step is the 1-1 correspondence between A and the set B, given by f ; the second step is the assertion that B itself is in fact a proper subset, expressed by the fact that inc(x) = x for all x ∈ B and that inc is not onto. In this diagram, the self-map does something; there is a critical point; elements of A are moved; in this way, the realm of pure silence has been enlivened to include considerable activity. Corresponding to the number 2, and by analogy, to the fuller elaboration of AK given by the first word of Rik Veda, Agnim, is the composition map j ◦ j. A j ◦j - 6 f ? B A inc j B - B Recall that if j : A → A is a Dedekind self-map with critical point a and range B, then j B : B → B is a Dedekind self-map with critical point j(a). The diagram shows that the 13 self-map j ◦ j : A → A is obtained by first applying f , then j B, and finally the inclusion map inc from B to A. One can view this suite of maps as a further elaboration of j itself. The visual impact of the diagram itself suggests that the first diagram is being expanded: The identity map idB : B → B that is only implicit in the first diagram is replaced in the second diagram by the Dedekind self-map j B. As our last example, the number 3, analogous to the next packet of expression in the Veda— the first pada, consisting of 8 syllables—corresponds to j ◦ j ◦ j. j ◦ j ◦ j- A 6 inc f ? B B,A (j B) ◦ (j -B) C B 6 inc f B ? A C,B j C - C Here, incB,A : B → A is the inclusion map from B to A, while incC,B is the inclusion map from C to B. Again, the diagram illustrates the more fully elaborated transformational dynamics as j ◦ j moves to j ◦ j ◦ j. The dynamics seen in the diagram for j ◦ j are now recapitulated in the lower square of the diagram for j ◦ j ◦ j, but the diagram as a whole tells a fuller story about how j acts on A in various ways. In this way, we see that the view that takes idA , j, j ◦ j, . . . as the blueprint of the natural numbers illustrates how the emergence of the natural numbers, at its foundation, consists of successive elaborations of the dynamics inherent in j : A → A. One final observation about this view of the natural numbers as arising from a sequence of Dedekind self-maps is that this sequence itself originates from a higher-order Dedekind self-map in the following way: Let j : A → A be a Dedekind self-map with critical point a, and let AA = {h | h is a self-map with domain A}.14 Note that the identity map idA : A → A is one of the elements of AA . Define a self-map Jj : AA → AA by Jj (h) = j ◦ h. Notice that idA is not in the range of Jj and that Jj is 1-1: For h, k ∈ AA we have Jj (h) 6= Jj (k) ⇒ j ◦ h = j ◦ k ⇒ h = k, since 1-1 functions are left-cancellable. We have shown that Jj : AA → AA is a Dedekind self-map with critical point idA . Now we can observe that, in the same way as a, j(a), j(j(a)), . . . are obtained by repeated applications of j to its critical point a, so likewise idA , j, j ◦ j, . . . is obtained by repeated applications of Jj to its critical point idA . We will establish in a moment the way in which both of these sequences can be considered blueprints for the natural numbers. 14 In general, if X, Y are sets, Y X denotes the set of all maps from X to Y . 14 6 A New Axiom of Infinity We have seen that the existence of a Dedekind self-map gives rise to the natural numbers by considering iterations of the map; in fact, as we have also seen, there are several ways to generate natural numbers by iteration of such a map. Each of these approaches presents the natural numbers as emerging from self-referral dynamics of an “unbounded” field—the degree of “unboundedness” of the underlying field increases as we consider more and more abstract versions of the self-map. And we have suggested that deriving the natural numbers in this way has the advantage that it provides an insight that can be applied to solve problems that require deeper intuition about the “true nature” of the mathematical infinite—our primary example of this phenomenon is the Problem of Large Cardinals. With these points in mind, we propose a new version of the Axiom of Infinity: New Axiom of Infinity. There is a Dedekind self-map. Stating the Axiom of Infinity in this way really causes a shift in viewpoint—a shift away from the idea that “infinite” means a vast collection of discrete objects and a shift in the direction of the insight, mentioned above, that the reality of the “infinite” is self-referral dynamics of an unbounded field, and that large collections of discrete object should be thought of as emerging from these dynamics, as a consequence or effect of the dynamics of the field. It is well-known to set theorists that, from the ZFC axioms minus the usual Axiom of Infinity, one can prove that the usual Axiom of Infinity and the New Axiom of Infinity are equivalent; using this known equivalence, we could then use the existence of an inductive set to demonstrate that the sequences a, j(a), j(j(a)), . . . and idA , j, j ◦ j, . . ., described in the previous section, do indeed form sets that each represent—in a sense that can be made precise—the set ω of natural numbers. We do not wish to rely on this standard result, however. The usual proof involves an auxiliary class of “natural numbers,” which does not necessarily form a set; the usual notions of induction and inductive definition can be shown to hold relative to this class even without the Axiom of Infinity. Roughly speaking, these “natural numbers” exist because of the fact that ZFC without the Axiom of Infinity15 is essentially the same as PA, and PA gives us the natural numbers as a class.16 Then, one can, given a Dedekind self-map j : A → A, define the class W = {a, j(a), j(j(a)), . . .} by proceeding as follows: Define a class function F on the class of natural numbers by F(0) = a F(n + 1) = j(F(n)). One may then use the Axiom of Replacement to define the set W by W = ran F ∩ A. 15 More precisely, the theory ZFC − Infinity + ¬Infinity. After we have verified that the natural numbers can be derived from a Dedekind self-map without the help of the natural numbers in any form, we will make use of this class of natural numbers. This subject is treated in Section 12. 16 15 Therefore the collection W = {a, j(a), j(j(a)), . . .} does indeed form a set, and one may derive the canonical set ω of natural numbers from W (we will carry out these last steps in detail in our approach as well). This approach to defining W , though mathematically sound, is somewhat unsatisfying since we make use of a class version of the natural numbers to obtain a set version of them. It would be more revealing to see how the dynamics of j give rise to the set of natural numbers without any reliance on the notion of a natural number at the beginning. We will therefore take a somewhat longer journey17 than is usually done to arrive at the natural numbers. We will use the following outline to guide our steps of reasoning: (A) We will define the notion of a j-inductive subset of A and we will let W denote the smallest j-inductive set. (B) We will define an order relation E on W . Roughly speaking, we will say that x E y if and only if one can obtain y from x by applying j at most fintely many times to x: y = j(j . . . (j(x)) . . .). This definition must not rely on any notion of natural number. (C) We will show that E is a well-founded partial order. (D) Using the Axiom of Choice, we will show that is a well-ordering of W . E is a total order as well, so that in fact, E These four points will set the stage for the final step, in which we derive the set of natural numbers, and the canonical successor function, by way of the Mostowski Collapsing Map; this map will produce the set of natural numbers because E is a well-ordering (and because the collapsing map outputs a transitive set), and will produce the standard successor function because of the way E is defined in terms of j; in other words, the successor s : ω → ω will be seen to be the “collapse” of the Dedekind self-map j restricted to W . 7 Obtaining the Well-Ordered Set W For this section we fix once and for all a Dedekind self-map j : A → A with critical point a. Establishing (A). We define the notion of a j-inductive set: A set B ⊆ A will be called j-inductive if (1) a ∈ B (2) whenever x ∈ B, j(x) is also in B. Notice that A itself T is j-inductive. Therefore, the set I = {B ⊆ A | B is j-inductive} is nonempty. Let W = I. Remark 1. For later reference, we observe here that W is definable from j and a. 17 The reader who does not want to take this longer journey and is satisfied with the definition of W just described, which relies on a class of natural numbers, can skip to Section 8 where the set of natural numbers is computed. 16 Lemma 1. W is a j-inductive subset of A. Proof. For (1), since a belongs to every j-inductive subset of A, a ∈ W . For (2), assume x ∈ W . Then x belongs to every j-inductive subset of A. For each such j-inductive subset B, since x ∈ B, j(x) ∈ B. Therefore j(x) ∈ W . Establishing (B). As mentioned earlier, we wish to define an order relation E on W by declaring that, for all x, y ∈ W , x E y if and only if y can be obtained by finitely many applications of j to x: y = j(j(. . .(j(x)) . . .)). The difficulty with this definition in our present context is the word “finite,” since a set is ordinarily defined to be finite if it can be put in 1-1 correspondence with one of the elements n of ω (recall that each such n is equal to its set {0, 1, 2, . . ., n − 1} of predecessors). To get around this difficulty, we will reword our requirements on E so that no mention of “finiteness” is necessary; the properties of the underlying structure W will ensure in the background that the number of applications of j that actually occur to reach from x to y whenever x E y will always be finite, but we will not need to prove this or make this fact explicit at any point in this early stage of development. In order to define E properly, we will first define a more elementary relation E0 on W : We shall say that, for all x, y ∈ W , x E0 y if and only if y = j(x). We prove some basic facts about j and E0 . For any relation R on a set X, an R-minimal element of X is an x ∈ X such that for all y ∈ X, it is not the case that y R x. Moreover, R is said to be wellfounded if every nonempty subset Y of X has an R-minimal element. A familiar example is the set N = {1, 2, 3, . . .} of natural numbers with its less than relation < (so here, R is simply <); < is wellfounded as every nonempty subset of N has a smallest element. We will not make use of this example here, of course, since the natural numbers have not yet formally been defined; the example is intended just to give some concrete sense for the new definitions. Claim 1. j has no fixed points in W ; that is, j(x) 6= x for all x ∈ W . Consequently, E0 is irreflexive. Proof. The last part follows immediately from the fact that j has no fixed points. We prove the first part: Suppose B ⊆ A is j-inductive and x ∈ B is a fixed point of j. We observe that B − {x} is also j-inductive: First, notice a 6= x since a 6∈ ran j but x = j(x) ∈ ran j. Therefore, a ∈ B − {x}. Suppose y ∈ B − {x}; we show j(y) ∈ B − {x}. Certainly j(y) ∈ B since B is j-inductive. If j(y) = x, then since j(x) = x too, we have j(x) = j(y), and by 1-1-ness of j, x = y, which is impossible. Therefore j(y) ∈ B − {x}. We have shown B − {x} is j-inductive. To prove the claim, suppose B ⊆ A is a j-inductive set that contains a fixed point x of j. As we have seen, B − {x} is j-inductive, and by leastness of W , W ⊆ B − {x}. Therefore B 6= W . This proves the claim. Claim 2. E0 is wellfounded. 17 Proof. Suppose X ⊆ W is a subset having no E0 -minimal element. Let B ⊆ W be defined by B = {x ∈ W | x 6∈ X}. We show that B is j-inductive and therefore equal to W ; the conclusion will then be that X itself is empty, as required. First we show that a ∈ B. If not, a ∈ X, then there is a b ∈ X such that b E0 a; that is, j(b) = a. But this is impossible since a 6∈ ran j. Next, assume x ∈ B; we show j(x) ∈ B. Assume for a contradiction that j(x) 6∈ B, so j(x) ∈ X. Since X has no E0 -minimal element, there is u ∈ X with j(u) = j(x). Since j is 1-1, u = x. But this is impossible because u ∈ X but x is not. This completes the proof that B is j-inductive and that W is empty. Claim 3. E0 is antisymmetric. Proof. Suppose x, y ∈ W and x E0 y. If y E0 x also, then the set {x, y} is a nonempty subset of W with no E0 -minimal element. The relation E0 is a first try at defining a prototype of the less than relation on the natural numbers. It has many of the characteristics that are needed, but it is not transitive: It is not the case that if z = j(y) and y = j(x), then z = j(x). To obtain a transitive order relation, we will expand E0 to a relation E as follows: For all x, y ∈ W , x E y if and only if there is a subset F of W satisfying the following: (i) x, y, ∈ F (ii) for some v ∈ F , y = j(v) (iii) there is no u ∈ F for which x = j(u) (iv) if u ∈ F and u 6= y, then j(u) ∈ F (v) if v ∈ F and v 6= x, there is u ∈ F such that v = j(u) The set F is said to join x to y, and is called a joining set. Intuitively, F is finite, but we do not state this in the definition, nor try to prove it (in fact, at this point, we do not even have a definition of “finite”).18 Establishing (C). We begin by proving some facts about Claim 4. 18 E0 E and joining sets. ⊆ E . That is, whenever x, y ∈ W and x E0 y, then x E y. We come back to this point later. The fact that F must always be finite is established in Theorem 13. 18 Proof. Suppose x E0 y, so y = j(x). Let F = {x, y}. We show F joins x to y. Parts (i),(ii),(iv),(v) in the definition of E are obviously true for F . We prove that (iii) also holds: We must verify that neither of the following holds: (a) j(x) = x, (b) j(y) = x. (a) fails because j has no fixed points; (b) fails because E0 is antisymmetric. We have established (iii); the result follows. Claim 5. E is irreflexive. Proof. Suppose x ∈ W . If x E x, let F join x to x. By (ii), there is u ∈ F such that x = j(u), but by (iii), there is no u ∈ F for which x = j(u). This contradiction shows that it is never the case that x E x. Claim 6. E is wellfounded. Proof. We use the same strategy as was used in the proof of Claim 2. Suppose X ⊆ W is a set with no E-minimal element. Let B = {x ∈ W | x 6∈ X}. We show B is j-inductive. We first prove that a ∈ B. Suppose, for a contradiction, that a ∈ X. Since X has no E -minimal element, there is x ∈ X with x E a. Let F ⊂ X join x to a. Then there is u ∈ F such that j(u) = a. But this is impossible since a 6∈ ran j. Next, suppose x ∈ B; we show j(x) ∈ B. Suppose not; then j(x) ∈ X and so for some u ∈ X, u E j(x). Let F join u to j(x). By property (ii) of the definition of E, there is some v ∈ F ⊆ X such that v E0 j(x), that is, j(v) = j(x). Since j is 1-1, v = x. But this is impossible since v ∈ X but x is not. We extract an observation made in the proof: Claim 7. For all x ∈ W , x E 6 a. Claim 8. E is anti-symmetric; that is, for all x, y ∈ W , if x E y then it is not the case that y E x. Proof. Suppose x, y ∈ W are such that x E y and also y E x. Then {x, y} has no element, contradicting Claim 6. E -minimal Claim 9. For all x ∈ W , if a 6= x, then a E x. Proof. Let B = {z ∈ W | a = z or a E z}. It suffices to show B is a j-inductive set. Since it is obvious that a ∈ B, it suffices to assume z ∈ B and prove j(z) ∈ B. If a = z, then certainly a E j(a), and so j(z) ∈ B. If a 6= z, then, since z ∈ B, we have a E z. Let F join a to z. Let G = F ∪ {j(z)}. We wish to show that G joins a to j(z). We verify that G satisfies (i)–(v) above. Properties (i), (ii), (iv), and (v) are immediate. For (iii), certainly no u ∈ F has the property that u E a, since F joins a to z (note (iii) already holds for F ). But j(z) 6 E a either because of Claim 7. We have shown B is j-inductive, as required. Claim 10. Suppose x, y, z ∈ W . 19 (1) If x E y, then j(x) E j(y). (2) x E j(j(x)). (3) Suppose x E y and y 6= j(x). Then j(x) E y. (4) Suppose x E j(y) and x 6= y. Then x E y. (5) Suppose x E y. Then j(y) E 6 x. (6) If x E y, then x E j(y). (7) If j(x) E y, then x E y. (8) Suppose F joins x to y, u ∈ F , and u 6= x. Then x E u. (9) Suppose F joins x to y and u ∈ F . Then y E 6 u. (10) Suppose F joins x to y, u ∈ F , and u 6= y. Then u E y. (11) If x 6= a, there is u ∈ W such that j(u) = x. Moreover, there is no v ∈ W such that u E v E x. Regarding part (11), for a given x ∈ W different from a, we will call an element u ∈ W for which j(u) = x an E -immediate predecessor of x, or, when the context makes it clear, an immediate predecessor of x. Since j is 1-1, if x has an immediate predecessor, it is unique. Part (11) says that every element of W other than a has an immediate predecessor. The notion “immediate predecessor” is to be distinguished from the unqualified “predecessor”: Whenever x E y, x is called a predecessor of y, but x is not necessarily the immediate predecessor of y. Proof of (1). Suppose F joins x to y. It is straightforward to verify that if G = {j(u) | u ∈ F }, then G joins j(x) to j(y). Proof of (2). Let B = {x ∈ W | x E j(j(x)). We show B is j-inductive. By Claim 9, a ∈ B. Assume x ∈ B; we show j(x) ∈ B. Since x ∈ B, x E j(j(x)). By (1), j(x) E j(j(j(x))). It follows that j(x) ∈ B, as required. Proof of (3). Suppose x E y and y 6= j(x). Let F join x to y. Let G = F − {x}. It is straightforward to verify that G joins j(x) to y. Proof of (4). Suppose x E j(y) and x 6= y. Let F join x to j(y). Let G = F − {j(y)}. It is straightforward to verify that G joins x to y. Proof of (5). Fix x ∈ W . Let B = {y ∈ W | if x E y then j(y) E 6 x}. We show B is j-inductive. Vacuously, a ∈ B. Assume y ∈ B; we show j(y) ∈ B. Assume x E j(y). Suppose first that x = y. By (2), y E j(j(y)). Since E is antisymmetric, then j(j(y)) E 6 y. Therefore, in this case, j(y) ∈ B. Now assume x 6= y. By (4), x E y. Since y ∈ B, it follows that j(y) E 6 x. But now if j(j(y)) E x, by (3) we would have j(y) E x, which is a contradiction. Therefore, j(j(y)) E 6 x and so j(y) ∈ B 20 in this case as well. This completes the proof that B is j-inductive. We have shown therefore that for all x, y ∈ W , if x E y, then j(y) E 6 x. Proof of (6). Suppose x E y. We show x E j(y). Let F join x to y. Let G = F ∪ {j(y)}. We show G joins x to j(y). Verification of (i),(ii), (iv), (v) in the definition of E is straightforward. We check that (iii) holds: Suppose u ∈ G; we must show j(u) 6= x. Certainly j(u) 6= x if u ∈ F (since F joins x to y). We consider the case in which u = j(y). If j(j(y)) = x, then since j(y) E j(j(y)), it follows that j(y) E x, and this contradicts (5). Therefore, in this case also, j(u) 6= x, and (iii) is established. We have shown G joins x to j(y), and so x E j(y). Proof of (7). Suppose j(x) E y. We show x E y. Let F join j(x) to y and let G = F ∪ {x}. We show G joins x to y. Verification of (i),(ii), (iv), (v) in the definition of E is straightforward. We check that (iii) holds: We must show that j(u) 6= x for any x ∈ G. First we observe that if u = j(x), j(u) 6= x: If j(j(x)) = x, then j(x) E x, violating antisymmetric property of E . Next, suppose u ∈ F is such that j(u) E x. By (6), j(u) E j(x), but this contradicts the fact that F joins j(x) to y (in particular, (iii) is violated). Therefore, no such u exists, and (iii) holds for G as well. We have shown G joins x to y, and so x E y. Proof of (8). Suppose F joins x to y, u ∈ F , and u 6= x. We show x E u. Let B = {u ∈ W | if u ∈ F and x 6= u, then x E u}. We show B is j-inductive. By definition of F , a 6∈ F , so a ∈ B. Assume u ∈ B and assume j(u) ∈ F and x 6= j(u). If x = u, then x E j(u), as required. On the other hand, if x 6= u, then since u ∈ B, x E u. It follows by (6) that x E j(u). In each case we have shown that j(u) ∈ B, and so B is j-inductive. The result follows. Proof of (9). Suppose F joins x to y and u ∈ F . We show y 6 E u. Let B = {u ∈ W | if u ∈ F , then y E 6 u}. Since for no z ∈ W is it true that z E a, we have that a ∈ B. Assume u ∈ B and j(u) ∈ F . y 6 E j(u). If u 6∈ F , then (reasoning as we did in (8)), it must be that j(u) = x. But now by antisymmetry of E , y E 6 x. On the other hand, suppose u ∈ F . Then since u ∈ B, y E 6 u. If y E j(u), then y E u (by (4)), giving a contradiction. Therefore, y E 6 j(u). We have shown j(u) ∈ B. Therefore, B is j-inductive, as required. Proof of (10). Suppose F joins x to y, u ∈ F , and u 6= y. We show u E y. Let B = {u ∈ W | if u ∈ F and y 6= u, then u E y}. We show B is j-inductive. Vacuously, a ∈ B. Assume u ∈ B and j(u) ∈ F with j(u) 6= y. We show j(u) E y. The first case is that u 6∈ F ; we claim j(u) = x. If not, for some v ∈ F , j(v) = j(u) (by property (v)), and since j is 1-1, v = u; but this is impossible since u 6∈ F but v ∈ F . Therefore, j(u) = x. But now F witnesses that j(u) E y. The other case is that u ∈ F . I claim y 6= u: If y = u, then y E j(u), but, since j(u) ∈ F , this contradicts the result established in (9). Therefore y 6= u. Since u ∈ B, u E y. By (3), j(u) E y. Therefore, j(u) ∈ B. We have shown B is j-inductive, as required. Proof of (11). Since a E x, let F join a to x. By definition of “join”, there must be a u ∈ F for which j(u) = x. If there is a v ∈ W such that u E v E x, let G join v to x. By definition of “join” again, there is a z ∈ G such that j(z) = x. By the antisymmetric property of E0 , z 6= x. Clearly, either v = z or v E z. It follows that u E z. By irreflexivity, u 6= z. However, j(u) = x = j(z), and 21 so, by 1-1-ness of j, we must have u = z. This is a contradiction and shows that no such v exists. Claim 11. Suppose x, y, z ∈ W are distinct. Then if F joins x to y and G joins y to z, then F ∪ G joins x to z. In particular, E is transitive. Proof. Verification of properties (i), (ii), (iv), (v) for F ∪ G is straightforward. For (iii), suppose u ∈ F ∪ G is such that j(u) = x. Since F joins x to y, it follows that u ∈ G and u 6∈ F . In particular, u 6= y. By Claim 10(8), y E u. By Claim 10(6), y E j(u), and so y E x. But this violates the antisymmetric property of E (since x E y by assumption). Therefore, for no u ∈ F ∪ G is it the case that j(u) = x. Therefore, we have established that all properties (i)–(v) hold for F ∪ G, as required. Theorem 2. (W, E) is a wellfounded partial order. Proof. This follows from Claims 5, 6, 8, and 11. Establishing (D). We are now ready to show that E is a well-ordering. For this purpose we will use the Axiom of Choice in its following equivalent form: Hausdorff Maximal Principle. Every partially ordered set contains a maximal totally ordered subset. Theorem 3. W is well-ordered under E . Proof. It suffices to show that E is a total order on W. We have shown that E is a partial order. Applying the Hausdorff Maximal Principle, let W0 ⊆ W be a maximal totally ordered subset of W , with respect to E . We show that W0 is j-inductive and therefore equal to W . By Claim 9, a E x for all x ∈ W for which a 6= x. Therefore, if a 6∈ W0 , W0 ∪ {a} is still totally ordered but W0 is a proper subset of W0 ∪ {a}, and this violates the maximality of W0 as a totally ordered subset of W . Thus, a ∈ W0 . Next, suppose x ∈ W0 ; we show j(x) ∈ W0 . Notice x is not an upper bound of W0 since, if it were, W0 ∪ {j(x)} would also be a total order, properly containing W0 . Therefore, there is y ∈ W0 such that x E y. Since W0 is totally ordered and wellfounded, it is well-ordered and the set {z ∈ W0 | x E z} has a least element z0 . Let F join x to z0 . If z0 6= j(x), if follows from Claims 4 and 10(3) that x E j(x) E z0 . I claim that W0 ∪ {j(x)} is a totally ordered subset of W that properly contains W0 : We must show that j(x) is related to every element of W0 . Let z ∈ W0 . Then because W0 is totally ordered and because of leastness of z0 , either z E x, z = x, z = z0 , or z0 E z. If z E x, by transitivity of E , z E j(x). If z = x, clearly again, z E j(x). If z = z0 , as we have seen, j(x) E z0 . And if z0 E z, then by transitivity of E, j(x) E z. We have shown that W0 ∪ {j(x)} is a totally ordered subset of W that properly contains W0 , contradicting maximality of W0 . Therefore, the least element z0 in W0 for which x E z0 must be j(x); in particular, j(x) ∈ W0 . This completes the proof that W0 is j-inductive, and that W0 = W . Hence (W, E) is a well-ordered set. The proof shows one additional useful fact, which we extract in the form of a Corollary: 22 Corollary 4. Let x ∈ W . Then j(x) is the E -least element of {z ∈ W | x E z}. Therefore, we may list the elements of W by a E j(a) E j(j(a)) E . . .. After we define the natural numbers, we will be able to establish more formally that W = {a, j(a), j(j(a)), . . .}.19 For future use, for each x ∈ W , let Wx = {z ∈ W | z E x} and let W x = {z ∈ W | x E z}. Thus, for each x ∈ W , W = Wx ∪ {x} ∪ W x . Corollary 5. Suppose x, y ∈ W and x 6= y. Then Wx 6= Wy . Proof. Since x ∈ Wy − Wx . E is a total order, we may assume, without loss of generality, then x E y. Then Because E has the property given by Corollary 5, one says that use, we compute Wx for a few values x ∈ W : E is extensional. For later Corollary 6. (1) Wa = ∅. (2) Wj(a) = {a}. (3) Wj(j(a)) = {a, j(a)}. (4) Wj(j(j(a))) = {a, j(a), j(j(a))}. Proof of (1). This follows from Claim 7. Proof of (2). Recall that by Claim 10(4), if x E j(a), either x E a or x = a. Thus, Wj(a) = {x ∈ W | x E j(a)} = {a}. Proof of (3). By Claim 10(4) again, if x E j(j(a)), then either x E j(a) or x = j(a), and the only x satisfying the first of these is a itself. Therefore Wj(j(a)) = {x ∈ W | x E j(j(a))} = {a, j(a)}. Proof of (4). By Claim 10(4) again, if x E j(j(j(a))), then either x E j(j(a)) or x = j(j(a)). We have seen in (3) that the x for which x E j(j(a)) are a and j(a). Therefore, Wj(j(j(a))) = {x ∈ W | x E j(j(j(a)))} = {a, j(a), j(j(a))}. 19 This is done in Theorem 16. 23 8 Generating the Natural Numbers with the Mostowski Collapse Before describing the Mostowski Collapsing Map, we need to introduce one other concept. A set T is said to be transitive if, whenever z ∈ T and y ∈ z, we have y ∈ T . It is not hard to verify that ω = {0, 1, 2, . . .} together with each of its elements, is transitive. Transitive sets are the “well-behaved” sets in the universe. We define a function π on W , which is known as the Mostowski Collapsing Map and which can be defined more generally on any well-founded, extensional relation defined on a set. The function π, defined on W , will turn out to be a bijection and has the effect of identifying its domain with a transitive set N whose membership relation ∈ behaves exactly like the relation E on W . The effect of the Mostowski Collapsing Map is to transform a set and its internal relationships into a “concrete, well-behaved” set that resides at the very bottom of the universe but that still exhibits the same relationships as before. As we will see, π will collapse the elements of W to corresponding elements of the set ω = {0, 1, 2, 3, . . .}. From our initial assessment of W , we would expect that a corresponds to 0, j(a) to 1, j(j(a)) to 2, and so forth. Theorem 7 (Mostowski Collapsing Theorem for W ). There is a unique function π defined on W that satisfies the following relation, for every x ∈ W : π(x) = {π(y) | y E x}. (∗) Proof. Let B ⊆ W be defined by putting z ∈ B if and only if the formula ψ(z) ≡ ∃!g φ(z, g) holds, where “∃!” means “there exists exactly one” and φ(z, g) ⇔ g is defined on Wz ∪ {z} so that it satisfies, for all x ∈ Wz ∪ {z} g(x) = {g(y) | y E x} Whenever there exists a g such that φ(z, g), we say that g is a witness for ψ(z). When such a g defined on Wz ∪ {z} exists, it will typically be denoted πz . We will show that B is j-inductive, and then, from B, obtain the Mostowski Collapsing map. We first observe that a ∈ B: Since there is no y ∈ W for which y E a, {πa (y) | y E a} must be empty, no matter how πa is defined on “predecessors” of a. Therefore, there is one and only one function πa with domain Wa ∪ {a} = {a} that satisfies πa (a) = {πa(y) | y E a}, and that is the function for which πa(a) = ∅. We have shown that ψ(a) holds, so a ∈ B. Now assume z ∈ B and let πz be the unique map defined on Wz ∪ {z} that is a witness for ψ(z). We prove j(z) ∈ B. We define πj(z) on Wj(z) ∪ {j(z)} by ( πz (x) if x E j(z) πj(z) (x) = {πz (y) | y E j(z)} if x = j(z) Notice that if y E j(z), then either y = z or y E z (by Claim 10(4)). Therefore, defining πj(z) (x) to be {πz (y) | y E j(z)} when x = j(z) makes sense. We verify that πj(z) is a witness for ψ(j(z)): If x E j(z), then πj(z) (x) = πz (x) = {πz (y) | y E x} = {πj(z) (y) | y E x}. 24 The last line follows because, by definition of πj(z) , πj(z) agrees with πz on all y for which y E x since in that case y E j(z). On the other hand, if x = j(z), then πj(z) (x) = {πz (y) | y E j(z)} = {πj(z) (y) | y E j(z)} Once again, by definition of πj(z) , πj(z) agrees with πz on y for which y E j(z), so the second equality in the display follows from the first. This shows that a witness πj(z) for ψ(j(z)) exists; we need to show it is unique. Assume f is defined on Wj(z) ∪ {j(z)} is also a witness for ψ(j(z)); in particular, that f satisfies f (x) = {f (y) | y E x}. It is not hard to check that f Wz is a witness for ψ(z), so by uniqueness of πz as a witness for ψ(z), f Wz = πz . By this observation, we have f (j(z)) = {f (y) | y E j(z)} = {πz (y) | y E j(z)} = {πj(z) (y) | y E j(z)} = πj(z) (j(z)). Hence f = πj(z) and we have established uniqueness. It follows that j(z) ∈ B. We have shown B is j-inductive, and so W = B. We now define the Mostowski Collapsing map π on W as follows: For each x ∈ W , π(x) = πx (x). Claim. For all y ∈ W , (+) if x E y, then πx (x) = πy (x). Proof. Let B = {y ∈ W | if x E y then πx (x) = πy (x)}. We show B is j-inductive. Vacuously, a ∈ B. Suppose x ∈ B; we show j(x) ∈ B. Let y be such that j(x) E y. Then, using the fact (twice) that x ∈ B, we have πy x = {πy (u) | u E x} = {πx (u) | u E x} = {πj(x)(u) | u E x} = πj(x)(x). Therefore j(x) ∈ B. Since B is j-inductive, B = W and the result follows. We show that π satisfies (∗) for each x ∈ W . Using (+), we have: π(x) = πx (x) = {πx (y) | y E x} = {π(y) | y E x}, as required. Finally, we show that π is the unique f satisfying, for all x ∈ W , f (x) = {f (y) | y E x}. Given any such f , we show f = π. Let B = {x ∈ W | for all y such that y = x or y E x, f (y) = π(y)}. We show B is j-inductive. The fact that a ∈ B is immediate; in particular, π(a) = ∅ = f (a). Assume x ∈ B, so that f (y) = π(y) for all y for which y = x or y E x. Then since y E j(x) implies y = x or y E x (as we observed earlier), f (j(x)) = {f (y) | y E j(x)} = {π(y) | y E j(x)} = π(j(x)). 25 To show j(x) ∈ B, we must also show that for u E j(x), f (u) = π(u), but this follows from the fact that x ∈ B. We have shown j(x) ∈ B. Therefore B is j-inductive and B = W . It follows that f = π, as required. As we establish properties of the collapsing map π, we will denote its range by N . We establish some important properties of π and N : Theorem 8 (Properties of π and N ). (1) N is a transitive set. (2) π is 1-1. (3) For all x, y ∈ W , x E y if and only if π(x) ∈ π(y). (4) (N, ∈) is a well-ordered set. In particular, 0 = ∅ is the ∈-least element of N . (5) Each n ∈ N is a transitive set. Moreover, n = {m ∈ N | m ∈ n}. Remark 2. Part (3) of the theorem tells us that the usual membership relation ∈ on the range N of π exactly parallels the relation E on W : Whenever x is an E-predecessor of y in W , in the sense of the relation E , the images π(x), π(y) have the same relationship to each other, namely, that π(x) is an ∈-predecessor of π(y); the converse statement also holds. This property, together with the fact that π is a bijection from W to N , tells us that π is an order isomorphism from W to N ; this means that (N, ∈) is an exact reproduction of (W, E) Intuitively, we may think of this relationship as indicating that the “blueprint” W has been faithfully reproduced as a concrete (transitive) object in the bottom portion of the universe. As we shall show in a moment, N turns out to be the concrete set ω of natural numbers. Part (5) tells us that each n ∈ N consists precisely of the elements of N that precede it in the well-ordering ∈. Thus, 1 = {0}, 2 = {0, 1}, and so forth. Proof of (1). Suppose π(x) ∈ N and u ∈ π(x). We must show u ∈ N . Since π(x) = {π(y) | y E x}, it follows that u = π(y) for some y ∈ W . Thus u ∈ N . Proof of (2). Suppose π(x) = π(z) but x 6= z. Recall that Wx 6= Wz . Without loss of generality, assume there is u ∈ Wx − Wz , so u E x and u 6E z. Then π(u) ∈ π(x). Since π(z) = π(x), then π(u) ∈ π(z), whence u E z, and this is a contradiction. We have shown π is 1-1. Proof of (3). This follows immediately from the definition of π. Proof of (4). It is easy to see that the first part follows from (2) and (3); we verify a few of the required points. For the irreflexive property, notice that for x ∈ W , x E x if and only if π(x) ∈ π(x); since the former is false, the latter is also false. Similar reasoning shows (N, ∈) is antisymmetric, transitive, and a total order. We verify that ∈ well-orders N: Suppose C ⊆ N is nonempty. Let B denote the preimages of C in W ; that is, b ∈ B if and only if π(b) ∈ C. Let 26 b0 be the least element of B. Let c0 = π(b0). We show c0 is ∈-least in N : Let π(b) ∈ C. Then b0 E b, and so π(b0) ∈ π(b). Since b was arbitrary, we have shown C has an ∈-least element. For the second clause, let m ∈ N . In defining π, we have already seen that π(a) = 0. Let x ∈ W be such that π(x) = m. Since a E x, then by (3), 0 = π(a) ∈ π(x) = m. Proof of (5). The fact that each n ∈ N is a transitive set follows from the fact that ∈ is transitive as an order relation: For all m, n, r ∈ N , if m ∈ n and n ∈ r, then m ∈ r. To show that n = {m ∈ N | m ∈ n}, we perform a computation: Let x ∈ W be such that n = π(x). n = π(x) = {π(y) | y E x} = {π(y) | π(y) ∈ π(x)} (because π is an order isomorphism) = {m ∈ N | m ∈ n}. Having defined the Mostowski Collapsing Map π on W , we can now demonstrate how π “collapses” the blueprint W to the concrete set ω of natural numbers. We begin with the computation of the first few elements of ω. We will make use of Corollary 6 in which a few values of Wx were computed. Computation of 0. Here, 0 arises as the collapse of a in W : π(a) = {π(y) | y E a} = ∅ = 0. Computation of 1. Here, 1 arises as the collapse of j(a) in W . Recall from Corollary 6 that Wj(a) = {a}. π(j(a)) = {π(x) | x E j(a)} = {π(x) | x ∈ Wj(a)} = {π(x) | x ∈ {a}} = {π(a)} = {0} = 1. Computation of 2. Here, 2 arises as the collapse of j(j(a)) in W . Recall from Corollary 6 that Wj(j(a)) = {a, j(a)}. π(j(j(a))) = {π(x) | x E j(j(a))} = {π(x) | x ∈ Wj(j(a))} = {π(x) | x ∈ {a, j(a)}} = {π(a), π(j(a))} = {0, 1} = 2. 27 Computation of 3. Here, 3 arises as the collapse of j(j(j(a))) in W . Recall from Corollary 6 that Wj(j(j(a))) = {a, j(a), j(j(a))}. π(j(j(j(a)))) = {π(x) | x E j(j(j(a)))} = {π(x) | x ∈ Wj(j(j(a)))} = {π(x) | x ∈ {a, j(a), j(j(a))}} = {π(a), π(j(a)), π(j(j(a))} = {0, 1, 2} = 3. These observations suggest that, at least in some sense, π “gives rise” to the natural numbers. We must be careful not to claim too much here, though, because the numbers 0, 1, 2, 3, . . ., with their usual definition as sets ∅, {∅}, {∅, {∅}}, . . ., are already present in the universe, even without the Axiom of Infinity, as part of the way the first stages of the universe are built; these come into being on the basis of the other axioms of ZFC. What the other axioms do not tell us, however, is how to form the set of natural numbers, and, in fact, without the Axiom of Infinity, it is not possible to prove that such a set even exists. Therefore, the real content of our claim that π “gives rise” to the natural numbers is that it gives rise to the sequence 0, 1, 2, . . . in its entirety, as a completed set—seen as emerging from the self-referral dynamics of the original Dedekind self-map j : A → A. What we wish to do then is to show how to derive from π the successor function s, which, once defined, specifies the full sequence of natural numbers: 0, s(0), s(s(0)), . . .. Recall that the set-based version of the successor function has this form: s(x) = x ∪ {x}. For instance, s(0) = s(∅) = ∅ ∪ {∅} = {∅} = {0} = 1. What we will see is that the values of the successor function s are obtained simultaneously from a certain type of interaction between π and j; to say it another way, s turns out to be the unique map that makes the following diagram commutative: W j W - π π ? N W s - ? N Notice that, in order for any map s : N → N to make the diagram commutative, it must be true that s = π ◦ (j W ) ◦ π −1 . (Note that this computation makes sense because π is a bijection.) We can therefore take this to be the definition of s and then verify that s is truly the successor function on the usual set of natural numbers. Theorem 9. Define s = π ◦ (j W ) ◦ π −1 : N → N . Then, for all n ∈ N , s(n) = n ∪ {n}. Proof. By the definition of s, the diagram must be commutative. Let B = {x ∈ W | s(π(x)) = π(x) ∪ {π(x)}}. We show B is j-inductive. This is enough because every n ∈ N is π(x) for some x ∈ W , so, assuming B = W , we have s(n) = s(π(x)) = π(x) ∪ {π(x)} = n ∪ {n}. 28 To prove B is j-inductive, first we show a ∈ B: By commutativity, s(π(a)) = π(j(a)) = {π(u) | u E j(a)} = {π(a)} = {0} = 0 ∪ {0} = π(a) ∪ {π(a)}. Next, assume x ∈ B, so s(π(x)) = π(x) ∪ {π(x)}. We show j(x) ∈ B, that is, s(π(j(x))) = π(j(x)) ∪ {π(j(x))}. But s(π(j(x))) = π(j(j(x))) = {π(y) | y E j(j(x))} = {π(y) | y E j(x) or y = j(x)} = {π(y) | y E j(x)} ∪ {π(y) | y = j(x)} = π(j(x)) ∪ {π(j(x))}. We can now verify in several theorems below that N and s have the expected properties. We will say that a set S is inductive if ∅ ∈ S and whenever x ∈ S, we have x ∪ {x} is in S. T Theorem 10. N is an inductive set. Indeed, N = {I | I is inductive}. Proof. We have seen already that ∅ ∈ N . Suppose n ∈ N . Then for some x ∈ W , n = π(x). But now n ∪ {n} = s(n) = s(π(x)) = π(j(x)) ∈ N. For the second clause, it is sufficient to show that N ⊆ I for every inductive set I. Let I be any inductive set. Let B = {x ∈ W | π(x) ∈ I}. We show B is j-inductive. By definition π(a) = ∅ ∈ I, so a ∈ B. If x ∈ B, then n = π(x) ∈ I. But now j(x) ∈ B ⇔ π(j(x)) ∈ I ⇔ s(π(x)) ∈ I ⇔ s(n) ∈ I ⇔ n ∪ {n} ∈ I, and the last of these statements is true by definition of “inductive.” Hence B is j-inductive, and so for every n ∈ N , n ∈ I, as required. The property that N has of being an inductive set leads to the Principle of Mathematical Induction: Principle of Mathematical Induction. Suppose A ⊆ N has the following two properties: (1) ∅ ∈ A (2) whenever n ∈ A, s(n) ∈ A. Then A = N . We can now prove the Principle of Mathematical Induction. Before doing so, we state a weaker version that is often also used, based on formulas. A formula is simply an expression involving set parameters. Examples include statements like “x is an even natural number,” “x has at least two elements,” “y = P(x)” (where P is the power set operation, which outputs the 29 set of all subsets of a set x). We have the following weak principle of induction: Weak Principle of Mathematical Induction. Suppose φ(x) is a formula20 whose parameter x represents an element of N . Suppose further that (1) φ(∅) holds true (2) whenever φ(n) holds, φ(s(n)) also holds. Then φ(n) holds for every n ∈ N . This priniciple is called “weak” because the Principle of Induction implies it: Given any formula φ(x), where x represents elements of N , suppose (1) and (2) of the Weak Principle hold. The Axiom of Separation (a ZFC axiom) implies that the collection A = {n ∈ N | φ(n)} is a set, and so (1) and (2) of the Principle also hold. Since the Principle of Induction holds, the conclusion of the Weak Principle now follows. The Weak Principle is useful because it generalizes to classes—a topic we will take up in Section 11. Theorem 11. The Principle of Mathematical Induction is correct. Proof. Let A ⊆ N satisfy properties (1) and (2). Properties (1) and (2) assert that A is in fact an inductive set. By Theorem 10, A ⊆ N. But since A ⊆ N also, we conclude that A = N . An alternative proof is obtained by using our earlier observation that (W, E) and (N, ∈) are order-isomorphic, so that, in particular, N is well-ordered under ∈. Thus, suppose A ⊆ N satisfies (1) and (2). We will use the fact that N is well-ordered to show A = N . Certainly 0 ∈ A. Assume A 6= N . Let S = {m ∈ N | m 6∈ A}. S 6= ∅ by assumption. Let n be the ∈-least element of S, using the well-ordering; we have seen that n 6= 0. Let y ∈ W with π(y) = n. Let x be the E -immediate predecessor of y. Let m = π(x). The order isomorphism between (W, E ) and (N, ∈) guarantees that m is the ∈-immediate predecessor of n (so that s(m) = n, and, moreover, m ∈ n, but for all r ∈ N , it is not true that m ∈ r ∈ n). Now m 6∈ S by leastness of n, so m ∈ A, and it follows that s(m) ∈ A. But this contradicts the fact that n 6∈ A. Notation. A small part of the proof shows that, as we would expect, every nonzero element of N has a unique immediate predecessor. For n 6= 0 in N , we denote the immediate predecessor of n by n − 1. Similarly, we will adopt the usual convention of writing n + 1 for s(n), whenever n ∈ N . Finally, it will be convenient, when emphasizing the use of ∈ on N as an order relation, to write < in place of ∈, as is customarily done. Moreover, we shall write m ≤ n if and only if either m ∈ n or m = n. Theorem 12. 0 6∈ ran s and s is 1-1. Moreover, N = {0} ∪ ran s. 20 Formally, formulas are formed from variables and the membership relation ∈; notions like “natural number” and “function” are defined in terms of these. Formulas are defined inductively (on the length of the expression). If x, y are variables, both x = y and x ∈ y are formulas. If φ, ψ are formulas and x is a variable, so are φ∧ψ, φ∨ψ, φ → ψ, ¬φ, ∃x φ, and ∀x φ. Formulas can have any finite number of parameters, so, to be more formal, we could write φ(x, y1 , . . . , yk ), where y1 , . . . , yk are variables standing for arbitrary sets; for readability, we just display x. 30 Proof. If s(m) = 0 for some m ∈ N , let x ∈ W be such that π(x) = m. Then 0 = π(a) = s(π(x)) = π(j(x)), and so, by 1-1-ness of π, a = j(x), which is impossible by Claim 7. To prove s is 1-1, suppose s(m) = s(n), where m, n ∈ N . Let x, y ∈ W with π(x) = m, π(y) = n. Then s(m) = s(n) ⇔ π(j(x)) = π(j(y)) ⇔ j(x) = j(y) ⇔ x = y ⇔ m = n. Finally, suppose n 6= 0 is in N . Let m be its unique immediate predecessor, so s(m) = n. We have shown that ran s = N − {0}. It follows that N = {0} ∪ ran s. Now we can define the concrete notion “natural number.” A set n is a natural number if and only if n ∈ N , if and only if n belongs to every inductive set. This concrete definition is the one that allows us to see each number explicitly rendered as a set: Theorem 13. For each n ∈ N , if n 6= 0, then n = {0, 1, 2, . . ., n − 1}. Proof. By Claim 10(11) and the fact that π is an isomorphism, for every m ∈ n, m ≤ n − 1. Conversely, if m ≤ n − 1, then either m ∈ n − 1—so by transitivity, m ∈ n—or m = n − 1, in which case also m ∈ n. Thus n consists precisely of the elements of N that precede n, the largest of these being n − 1. In light of Theorem 10, it is clear that N and ω (defined at the beginning of this paper) are the same set. For the rest of the paper, we will use “ω” as the name of this set. 9 The Self-Map s : ω → ω As the Smallest Among All Dedekind Self-Maps Starting from an arbitrary Dedekind self-map, we have derived the set ω of natural numbers together with its usual successor function s : ω → ω, which is itself a Dedekind self-map. Intuitively, one considers the set of natural numbers as the smallest type of infinite set. In the usual development of ZFC set theory, for example, one proves that the size ℵ0 of ω is the smallest infinite cardinal, and one can also show that ω is faithfully embedded in every infinite set; that is, for any infinite S, there is a 1-1 function f : ω → S. Here, we will show something that is apparently even stronger: That the successor function is “embedded” in every Dedekind self-map, in a unique way. This result will have a number of practical consequences that we will mention at the end of this section. To begin, we return to the diagram that showed how j W : W → W is collapsed to the successor function s : ω → ω: W j W - π π ? ω W s 31 - ? ω The Mostowski Collapsing Theorem tells us that π is the unique function defined on W satisfying π(x) = {π(y) | y E x}. This result allowed us to demonstrate that the set ω and the successor function s exist (as a function from ω to ω). We wish now to look at the diagram in a slightly different way. Initially, we viewed the diagram as starting with the maps j W and π, which occupied the top and two vertical sides of the square determined by the diagram: W j W - π W π ? ? ω ω The bottom edge was filled in by defining s to be a composition of the other maps (and their inverses): s = π ◦ (j W ) ◦ π −1 . Having established the existence of ω and s, we can view the diagram in another way. We begin as before with j W : W → W and now also take as given the map s : ω → ω, defined now by the formula s(n) = n ∪ {n}.21 For the moment, we leave the two vertical edges unspecified. W ω j W - W s - ω We now define π on W as before, this time as a candidate to fill in the vertical edges of the diagram: π(x) = {π(y) | y E x}. As we argued earlier, π is uniquely determined by this relation. Let A ⊆ ω be defined by A = {n ∈ ω | for some x ∈ W , π(x) = n}. Showing that A is inductive will establish that ran π = ω. Since π(a) = 0, as we argued before, 0 ∈ A. If n ∈ A, then let x ∈ W with π(x) = n. We show s(n) ∈ A by showing π(j(x)) = s(n) = n ∪ {n}. But π(j(x)) = {π(y) | y E j(x)} = {π(y) | y E x} ∪ {π(x)} = π(x) ∪ {π(x)} = n ∪ {n}, as required. We wish to show now that π : W → ω is the unique map for which the following two conditions hold: Technically, this amounts to defining s on ω to be s̄ ω; recall that s̄ is defined on all sets by s̄(x) = x ∪ {x}. One needs to verify that the range of s̄ ω is a subset of ω, so that we may write, as usual, s : ω → ω. We do this quick verification here: Let A ⊆ ω be the set {n ∈ ω | s(n) ∈ ω}. Certainly 0 ∈ A and if n ∈ A, s(n) ∈ ω, since ω is inductive. Therefore, A is inductive, so ω ⊆ A, yielding that A = ω. 21 32 (i) π(a) = 0. (ii) The diagram is commutative; that is, s ◦ π = π ◦ (j W ) (∗∗) We showed in Theorem 9 that (∗∗) holds true (we established this using a different definition of s, but we also showed that for each n, s(n) = n ∪ {n}). We verify uniqueness: Suppose h : W → ω satisfies (i) and (ii) above; that is, h(a) = 0 and s ◦ h = h ◦ (j W ). j W W - W h h ? s ω ? - ω We establish uniqueness by j-induction: Let B ⊆ W be defined by B = {x ∈ W | h(x) = π(x)}. Since (i) holds for both functions, a ∈ B. Assume x ∈ B. Then since both functions make the diagram commutative, we have π(j(x)) = j(s(x)) = h(j(x)), and so j(x) ∈ B, and B is j-inductive, as required. We observed earlier that π is an order isomorphism. In the present context, π is another kind of isomorphism, called a Dedekind self-map isomorphism. We define this notion now. Suppose g : B → B is a Dedekind self-map with critical point b and h : C → C is a Dedekind self-map with critical point c. A Dedekind self-map morphism from g to h is a map β : B → C satisfying: (1) β(b) = c (2) The diagram is commutative; that is, β ◦ g = h ◦ β. g B - β B β ? h C - ? C Moreover, if β is a bijection, β is a Dedekind-self map isomorphism.22 The intuition behind a Dedekind self-map morphism β from g : B → B to h : C → C is that the structural relationships As an interesting sideline, we point out that the concept of a Dedekind-self map isomorphism from j W to s : ω → ω is a slight weakening of “E -order isomorphism” to “E 0 -order isomorphism.” We can explain this point in the following way. Given g and h as in the definition of Dedekind self-map morphisms. Suppose Eg is defined on B by x Eg y if and only if y = g(x) and Eh is defined on C by x Eh y if and only if y = h(x), then β is a Dedekind self-map isomorphism if and only if β is an (Eg , Eh )-isomorphism (that is, β is a bijection and x Eg y if and only if β(x) Eh β(y)). In particular, to say that π is a Dedekind self-map isomorphism from j W to s is the same as saying that it is an order isomorphism, relative to the relation E 0 . 22 33 given by g are reflected in h: if g takes x to y in B, then h takes β(x) to β(y) in C. Moreover, if β is also an isomorphism, then the relationships between the two maps are structurally identical: For all x, y ∈ C, g takes x to y if and only if h takes β(x) to β(y). We introduce some notation for Dedekind self-map morphisms: Notice that every Dedekind self-map g : B → B with critical point b specifies three parameters: g, B, and b. We therefore may describe the triple (B, g, b) as being the Dedekind self-map. Then, we can indicate that β is a Dedekind self-map morphism from g to h taking b to c by simply saying that β : (B, g, b) → (C, h, c) is a Dedekind self-map morphism. Our reasoning in the previous paragraphs has demonstrated the following: Theorem 14. Let π : W → ω be defined by π(x) = {π(y) | y E x}. Then π is the unique Dedekind self-map morphism from (W, j W, a) to (ω, s, 0). Moreover, π is in fact a Dedekind self-map isomorphism. Theorem 14 tells us that, structurally, W and ω are identical, with “successor” functions that behave in exactly the same way; the structures (W, j W, a) and (ω, s, 0) are identical except for notational differences. It is natural to expect that the isomorphism π is invertible—that π −1 is also a Dedekind self-map isomorphism, and that it is the unique morphism from s to j W . We verify this now. Henceforth, we let τ denote π −1 . ω s - τ ? W ω τ j W - ? W Theorem 15. Let τ : ω → W be π −1 . Then τ is a bijection and is the unique Dedekind self-map morphism from (ω, s, 0) to (W, j W, a). Proof. Since τ = π −1 , τ is a bijection and τ (0) = a. Then τ ◦ s = j ◦ τ ⇔ s = π ◦ j ◦ τ ⇔ s ◦ π = π ◦ j. Since π is a Dedekind self-map morphism, the last of these equations holds true, and so the first one does as well. For uniqueness, we first observe that if g is a Dedekind self-map morphism from (ω, s, 0) to (W, j W, a), then g must be 1-1 and onto: ω s - g ? W ω g j W 34 - ? W Let A = {n ∈ ω | g(n) 6∈ {g(0), g(1), . . . , g(n − 1)}}. We show A is inductive; this will establish that g is 1-1. Clearly 0 ∈ A. If n ∈ A and g(s(n)) = g(i) for some i, 0 ≤ i ≤ n − 1, notice first that i 6= 0 since we would have in that case a = g(0) = g(s(n)) = j(g(n)), which is impossible since, for no x ∈ W is it true that x E a. Therefore, i = s(k) for some k, 0 ≤ k < n − 1, and we have j(g(n)) = g(s(n)) = g(s(k)) = j(g(k)). Since j is 1-1, g(n) = g(k) which contradicts the fact that n ∈ A. Therefore, s(n) ∈ A as required. To see that g is also onto, let B ⊆ W be defined by B = {x ∈ W | for some n ∈ ω, g(n) = x}. Clearly a ∈ B. If x ∈ B, let n ∈ ω with g(n) = x. We show j(x) ∈ B. But j(x) = j(g(n)) = g(s(n)), as required. Since B is j-inductive, B = W , and g is onto. To complete the proof, we must show that τ = g. But notice now that g −1 makes the following diagram commutative: W j W - g −1 g −1 ? ω W s - ? ω By uniqueness of π, g −1 = π, and so g = (g −1 )−1 = π −1 = τ. We can now use Theorem 15 to demonstrate several facts about W and the relation E that we have not been able to establish until now. First, we give some definitions. A set X is finite if there is n ∈ ω for which there is a bijection from n to X. Let i = j W : W → W be defined as above. For each n > 1, let in be the n-fold composition of i with itself: in = i ◦ i ◦ · · · ◦ i, where the right-hand side has n factors. We let i0 = idW and i1 = i.23 ω s - τ τ ? W ω i - ? W Theorem 16. 23 At this point, we do not yet claim that the values i0 , i1 , i2 , . . . form an actual set, but we will use a result like Theorem 15, to be described shortly, that will allow us to make (and prove) such a claim. 35 (1) For each n ∈ ω, in : W → W and in+1 = i ◦ in . Moreover, τ (n) = in (a) = j n (a). (2) W = {a, j(a), j 2(a), . . .} = {a, i(a), i2(a), . . .} = {in (a) | n ∈ ω} (3) Suppose F joins x to y in W . Then F is finite. Proof of (1). Define A by A = {n ∈ ω | in+1 = i ◦ in and ran in+1 ⊆ W }. To prove the first half of (1), it suffices to show that A = ω. Since i0 = idW , 0 ∈ A. Assume n ∈ A. Both in+1 and i ◦ in are n + 1-fold compositions of i with itself, so they are equal. Also, if x ∈ W , in+1 (x) = i(in (x)). Since in (x) ∈ W (by the induction hypothesis), it follows that in+1 (x) ∈ W as well. For the “moreover” clause, we perform a separate induction. Let B = {n ∈ ω | τ (n) = in (a) = j n (a)}. It is clear that 0 ∈ B. Assuming n ∈ B, we have τ (n + 1) = τ (s(n)) = i(τ (n)) = i(in(a)) = in+1 (a), and similarly if we replace i by j. Proof of (2). By (1), for all n ∈ ω, in (a) ∈ ω. This shows that W contains all the terms in (a). We show that these are the only elements of W . Let B ⊆ W be defined by B = {x ∈ W | for some n ∈ ω, x = in (a)}. Notice a ∈ B since a = i0 (a). Assume x ∈ B, so x = in (a) for some n ∈ ω. Then i(x) = i(in (a)) = in+1 (a). Therefore, i(x) ∈ B and B = W . Therefore, every element of W is one of the terms in (a). This completes the proof of (2). Proof of (3). Define E n on W by x En y if there are x1 , . . . , xn ∈ W so that x = x1 , y = xn , and for each k, 1 ≤ k ≤ n − 1, xk+1 = j(xk ). We show that whenever x E y, there is n ∈ ω so that x En y. Recall that E is a well-ordering of W , isomorphic under π to ω. Given x, y such that x E y, let m, n be such that π(x) = m and π(y) = n, and let F join x to y. Then the chain m < m + 1 < . . . < n has ` = n − m + 1 terms. It follows that x E ` y, as required. We show that F consists of ` elements, F = {y1 , . . . , y` }, where, for each k, 1 ≤ k < `, j(yk) = yk+1 . Let Y = {y1 , . . . , y` }. A simple induction shows each element of Y is in F . Suppose some element of F is not in Y . Let y be the E-least such. The E -least element of F must be y1 . Therefore, by properties of F , there is some z ∈ F such that y = j(z). Since z E y, y = yk for some k. But now y = j(yk ) = yk+1 , contradicting the assumption that y is not in Y . We have shown F = Y = {y1 , . . . , y` } as required. We can now show the special relationship the successor function has with all other Dedekind self-maps. 36 Theorem 17. (1) Suppose j : A → A is a Dedekind self-map with critical point a. Then there is a unique Dedekind self-map morphism τ from (ω, s, 0) to (A, j, a). s ω - ω τ τ ? j A ? - A (2) Suppose j : A → A is a Dedekind self-map with critical point a and suppose k : U → U is any Dedekind self map, with critical point u such that (U, k, u) is Dedekind self-map isomorphic to (ω, s, 0). Then there is also in this case a unique Dedekind self-map morphism σ from (U, k, u) to (A, j, a). k U - U σ σ ? j A ? - A Proof. We have already completed the main steps of the proof; we review these now. We obtain W ⊆ A as the smallest j-inductive set. We have seen that π : W → ω, as defined earlier, is unique such that π(a) = 0 and π ◦ j W = s ◦ π and that π is a bijection. It follows, as we have shown, that if τ = π −1 , then τ is a bijection and the unique Dedekind self-map from (ω, s, 0) to (W, j W, a). The existence of τ requires one small additional step. Consider the following diagram: s ω - τ τ ? i W - incW,A ? A ω ? W incW,A j - ? A We now define τ = incW,A ◦ τ . Clearly τ (0) = a and, for all n ∈ ω, (+) j ◦ incW,A ◦ τ = incW,A ◦ i ◦ τ = incW,A ◦ τ ◦ s, so the diagram is commutative. It follows that j ◦ τ = τ ◦ s, 37 as required. For uniqueness, suppose h : ω → A satisfies the same conditions: h(0) = a and h ◦ s = j ◦ h. We show h = τ by proving by induction that h(n) = τ (n) for all n ∈ ω. Certainly h(0) = a = τ (0) by assumption. Assuming h(n) = τ (n) we show h(s(n)) = τ (s(n)). But h(s(n)) = j(h(n)) = j(τ(n)) = τ (s(n)), as required. This completes the induction and shows that h = τ . Proof of (2). The proof is similar to the one given for (1). Let β be a Dedekind self-map isomorphism from (U, k, u) to (ω, s, 0) (we have already seen that if there is such an isomorphism from s to k, the inverse is also such a map, now from k to s). Let τ be the Dedekind self-map morphism from (ω, s, 0) to (A, j, a), as defined in (1). U k - β β ? ω s - τ ? A U ? ω τ j - ? A Define σ from k to j by σ = τ ◦ β. The proof that σ(u) = a and j ◦ σ = σ ◦ k is essentially identical to (+). The proof of uniqueness is also essentially the same as for (1). Theorem 17(1) allows us to perform inductive definitions of sequences, indexed by the elements of ω, in the usual way.24 For instance, suppose we wish to formally define the sequence of powers of 2: 20 , 21 , 22 , . . . = 1, 2, 4, . . .. Theorem 17(1) guarantees that a single function or sequence exists that will produce precisely these values. ω s - τ ? A ω τ j - ? A To use the theorem, we need to specify a function j (as in the bottom map of the diagram). The map j declares, as we move through the sequence, how the next value is computed from the current value. For this purpose, we define j : ω → ω by j(m) = 2m. The critical point of j that we will use is 1. Now the theorem guarantees there is a unique τ : ω → ω that takes 0 to 1 and 24 Some familiar examples of inductive definitions actually require a slight variation of Theorem 17(1) in which j : A → A is replaced by a proper class Dedekind self-map j : V → V ; we will introduce this variant later in this discussion and develop it further in Sections 10–12. 38 makes the diagram commutative. Now we can see that τ is precisely the exponential function n 7→ 2n (this is actually the definition of this exponential function). Notice by commutativity of the diagram that τ (1) = τ (s(0)) = j(τ(0)) = j(1) = 2. and τ (2) = τ (s(1)) = j(τ(1)) = j(2) = 2 · 2 = 4. A slightly more complicated example is to compute the factorial function, where n! (“n factorial”) is defined to be the product 1 · 2 · · · · · n.25 The first terms of the sequence are 1!, 2!, 3!, 4! . . . = 1, 2, 6, 24, . . .. Once again, to use the theorem, it is necessary to define a function j : A → A, as in the bottom row of the diagram. For factorial, we will need to use two arguments, and our final “factorial” function will be defined to be a sequence26 of pairs f = h(0, 0!), (1, 1!), (2, 2!), (3, 3!), . . .i = h(0, 1), (1, 1), (2, 2), (3, 6), . . .i. We define j : ω × ω → ω × ω by j(m, n) = (m + 1, (m + 1) · n), and we will use as critical point the pair (0, 1). s ω - τ ω τ ? j ω×ω ? - ω×ω Now there is a unique τ : ω → ω × ω that takes 0 to (0, 1) and makes the diagram commutative. We can argue by induction to show that τ (n) = (n, n!) for each n: By assumption, τ (0) = (0, 1) = 25 This function is more elegantly defined using the variant of Theorem 17(1) mentioned in the previous footnote. Using a similar and more traditional approach, we can derive the factorial function directly, but the maps in the diagram change and look a bit more complicated. We give an idea about how it works in this note. What can be shown is that, given any f : ω × ω → ω and value t0 , there is a unique function t : ω → ω so that t(0) = t0 , and for all n, t(s(n)) = f (n, t(n)). (A statement of a more general result is given in Theorem 21; the proof is like the one given for Theorem 15.) To express this in the form of a commutative diagram, one uses the diagonal map ∆ : ω → ω × ω defined by ∆(n) = (n, n), and another map 1ω × t defined by (1ω × t)(m, n) = (m, t(n)). Thus, (1ω × t) ((∆(n)) = (n, t(n)). 26 s ω - (1ω ×t)◦∆ t ? ω×ω ω f - ? ω Now, the diagram commutes if and only if, for every n, t(s(n)) = f ((1ω × t) (∆(n))) = f (n, t(n)), as expected. In this example, defining f by f (m, n) = (m + 1)n and t0 = 1, the function t is the factorial function, as one may prove by induction: t(n + 1) = f (n, t(n)) = (n + 1) · n! = (n + 1)! Thus, t(n) = n! for every n. 39 (0, 0!). Assuming the result for n, we have τ (s(n)) = j(τ(n)) = j(n, n!) = (n + 1, (n + 1) · n!) = (n + 1, (n + 1)!). As a final observation, note that in this case, the map τ has the property that ran τ is another function, but in this case its range is precisely the factorial function. The most useful application of the theorem, for our purposes, is the definition of the first ω stages of the universe of sets. The stages V0 , V1 , V2, . . . of the universe V are obtained by taking repeated power sets, starting with the empty set ∅, where, by definition, the power set P(X) of a set X is the set whose elements are the subsets of X. To define this sequence formally, we define a class version of Theorem 17(1). We discuss classes in more detail in Section 11. For our purposes, a class is essentially a formula27, but we think of it as a collection of objects defined by the formula. After we have given a definition of the universe V , we will think of classes as enormous subcollections of V . For now, a class will simply be viewed as a collection C = {x | φ(x)} for some formula φ (having possibly finitely many other set parameters not displayed). An example of a class is B = {{y} | y is a set}. Formally, the formula φ(x) that defines B states “there is z such that for all u, u ∈ x if and only if u = z”: B = {x | φ(x)}. (Here, x represents a set of the form {z} and the condition states that the only element of x is z.) Intuitively, a collection like B spans the universe and so is too big to be a set. We wish to establish a class version of Theorem 17(1): Theorem 18. Suppose C is a class, c ∈ C, and j : C → C is 1-1 and has critical point c. Then there is a unique class map τ : ω → C such that τ (0) = c and the following is commutative: ω s - τ ? C ω τ j - ? C Proof. The proof follows the same steps as the proof of the Mostowski Collapsing Theorem (p. 24), so we just give an outline. Let φ(x, u) be the formula “u ∈ ω and x is a function with domain u so that x(0) = c and for all i, 0 ≤ i < u − 1, x(s(i)) = j(x(i).” We show that A = ω, where A = {n ∈ ω | there is a unique function tn such that φ(tn , n) holds}, by showing that A is inductive. ω s - tn ? C 27 ω tn j See Section 11 for a discussion. 40 - ? C Certainly 0 ∈ A; the unique map t0 is the empty function. Assume n ∈ A, so that a unique tn is defined as in the definition of A. Define tn+1 = tn ∪ {(n, j(tn(s(n − 2))))}. Note that j(tn (s(n − 2))) ∈ C since tn (s(n − 2)) ∈ C. The pair added to tn in the definition ensures that tn+1 (s(n − 1)) = j(tn (s(n − 2))) = j(tn+1 (s(n − 2))) = j(tn+1 (n − 1), and so we have tn+1 (0) = c and tn+1 ◦ s = j ◦ tn+1 . Uniqueness of tn+1 is straightforward to verify since, whenever r is defined on n + 1, r(0) = c, and r ◦ s = j ◦ r, we must have r n = tn (by uniqueness of tn ) and r(n) = j(tn (s(n − 2))) = tn+1 (n). We have shown A = ω, so for each n we have a uniquely defined tn as described in the definition of A. We define τ on ω by: τ (n) = tn+1 (n). As in the proof of the Mostowski Collapsing Theorem (in particular, (+)), it follows that τ (n) = tm (n) for all m ≥ n + 1. Verification that τ has the required properties follows easily. Now we define a particular class HF (the hereditarily finite sets) as follows: We place x ∈ HF if the smallest transitive set that contains x as a subset is finite.28 We define j : HF → HF by j(x) = P(x), where P denotes the power set operator: P(x) is the set of all subsets of x. We verify that ran j ⊆ HF: Notice that if y is finite and transitive, so is P(y), so if x ∈ HF and y is the smallest transitive set containing x (which, in particular, must be finite), then, since P(y) is finite and transitive and P(x) ⊆ P(y), it follows P(x) ∈ HF. Therefore, ran j ⊆ HF. Notice that ∅ is a critical point for j and that j is 1-1. Therefore, j is a Dedekind self-map on the class HF.29 Using Theorem 18, we obtain a unique τ : ω → HF satisfying: (a) τ (0) = ∅ (b) for all n ∈ ω, τ (s(n)) = j(τ (n)). If we write Vn = τ (n), then these clauses become (a0 ) V0 = ∅ (b0 ) for all n ∈ ω, Vn+1 = P(Vn ). The Axiom of Replacement guarantees that we have a sequence30 hV0 , V1 , V2, . . .i. The Union Axiom allows us now to form the union, which we denote Vω : [ (++) Vω = Vn . n∈ω 28 Recall that the axiom Trans (which asserts that every set is included in a transitive set) has been included in ZFC; see the footnote on p. 6. The “smallest” such transitive set is found by forming the intersection of them all. 29 Reasoning of this kind leads to a proof from the theory ZFC − Infinity + ¬Infinity that V = HF: Given any set x, let y be the smallest transitive set that contains x. Since ¬Infinity holds, y is finite, and so x ∈ HF. 30 In other words, the sequence is a set, not a proper class. 41 It can be shown that, in fact, Vω = HF.31 We complete the construction of the stages of V in Section 11. In the absence of the Axiom of Infinity—in particular, when we assume that no infinite set exists—HF is still defined in the same way, but in that context, it is a class that is not a set. One may also use a variation of Theorem 18 to define the class sequence hV0 , V1, V2, . . .i, exactly as above; in this case, every “set” belongs to one of these stages. Next, we continue with a converse for Theorem 17(2): Corollary 19. Suppose k : U → U is a Dedekind self-map with critical point u. Suppose (U, k, u) has the property that for every Dedekind self-map j : A → A with critical point a, there is a unique Dedekind self-map morphism σ from (U, k, u) to (A, j, a). Then there is a Dedekind self-map isomorphism from (ω, s, 0) to (U, k, u). Remark 3. Corollary 19 is the converse to Theorem 17(2). These two results tell us that the structure of the Dedekind self-map (ω, s, 0)—and therefore the fundamental structure of the set of natural numbers—is completely characterized by the universal property stated in the hypothesis of Corollary 19. Proof. The idea here is that both (ω, s, 0) and (U, k, u) have the universal property described in the hypothesis of Corollary 16. The following diagram captures the relationships involved: U k - β β ? ω s - γ k - β 31 ω ? U β ? ω ? γ ? U U s - ? ω We prove this fact here. First we prove the following claim: Claim. If X is finite and X ⊆ Vω , then X ∈ Vω . Proof of Claim. Let X = {x1 , x2 , . . . , xk } and let n1 , n2 , . . . , nk be such that for 1 ≤ i ≤ k, xi ∈ Vni . Let n = max{n1 , . . . , nk }. Then X ⊆ Vn , so X ∈ Vn+1 , and hence, X ∈ Vω . To prove that Vω = HF, it is enough to prove HF ⊆ Vω . To prove this, we will make use of the Foundation Axiom, which states that every set in the universe has an ∈-minimal element; that is, for every set x, there is a z ∈ x such that no element of x belongs to z. Given X ∈ HF, let Y ⊇ X be a finite, transitive set containing X; certainly, Y ∈ HF. Let Z = Y − Vω . We wish to show that Z = ∅; then by the Claim we can conclude Y ∈ Vω , and so HF ⊆ Vω , as required. To prove Z = ∅, we give an indirect argument, assuming Z 6= ∅. Let u ∈ Z be ∈-minimal in Z. Since u ∈ Y − Vω , u 6= ∅; let v ∈ u. Since Y is transitive, v ∈ Y . Since u is ∈-minimal, v 6∈ Z, so v ∈ Vω . Since v was arbitrary, we have shown u ⊆ Vω . By the Claim, it follows that u ∈ Vω , but this contradicts the original choice of u. Therefore, the assumption that Z is nonempty is incorrect; we have shown Z = ∅, as required. 42 We let β : (U, k, u) → (ω, s, 0) be the unique Dedekind self-map guaranteed to exist because of the hypothesis, and we let γ : (ω, s, 0) → (U, k, u) be the unique Dedekind self-map guaranteed to exist by Theorem 17(1). Notice γ ◦ β takes u to u and makes the following diagram commutative: k U - γ◦β γ◦β ? k U U - ? U but idU : U → U does the same: idU (u) = u and the following is commutative: k U - idU idU ? k U U - ? U By the uniqueness guaranteed by the hypothesis of Corollary 19, idU = γ ◦ β. (#) Likewise, we have β(γ(0)) = 0 and the following diagram is commutative: s ω - β◦γ β◦γ ? s ω ω - ? ω but idω : ω → ω does the same: idω (0) = 0 and the following is commutative: s ω - idω idω ? ω ω s - ? ω Therefore, we also have (##) idω = β ◦ γ. It is straightforward to verify that (#) and (##) together imply that both β and γ are bijections. Hence, in particular, γ : (ω, s, 0) → (U, k, u) is a Dedekind self-map isomorphism. 43 The theorem tells us that the Dedekind self-map (ω, s, 0) is “less than or equal to” all other Dedekind self-maps, in the same way that 0 is less than or equal to every natural number. This idea can be made precise with the concept of a category. A category is a pair (O, M), where O is a collection of objects and M is a collection of morphisms, satisfying the following: (1) Each morphism f ∈ M has a domain and a codomain, written dom f, cod f, respectively, both belonging to O; in the familiar way, if A = dom f and B = cod f , we write f : A → B. (2) Morphisms can be composed: If f : A → B and g : B → C both belong to M, then there is another morphism g ◦ f : A → C that also belongs to M; moreover, composition is associative: h ◦ (g ◦ f ) = (h ◦ g) ◦ f . (3) For each object A ∈ O, there is an identity morphism 1A : A → A, which has the following two properties: For all f : X → Y in M, 1Y ◦ f = f and f ◦ 1X = f . We sometimes denote 1A by idA . A simple example of a category is Set, which has all sets as its objects, and all functions between sets as its morphisms. If we denote the collection of all functions defined on a set in V by V <V , then32 Set = (V, V <V ∩ V ). Another very different example is the category Nat, whose objects are the natural numbers 0, 1, 2, . . . and whose morphisms are pairs (m, n) of natural numbers for which m ≤ n. If S = {(m, n) ∈ ω × ω | m ≤ n}, then Nat = (ω, S). Notice that in this case, “morphisms” are nothing like the usual notion of functions, but they do satisfy the requirements mentioned in the definition of categories.33 Finally, a category of primary interest to us in this paper is SelfMap, whose objects are Dedekind self-maps and whose morphisms are Dedekind self-map morphisms. Checking that the requirements for a category have been met for each of our examples is straightforward, and we shall assume that this verification has been done. A notion that we have already defined in particular cases, but which is best formulated in the language of category theory, is isomorphism. In any category C = (O, M), if A, B ∈ O, an isomorphism from A to B is a morphism f ∈ M with the property that there is another g ∈ M, g : B → A, with f ◦ g = idB and g ◦ f = idA . In Set, the isomorphisms are the bijections. In Nat, two objects are isomorphic if and only if they are equal. And in SelfMap, the isomorphisms are precisely the Dedekind self-map isomorphisms. The concept we wish to introduce at this point is that of an initial object. An initial object in a category is the “smallest” object of the category. We give the definition: An object I in a category C = (O, M) is initial if there is exactly one morphism from I to each object X in O. 32 The curious reader may wonder why the collection of morphisms for Set is V <V ∩ V rather than simply V <V . The reason is that it is possible to devise a function f defined on some set X ∈ V that is not itself a member of V . We will see an example at the end of this paper. Such functions are necessarily not definable in the universe, nor derivable from the axioms of set theory, and hence do not properly belong to Set, which must be viewed as a (definable) class. Moreover, the range of such functions is not itself a set, so it does not make sense to include them as morphisms. Nevertheless, such undefinable functions play an important role in the theory of sets (but not so much in the category of sets); see the footnote on p. 147 for an example. 33 It is helpful to think of a morphism f : A → B in a category as a directed edge in a directed graph. One could have directed graphs in which such edges are indeed functions, and other graphs in which they are not. It can be shown that any category is a directed graph with additional properties. 44 In Set, ∅ is initial because there is just one map (the empty map) from ∅ to any other set. In Nat, 0 is initial because 0 ≤ m for all m ∈ ω. Now, with this new notion in hand, it is easy to see that (ω, s, 0) is an initial object for SelfMap. In fact, Theorem 17 and Corollary 19 can be summarized as follows, using this new concept: Theorem 20. (1) (ω, s, 0) is initial in the category SelfMap. (2) Any object (U, k, u) of SelfMap that is initial is Dedekind self-map isomorphic (hence SelfMap-isomorphic) to (ω, s, 0). (3) Any object (U, k, u) of SelfMap that is Dedekind self-map isomorphic (hence SelfMapisomorphic) to (ω, s, 0) is initial. Proof. This follows from Theorem 17 and Corollary 19. Our reasoning so far has shown that, given a Dedekind self-map j : A → A with critical point a, if we form the smallest j-inductive set W , it turns out that W = {j n (a) | n ∈ ω}. We have also just seen that (W, j W, a) is initial—a structural duplicate of (ω, s, 0). We will call W , defined as the smallest j-inductive set, the initial object generated by j. We have also shown that τ = π −1 : ω → W is a map that lists the elements of W : For all n ∈ ω, τ (n) = j n (a) = j(j(. . .((a)) . . .)) (where the right hand side of the expression consists of n applications of j to a). We will call τ the canonical enumeration of W . Recall that τ is itself a Dedekind self-map isomorphism from (ω, s, 0) to (W, j W, a). We can now verify that, whenever j : A → A is a Dedekind self-map with critical point a, the sequence of maps idA , j, j ◦ j, . . . forms a set that is, like W = {a, j(a), j(j(a)), . . .}, a structure isomorphic to ω. Moreover, this sequence is defined by a Dedekind self-map Jj : AA → AA defined by Jj (h) = j ◦ h, with critical point idA . Theorem 21. (1) The map Jj : AA → AA is a Dedekind self-map with critical point idA . (2) There is a set W whose elements are precisely idA , j, j ◦ j, . . .. Indeed, the initial object generated by Jj has this property, and if τ denotes the canonical enumeration of W , then for each n ∈ ω, τ (n) = j n. Proof of (1). The only point that needs to be checked is that Jj is 1-1. Suppose f, g ∈ AA and j ◦ f = j ◦ g. Then f = g because j is 1-1. Proof of (2). Let W denote the initial object generated by Jj and let τ : ω → W be its canonical enumeration. We show by induction that for all n ∈ ω, τ (n) = j n . ω s τ ? AA - ω τ Jj 45 ? - AA Let B ⊆ ω be the set {n ∈ ω | τ (n) = j n }. Since idA is the critical point of Jj , τ (0) = idA = j 0 , so 0 ∈ B. Suppose n ∈ B, so τ (n) = j n . Then τ (s(n)) = Jj (τ (n)) = Jj (j n ) = j ◦ j n = j n+1 . Thus, s(n) ∈ B, and so the claim is proved. As was proved in Theorem 16(2), W = {idA , Jj (idA ), Jj (Jj (idA )), . . .} = {idA , j, j ◦ j, j ◦ j ◦ j . . .}. Our results in this section lead to important characterizations of the notions of “infinite set” and “set of natural numbers.” We began our discussion with an effort to show that the Axiom of Infinity could be formulated as the assertion that a Dedekind self-map exists. We showed that, without reliance on the natural numbers (defined in the usual way), the set ω of natural numbers could be derived, using just a Dedekind self-map in conjunction with the other axioms of ZFC. These efforts led not only to a derivation of ω but also a characterization of those Dedekind self-maps that are in every respect, up to notational differences, equivalent to ω together with its successor function. Speaking philosophically, we have given evidence for the following two conclusions: (1) The “underlying reality” of infinite sets is the notion of a Dedekind self-map. (2) The “underlying reality” of the set of natural numbers is the notion of an initial Dedekind self-map. In particular, every Dedekind self-map j : A → A with critical point a generates a blueprint for the set of natural numbers, in the form W = {a, j(a), j(j(a)), . . .} ⊆ A; indeed, j W : W → W is itself an initial Dedekind self-map, isomorphic to s : ω → ω and naturally embedded in every other Dedekind self-map. We consider next another equivalent form of Dedekind self-map that reveals another characteristic underlying the notion of the infinite. 10 Co-Dedekind Self-Maps, Dedekind Maps, and co-Dedekind Maps In our effort to capture a deeper meaning of the notion of the infinite in our Axiom of Infinity, we isolated the concept of a Dedekind self-map. This notion gives mathematical expression to one end of a polarity that characterizes the Infinite, according to the ancient wisdom, namely, the dynamics of expansion to the infinite from a singularity. We have seen that this expansion takes place, for a given Dedekind self-map j : A → A with critical point a, by repeated applications of j to its critical point, producing the infinite sequence a, j(a), j(j(a)), . . .. The other half of this polarity, is, according to the ancient view, the dynamics of collapse, by which the vast diversity returns to the point value from which it originates. In a rather natural way, these dynamics are expressed mathematically using the dual notion of a co-Dedekind selfmap. Let us say that an onto self-map h : A → A is co-Dedekind if some preimage h−1 (a), for some a ∈ A, has two or more elements. Whenever a ∈ A is such that |h−1 (a)| ≥ 2, any element of h−1 (a) will be called a co-critical point of h. 46 Recall that, whenever h : A → A is onto, there is, by the Axiom of Choice, a function s : A → A whose range contains exactly one element from each preimage h−1 (a), for a ∈ A; such an s is called a section of h. Clearly, any section of an onto map must be 1-1. Moreover, we have the following: Theorem 22. (ZFC-Infinity). The following are equivalent: (1) There is a Dedekind self-map. (2) There is a co-Dedekind self-map. Proof. Suppose there is a Dedekind self-map j : A → A with critical point a. Let a0 ∈ ran j. Define h : A → A by ( a0 if x 6∈ ran j h(x) = y otherwise, where y ∈ A is unique such that j(y) = x. Under the definition, we have that h(a) = a0 since a is a critical point of j. It is obvious that h : A → A is onto since even h ran j is onto. It follows that some b = j(x) ∈ ran j is also mapped by h to a0 , since h ran j is onto. Certainly b 6= a since a 6∈ ran j. Therefore, |h−1 (a0 )| ≥ 2, so h is co-Dedekind. Conversely, suppose h : A → A is co-Dedekind with co-critical point a and suppose s : A → A is a section. We show s itself is Dedekind. We have already observed that s is 1-1. Let x ∈ A be such that |h−1 (x)| ≥ 2, and let u 6= v ∈ A be elements of h−1 (x). Then one of u, v does not belong to the range of s; in particular, one of u, v is a critical point of s. The argument shows that any section s of a co-Dedekind self-map h : A → A is itself a Dedekind self-map; moreover at least one co-critical point of h is a critical point of s. We introduce a somewhat specialized context to bring to light the “collapsing” effect of a co-Dedekind self-map. Let us say that a set A is closed under singletons if, for all x ∈ A, we have {x} ∈ A; more generally, A is closed under pairs if, whenever x, y ∈ A, {x, y} ∈ A. It turns out that, in studying co-Dedekind self-maps A → A, there is nothing lost if we assume A is a transitive set closed under pairs: Proposition 23. There is a co-Dedekind self-map on a set if and only if there is a co-Dedekind self-map on a transitive set that is closed under pairs. Moreover, any set that is closed under pairs admits a co-Dedekind self-map. Proof. Suppose h : A → A is a co-Dedekind self-map on A. Let t : A → A be a section of h that is a Dedekind self-map with critical point a. Then the successor s : ω → ω is, by Theorem 17(1), uniquely embedded in h. Let S : V → V be defined by S(x) = {x}. By Theorem 18, there is a unique τ̄ : ω → V such that τ̄ (0) = a and τ̄ ◦ S = s ◦ τ̄ . In particular, there is a set W whose elements are precisely a, {a}, {{a}}, . . .—that is, W = {a, {a}, {{a}}, . . .}, 47 and S W : W → W is initial. In particular, (ω, s, 0) and (W, S W, a) are Dedekind self-map isomorphic. Claim. There is a set B that is both transitive and closed under pairs that admits a Dedekind self-map. Indeed, there is such a set B that includes A. Proof of Claim. We first observe that for any set C, we can obtain a set T (C) ⊇ C that is closed under pairs: Define C = C0 ⊆ C1 ⊆ · · · as follows: Let C00 = [C0 ]2 (where, for any set 0 34 D, [D]2 denotes the set of all unordered pairs S from D), and let C1 = C0 ∪ C0 . In general , 0 0 2 Cn+1 = Cn where Cn = [Cn ] . Let T (C) = n∈ω Cn . If u, v ∈ C, then for some n, u, v ∈ Cn , and so {u, v} ∈ Cn+1 ⊆ T (C). Now we build a set B as the union of the following chain: A = A0 ⊆ B0 ⊆ A1 ⊆ B1 ⊆ · · · , where, for each i ∈ ω, Bi = T (Ai ) and Ai+1 is a transitive set that contains Bi . If u, v ∈ B, then u, v ∈ Bi for some i, and so, because Bi = T (Ai ), {u, v} ∈ Bi ⊆ B, and so B is closed under pairs. Also, if u ∈ v ∈ B, then v ∈ Ai for some i > 0, and so, by transitivity of Ai , u ∈ Ai ⊆ B; this shows B is transitive, as required. We obtain a Dedekind self-map t̂ : B → B as follows: ( b if b 6∈ A t̂(b) = t(b) if b ∈ A It is easy to see that t̂ is a Dedekind self-map. To complete the proof of the Proposition, we simply recall that by the proof of Theorem 22, whenever there is a Dedekind self-map B → B, there is also a co-Dedekind self-map B → B. Finally, for the “moreover” clause, notice that if B is closed under pairs, it is closed under singletons. If x ∈ B, then W = {x, {x}, {{x}} . . .} is a copy of ω in B, and so B is infinite, and therefore must admit a co-Dedekind self-map (by the first half of the Proposition). Example 1 (Generate/Collapse Duality). Let A be a transitive set that is closed under pairs. We assume that A is a set “in the universe” in the sense that it is governed by the usual axioms of set theory. (In particular, no element of A is an element of itself.) Consider the self-map F : A → A, defined by F (x) = any ∈-minimal element of x.35 34 Formally, we are using Theorem 18 here. Define j : V → V by j(x) = {{u, v} | u ∈ x and v ∈ x}; j is a Dedekind self-map. As in the Theorem, there is a unique τ : ω → V satisfying jτ = τ s, where s : ω → ω is the successor function. Then define Cn = τ (n) for each n ∈ ω. This gives us the sequence hC0 , C1 , C2 , . . .i as in the main text, and one may then form the union, as described there. This formal treatment makes it clear that one obtains a set T (A) for any set A only if ω already exists. 35 F (∅) is arbitrary. The notion of an ∈-minimal element of a set is defined on p. 17. 48 Because A is transitive, ran F ⊆ A. We also observe that F is onto: Suppose y ∈ A. Then {y} ∈ A, and clearly F ({y}) = y. Finally, suppose z ∈ A and consider the sets x = {z} and y = {z, {z}}. The fact that no set is an element of itself ensures that z, {z} are disjoint, and so F (y) = z = F (x). Thus, |F −1 (z)| > 1. We have shown F is a co-Dedekind self-map. Let SA = S A : A → A, where, we recall, S(x) = {x} for all x ∈ A. We show that SA is a section of F by showing F ◦ S = idA : F (S(x)) = F ({x}) = x. The dual notions of Dedekind self-map and co-Dedekind self-map are expressed in the selfmaps SA and F . Certainly, SA plays the role of generating a blueprint for the natural numbers: 2 (a), . . .} is an initial Dedekind self-map. We wish to show that, conGiven a, SA {a, SA(a), SA versely, F plays the role of collapsing the values of A to their point of origin. We show that for every x ∈ A, there is n ∈ ω such that F n (x) = ∅. Suppose not. Then for each n ∈ ω, F n (x) 6= ∅. It follows that the following is an infinite descending ∈-chain: · · · F n (x) ∈ F n−1 (x) ∈ · · · ∈ F (x) ∈ x. Such chains cannot exist in the presence of the Axiom of Foundation. Let E = {in | n ∈ ω}, where, for each n, in : AA → AA is defined by in (f ) = f n , the nth iterate of f . We have the following: Proposition 24. Suppose A is a transitive set that is closed under pairs. Let W = {∅, {∅}, {{∅}}, . . .} ⊆ A. (1) For every x ∈ A—in particular, for every x ∈ W —there is i ∈ E such that i(F )(x) = ∅. (2) For every x ∈ W , there is i ∈ E such that i(SA)(∅) = x. The Proposition indicates how every element of A is “returned to its source” via the interplay of F and a naturally occurring set E of functionals defined over A. Likewise, through the interplay of SA and E, we also see once again in the present context how a blueprint for the whole numbers is generated. In summary, the dual self-maps SA and F play the roles, respectively, of “generating a blueprint” and “returning elements to their source.” The concept of a “blueprint” appears naturally in the context of large cardinals, as we shall see in later sections. In the next section, we make the notion of a blueprint, suggested by our results here, more precise. The portions of our work here that we wish to emphasize, and that will generalize nicely, is that any Dedekind self-map A → A produces a kind of blueprint for the natural numbers, and also that in the special case in which A is transitive and closed under pairs, there are dual maps SA W : W → W and F W : W → W (where W is the set of iterates ∅, SA(∅), SA(SA (∅)), . . .) which play the roles of generating the elements of W from ∅ and returning elements of W to ∅. 49 10.1 Dedekind Maps and co-Dedekind Maps We close this section by considering two slight generalizations of Dedekind self-maps and coDedekind self-maps. Let us say that a function f : A → B is a Dedekind map if (1) |A| = |B| (2) f is 1-1 but not onto. We will call any element b ∈ B not in the range of f a critical point of f . Dually, we define a function g : A → B to be a co-Dedekind map if (1) |A| = |B| (2) g is onto but not 1-1. We will call any element a ∈ A that belongs to some g −1 (b), for some b ∈ B, a co-critical point of g if |g −1 (b)| > 1. As we now show, Dedekind maps factor as a composition of a bijection and a Dedekind self-map. Proposition 25. Suppose f : A → B and b ∈ B. Then f is a Dedekind map with critical point b if and only if there exist functions π, j so that f = π ◦ j, where π : A → B is a bijection and j : A → A is a Dedekind self-map with critical point π −1 (b). f A @ - B j@ π R @ A Proof. For one direction, suppose A, B are sets, b ∈ B, and f : A → B can be factored as f = π ◦ j, where π : A → B is a bijection and j : A → A is a Dedekind self-map with critical point π −1 (b). We show f is a Dedekind map. Certainly, f is 1-1. We show b is a critical point of f . Since π is onto, there is a ∈ A with π(a) = b. Note that a = π −1 (b) is, by assumption, a critical point of j. If b ∈ ran f , then let x ∈ A be such that f (x) = b. Then by commutativity of the diagram, π(j(x)) = b, and so j(x) = π −1 (b) = a, which is impossible since a 6∈ ran j. We have shown b is a critical point of f , f is 1-1, and |A| = |B| (by way of π), as required. For the other direction, suppose f : A → B is a Dedekind map with critical point b. Since |A| = |B|, there is a bijection π : A → B. Let B0 = f [A] ⊆ B and let A0 = π −1 [B0 ] ⊆ A. Let π0 = π A0 . We show ran π0 = B0 : Suppose y ∈ A0 . Then y = π −1 (x) for some x ∈ B0 ; that is, for some x ∈ B0 , π(y) = x. This shows ran π0 ⊆ B0 . If x ∈ B0 , then let y ∈ A0 with π −1 (x) = y. Then x = π(y) ∈ ran π. We have shown therefore that π0 is onto. Since π0 is a restriction of the bijection π, π0 is also 1-1. Therefore, π0 : A0 → B0 is a bijection. 50 Note that f : A → B0 is a bijection. Let j = π0−1 ◦ f : A → A0 . Then j is a bijection, and so, viewed as a map j : A → A, j is a Dedekind self-map. The following diagram is commutative: f A - B0 @ j@ R @ π0−1 A0 We claim that π −1 (b) is a critical point for j : A → A: Suppose j(x) = π −1 (b) for some x ∈ A. Then b = π(j(x)) = π(π −1 (f (x))) = f (x), which is impossible because b is a critical point of f . Now notice that, for any x ∈ A, since j(x) = π0−1 (f (x)), then π(j(x)) = π0 (j(x)) = f (x). We have shown that f = π ◦ j, π is a bijection, and j : A → A is a Dedekind self-map with critical point π −1 (b), as required. Any decomposition of a Dedekind map f by f = π ◦ j as in Proposition 25 will be called a Dedekind factorization of f . The proof shows that for each bijection π : A → B we can obtain a Dedekind self-map j : A → A so that f = π ◦ j. If π and π 0 are bijections from A to B whose inverse images of f [A] are distinct, then the resulting Dedekind self-maps j = π −1 ◦ f and j 0 = (π 0)−1 ◦ f are also distinct. It follows that every Dedekind map has infinitely many distinct Dedekind factorizations, as the following Lemma shows: Lemma 26. Suppose A, B are infinite sets with |A| = |B|, b ∈ B, and B0 ⊆ B − {b} also infinite. Then for k ∈ ω, there exist bijections πk : A → B with the property that, whenever k1 6= k2 are finite ordinals, πk−1 (B0 ) 6= πk−1 (B0 ). 1 2 Proof. Let B1 = {b1 , . . . , bn, . . .} ⊆ B0 be a countably infinite subset of B0 . Let π0 : A → B be any bijection. For n = 1, 2, 3, . . ., let an ∈ A be such that π0 (an ) = bn , and let a be such that π0 (a) = b. For each n > 0, we define another bijection πn : A → B as follows:   π0 (u) if u 6= a and u 6= an πn (u) = b if u = an   bn if u = a Now if k 6= n, then ak 6∈ πk−1 (B0 ) but ak ∈ πn−1 (B0 ), so πk−1 (B0 ) 6= πn−1 (B0 ), as required. For co-Dedekind maps, we have the following dual to Proposition 25. Proposition 27. Suppose g : A → B and a ∈ A. Then g is a co-Dedekind map with co-critical point a if and only if there exist functions π, h so that g = h ◦ π, where π : A → B is a bijection and h : A → A is a co-Dedekind self-map with co-critical point π(a). 51 g A @ - B π@ h R @ B Proof. First, suppose g : A → B is co-Dedekind with co-critical point a ∈ A. Since |A| = |B|, we let π : A → B be any bijection. Then define h : B → B by (+) h(π(x)) = g(x). This definition guarantees that the diagram above is commutative. We show h is a co-Dedekind self-map with co-critical point π(a). Clearly h = g ◦ π −1 is onto. Let b ∈ B be such that g(a) = b, and let a0 be such that g(a0) = b as well (since |g −1 (b)| > 1). It follows from (+) that (++) h(π(a)) = b = h(π(a0)), and so |h−1 (b)| > 1, and h is a co-Dedekind self-map. Moreover, by (++), π(a) ∈ h−1 (b), so π(a) is a co-critical point of h. Conversely, suppose g : A → B, a ∈ A, and g = h ◦ π, where π : A → B is a bijection and h : A → A is a co-Dedekind self-map with co-critical point π(a). We show g is a co-Dedekind map with co-critical point a. Since both h and π are onto, it follows that g is onto. Since π(a) is co-critical for h, there is a0 ∈ A so that h(π(a)) = b = h(π(a0 )). It follows that g(a) = b = g(a0 ), so |g −1 (b)| > 1 and a ∈ g −1 (b). We have shown g is a co-Dedekind map with co-critical point a. The next proposition shows the relationship between Dedekind maps, co-Dedekind maps, and infinite sets. Proposition 28. (ZFC – Infinity) The following are equivalent for any set A: (1) A is infinite. (2) A is the domain of a Dedekind map. (3) A is the codomain of a Dedekind map. (4) A is the codomain of a co-Dedekind map. Proof of (1) ⇔ (2). Suppose f : A → B is a Dedekind map. Then, as in the proposition, there is a Dedekind self-map j : A → A, and so j : A → j[A] is a bijection between A and one of its proper subsets. Therefore, A is Dedekind-infinite (hence, infinite). Conversely, suppose A is Dedekind-infinite and A0 ⊆ A is a proper subset for which there is a bijection j : A → A0 . Then, viewed as a map A → A, j is a Dedekind self-map. Trivially, then, we have a Dedekind map 52 (letting B = A, π = idA , and f = j). Proof of (1) ⇔ (3). If A is infinite, let x ∈ A; then the inclusion map A−{x} ,→ A is a Dedekind map. Conversely, if g : B → A is a Dedekind map, then by (2) ⇒ (1), A is infinite. Proof of (4) ⇒ (3). Suppose g : B → A is a co-Dedekind map with co-critical point in g −1 (a). Let s : A → B be a section of g. Then s is 1-1 but not onto since at least one element of g −1 (a) is not in the range of s. It follows that s is a Dedekind map, and A is the domain of a Dedekind map. By (2) ⇒ (3), A is also the codomain of a Dedekind map. Proof of (2) ⇒ (4). Suppose f : A → B is a Dedekind map with critical point b ∈ B. Let b0 ∈ f [A], and suppose a ∈ A with f (a) = b0 . Define g : B → A as follows: ( x if f (x) = y g(y) = a if y 6∈ ran f Given u ∈ A, let v = f (u) for some v ∈ B; then g(v) = u. Therefore g is onto. Since g(b0) = a = g(b), |g −1 (a)| > 1, so g is a co-Dedekind map and A is the codomain of a coDedekind map. The next Proposition highlights the close relationship between Dedekind maps and arbitrary functions that are 1-1 and not onto, and also between co-Dedekind maps and arbitrary functions that are onto but not 1-1. Proposition 29. (1) Suppose h : A → D is 1-1 but not onto. Then there is B ⊆ D such that ran h ⊆ B and h : A → B is a Dedekind map. In particular, h factors as i ◦ h where h is h : A → B and i : B ,→ D is the inclusion map. h A @ h@ R @ - D i B (2) Suppose k : D → A is onto but not 1-1. Then there is B ⊆ D such that k B : B → A is a co-Dedekind map. In particular, k factors as k = (k B) ◦ t where t = sB ◦ k and sB : A → B is a section of k B. k D @ t=sB ◦k@ A kB R @ B 53 - Remark 4. In part (2), the definition of t need not be quite so precise. In fact, if we define t by ( u if u ∈ B t(u) = arbitrary otherwise commutativity of the diagram is still guaranteed. Proof of (1). Since h is 1-1 but not onto, A and D must both be infinite. Let d ∈ D lie outside the range of h. Let B = h[A] ∪ {d}. Clearly, |A| = |B| and h : A → B is 1-1 and not onto. Therefore, h : A → B is Dedekind. Proof of (2). Let s : A → D be a section of k and let B = ran s ∪ {y}, where y ∈ D − ran s. Let x ∈ A be such that k(y) = x. Since s : A → ran s is a bijection, there is y 0 ∈ ran s such that k(y 0 ) = x. It follows that k B : B → A is a co-Dedekind map. Let sB be the map i ◦ s : A → B, where i : ran s ,→ B is inclusion. Letting t = sB ◦ k, it follows that k = (k B) ◦ t; that is, the diagram displayed in part (2) of the Proposition is commutative. Remark 5. We observe that Dedekind maps have essentially the same preservation properties as Dedekind self-maps: Given a Dedekind map f : A → B, in a sense we will now specify, the “image” of f is again the domain of a Dedekind map and is itself a Dedekind infinite set. Moreover, iterating the process of obtaining new Dedekind maps leads to a sequence of critical points that provides a blueprint for the set ω of finite ordinals, just as in the case of Dedekind self-maps. Suppose f : A → B is a Dedekind map with critical point b and let f = π◦j be a factorization. f A @ - B j@ π R @ A As before, let B0 = f [A], A0 = π −1 [B0 ] ⊆ A, π0 = π A0 : A0 → B0 , a = π −1 (b), and we have crit j = a. Notice A0 = π −1 [B0 ] = π −1 [f [A]] = j[A]. Therefore, since a 6∈ ran j, a 6∈ A0 . Let j0 = j A0 . We verify that j(a) is a critical point of j0 : If not, let z ∈ A0 with j0 (z) = j(a). Then j(z) = j(a) and z = a, but this is impossible since a 6∈ A0 . Let f0 = f A0 . We check that ran f0 ⊆ B0 : Given x ∈ A0 , f (x) ∈ f [A] = B0 . We note that f (a) = π(j(a)) is a critical point for f0 : If f0 (x) = π(j(a)) = f (a), then x = a, which is impossible since a 6∈ A0 . Since |A0 | = |B0 | (via π0 ) and f0 : A0 → B0 is 1-1 and not onto, it follows that f0 is a Dedekind map. And once again, A0 is Dedekind-infinite because j0 : A0 → A0 is a Dedekind self-map. Summarizing the discussion so far, Dedekind maps have essentially the same preservation properties as a Dedekind self-map: Given a Dedekind map f : A → B with critical point a and witnessing bijection π : A → B, A0 = π −1 [ran f ] is, like A, the domain of another Dedekind map 54 f A0 , with critical point π(j(a)). And, since there is a Dedekind self-map (obtained from f0 , π0 ) from A0 to itself, A0 is itself, like A, Dedekind-infinite. f0 A0 - B0 @ j0@ π0 R @ A0 Moreover, similar logic leads to a sequence of Dedekind maps, f : A → B, f0 : A0 → B0 , f1 : A1 → B1 , . . ., with critical points b, π(j(a)), π(j(j(a))), . . ., which is the sequence (∗) (π ◦ j ◦ π −1 )0 (b), (π ◦ j ◦ π −1 )1 (b), (π ◦ j ◦ π −1 )2 (b), . . ., together with witnessing bijections π, π0, π1 , . . ., where each of A, A0 , A1 , . . . is Dedekind-infinite. From previous work, it is not hard to show that (∗) determines a sequence that is Dedekind selfmap isomorphic to s : ω → ω. This can be shown by first letting W 0 be the smallest j 0 -inductive subset of B, where j 0 = π ◦ j ◦ π −1 . Then one shows that there is a unique Dedekind self-map isomorphism τ 0 : ω → W 0 taking 0 to b and making the following commute: ω s - τ0 ? W0 ω τ0 j0 - ? W0 Indeed, τ 0 (n) = (j 0)n (b) is the required map. Therefore, in the same sense in which a Dedekind self-map j : A → A gives rise to a kind of blueprint W ⊆ A for ω, so likewise any Dedekind map f : A → B gives rise to a blueprint W 0 ⊆ B for ω. To close this section, we show that Dedekind (self)-maps and co-Dedekind (self)-maps, together with bijections, are the building blocks of functions whose domain, codomain, and range all have the same infinite size. For any such function g : A → B, we show below that exactly one of the following must be true: (1) g is a bijection; (2) g is a Dedekind map; (3) g is a co-Dedekind map; (4) g factors as g = i ◦ f , where i is Dedekind map and f is a co-Dedekind map. Remark 6. Part (4) is a more informative variant—in the special case in which g is neither 1-1 nor onto and its domain, codomain and range all have the same infinite size—of the fact that every function g : A → B can be factored as g = i ◦ f where f is onto and i is 1-1 (known as an epi-monic factorization). 55 Proposition 30. Suppose g : A → B is a function that is not a bijection. Let C = ran g. Suppose also that |A| = |C| = |B|. Then: (A) If g is 1-1, then g : A → B is a Dedekind map. (B) If g is onto, then g : A → B is a co-Dedekind map. (C) If g is neither 1-1 nor onto, then g factors as g = i ◦ f where f = g : A → C is co-Dedekind and i : C → B is Dedekind. Proof of (A) and (B). If g is 1-1 then since g is not a bijection, it is not onto; since |A| = |B| as well, it follows that g is Dedekind. If g is onto, then g is not 1-1, and, by similar reasoning, g is co-Dedekind. Proof of (C). Since g is not 1-1, it follows that g : A → C is co-Dedekind. Let i : C ,→ B be the inclusion map. Since C 6= B (g is onto C but not onto B, so C 6= B), i is a Dedekind map. The result follows. 11 Blueprints Generated by a Dedekind Self-Map In this section, we give a more formal treatment of the concept of a blueprint, introduced in the last section. This approach will generalize well to broader contexts and will provide further evidence of the rich foreshadowing of large cardinals suggested by the notion of a Dedekind self-map. Starting with a Dedekind self-map j : A → A with critical point a, and a set X ⊆ A, we describe in a reasonably precise way what we mean by a blueprint for X. Because the intended meaning can vary considerably depending on the context, our treatment will leave the notion of a blueprint somewhat underspecified. The intention is that we have a class E of structurepreserving maps—preserving structure in the same way that j preserves structure—definable from j. Through the interaction of j, a, and E, a self-map f (either Dedekind or co-Dedekind) is obtained, which encodes the set X. The map f is the blueprint for X, and E plays the role of encoding and decoding the blueprint. In particular, it should be the case that for every x ∈ X, there is i ∈ E such that i(f )(a) = x. We will also consider the special case in which there is also a dual g to f (so that either f is a section of g or g is a section of f ), playing the role of “collapsing” X to a, with the property that for every x ∈ X, there is i ∈ E such that i(g)(x) = a. Here are the details: Definition 3 (Blueprints and Strong Blueprints). Suppose j : A → A is a Dedekind self-map with critical point a and X is a set. A j-blueprint (or simply a blueprint) for X is a triple (f, a, E) having the following properties: (1) For some set B, f is a map B → B, either a Dedekind self-map or a co-Dedekind self-map, called the blueprint map. (2) a is a critical point of j and is called the blueprint seed. 56 (3) E is a set of “structure-preserving” functionals—having essentially the same structurepreserving characteristics as j itself—“compatible with” j, such that, for each i ∈ E: (a) dom i ⊇ B B . (b) i(f ) is a function and dom i(f ) ⊇ B. (c) a ∈ dom i(f ). The collection E is called a blueprint coder. (4) (Encoding) The self-map f is defined from E, j, a. (5) (Decoding) For every x ∈ X, there is i ∈ E such that i(f )(a) = x. Moreover, a strong j-blueprint (or simply a strong blueprint) for X is a quadruple (f, g, a, E) such that (f, a, E) is a blueprint for X and (A) dom g = dom f (B) Either f is a section of g or g is a section of f (C) For each i ∈ E, dom i(g) = dom i(f ) (D) For every x ∈ X, there is i ∈ E such that i(g)(x) = a. We have under-specified E in our requirement that its members are “structure-preserving” and “compatible with j.” Presumably, with additional work and a study of a wider range of examples, we can make this requirement more precise. For the examples we know about so far, when our starting point is a Dedekind self-map j : A → A, as described in the previous section, a member of E = {i0 , i1 , . . . , in, . . .} is “structure-preserving” if it is a Dedekind map (with exceptions in the base cases n = 0, 1), and is “compatible with j” if it is definable from j and its critical point, without extra parameters. When we expand to the context of Dedekind self-maps j : V → V , we will require the elements of E to be elementary embeddings, and compatibility with j will have the technical meaning discussed in [6] of compatibility of a class of set elementary embeddings relative to an ambient embedding j : V → V . Intuitively, if (f, a, E) is a blueprint for X, E will have been defined in some way from j, a, and has been used to encode X in the map f , and E may be used to “read” f , recovering the elements of X. As we shall show in a moment, the notion that the iterates a, j(a), j(j(a)), . . . obtained from a Dedekind self-map j : A → A form a “blueprint” of the set ω of finite ordinals can be expressed in this framework. In later sections, we will show that analogous blueprints arise with various large cardinals, providing evidence for our hypothesis that properties of Dedekind self-maps naturally generalize to the context of large cardinals. The work in the previous section provides us with an example of both a blueprint and a strong blueprint, as we now show. 57 Example 2 (Blueprint for ω). Suppose j : A → A is a Dedekind self-map with critical point a. Let W denote the smallest j-inductive set in A. (W corresponds to the set B mentioned in the definition of blueprint.) Let π : W → ω be the Mostowski collapsing isomorphism. We define E = {i0 , i1, i2 , . . .} as follows: For each n, in : W W → ω W is defined by ( π if n = 0 in (h) = π ◦ j n−1 ◦ h if n > 0 Note that when n = 0, in is the constant function h 7→ π and when n = 1, in (h) = π ◦ h. E is the blueprint coder. Define f = j W : W → W ; f is the blueprint map. We verify that (f, a, E) is a blueprint for ω. We verify points (1)–(5) from the definition. Parts (1) and (2) are obvious. For (4), as f = j W , we see f is definable from j and W . However, W itself (see Remark 1) is definable from j and a. For (3), parts (a)–(c) are easy to check. For the main statement of (3), first it is clear (by inspecting the proof of the Mostowski Collapsing Theorem) that π itself is definable from j and each in is definable from j and π. For n > 1, in is a Dedekind map: It is easy to see in is 1-1, and, for each such n, the map r : W → ω, defined by r(j n (a)) = sn (0), fails to be in the range of in , and so each in , for n > 1, is a Dedekind map; in particular, each such in has the same “kind” of preservation properties as j itself, as discussed in Remark 5. For n = 0, 1, in is not a Dedekind map; these are the exceptional base cases. One can easily check that, in fact, i1 is a bijection. For (5), certainly i0 (f )(a) = π(a) = 0. Let n ∈ ω, n ≥ 1. Then in (f )(a) = π(j n−1 (f (a))) = π(j n(a)) = sn (π(a)) = sn (0) = n. We have shown that (f, a, E) is a blueprint for ω. In Example 2, we see how j : A → A provides a blueprint for ω and how the blueprint coder E extracts the finite ordinals from the blueprint. The next example exhibits a strong blueprint for W . Example 3 (Strong Blueprint for W ). Suppose A is a transitive set closed under pairs. Define SA : A → A and F : A → A as in Example 1. We treat SA as a Dedekind self-map with critical point ∅. Let W ⊆ A be the smallest SA -inductive set in A and let π : W → ω be the Mostowski collapsing isomorphism. Let f = SA W : W → W and let g = F W : W → W . Define a collection E = {i0 , i1, . . .} of functionals as follows: For each i ∈ E, dom i = W W . For each h ∈ WW: i0 (h) = idW in (h) = hn for n ≥ 1. Then (f, g, ∅, E) is a strong blueprint for W . We verify the properties of a blueprint. For (1), the set B is W = {∅, {∅}, {{∅}}, . . .}, and, as shown in the previous section, f is a Dedekind self-map with critical point ∅ and g is a co-Dedekind self-map. Parts (2) and (4) are clear. 58 For (3), parts (a)–(c) are immediate; for the main part of (3), as in the previous propostion, when n = 0, in is constant; when n = 1, in is a bijection; and for n > 1, in is a Dedekind self-map, with critical point f . Also note that each the elements of E are definable from ω, which, as previous work shows, is definable from SA and ∅. For (5), certainly i0 (f )(a) = a. For n ≥ 1, we have in (f )(a) = f n (a) = j n (a), as required. The additional properties (A)–(D) for strong blueprints were established in the previous section. In particular, g is a co-Dedekind self-map and for each n ∈ ω, in (g)(j n(a)) = F n (j n (a)) = ∅. 12 A First Look at Dedekind Self-Maps of the Universe One of our motivations for introducing the New Axiom of Infinity is to wrap within the formulation of the set theory axiom “there exists an infinite set” some intuition about the nature of “the infinite”—the intention is that, by including an intuition of this kind, the axiom can suggest a direction for answering questions about the mathematical infinite that cannot be resolved by the ZFC axioms alone. The intuition that we have wished to include is the notion that the “essence” of infinite collections consists of underlying self-referral dynamics of an unbounded field, represented mathematically as a Dedekind self-map. To express Cantor’s original insight that “infinity exists” by saying “a Dedekind self-map exists” is to say that it is enough that the axiom should assert the existence of the source of infinite collections; then one derives the existence of concrete infinite collections. One fundamental question that has been of interest to set theorists for many decades, is the Problem of Large Cardinals: Is there a natural axiom that can be added to the axioms of ZFC on the basis of which the known large cardinals can be derived?36 Since this is a question about the mathematical infinite, it is reasonable to consult the Axiom of Infinity to get some hint about how we might strengthen the axiom in a natural way to produce a new axiom that could account for large cardinals. But the usual Axiom of Infinity tells us very little. On the other hand, our New Axiom of Infinity suggests a direction that one would be unlikely to consider on the basis of the usual Axiom of Infinity. We put together two clues as we seek new axioms to justify large cardinals: (1) Many of the known large cardinals have a global character, asserting things about the universe V as a whole. (2) The New Axiom of Infinity (which has “wrapped within it” some intuition about the origin of infinite sets) asserts the existence of a Dedekind self-map. Based on these clues, we conjecture that the axiom for large cardinals that we are seeking is of the form “There is a Dedekind self-map defined on the universe of all sets.” We therefore are interested in investigating Dedekind self-maps of the form j : V → V . 36 Large cardinals will be introduced more systematically in Section 21. 59 Based on our work so far, we can describe properties we would expect such a j : V → V to have. Properties of a Dedekind self-map j : A → A with critical point a that we have discovered so far are: (A) The self-map j preserves essential properties of its domain (A is Dedekind infinite, and so is the image j[A] of j) and of itself (the property of being a Dedekind self-map propagates to the restriction j j[A]). (B) The definition of j entails a critical point that plays a key role in its dynamics. One aspect of those dynamics is that repeated restrictions of j to successive images give rise to a critical sequence W = {a, j(a), j(j(a)), . . .} which is a precursor to the set ω of finite ordinals; the emergence of this critical sequence provides a strong analogy to the ancient and quantum field theoretic perspective that “particles arise from the dynamics of an unbounded field,” where, in this context, particular finite ordinals are viewed as “particles.” The sequence of restrictions j0 , j1, j2 , . . . of j that produce the critical sequence is defined by the clauses: A0 = A j0 = j : A → A crit(j0 ) = a A1 = j[A0 ] j1 = j A1 crit(j1 ) = j(a) An+1 = j[An] jn+1 = j An+1 crit(jn+1 ) = j n+1 (a). Therefore, crit(jn ) = j n (a) is a critical point of jn . (C) Through the interplay between j and its critical point, a blueprint (j W, a, E) of the set ω of finite ordinals arises. In particular, j W is a Dedekind self-map, and for every n ∈ ω, there is i ∈ E such that i(j W )(a) = n. (D) In special cases (starting with a transitive set A that is closed under pairs), there also arises a strong blueprint (S W 0 , F W 0 , ∅, E 0), where S is the singleton map x 7→ {x} (which is a Dedekind self-map), W 0 = {∅, {∅}, {{∅}} . . .}, F takes a set to a ∈-minimal element of the set (F is a co-Dedekind self-map and S A is a section of F ), and E has the property that for all x ∈ W 0 there are i, k ∈ E such that i(S W )(∅) = x and k(F W )(x) = ∅. In this case, the strong blueprint gives expression to both the generating effect of a Dedekind self-map and the collapsing effect of the dual map, a co-Dedekind self-map. Using (A)–(D) to guide intuition about the “nature of the infinite,” as we expand our consideration from local to global, it is natural to search for Dedekind self-maps j : V → V that exhibit similar properties. We, in a sense, expect to find Dedekind self-maps that: (A) Preserve essential properties of their domain 60 (B) Have a critical point that plays a key role in the dynamics of j. Moreover, restrictions of j should give rise to a sequence of critical points and reveal further dynamics of j. (C) Give rise to a blueprint for some set related to the critical point. (D) Give rise to a strong blueprint for some other set related to the critical point. As we begin studying Dedekind self-maps defined on the universe V , an issue comes into view that needs to be resolved. Consider the global successor function s defined by s(x) = x ∪ {x}. Certainly this is an example of a Dedekind self-map defined on the universe of sets, having critical point ∅ (just like the successor function on ω). However, the definition of this function does not imply the existence of an infinite set. Suppose we attempt to carry out the proof given above, which showed that from a set Dedekind self-map, we can derive s and the set ω of natural numbers, but now using s in place of the set Dedekind self-map. The first thing to notice is that the proof in Theorem 10, showing that N is the smallest s̄-inductive set, doesn’t work. The problem is that, in order for there to be a smallest inductive set, there must be at least one inductive set in the first place. Certainly V itself is inductive (but it’s not a set, as we shall discuss in the next section), and certainly we can form the sequence ∅, s(∅), s(s(∅)), . . . by iterating s̄ (and, intuitively, this collection should form an inductive set), but we have no guarantee that all of these values belong to any single set; therefore, in this case, forming the intersection of all inductive sets may result in the empty set, and forming the intersection of all inductive classes may result in a proper class. Thus, N does not emerge from the dynamics of s̄ : V → V as it does from any Dedekind self-map defined on a set. The fact is that the definition of s is just as valid37 in ZFC − Infinity as it is in ZFC.38 This means that the existence of a Dedekind self-map on V cannot properly be viewed as an axiom of infinity even though existence of a set Dedekind self-map can. Nevertheless, it is reasonable to expect that one could introduce slight strengthenings of the properties of a self-map ̄ : V → V that would imply the Axiom of Infinity (or the New Axiom of Infinity). Our next step, therefore, is to formulate several strengthenings of this kind, which we can then use for further intuition concerning how to proceed in the direction of justifying not just infinite sets, but large cardinals as well. To take this step, we will need a deeper understanding of the universe V and of the notion of a class. 37 In fact, s is definable on any transitive class that is closed under formation of pairs and unions; in other words, s is definable in the theory {Pairing,Union}, which is just a tiny fragment of ZFC. 38 One consequence of this observation is the fact that the existence of a Dedekind self-map on a set is stronger than the existence of such a map on the universe V . We mention briefly here the precise sense in which this is true (the reader unfamiliar with consistency results concerning ZFC can skip this note). Using only ZFC-Infinity, we can demonstrate the existence of a Dedekind self-map from the universe to itself (the generalized successor s̄ : V → V is an example). However, the existence of a Dedekind self-map on a set, in conjunction with the other axioms of ZFC, is sufficient to construct a model of the theory ZFC-Infinity (for instance, Vω ). This shows that the existence of a Dedekind self-map on a set bears the same relationship to the theory ZFC-Infinity as the existence of large cardinals bears to the theory ZFC: One in principle cannot prove from ZFC-Infinity that a Dedekind self-map (or any kind of infinite set) exists; likewise, one cannot prove from ZFC, for the same reason, that any type of large cardinal exists. 61 13 The Classes ON and V The collection ON of ordinal numbers is to the universe V as the set of natural numbers is to the “universe” Vω . They allow us to measure the sizes of sets and arrange elements in (long) sequences. Both ON and V are examples of proper classes—collections that are too big to be sets, but that we can speak about because they are definable by a formula. In this section, we give precise defintions of these ideas. We begin with the construction of V . The universe V is built in stages, and, as we have already seen, the first few stages of V can be defined by the following inductive definition: V0 = ∅ Vn+1 = P(Vn). We also observed before that, assuming the Axiom of Infinity, we can obtain the set [ Vω = Vn . n∈ω To continue the construction further, we need to define the ordinal numbers. Definition 4 (Ordinal Numbers). An ordinal number is any transitive set X with the property that (X, ∈) is a well-ordered set.39 The collection of ordinal numbers is denoted ON. The list of all ordinals begins with the finite ordinals 0, 1, 2, . . . (defined, as we described earlier, as sets, with the property that each is the set comprised of all finite ordinals that precede it). The first infinite ordinal is ω, which, like the finite ordinals, is the set consisting of all the ordinals that precede it. Adopting the convention that, for any ordinal α, α + 1 = α ∪ {α}, the next few ordinals can be listed as follows: ω, ω + 1, ω + 2, ω + 3, . . . (note that by ω + 2 we mean (ω + 1) + 1, and so forth). The order relation (which is simply the membership relation ∈) on the ordinal numbers closely resembles that defined on the natural numbers because of the following fact, which we do not prove here: Proposition 31. Every nonempty subset of ON is well-ordered (by ∈). Because the ordinals are well-ordered, they can be used for inductively defining long sequences in exactly the same way as is done with the natural numbers. The theorem that makes this assertion is given below, Theorem 21. We begin with two more definitions. A successor ordinal is an ordinal β for which, for some ordinal α, β = α + 1 (recall α + 1 is shorthand for α ∪ {α}). All the positive natural numbers are successor ordinals: For instance, 3 = 2 + 1. A limit ordinal is an ordinal that is not a successor ordinal; equivalently, an ordinal that has no immediate predecessor. The easiest example is 0. Whether or not any other limit 39 Technically, we should say (X, ∈ X × X), since ∈ is a relation defined on all of V . 62 ordinals exist depends on whether the Axiom of Infinity holds. If not, then 0 is the only limit ordinal. If it does hold, then ω exists and it is the smallest nonzero limit ordinal. The ordinals can be used as indices to continue the construction of the stages of the universe. The proof that this is so—as was the case for inductive definitions over ω—requires a principle of induction. We state this principle first, and then show how inductive definitions are done over ON. Theorem 32 (Transfinite Induction). (ZFC-Infinity) Suppose φ(x) is a formula with ordinal parameter x and possibly other set parameters. Suppose the following three conditions hold: (1) (Basis) φ(0) holds. (2) (Successor) If α is a successor ordinal, α = β + 1, and φ(β) holds, then φ(α) also holds. (3) (Limit) If α is a limit ordinal and φ(β) holds for every β < α, then φ(α) holds. Then φ(α) holds for every α ∈ ON. For transfinite induction, the induction step involves two sub-steps—the Successor step and the Limit step. Likewise, inductive definitions across ON take into account successor and limit ordinals separately. We give the inductive definition to define the sequence hV0 , V1, V2 , . . . , Vα, . . .i of stages of the universe, using a compact format, and then follow this by the more rigorous underpinnings of this simpler format. V0 = ∅ Vα = P(Vβ ) if α = β + 1 [ Vα = Vβ if α is a limit ordinal β<α V = [ Vα α∈ON The definition of the long sequence hV0 , V1, . . . , Vα, . . .iα∈ON involves two “induction” steps, both of which are concerned with how to define the αth stage on the basis of stages Vβ for β < α. When α is a successor β + 1, Vα is computed to be the power set of the previous stage; when α is a limit, Vα is computed to be the union of all previous stages. 63 Formally, the mechanism for justifying the existence of such a long sequence is a generalization of the mechanism given in Theorem 18. We state the result without directly using Dedekind self-maps on a class, as was done in that theorem, but our result could be reworded in the language of Dedekind self-maps. Theorem 33 (Transfinite Recursion). . (ZFC − Infinity) Suppose C is a class, well-ordered by some relation <. For each x ∈ C, let Cx = {y ∈ C | y < x}. Suppose F : C × V → V is a class function. Then there is a unique class map T : C → V satisfying, for all x ∈ C, T(x) = F (x, T Cx ) . An important example of C is ON. theorem, let F : ON × V be defined by   z    P(S(ran(z))) F(α, z) = S  (ran(z))    ∅ To see how the stages of the universe emerge from the if α = 0 if z is a function on α and α is a successor if z is a function on α and α is a limit otherwise Let T : ON → V be as in the theorem. For each α, write Vα = T(α). We show by transfinite induction that the Vα satisfy the relations given in the compact definition given earlier. First, V0 = T(0) = F(0, T 0) = ∅ = 0. Next, assume α ∈ ON is a successor, so α = β + 1, and let z = T α = hVγ | γ < αi. Then [ Vα = T(β + 1) = F(β + 1, z) = P( (ran(z))) = P(Vβ ). If α is a limit Vα = T(α) = F(α, z) = [ [ (ran(z)) = Vβ . β<α Now the result follows by transfinite induction. Notice that in the theorem, we make reference to V even before we have defined the stages of V . This is because V is, first and foremost, a class, defined by a formula (namely, the formula “x = x,” so V = {x | x = x}); the representation of V as a union of stages is made possible by the Foundation Axiom (one of the ZFC axioms).40 A set is understood to be any element of V . A class is any subcollection of V that one can specify using a formula with set parameters. Any set x is, trivially, a class since x is specified by the formula (with set parameters x, y) “y ∈ x”: x = {y | y ∈ x}. A proper class is a class that is not a set. An example we have just seen is the class ON of ordinal numbers. Another example is K = {x | x is finite}. It is easy to see that a collection C defined by a formula is a proper class if and only if, for every ordinal α, there is an element of C that does not belong to Vα; in other words, there are elements of C arbitrarily high up in V , so C 6⊆ Vα for any ordinal α. So, our 40 More formally, the Foundation Axiom is used to prove that for every set x, there is an ordinal α such that x ∈ Vα . 64 two examples ON and K are proper classes because, for every α, neither α nor {α} belong to Vα (and α is an ordinal so belongs to ON, and {α} is finite, so belongs to K). A map F : A → B is a set if both A and B are sets. In some contexts, even if B is a proper class, F may be considered to be a set as long the collection of ordered pairs that determine F forms a set. (In that case, the range R of F certainly is a set, so one could, for example, consider F to be F : A → R, now presented as a set function.) On the other hand, if the collection of ordered pairs that determine F are definable from some formula, F is a class map. If, in addition, these ordered pairs form a proper class, then F is a proper class map. We will see later that it is (meaningfully) possible for F to be neither a set nor a class.41 We now state a slight strengthening of Proposition 31:42 Proposition 34. (ZFC − Infinity) Every nonempty subclass of ON has an ∈-least element. 14 The Class of Natural Numbers We mentioned at the beginning of this paper43 that one could define, without any form of the Axiom of Infinity, the class of natural numbers. Our proof that the set ω of natural numbers can be derived from a Dedekind self-map could have been simplified greatly by making use of this class, but we chose not to do so in order to emphasize that it is possible to view the natural numbers as emerging from the dynamics of a Dedekind self-map without any “help” from the natural numbers themselves. At this point, since our discipline has already born the intended fruit, we will no longer attempt to avoid using this class. We define ω as follows: ( {0} ∪ {s(α) | α ∈ ON and α < γ} if γ is the least nonzero limit ordinal ω= {0} ∪ {s(α) | α ∈ ON} if ON has no nonzero limit ordinal Theorem 35. (ZFC − Infinity) The following are equivalent: (1) The Axiom of Infinity (there is an inductive set). (2) ω = ω. (3) There is a nonzero limit ordinal. Proof. For (1) ⇒ (2), assuming there is an inductive set, we can form (in the usual way) the intersection ω of all inductive sets. It is easy to verify that ω is an inductive set, so ω ⊆ ω. The usual principle of induction and definition by induction theorems follow. By induction, one verifies that s = s ω : ω → ω and that ω = {0} ∪ {s(n) | n ∈ ω}. Since ω is defined using the first clause in the definition in this case, the result follows. For (2) ⇒ (3), we can use ω. For 41 See the footnote on p.147 for an example of such a map. This follows from the other ZFC axioms in conjunction with Proposition 31: Given a class C of ordinals, let α ∈ C. By Separation, E = {β ∈ α | β ∈ C} is a set, as is {α}; therefore, their union is a set of ordinals having a least element γ. Now, for any δ ∈ C, either δ < α or δ ≥ α. In the first case, δ ∈ E, so γ ≤ δ. In the second case, γ ≤ α ≤ δ. 43 See p.15. 42 65 (3) ⇒ (1), let γ be a nonzero limit ordinal. It is easy to see that γ is inductive, so the Axiom of Infinity holds. Whether or not any form of the Axiom of Infinity holds, ω still satisfies a form of the Principle of Induction. Theorem 36 (Class Induction Over ω). (ZFC − Infinity) Suppose C is a subclass of ω with the following properties: (1) 0 ∈ C (2) whenever α ∈ C, s(α) ∈ C Then C = ω. Proof. Let B = ω − C. Assume B 6= ∅, and let α be its least element. Since 0 ∈ C, α > 0. By the definition of ω, α is not a limit ordinal, and so has an immediate predecessor β, which must be in C. By (2), the successor of β must therefore be in C, contradicting the fact that α ∈ C. Therefore, B is empty, and the result follows. A special case of Theorem 33 allows us to recursively define sequences with indices in ω. Theorem 37 (Class Recursion Over ω). (ZFC − Infinity) Suppose F : ω × V → V is a class function and t0 is a set. Then there is a unique class function T : ω → V satisfying: (1) T(0) = t0 . (2) For all n ∈ ω, T(s(n)) = F(n, T(n)). The theorem shows that, in order to specify a sequence hxn | n ∈ ωi of sets, it is enough to specify x0 and to declare how xn+1 should be computed from xn —this latter declaration takes the form of a class function F. In practice, one uses a simpler but equivalent form of Theorem 37 to define a sequence inductively (we encountered this format before in our definition of the first stages of the universe V , p.41). To obtain a sequence hxn | n ∈ ωi, it suffices to specify a value t0 and a function t so that (1) x0 = t0 ; (2) xn+1 = t(xn ). In the context of the theory ZFC − Infinity a set X is said to be finite if |X| = n for some n ∈ ω.44 When ω = ω, of course, this new definition of “finite” coincides with the one given earlier,45 but this slightly more general definition gives meaning to the term “finite” even in the absence of infinite sets. This is the definition we will use for the rest of the paper. We can now define a set to be infinite if it is not finite. We can verify that this notion of “infinite” matches the equivalent forms given in Theorem 35. Since we have already shown that existence of a Dedekind self-map is equivalent to the Axiom of Infinity, the following suffices. We make use of the easily proven fact that a finite union of finite sets is finite. 44 45 The expression “|X| = n” means there is a bijection f : X → n. See p. 35. 66 Theorem 38. (ZFC − Infinity) The following are equivalent: (1) There is a Dedekind self-map. (2) There is an infinite set. Proof of (1) ⇒ (2). It suffices to show that no finite set admits a Dedekind self-map, and we do this by induction over ω. Certainly the empty set is not Dedekind-infinite since the empty map has no critical point. Assuming the result holds for all finite sets of size n, let X = {x1 , . . . , xn , xn+1 } be a set of size n + 1 and assume t : X → X is a 1-1 function with critical point xi . Let k 6= i and let ` be such that t(x` ) = xk . Let Y = X − {x` }, let Z = X − {xk }, and let j = t Y : Y → Z. Certainly j is 1-1 but not onto since xi is not in the range of j. Since both Y and Z have size n, there is a bijection α : Z → Y . Now j ◦α : Z → Z is 1-1 but not onto, since if xi = j(z) = j(α(z)), then, letting xr = α(z) we would have j(xr ) = xi , which is impossible since xi 6∈ ran j. We have shown that j ◦ α is a Dedekind self-map on a set of size n, which contradicts the induction hypothesis. Thus, the assumption that t is a Dedekind self-map is incorrect. We have completed the induction step and therefore the proof that no finite set admits a Dedekind self-map. Proof of (2) ⇒ (1). Suppose X is infinite. We prove by induction that for any n ∈ ω, there is a 1-1 function f : n → X: This is clear for n = 0, 1. Assuming f : n → X is 1-1, since |ranf | = n, there must be an x ∈ X not in the range of f . Define g : n+1 → X by g = f ∪{(n, x)}. Clearly g is also 1-1. This completes the induction. We next inductively define a sequence S = hSn | n ∈ ωi of disjoint, nonempty subsets of X so that, for each n > 0, |Sn | = n. Let x1 ∈ X; we let S1 = {x1 }. Assume we have defined S1 , S2 , . . . , Sn. An easy induction argument shows A = S1 ∪ S2 ∪ . . .∪ Sn is finite. The set X − A cannot be finite—otherwise X = (X − A) ∪ A would be finite. By our earlier claim, we can obtain f : n + 1 → X − A that is 1-1. Let Sn+1 = ran f . This completes the inductive defintion and gives us a sequence (or function) h : ω → X. To complete the proof of S S (2) ⇒ (1), define Y ⊆ X by Y = ran h; that is, Y = n Sn . Obtain a set W using the Axiom of Choice by picking one element sn from each Sn . Define w : W → W by w(sn ) = sn+1 . Clearly w is a Dedekind self-map. Next, we wish to show that ω consists precisely of its elements of the form sn (0). We first define by induction on ω the sequence of iterates of s: s0 , s1 , s2 , . . . , sn , . . .. Let t0 = idω . Define F on ω by ( s ◦ x if x : ω → ω is a function F(n, x) = ∅ otherwise Using Theorem 37, we obtain a unique class function T on ω satisfying T(0) = t0 = s0 and T(s(n)) = F(n, T(n)) = s ◦ T(n). Letting sn = T(n) for each n ∈ ω, one shows by induction on ω that s0 = idω sn+1 = s ◦ sn . Alternatively, we can view these clauses as an application of the simpler form of definition by induction given above. 67 Theorem 39. (ZFC − Infinity) ω = {sn (0) | n ∈ ω}. Moreover, for each n ∈ ω, sn (0) = n. Proof. An easy induction shows that each sn (0) belongs to ω. To complete the proof, it suffices to prove the “moreover” clause. We proceed by induction on n. If n = 0, then n = s0 (0). Assume the result for n ≥ 0. Then sn+1 (0) = s(sn (0)) = s(n) = n + 1. We can now show that (ω, s, 0) has the property of being initial in the category of class Dedekind self-maps. Theorem 40. (ZFC − Infinity) Suppose j : C → C is a class Dedekind self-map with critical point c. Then there is a unique class map F, definable from j and s, for which F(0) = c and the following diagram is commutative: ω s - F ? C ω F j - ? C Proof. By the proof of Theorem 17(1), a class function F on ω is specified uniquely by the following data: F(0) = c F(s(n)) = j(F(n)). This uniquely defined F satisfies the requirements of the Theorem. The work in this section has been largely unnecessary because it is redundant. We have already proven the induction and recursion theorems for ω. On the other hand, if ω does not exist, it is easy to see that ω = ON, since the only ordinals that exist in that case are the finite ordinals; for that case, we have also provided an induction and recursion theorem (in Section 13). Nevertheless, we do have in hand now a definition of the class of natural numbers, and this notion will prove useful. 15 Strengthening Proper Class Dedekind Self-Maps We turn now to several ways of strengthening a Dedekind self-map of the form j : V → V in the theory ZFC − Infinity so that some form of the Axiom of Infinity is derivable. Given such a j, we say that a subset (or more generally, a subclass) A of V is j-closed if, for all x ∈ A, j(x) ∈ A. Theorem 41. Suppose j : V → V is a Dedekind self-map with critical point a. Suppose A ∈ V is a set such that a ∈ A and A is j-closed. Then j A : A → A is a Dedekind self-map with critical point a. 68 Proof. This is a straightforward verification. We point out that, in the theory ZFC − Infinity, it is impossible to prove the existence of a Dedekind self-map j : V → V and a set A containing the critical point of j for which A is j-closed, since existence of an infinite set is not provable from that theory. On the other hand, the global successor function s, together with Vω , provides an example in ZFC. Properties of the global successor s suggest another related way to generate Dedekind selfmaps: Theorem 42. There is a Dedekind self-map if and only if there is a set A such that |A| = |s(A)|. Proof. An easy but rather unenlightening proof can be given as follows: If there is a Dedekind self-map, an infinite set must exist, so, in particular, ω exists. Since ω = |ω + 1|, we can let A = ω in this case. For the converse, suppose |A| = |s(A)|. Let κ = |A|. Then κ = |s(κ)| = |κ + 1|. But for each finite ordinal n, n < n + 1 = |n + 1|. Therefore, κ (hence, A as well) is infinite. We give an alternative proof46 using the tools of category theory. This approach will have the advantage that it will reveal an alternative characterization of initial Dedekind self-maps. Recall that, by the Axiom of Foundation, for any set A, {A} must be disjoint from A, so s(A) is the disjoint union of A and {A}. In category theoretic notation, we may view s(A) as a coproduct of A and {A}: s(A) = A q {A}. Since all singleton sets are isomorphic in the category of sets, we may equally well write s(A) = A q 1, 47 where, as usual, 1 = {0}. We take a moment to define the notion of coproducts (in any category) more carefully: Given objects X, Y , the coproduct of X and Y consists of an object X q Y and morphisms iX : X → X q Y and iY : Y → X q Y with the following universal property: Whenever h : X → Z and k : Y → Z are morphisms, there is a unique morphism denoted hh, ki which makes the following diagram commute: iX X H H -X q Y hHH j H iY hh,ki k ? Y Z i i X Y The diagram X −→ X q Y ←− Y is called a coproduct diagram. In any model of ZFC − Infinity, the axioms guarantee the existence of coproducts: Given sets X, Y , X q Y is the disjoint union of X, Y ; that is: X q Y = {(0, x) | x ∈ X} ∪ {(1, y) | y ∈ Y }, 46 This is a special case of a much deeper result due to Freyd. In this context, it is even more usual to write s(A) = A + 1, but to avoid confusion we will adopt the notational convention mentioned in the text. 47 69 together with the maps iX , iY defined by iX (x) = (0, x) iY (y) = (1, y). Then, given h : X → Z and k : Y → Z, we define hh, ki by hh, ki((0, x)) = h(x) and hh, ki((1, y)) = k(y). One verifies commutativity of the diagram and uniqueness of hh, ki. The main lemma for our second proof of the theorem is the following. Let us call a Dedekind self-map singular if it has only one critical point. i i a A 1 Lemma 43. Suppose 1 −→ A q 1 ←− 1 is a coproduct diagram. Suppose j : A → A and 1 −→ A a (identifying an element a of A with 1 −→ A) are functions. Then (A) If j is a Dedekind self-map with critical point a, then hj, ai is 1-1, from which it follows that |A| = |A q 1|. If, in addition, j is singular, then hj, ai is itself a bijection. (B) If hj, ai is 1-1, then j is a Dedekind self-map with critical point a. Moreover, if hj, ai is a bijection, j is a singular Dedekind self-map. A iA H H -Aq1 jHH j H hj,ai ? i1 a 1 A Proof of Lemma. For (A), we start by showing hj, ai is 1-1. Suppose hj, ai((0, x)) = hj, ai((0, y)). Using the set-based definition of hj, ai, it follows that j(x) = j(y), so, by 1-1-ness of j, x = y. On the other hand, if hj, ai((0, x)) = hj, ai((1, 0)) (note that (1, 0) is the only element of A q 1 having first component 1), it follows that j(x) = a. But this is impossible since a 6∈ ran j. We have shown hj, ai is 1-1, whence |A q 1| ≤ |A|; by the Schröder-Bernstein Theorem, since |A| ≤ |A q 1| as well, it follows that |A| = |A q 1|. For the “moreover” clause of (A), suppose j is a singular Dedekind self-map. Let x ∈ A. If x 6= a, then x ∈ ran j, and so hj, ai((0, x)) = x. On the other hand, hj, ai((1, 0)) = a. It follows that hj, ai is onto, as required. For (B), suppose hj, ai is 1-1. Observing that iA is 1-1, it follows that j, being a composite of 1-1 functions, is itself 1-1. We show a 6∈ ran j: Suppose, for a contradiction, that j(x) = a. It follows that h(j, a)i((0, x)) = a = hj, ai((1, 0)), contradicting the 1-1-ness of hj, ai. For the “moreover” clause, assume now that hj, ai is a bijection. It suffices to show that hj, ai (Aq1−{(1, 0)}) is a bijection from (A q 1) − {(1, 0)} to A − {a}. Let b ∈ A − {a}. Then there is x ∈ A with j(x) = b, and so hj, ai((0, x)) = b. Finally, we prove the theorem. If there is a Dedekind self-map, then by part (A) of the lemma, |A| = |A q 1| = |s(A)|. Conversely, if |A| = |s(A)|, let q : A q 1 → A be a bijection and let a = q((1, 0)). Define j : A → A by j(x) = q((0, x)). With these definitions, the following 70 diagram is commutative: A iA HH -Aq1 jHH j H q ? i1 a 1 A But, by uniqueness of the morphism hj, ai (by the universal property of coproducts), it follows that q = hj, ai. Since q is a bijection, it follows from the lemma that j : A → A is a Dedekind self-map. The theorem, together with the easy proof given in the first paragraph, can be generalized somewhat, using a reasonable Dedekind self-map j : V → V in place of s: Corollary 44. There is a set Dedekind self-map if and only if there is a Dedekind self-map j : V → V having the following properties: (A) ran j A $ j(A) (B) |A| = |j(A)|. Proof. If there is a set Dedekind self-map, then ω exists. Letting A = ω in the theorem, the global successor s : V → V has the required properties, as the previous theorem shows. For the other direction, let j : V → V be a Dedekind self-map satisfying (A) and (B). Let f : j(A) → A be a bijection. Let i = f ◦ (j A) : A → A. Since j is 1-1, so is j A, and so i must also be 1-1. Since j A is not onto, let y ∈ j(A) − ran (j A). Then f (y) is not in the range of i. We have shown that i : A → A is a Dedekind self-map. The reader will recall that we went to the trouble of giving a second proof of the previous theorem because it leads to an alternative characterization of initial Dedekind self-maps. To state the result, we need to define one other universal construction from category theory. Suppose f : A → B and g : A → B are functions, with the same domain and codomain. A coequalizer for f, g is a function q : B → C (for some C) satisfying q ◦ f = q ◦ g, and having the following universal property: For any h : B → D satisfying h ◦ f = h ◦ g, there is a unique k : D → C making the following diagram commutative: f A g q B h C k D Theorem 45. Suppose j : A → A and a ∈ A. (A) If (A, j, a) is an initial Dedekind self-map, then, i i A 1 (i) for the coproduct diagram A −→ A q 1 ←− 1, hj, ai is an isomorphism, and 71 (ii) if u : A → 1 is the unique function from A to 1, then u is a coequalizer for j, idA . (B) Suppose (i) and (ii) above hold true. Then (A, j, a) is an initial Dedekind self-map. Proof of (A). For (i), the fact that every initial Dedekind self-map (A, j, a) is singular follows from the fact that it must be uniquely isomorphic to (ω, s, 0). By the previous lemma, it follows that hj, ai is an isomorphism. We prove (ii). Let h : A → B so that (∗) h ◦ j = h ◦ idA = h. Certainly h(j(a)) = h(a). Moreover, by (∗), h(j n+1 (a)) = h(j n (a)). It follows that h(j n(a)) = h(a) for all n ≥ 0. Since A = {a, j(a), j 2(a), . . .}, it follows that, for all x ∈ A, h(x) = h(a). Consider the function k : 1 → B : 0 7→ h(a). Certainly k is the unique function that makes the diagram commutative. j A u A idA 1 k h B We have shown that u coequalizes j, idA . Proof of (B). Assuming conditions (i) and (ii), we show j is an initial Dedekind self-map with critical point a. By the Lemma, because (i) holds, j is a singular Dedekind self-map with critical point a. To see j is initial, letting W = {a, j(a), j 2(a), . . .}, it suffices by previous work to show W = A. Suppose there is t ∈ A − W . It follows that j(t) 6∈ W : j(t) 6= a since a 6∈ ran j, and j(t) 6= j n+1 (a) since t 6∈ W . Define h : A → 2 by ( 0 if x ∈ W h(x) = 1 if x 6∈ W We show h(j(x)) = h(x) for all x ∈ A, establishing that h ◦ j = h ◦ idA . If x ∈ W , this is obvious. If x 6∈ W , as we have shown, j(x) 6∈ W , and so h(j(x)) = 1 = h(x). To complete the proof, we show there is no k : 1 → 2 that makes the following diagram commutative: j A u A idA h 1 k B There are two possible choices for k: Either k(0) = 0 or k(0) = 1. If k(0) = 0, then we have k(u(t)) = 0 but h(t) = 1. If k(0) = 1, then we have k(u(a)) = 1 but h(a) = 0. Either way, we have a contradiction. This shows that there is no k that makes the diagram commute, and this contradicts the fact that u : A → 1 is a coequalizer for j, idA . Therefore, W = A, as required. Another way that a class Dedekind self-map j : V → V could give rise to a set Dedekind self-map is if j(x) is itself Dedekind-infinite. From a philosophical point of view, this kind of 72 situation matches the intuition that a critical point of a Dedekind self-map is directly related to the existence of an infinite set. We will discuss a somewhat more elaborate example of this type in Section 17. Two other ways a class map j : V → V could be strengthened to give rise to a Dedekind selfmap involve strengthening the preservation properties of j. Preservation properties turn out to be a key element in strengthening maps like j in such a way that stronger forms of “infinity” emerge. We introduce several properties that could be preserved by such a j and give two examples that lead to a derivation of an infinite set from a class Dedekind self-map. Recall that if C = (O, M) is a category, an initial object I in C is an object with the property that for every X ∈ O there is a unique m ∈ M such that m : I → X. We saw that in Set, the empty set is the unique initial object. The “dual” notion is a terminal object. An object T ∈ O is terminal if, for every X ∈ O there is a unique t ∈ M such that t : X → T . In Set, every singleton set {x} is terminal since, for any set X, the only function f : X → {x} is the one defined by f (z) = x for all z ∈ X. Suppose j : V → V is a 1-1 class function. A set X ∈ V is said to be a strong critical point if |X| = 6 |j(X)|. It is possible that even with a strong critical point, j may not be a Dedekind self-map. The relationship between strong critical points and critical points is addressed in Proposition 32 and Theorem 34, below. Definition 5. Suppose j : V → V is any function. (1) j is said to preserve disjoint unions if, whenever X, Y ∈ V are disjoint, j(X), j(Y ) are also disjoint and j(X ∪ Y ) = j(X) ∪ j(Y ). (2) j preserves singletons if, for any X, j({X}) = {j(X)}. (3) j is said to preserve coproducts if, whenever X, Y are disjoint, |j(X ∪ Y )| = |j(X) ∪ j(Y )|. Moreover, j preserves finite coproducts if, whenever n ∈ ω and X1 , X2 , . . . , Xn are disjoint, |j(X1 ∪ X2 ∪ · · · ∪ Xn )| = |j(X1) ∪ j(X2) ∪ · · · ∪ j(Xn)|. (4) j preserves terminal objects if, whenever T is terminal, j(T ) is terminal. In particular, j maps singleton sets to singleton sets since, in Set, the terminal objects are precisely the singleton sets. (5) j preserves initial objects if, whenever I is initial, j(I) is also initial. In particular, j(∅) = ∅, since ∅ is the unique initial object in Set. (6) j is cofinal if, for every a ∈ V there is A ∈ V with a ∈ j(A). (7) j preserves subsets if, whenever X ⊆ Y , we have j(X) ⊆ j(Y ). Remark 7. Note that preservation of disjoint unions is different from the property that every 1-1 function f has, namely, that whenever A, B are disjoint, the images f [A], f [B] are disjoint, and f [A ∪ B] = f [A] ∪ f [B]. The example f : V → V defined by f (x) = {x} highlights the distinction since f ({1} ∪ {2}) = f ({1, 2}) = {{1, 2}} = 6 {{1}, {2}} = f ({1}) ∪ f ({2}), 73 and yet f [{1} ∪ {2}] = f [{1, 2}] = {f (1), f (2)} = f [{1}] ∪ f [{2}]. We also observe here that whenever j preserves disjoint unions, j preserves subsets: Given sets A, B with A ⊆ B, note that B = A ∪ (B − A), which is a disjoint union, and so we may write j(B) = j(A) ∪ j(B − A), so that j(A) ⊆ j(B). Finally, we remark that if j has critical point a and j preserves singletons and disjoint unions, then {a} is also a critical point of j: Suppose {a} = j(z). If z is not a singleton, let z0 ∈ z and, because {z0 }, z − {z0 } is a partition of z, by preservation of singletons and disjoint unions, j(z) = {j(z0 )} ∪ j(z − {z0 }), and so |j(z)| > 1, which is impossible. Therefore, z is a singleton set {y}. We have {a} = j(z) = j({y}) = {j(y)}, and so a = j(y), contradicting the fact that a is a critical point of j. Theorem 46. (ZFC − Infinity) Suppose j : V → V is a class Dedekind self-map. Suppose also that j preserves disjoint unions, the empty set, and singletons. Then there is an infinite set. Proof. We show by induction on n that j preserves finite disjoint unions. The result is clear for n = 2 since j preserves disjoint unions. Assume that whenever {X1 , . . . , Xn } is a collection of n disjoint sets, then the sets j(X1), . . . , j(Xn) are also disjoint, and j(X1 ∪ · · · ∪ Xn ) = j(X1 ) ∪ · · · ∪ j(Xn). Given disjoint sets Y1 , . . . , Yn , Yn+1 , certainly j(Y1 ), . . . , j(Yn) are disjoint. Let Y = Y1 ∪ · · · ∪ Yn . By induction hypothesis, j(Y ) = j(Y1 ) ∪ · · · ∪ j(Yn ). Since Y and Yn+1 are disjoint, so are j(Y ) and j(Yn+1 ), and j(Y1 ∪ · · · ∪ Yn ∪ Yn+1 ) = j(Y ∪ Yn+1 ) = j(Y ) ∪ j(Yn+1 ) = j(Y1 ) ∪ · · · ∪ j(Yn ) ∪ j(Yn+1 ), as required. This completes the induction and proves the claim. We proceed by induction to show that j(n) = n for all n ∈ ω. By assumption on j, j(∅) = ∅. Assume j(n) = n. Note that for all n ∈ ω, n, {n} are disjoint because (n, ∈) is a well-ordering (so, in particular, n 6∈ n). We complete the induction step and the proof that j(n) = n for all n ∈ ω with the following: j(n + 1) = j(n ∪ {n}) = j(n) ∪ j({n}) = j(n) ∪ {j(n)} = n ∪ {n} = n + 1. S Next we show that, for all x ∈ HF, j(x) = x. As we observed earlier,48 HF = n∈ω Vn . For each x ∈ HF, let rank x denote the least n ∈ ω for which x ⊆ Vn .49 We proceed by induction on n ∈ ω to show that, for all x of rank ≤ n, j(x) = x. This is clear for n = 0, since j(∅) = ∅. Assume the result for n. For the induction step, it suffices to show that if rank x = n + 1, then j(x) = x. Given such an x, certainly x is finite; write x = {y1 , . . . , yk }. For each i, yi ∈ x ⊆ Vn+1 . 48 A proof is given in a footnote on page p. 42. The proof that this is true when ω does not exist is essentially the same even though, in that case, the union of the Vn is a proper class. 49 More generally, for any x ∈ V , rank x is defined to be the least ordinal α for which x ⊆ Vα . 74 By the induction hypothesis, for each i, j(yi) = yi . We have j(x) = j ({y1 , . . . , yk })   [ = j {yi } 1≤i≤k = [ j({yi }) 1≤i≤k = [ {j(yi)} 1≤i≤k = [ {yi } 1≤i≤k = {y1 , . . . , yk } = x. This completes the induction and proves that, for all x ∈ HF, j(x) = x. Finally, we complete the proof of the theorem: Let a be a critical point of j. Since j HF is just the identity map on HF, which is onto HF, a 6∈ HF. Therefore, every transitive set that contains a is infinite; since the Trans axiom holds, some transitive set must include a. Therefore, there is an infinite set. A filter F on a set A is a collection of subsets of A that is closed under intersections and supersets, and for which A ∈ F and ∅ 6∈ F (sometimes such an F is called a proper filter). If there is X ⊆ A such that F = {Y | X ⊆ Y }, F is a principal filter with generator X. If there is x ∈ A such that {x} ∈ F , then F is a trivial filter. Every trivial filter is principal. A filter F is an ultrafilter if, for every X ⊆ A, either X or A − X belongs to F . If F is an ultrafilter, F is nontrivial if and only if F is nonprincipal. If F is a nonprincipal ultrafilter on A, F contains no finite sets: If Y ⊆ A is finite, Y = {y1 , . . . , yk }, then by closure under intersections, since {yi } 6∈ F for 1 ≤ i ≤ k, A − {yi } ∈ F , and so \ A−F = (A − {yi }) ∈ F. 1≤i≤k It is possible to construct nontrivial filters on a finite set, and also to construct an ultrafilter on a finite set. But it is not possible to construct a nontrivial (nonprincipal) ultrafilter on a finite set, as is shown in the previous paragraph. These observations give us another equivalent form of the Axiom of Infinity: Ultrafilter Criterion for Infinite Sets. There is an infinite set if and only if there is a nonprincipal ultrafilter. Using the Axiom of Choice, one can construct a nonprincipal ultrafilter on any infinite set. Ultrafilters (including nonprincipal ultrfilters) also arise from Dedekind self-maps j : V → V , equipped with adequate preservation properties, as we discuss next in Theorem 47. 75 Theorem 47. Suppose j : V → V is a class Dedekind self-map with critical point a and there is a set A such that a ∈ j(A). Let D = {X ⊆ A | a ∈ j(X)}. (1) If j preserves subsets, intersections, and the empty set, then D is a filter. (2) If, in addition, to (1), one of the following conditions holds, then D is a nontrivial filter: (A) j preserves singletons. (B) j preserves terminals and {a} is a second critical point of j. (3) If, in addition to (1), j preserves disjoint unions, then D is an ultrafilter. Moreover, if one of the conditions in (2) also holds, D is a nonprincipal ultrafilter (and A is infinite). Motivation for the conditions in part (2) are given in Remark 7. Remark 8. In the absence of (3), properties (1) and (2) are rather weak. We give an example to illustrate this point, but also to motivate a more complex example for which (3) does hold. Suppose A is a set with two or more elements, and define j : V → V by j(X) = X A = {f | f : A → X}. Let idA : A → A denote the identity map. Then idA ∈ j(A). Clearly idA and {idA } are critical points of j. Also, for any sets X, Y , X 6= Y ⇒ X A 6= Y A , and so j is 1-1; indeed, j is a Dedekind self-map. Since ∅A = ∅, j preserves the emptyset. For any set X, j({X}) = {X}A = {t}, where t : A → {X} is the unique function from A to {X}; therefore, j preserves terminals. Finally, notice j preserves intersections: It suffices to show that (X ∩ Y )A = X A ∩ Y A . f ∈ (X ∩ Y )A , it follows easily that f ∈ X A ∩ Y A . Now suppose f ∈ X A ∩ Y A , so that ran f ⊆ X and ran f ⊆ Y . It follows that ran f ⊆ X ∩ Y , and so f ∈ (X ∩ Y )A . We have shown that this particular j : V → V satisfies properties (1) and (2) of Theorem 47, and so we conclude that D is a nontrivial filter. However, j does not satisfy property (3): Given two nonempty disjoint subsets B, C of A with B ∪ C = A, while it is true that B A and C A are disjoint, it is not the case that B A ∪ C A = (B ∪ C)A ; indeed, idA belongs to the set on the right-hand side, but not to the set on the left-hand side. In fact, D is rather uninteresting: Although it is true that A ∈ D (since 1A ∈ j(A)), D contains no other set! If X $ A, then it is not the case that 1A ∈ X A . To make D more interesting, and to ensure that j preserves disjoint unions, it is necessary to “blur” the sharp differences between the sets of the form Z A , and this can be done with an appropriate choice of equivalence relation. Proof of (1). A ∈ D by the definition of D and ∅ 6∈ D since j(∅) = ∅. D is closed under intersections since j preserves intersections. D is also closed under supersets since j preserves subsets. We have shown that D is a filter. Proof of (2). We verify D is nontrivial assuming either (2)(A) or (2)(B). For this, it suffices to show that D has no element of the form {z}. Suppose for some z ∈ A, {z} ∈ D. If (2)(A) holds, then a ∈ j({z}) = {j(z)}, since j preserves singletons in this case. It follows a = j(z), which is impossible since a is a critical point of j. 76 If, instead, (2)(B) holds, then a ∈ j({z}) = {y} for some y ∈ j(A), since j preserves terminals. It follows that a = y, and so {a} = j({z}); in other words, {a} ∈ ran j contradicting the assumption that {a} is a second critical point for j. We have shown that D is a nontrivial filter. Proof of (3). Assuming j preserves disjoint unions, we show that D is an ultrafilter: Suppose X ⊆ A and X 6∈ D. Let Y = A − X. We show a ∈ j(Y ). Since j preserves disjoint unions, j(A) = j(X) ∪ j(Y ); since a 6∈ j(X), then a ∈ j(Y ), as required. If one of the conditions in (2) also holds, then, by the argument establishing (2), D is a nonprincipal ultrafilter. It follows that A itself is infinite. Remark 9. We modify the example described in Remark 8 in a way that leads to the conclusion that the filter D derived from j is a nonprinicipal ultrafilter. As we mentioned in that remark, we need to modify the map X 7→ AX that was used there by introducing an appropriate equivalence relation. One of the problems to fix from that example is that the only element of D is A. This limitation occurs because the only way the function idA : A → A could be a member of X A, for X ⊆ A, is if X = A. This requirement can be relaxed if we allow the possibility that idA “almost” belongs to X A . We can do this by declaring two partial functions defined on A to be equivalent if they are both defined on a “large” subset of A and agree on a “large subset” of A. This new approach solves the problem: If f : A → X is a function that agrees with idA on a large set (because X itself is large), we will be able to conclude that X ∈ D. The concept of “large subset” can be captured by making use of a filter. For this purpose, let U denote a (proper) filter on A. We do not require U to be a nonprincipal ultrafilter, so it is perfectly possible for A to be finite at this step. If f : A → Z is a partial function, we call f U -good if {x ∈ A | f (x) is defined} ∈ U . For f, g : A → Z that are U -good, we define ∼ = ∼A,U,Z by f ∼ g iff {x ∈ A | f (x) = g(x)} ∈ U. We note that, for any f ∈ Z A for which |Z| ≥ 1, there is a total function f 0 : A → Z such that f ∼ f 0 : Let z ∈ Z. Define f 0 by ( f (x) if x ∈ dom f f 0 (x) = z otherwise Clearly f 0 ∼ f . Denote the equivalence class that contains f by [f ]U , or simply [f ]. For each set Z, we define Z A /U = {[f ]U | f : A → Z is U -good}. Finally, define j : V → V by j(Z) = Z A /U . We first show, in Claims (1)–(5) below, that j has property (1), mentioned in Theorem 47: Claim 1. j(∅) = ∅. 77 Proof. This follows because ∅A = ∅. Claim 2. j is 1-1. Proof. Suppose Y 6= Z are sets; without loss of generality, let y ∈ Y − Z. Consider the constant function f y : A → Y defined by f y (x) = y. Clearly f y agrees nowhere with any partial function A → Z. Therefore f y ∈ Y ω − Z ω , and so j(Y ) 6= j(Z). Claim 3. j preserves terminal objects. Proof. We show j takes singleton sets to singleton sets. Given a singleton set {z}, let fz be the unique function A → {z}. Then j({z}) = {z}A /D = {[f ] | f is a partial function from a set A ∈ D into {z}} = {[fz ]}. Claim 4. [idA ] is a critical point of j. [idA ] ∈ j(A). Therefore, j is a Dedekind self-map. Moreover, Proof. Certainly, [idA ] is not itself of the form Z A /U , so [idA ] is a critical point. Since idA ∈ AA , certainly [idA ] ∈ AA /U = j(A). Claim 5. j preserves intersections. Proof. Suppose X, Y are sets and Z = X ∩ Y . Suppose [f ] ∈ X A /U ∩ Y A /U . Let B, C ∈ U be such that for all x ∈ B, f (x) ∈ X and for all x ∈ C, f (x) ∈ Y . Since U is closed under instersections, B ∩ C ∈ U , and we have that for all x ∈ B ∩ C, f (x) ∈ X ∩ Y , so [f ] ∈ (X ∩ Y )A /U . For the converse, if E ∈ U is such that for all x ∈ E, f (x) ∈ X ∩ Y , then it follows easily, using the fact that U is closed under supersets, that [f ] ∈ X A/U and [f ] ∈ Y A /U . We have shown property (1) of Theorem 47 holds for j. Therefore, if D = {X ⊆ A | [idA ] ∈ j(X)}, then D is a filter. In the present context, this is particularly unsurprising in light of the next Claim: Claim 6. D = U . Proof. Suppose X is a set. Then X ∈ D ⇔ [idA ] ∈ j(X) ⇔ [idA ] ∈ X A/U ⇔ {x ∈ A | x ∈ X} ∈ U ⇔ X ∈ U. We next show that j preserves disjoint unions only if our starting filter U was already an ultrafilter: Claim 7. The following are equivalent: (A) j preserves disjoint unions. 78 (B) U is an ultrafilter (equivalently, D is an ultrafilter). Proof. First, observe that (A) ⇒ (B) follows from Theorem 47, part (3), using the fact that D = U . For the converse, suppose X, Y are disjoint and let Z = X ∪ Y . It is obvious that X ω ∩Y ω = ∅, and so j(X)∩j(Y ) = ∅. It is also obvious that j(X)∪j(Y ) ⊆ j(Z). To prove equality, let [f ] ∈ Z A /D. Let SX = {x ∈ A | f (x) ∈ X} and SY = {x ∈ A | f (x) ∈ Y }. Since SX ∪ SY ∈ D and D is an ultrafilter, one of SX , SY belongs to D, say SX . Then [f ] = [f SX ] ∈ X A /U. We have shown that each [f ] in Z A /U belongs to X A /U ∪ Y A /U . Assuming from the beginning that U is an ultrafilter, we have established that j preserves disjoint unions and that D is therefore, by Theorem 47, also an ultrafilter. None of the arguments so far require A to be infinite or D to be nonprincipal. In the present setting, the only way D could turn out to be a nonprincipal ultrafilter, given the assumptions we have made so far on U , is if {[idA ]} turns out to be a critical point of j, which could happen only if U itself was initially assumed to be nonprincipal. Unfortunately, therefore, tracing through this example does not reveal the mechanics by which a nonprincipal ultrafilter comes into existence; instead, it shows that existence of a nonprincipal ultrafilter is equivalent to existence of a Dedekind self-map having properties (1), (2)(B), and (3) in Theorem 47. The next Claim establishes the remaining details. Claim 8. Assume U is an ultrafilter. Then the following are equivalent: (A) {[idA ]} is a critical point of j. (B) U (equivalently, D) is a nonprincipal ultrafilter on A, whence A is infinite. Proof. The fact that (A) ⇒ (B) follows from Theorem 47(2)(B), using the fact that D = U . For the converse, assume U is nonprincipal but {[idA ]} is in the range of j, so that {[idA ]} = Z A /U , for some set Z; we will arrive at a contradiction. First, we show that Z itself must be a singleton set: If Z has two elements y, z, the constant functions f y : A → Z : x 7→ y and f z : A → Z : x 7→ z agree nowhere, and so [f y ] 6= [f z ]; hence, |j(Z)| = |Z A /U | > 1, contradicting our assumption that {[idA ]} = Z A /U . Therefore, Z = {z} for some z. Let f be the unique function from A to {z}. Then Z A /U = {[f ]} = {[idA ]}; in other words, f ∼ idA . It follows that {x ∈ A | f (x) = id(x)} ∈ U ; that is, {x ∈ A | z = x} ∈ U , and so {z} ∈ U . Since U is nonprincipal, this is impossible. We have shown therefore that {[idA ]} is also a critical point of j. Claims 7 and 8 could have been presented and proven in reverse order, with slight modifications; the only change in the proofs is that “nonprincipal” must be replaced with “nontrivial” in the new version of Claim 8 (which we will now call Claim 7’). We give the restatements here: Claim 7’. The following are equivalent for j, U as originally defined in this example: (A) {[idA ]} is a critical point of j. (B) U (equivalently, D) is a nontrivial filter on A. 79 Claim 8’. Assume U is a nontrivial filter on A. Then the following are equivalent: (A) j preserves disjoint unions. (B) U (equivalently, D) is a nonprincipal ultrafilter, whence A is infinite. The condition in Theorem 47 that the critical point a must belong to a set of the form j(A) for some set A can fail even when the other conditions on j hold; indeed, the example in Remark 9 provides an example: A is a critical point for j but does not belong to any set of the form j(X) = X A /D. We list several sufficient conditions here for the existence of such a set A, but most of these involve concepts that will be introduced later. (a) j is cofinal (p. 73) (b) j preserves ∈ and both preserves and reflects ordinals (definitions on p. 88). (c) j has a universal element (defined on p. 100). Verification of (b) can be found on p. 90. The example given in Remark 9 is an example of (c), as we will see later (p. 100). Combining Theorem 47 with condition (a) leads to a nice sufficient condition for the existence of a nonprincipal ultrafilter: Proposition 48 (ZFC − Infinity). Suppose j : V → V is a cofinal class Dedekind self-map with critical point a. Suppose also that j preserves disjoint unions, intersections, the empty set, and singletons. Then there is a nonprincipal ultrafilter over some (infinite) set A. Proof. By cofinality of j, there is a set A such that a ∈ j(A). Define D by D = {X ⊆ A | a ∈ j(X)}. Then by the proof of Theorem 47, D is a nonprincipal ultrafilter on A. We turn now to a third set of preservation properties that a Dedekind self-map may exhibit, which imply existence of an infinite set. Theorem 49. (ZFC − Infinity) Suppose j : V → V is a class Dedekind self-map with a strong critical point. Suppose j preserves finite coproducts and terminal objects. Then there is an infinite set. Proof. We first show that (∗) for every finite set X, |j(X)| = |X|. First, suppose X 6= ∅. Write X = {x1 , . . . , xn}. Because j preserves finite coproducts and terminal objects, |j(X)| = |j({x1}) ∪ · · · ∪ j({xn})| = |{y1 , . . . , yn }| = |X|, 80 where, for each i, {yi } = j({xi}). We show that j(∅) = ∅, so, in particular, |j(∅)| = |∅|. Suppose not. Then |j(∅)| ≥ 1 and so, since terminal objects are preserved, for some set y, |j(∅) ∪ j({∅})| = |j(∅)| + |j({∅})| = |j(∅)| + |{y}| ≥ 2. But this contradicts the fact that coproducts are preserved, since, for this same set y, |j(∅ ∪ {∅})| = |j({∅})| = |{y}| = 1. Finally, we show there is an infinite set: Let Z be a strong critical point of j, so that |j(Z)| = 6 |Z|. But now (∗) implies that Z is not finite. Therefore, Z is infinite. Remark 10. The proof of Theorem 49 does not actually require j to be a Dedekind self-map; the same argument works assuming only that j : V → V has a strong critical point and preserves finite coproducts and terminal objects. Remark 11. The proofs of Theorems 46, 47, and 49 show that any Dedekind self-map on V that exhibits the specified preservation properties produces an infinite set, and, in the case of Theorem 47, a nonprincipal ultrafilter. Self-maps with such properties cannot be proven to exist in the theory ZFC − Infinity by Gödel Incompleteness. However, assuming existence of a set Dedekind self-map—or, equivalently, existence of ω—such examples can be found. As an example of a j : V → V that exhibits the properties specified in the hypotheses of Theorems 47(2) and 49, we summarize (with a more concrete example) the points made in Remark 9. We follow this with a simpler example that satisfies the properties of Theorem 49 only. We do not have an example of a j : V → V with precisely the properties of Theorem 46 or of Theorem 47(1); in particular, our examples do not preserve singletons (only terminal objects). Such examples do exist however, but, so far, the only examples we know involve the use of large cardinal hypotheses. Example 4 (Reduced Product Construction). We recap the development of the example in Remark 9 using a more concrete example and proving one additional property that it has. In particular, assuming ω exists, we exhibit a class Dedekind self-map with the properties listed in Theorem 47(2) and Theorem 49: As was mentioned earlier, we may obtain a nonprincipal ultrafilter D on ω. We recall several definitions from Remark 9. Given any set X, if f, g are both D-good partial functions from ω to X, we declare f, g are equivalent, and write f ∼ g, if the set of n ∈ ω at which f, g are both defined and equal belongs to D. Let [f ] denote the ∼-equivalence class containing f and let X ω /D = {[f ] | f is a D-good partial function from ω to X}. Define jD : V → V by jD (X) = X ω /D. We note as before that, for any f ∈ X ω for which |X| ≥ 1, there is a total function f 0 : ω → X such that f ∼ f 0 : Let x0 ∈ X. Define f 0 by ( f (n) if n ∈ dom f f 0 (n) = x0 otherwise 81 Clearly f 0 ∼ f . The following claims are proved in Remark 9 (setting A = ω). Claim Claim Claim Claim 1. 2. 3. 4. jD (∅) = ∅. jD is 1-1. jD preserves terminal objects. Both [idω ] and {[idω ]} are critical points of jD . Also, D = {X ⊆ A | [idω ] ∈ jD (X)}. Claim 5. jD preserves disjoint unions. Consequently, jD preserves all finite coproducts. Claim 6. jD preserves intersections. The following claim shows that our example also illustrates Theorem 49: Claim 7. ω is a strong critical point of jD . Proof. Define, for each n ∈ ω, the function fn : ω → ω by fn (i) = n + i. Then, whenever m 6= n, fm , fn disagree everywhere. Therefore {[fn ] | n ∈ ω} is an infinite subset of jD (ω) = ω ω /D. We show that jD (ω) is in fact uncountable. Suppose {[gn] | n ∈ ω} is an infinite subset of ω ω /D. Define h : ω → ω by h(n) = least element of ω − {gi(n) | i < n}. Then for all i ∈ ω and all n > i, h(n) 6= gi(n); in particular [h] 6= [gi ] for all i ∈ ω. We have shown jD (ω) is uncountable, so ω < |jD (ω)|. Therefore, ω is a strong critical point. Example 5 (Simple Model of Theorem 49). Assuming that ω does exist, we construct a Dedekind self-map j : V → V having the properties mentioned in the hypothesis of Theorem 49. ( x if x is finite j(x) = P(x) if x is infinite The first clause of the definition ensures that j preserves terminal objects, since such objects must always be finite. Since the powerset operator is 1-1, so is j. Notice ω is a strong critical point by Cantor’s Theorem; it is easy to see that ω is a critical point as well. We show that j preserves coproducts. Suppose X, Y are disjoint sets. The cases in which one of X, Y is finite are straightforward; assume both are infinite. Then |j(X ∪ Y )| = |P(X ∪ Y )| = 2max{|X|,|Y |} = max{2|X|, 2|Y | } = |P(X) ∪ P(Y )| = |j(X) ∪ j(Y )|. The proof that finite coproducts are preserved is a straightforward induction. Remark 12. We review the logic of Theorems 46, 47, and 49, along with Examples 4 and 5. Theorems 46,47, and 49 guarantee the existence of an infinite set from the existence of a class 82 Dedekind self-map with sufficiently strong preservation properties. Theorem 47 guarantees, in additions, existence of a nonprincipal ultrafilter, also derivable from a class Dedekind self-map with sufficiently strong preservation properties. Conversely, Example 5 shows that assuming the existence of an infinite set, one may recover a Dedekind self-map with the properties listed in Theorem 49. And Example 4 shows how, assuming existence of a nonprincipal ultrafilter on an infinite set, one can define a Dedekind self-map with the properties listed in Theorem 47. Our work so far shows that a class Dedekind self-map, equipped with sufficiently strong preservation properties, gives rise to an infinite set, and therefore to ω, and also to a nonprincipal ultrafilter. Another sort of structure, in the domain of graph theory, also arises with the emergence of ω, called ω-trees; the existence of a nonprincipal ultrafilter provides a key for establishing the main property that belongs to all trees of this type. Example 6 (Tree Property for ω). A tree is a partially ordered set (T, <) having a smallest element (called the root), with the property that, for every t ∈ T , the set P rt = {s ∈ T | s ≤ t} (the predecessors of t) is well-ordered by ≤. We identify this definition of a tree with that of a rooted tree from graph theory (a rooted tree is an undirected graph that is connected, has no cycles, and has a distinguished vertex called the root). Thus, there is a path from x to y in T (as a graph) if and only if either x ≤ y or y ≤ x; likewise, for any x ∈ T , the descendants of x are those y ∈ T for which x < y, and the children of x are those descendants of x that are immediate successors of x. A chain in a tree T is any linearly ordered subset of T . A tree has levels, denoted LevT0 , LevT1 , LevT2 , . . . , or, when the meaning is clear, Lev0 , Lev1 , . . .50 If r denotes the root of a tree, Lev0 = {r}, and Lev1 consists of the children of r, and so forth.51 An ω-tree is a tree with exactly ω levels and for which every 50 In the general case, this enumeration may extend into the transfinite. Further specification is required for trees having ω or more nonempty levels; we will not need this specification here. 51 83 level is finite; in particular, every element has finitely many children. A binary ω-tree is an ω-tree in which each vertex has at most two children. For any m ∈ ω ∪ {ω}, an m-branch in an ω-tree is a subset B of T , linearly ordered by <, which intersects each of the levels in L = {Levi | i < m} exactly once. A maximal branch is a subset B of T that is an m-branch for every m ∈ ω. We can make use of the fact that ω admits a nonprincipal ultrafilter to prove the following: Proposition 50. ω has the tree property. That is, every ω-tree has a maximal branch. Proof. Given an ω-tree (T, <T ), it is possible to list its elements level by level using the elements of ω as indices. Therefore, we identify the underlying set of this tree with ω, ordered by the tree ordering <T . Note that identifying vertices of T with elements of ω in this way implies the following: For any m, n ∈ ω, m <T n implies m < n. We let D be a nonprincipal ultrafilter D on ω (i.e., on T ). For each t ∈ T , we let St be the subtree of T at t; precisely: St = {t} ∪ {s ∈ T | s is a descendant of t in T }. Notice that t1 6= t2 implies St1 ∩ St2 = ∅. We let S n denote the union of all subtrees of T having roots in Lev n : [ S n = {St | t ∈ Levn }. Notice that S n includes each element of T whose level is ≥ n. Since there are only finitely many elements of T in levels 0 to n − 1, it follows that, for each n, S n ∈ D. Since D is an ultrafilter and S n is a finite union, there is tn ∈ Levn such that Stn ∈ D. We claim that (∗) T = St 0 ) St 1 ) St 2 ) · · · This can be shown by induction. Clearly St0 ) St1 . Assuming this relation holds below n, for n > 0, we show Stn ) Stn+1 . But since [ Stn = {S` | ` ∈ Levn+1 and ` is a child of tn }, one of the S` in this union also belongs to D. Since only one of the subtrees rooted in Levn+1 can belong to D (since they are all disjoint), it must be that S` = Sn+1 , as required. Let B = {tn | n ∈ ω}. We verify B is a maximal branch, completing the proof of Proposition 50. By construction, B meets every level exactly once. Also, our proof of (∗) above shows that for each n, tn+1 < tn . Therefore, B is a maximal branch in T , as required. We have seen that existence of a nonprincipal ultrafilter follows from existence of a Dedekind self-map j : V → V with sufficient preservation properties, and that nonprincipal ultrafilters on ω play a role in showing that ω has the tree property. This latter result, in turn, plays a key role in establishing a well-known combinatorial property of ω, which we introduce briefly here. Example 7 (Ramsey’s Theorem). The notation [ω]2 signifies the set of all unordered pairs of ω; that is, [ω]2 = {{m, n} | m, n ∈ ω}. We adopt the convention of representing unordered pairs {m, n} so that m < n. Suppose PS= {P0 , P1 } is a partition of [ω]2 ; that is, P0 , P1 are disjoint, each is nonempty, and [ω]2 = P0 P1 . We will call a subset H of ω homogeneous for P 84 if the pairs obtained from H lie entirely in one piece of the partition; that is, for some i = 0, 1, [H]2 ⊆ Pi . The Infinite Ramsey Theorem implies that every partition of [ω]2 into two pieces admits a homogeneous set of size ω. Apart from the trivial cases of n = 0, 1, 2, the property described by this theorem is unique to infinite sets:52 For n > 2, it is never true that a partition of [n]2 admits a homogeneous set of size n. We give a proof of the Infinite Ramsey Theorem using the fact that ω has the tree property. Suppose we are given a partition P = {P0 , P1 } of [ω]2 . We define a binary ω-tree from P —which we will call a P -tree—so that vertices are finite 0-1 sequences which are named in a way that encodes the partition P . We now describe how to build the P -tree T from P : Vertices of T will be finite 0-1 sequences. If t = he0 , e1 , . . . , em i, the length of t, denoted len t, is m + 1, and, for 0 ≤ i ≤ m, the value of t in position i is denoted t(i). Thus t = ht(0), t(1), . . . , t(m)i. We denote the vertices of T by: t0 , t1 , t2 , . . .. We specify the order relation on T as follows: If s, t are vertices of T , then s t if t extends s, that is, if s = t len s. We specify the values of t0 , t1 , t2 , . . . with an inductive definition so that for each j ∈ ω, tj is not equal to tk for k < j and len tj ≤ j. t0 = ∅, the empty sequence. Assuming t0 , . . . , tn−1 have been defined, tn is defined as follows: the length of tn is ≤ n. We define values tn (i) inductively as follows. When i = 0, we define: ( 0 if {0, n} ∈ P0 tn (0) = 1 if {0, n} ∈ P1 For 0 < i ≤ n, if the computation of tn is not yet complete (and, assuming that we have defined htn (0), tn(1), . . . , tn (i − 1)i), we perform the next step of computation of tn by considering two cases: In the first case, for some m < n, (∗) tm = htn (0), tn(1), . . ., tn (i − 1)i. In that case, since we are assuming the vertices t0 , . . . , tn−1 are distinct, tm is the unique vertex for which (∗) holds. Then we define ( 0 if {m, n} ∈ P0 tn (i) = 1 if {m, n} ∈ P1 For the second case, (∗) does not hold for any m < n. When this is the case, we define tn = htn (0), tn(1), . . ., tn (i − 1)i, and the computation of tn is considered complete. (Note that in this case, we do not assign a value to tn (i).) This completes the inductive definition of T . We must verify that T has the required properties: that len tj ≤ j and that all tj are distinct. The first of these is obvious by construction. To see that the tj are all distinct, notice that in the definition of tj , the computation is not considered complete until its sequence of values is different from each of t0 , . . ., tj−1 ; this requirement guarantees that, when the computation for tj is complete, tj is different from each of t0 , . . . , tj−1 . 52 We show here that the Ramsey property holds for sets of size ω; it does not hold for sets of all infinite sizes however; in fact, for sets of uncountable size κ, the statement that every partition of the pairs [κ]2 has a homogeneous set of size κ is equivalent to the assertion that κ is weakly compact (weak compactness is a large cardinal property). 85 Proposition 51 (Ramsey’s Theorem). Suppose P = {P0 , P1 } is a partition of [ω]2 and let T be the P -tree obtained from P . Then P admits a homongeneous set of size ω. Proof. Since any vertex t ∈ T can have at most two extensions, t_ 0, t_ 1, T is a binary tree, which, since T is infinite, must in fact be an ω-tree. Therefore T has a maximal branch B. Let C0 , C1 be subsets of ω defined by C0 = {n ∈ ω | both tn and t_ n 0 belong to B} _ C1 = {n ∈ ω | both tn and tn 1 belong to B}. We observe that at least one of C0 , C1 is infinite: If both were finite, let n be bigger than _ max{max C0 , max C1 } for which tn ∈ B. Since B is an infinite branch, either t_ n 0 or tn 1 also belongs to B, which is a contradiction. Without loss of generality, assume C0 is infinite. We show that C0 is homogeneous for P ; indeed, that [C0 ]2 ⊆ P0 . Suppose m < n and both belong to C0 . We show that {m, n} ∈ P0 . Since both belong to C0 , it follows that the following three vertices lie on B: tm , t_ m 0, tn . Let `m = len tm . Then tm (`m ) = 0. Since n > m and both tn , tm lie on B, tn ⊇ t_ 0, so, in particular, m tn (`m ) = 0. When tn (`m ) was computed, in the construction of T , it was observed that tm is the (unique) vertex equal to htn (0), . . ., tn (`m − 1)i, and so ( 0 if {m, n} ∈ P0 tn (`m ) = 1 if {m, n} ∈ P1 Since tn (`m) = 0, it follows that {m, n} ∈ P0 , as required. Since Ramsey’s Theorem does not hold for finite cardinals n > 2, we are in a position to state another alternative form of the Axiom of Infinity: There is an infinite set if and only if there is a cardinal n with the following two properties: (1) n is large enough so that [n]2 admits a partition into two pieces. (2) Every partition of [n]2 into two pieces admits a homogeneous set of size n. This formulation can be generalized somewhat. Although the proof we have given of Ramsey’s Theorem makes essential use of the well-ordering of the natural numbers, the truth of Ramsey’s Theorem does not depend on such properites. This observation leads to the following variant of the Axiom of Infinity: Ramsey Criterion for Infinite Sets. There is an infinite set if and only if there is a set A with the following two properties: (1) [A]2 admits a partition into two pieces. (2) Every partition of [A]2 into two pieces admits a homogeneous set of size |A|. We began this section with the observation that class Dedekind self-maps are not strong enough to imply the existence of an infinite set. In this section, we have identified several 86 strengthenings of the properties of a class Dedekind self-map that would imply existence of infinite sets. In some cases, these strengthenings involved the introduction of preservation properties; when a class Dedekind self-map preserves a suitable set of properties belonging to its domain, it becomes strong enough to give rise to an infinite set. Our plan is to motivate the existence of large cardinals by further strengthening the properties of a class Dedekind self-map in the ways that our results so far suggest. Our first step will be to consider one further strengthening that will allow us to obtain a Dedekind self-map from a functor. To take this step, it will be helpful to clarify the relationship between the different notions of “critical point” we have introduced. 16 The Relationship Between the Different Notions of Critical Point We spend a moment to clarify the relationship between critical points and strong critical points. We introduce one other related concept: Suppose j : V → V is a 1-1 class function. A set X in the domain of j is a weak critical point if j(X) 6= X. In general, the relationships between these three notions of critical point are summarized in the following proposition: Proposition 52. (ZFC − Infinity) (1) Suppose j : V → V is a 1-1 class function. If X is a strong critical point or a critical point, then X is also a weak critical point. (2) There is a 1-1 class map j : V → V having a weak critical point that has no strong critical point. (3) There is a 1-1 class map j : V → V having a weak critical point but no critical point. (4) There is a 1-1 class map j : V → V that has a strong critical point but no critical point. (5) There is a 1-1 class map j : V → V that has a critical point but not a strong critical point. Proof of (1). If X is a strong critical point, certainly j(X) 6= X, so X is also a weak critical point. If X is a critical point, then since X is not in the range of j, j(X) 6= X, so X is a weak critical point. Proof of (2). Obtain j : V → V as follows. Let idV be the identity on V . Define j as follows:   idV (x) if x 6= 1 and x 6= {1} j(x) = {1} if x = 1   1 if x = {1} Here, j(1) 6= 1, so 1 is a weak critical point. However, |j(x)| = |x| for all x ∈ V , so j has no strong critical point. 87 Proof of (3) & (4). Obtain j : V → V as follows:   id(x) j(x) = 0   1 follows. Let id be the identity map. Define j as if x 6= 0 and x 6= 1 if x = 1 if x = 0 Now j(0) 6= 0 so 0 is a weak critical point. However, ran j = dom j, so j has no critical point. Also, notice that |j(0)| = 6 0, so 0 is a strong critical point. Proof of (5). Define j : V → V by ( {n + 1} j(x) = x if x = {n} for some n ∈ ω otherwise. Certainly j is 1-1 and has critical point {0}. However, for each n ∈ ω, |{n + 1}| = |j({n})| = |{n}| = 1, so j has no strong critical point. Despite these differences, when j satisfies certain additional preservation properties, these three notions of critical point coincide. Moreover, we will be especially interested in maps that satisfy these preservation properties. We will see that for self-maps exhibiting strong preservation properties, these distinct notions of critical point will turn out to be equivalent. Definition 6. Suppose j : V → V is a 1-1 class function. (1) j preserves ∈ if, whenever x, y ∈ dom j and x ∈ y, then j(x) ∈ j(y). (2) j preserves cardinals if, whenever γ ∈ ON is a cardinal, j(γ) is also a cardinal. (3) j preserves functions if, for any f : X → Y and X, Y ∈ dom j, j(f ) is a function from j(X) to j(Y ) and, for all x ∈ X, j(f (x)) = j(f )(j(x)). This property of j can be represented by a commutative diagram: f X j - Y j ? j(f ) j(X) ? - j(Y ) (4) j preserves images if, whenever f : X → Y is a function, j(f ) is a function and j(ran f ) = ran j(f ). (5) j preserves ordinals if, whenever α is an ordinal, j(α) is also an ordinal. j reflects ordinals if, whenever j(x) is an ordinal, x is also an ordinal. Any 1-1 class function j : V → V that satisfies (1)–(4) will be called basic structure-preserving, or BSP. 88 Lemma 53. (ZFC − Infinity) Suppose j : V → V is a 1-1 class function. (1) If j preserves ∈, then, for all ordinals α, β, if j(α) ∈ j(β), then α ∈ β. (2) If j preserves ordinals and ∈, then, for all ordinals α, j(α) ≥ α. (3) Suppose j preserves ordinals and ∈. Then for all x, if x and j(x) have the same rank and for all y for which rank(y) < rank(x) we have j(y) = y, then j(x) = x. (4) If j preserves ordinals, preserves ∈, and has a weak critical point, then there is an ordinal α such that j(α) 6= α. In particular, j ON : ON → ON has a weak critical point. (5) Suppose j preserves ordinals, is BSP, and has a weak critical point κ. Then κ is a cardinal. Proof of (1). Suppose j(α) ∈ j(β). If α 6∈ β, then either α = β or β ∈ α (since ∈ is a total ordering on ON). If α = β, then we have j(β) = j(α) ∈ j(β) which is impossible by irreflexivity of ∈. If β ∈ α, then j(β) ∈ j(α) ∈ j(β), and this contradicts the fact that ∈ is both irreflexive and transitive. The result follows. Proof of (2). Suppose not; let α be least such that j(α) < α. Since j preserves ordinals and ∈, j(j(α)) < j(α). But now j(α) is a smaller ordinal β with the property that j(β) < β, contradicting leastness of α. Proof of (3). We first show x ⊆ j(x): Let y ∈ x. Since j preserves ∈, j(y) ∈ j(x). Since rank(y) < rank(x), y = j(y). It follows y ∈ j(x). Conversely, if y ∈ j(x), then y is of lower rank, so y = j(y). Since j(y) = y ∈ j(x), it follows by (1) then y ∈ x, as required. Proof of (4). Suppose j(α) = α for all ordinals α. Let x be a weak critical point for j, so j(x) 6= x. Let α = rank(x) + 1, and let X = Vα . Let M = {x ∈ X : j(x) 6= x}. The fact that M is a set follows from an application of Separation. Let B = {rank(x) : x ∈ M }. B is a set by Replacement. Also, B 6= ∅ since M 6= ∅. Let γ = inf B and let y ∈ M be such that rank(y) = γ. By (3) and the leastness of rank(y), we must have j(y) = y, and we have a contradiction. Therefore, j ON : ON → ON has a weak critical point. Proof of (5). Let κ be the least ordinal moved by j. By (2), j(κ) > κ. Suppose α < κ and f : α → κ is an onto function. By leastness of κ, j(α) = α. Since j preserves images, j(f ) : j(α) → j(κ) is also a function, and (+) ran j(f ) = j(ran f ) = j(κ). Since j(α) = α, j(f ) : α → j(κ). We show f = j(f ): For any β ∈ α, because j preserves functions and j(β) = β, we have j(f )(β) = j(f )(j(β)) = j(f (β)) = f (β). The last step follows because f (β) ∈ κ and κ is the least ordinal moved by j. Therefore j(f ) = f and so ran j(f ) ⊆ κ, which contradicts (+). Therefore, no such onto function exists, and κ is a 89 cardinal. With these preservation properties in mind, we can return to the task of verifying a point mentioned earlier, regarding Theorem 47. In that theorem, one of the hypotheses concerning j : V → V was that one of its critical points a belongs to a set of the form j(A). We described earlier (p. 80) several sufficient conditions for this hypothesis to hold true. One of the conditions described there is the following: j preserves ∈ and both preserves and reflects ordinals. We explain why this condition suffices: Theorem 52 shows that if j : V → V is a Dedekind self-map, j has a weak critical point; moreover, since we are assuming j preserves ∈ and ordinals, there is, by Lemma 53(4), a least ordinal α0 such that α0 < j(α0 ). This ordinal α0 must be a critical point of j because (a) for all ordinals α, α ≤ j(α) (Lemma 53(2)), and (b) by the ordinal reflecting property, no non-ordinal is mapped to α0 . It follows therefore that α0 itself is the required set A; that is, α0 ∈ j(α0 ). We show now that when a self-map j : V → V satisfies a modest set of the preservation properties described so far, the three notions of critical point coincide. Theorem 54. (ZFC − Infinity) Suppose j : V → V is a 1-1 class function that preserves and reflects ordinals, and is BSP. Then the following are equivalent. (1) j has a weak critical point. (2) j has a critical point. (3) j has a strong critical point. Proof. We have already seen that (2) ⇒ (1) and (3) ⇒ (1). It suffices to prove (1) ⇒ (2) and (1) ⇒ (3). Assume j has a weak critical point. Using Lemma 53(4),(5), there is a cardinal κ that is moved by j and that is the least ordinal moved by j. Assume for a contradiction that κ ∈ ran j, and let j(x) = κ. Because j reflects ordinals, x must also be an ordinal. Because α ≤ j(α) for all ordinals α, it follows that x < κ. However, existence of such an ordinal x contradicts the leastness of κ. It follows that κ is a critical point of j. Also, since κ is a cardinal and j preserves cardinals, j(κ) is also a cardinal. Since |κ| = κ < j(κ) = |j(κ)|, we conclude that κ is also a strong critical point of j. Remark 13. The property of j that it reflects ordinals is used only in the proof that a weak critical point is also critical point; all other implications in the theorem are provable without assuming that j reflects ordinals. 17 Class Dedekind Self-Maps and Functors In this section, we lay the groundwork for a discussion about class Dedekind self-maps that are functors. 90 Definition 7 (Functors). Suppose C = (OC , MC ) and D = (OD , MD) are catgories. A functor F : C → D is a transformation that maps OC to OD and MC to MD , satisfying the following properties: (1) Whenever f : A → B belongs to MC , F(f ) : F(A) → F(B) belongs to MD . (2) Whenever f : A → B and g : B → C belong to MC , we have F(g ◦ f ) = F(g) ◦ F(f ). (3) For any object A ∈ OC , F(1A ) = 1F(A). Notice that if F : Set → Set is a 1-1 functor and there is some Y ∈ V not in the range of F, then F can be seen as a class Dedekind self-map from V to V . As we will show now, the notion of a functor is related to that of a Dedekind self-map that preserves functions. To explain this relationship, we introduce the notion of a natural transformation from one functor to another. If F : C → D and G : C → D are both functors, one can define a certain type of transformation—called a natural transformation—from F to G that preserves the relationships exhibited by F now within the context of G. Specifically, we have the following definition: Definition 8 (Natural Transformations). Suppose F : C → D and G : C → D are functors. A natural transformation η from F to G associates to each object X in C a map ηX : F(X) → G(X) so that for any C-morphism f : X → Y , the following is commutative: F(X) F(f ) - F(Y ) ηX ? G(X) ηY G(f ) ? - G(Y ) Proposition 55. Suppose j : V → V is a 1-1 functor that preserves ∈. Define a mapping ̂ that associates to each set X the map jX = j X : X → j(X), and suppose that ̂ is a natural transformation from 1V to j (where 1V denotes the identity functor from V to V ). Then j preserves functions. Proof. We first observe that, because j preserves ∈, it follows that for every X, j[X] ⊆ j(X), so j X : X → j(X). Suppose f : X → Y is a function. By naturality of ̂, the following is commutative: X f jX ? j(X) - Y jY j(f ) ? - j(Y ) But now, tracing through the diagram beginning with any x ∈ X, we obtain j(f (x)) = (j(f ))(j(x)), 91 as required. The proposition says that relationships indicated by the identity functor are preserved by j in the structure of any function j(f ). This preservation of “silence” in dynamism is a characteristic of a class Dedekind self-map that is a BSP functor, but is not a characteristic of class Dedekind self-maps in general—for instance the global successor function s : V → V : x 7→ x ∪ {x} does not have this property. The result shows that deeper aspects of the self-referral flow of consciousness, as understood in the ancient wisdom of life, begin to be reflected more profoundly in Dedekind self-maps equipped with stronger preservation properties. We will see this theme develop more fully as we introduce still stronger preservation properties. Having introduced the notions of functor and natural transformation, we are ready to discuss one final way in which Dedekind self-maps emerge from the dynamics of a class Dedekind self-map. This final example does not make direct use53 of preservation properties as a way to enhance the class map. Instead, the mechanism for generating a Dedekind self-map that is a set is to locate a functor that can play the role of a self-map generator in such a way that it provides a balancing counterpoint to another kind of functor that plays the role of a self-map annihilator. Since we wish to “create” a Dedekind self-map—in effect, create something infinite—we will work, as in previous sections, in the theory ZFC − Infinity. At the same time, we will need an ambient category in which our new Dedekind self-map can appear. But because Dedekind selfmaps may not exist in the absence of an Axiom of Infinity, the category SelfMap may be empty, and therefore is not a suitable choice. We therefore introduce a slightly more general category SM whose objects are arbitrary self-maps—functions from a set to itself—and whose morphisms are defined as in SelfMap. More precisely, SM = (O, M) where O = {f : A → A | A is a set}, and an element α : f → g of M, where f : A → A, g : B → B belong to O, is a function α : A → B making the following diagram commutative: A f - α ? B A α g - ? B There is a naturally defined functor G : SM → Set—known as the forgetful functor—that has the effect of annihilating the structure of any self-map by outputting just the domain of the self-map. Thus, for any self-map g : B → B, G(g) = B. It turns out that the existence of a (set) Dedekind self-map is equivalent to the existence of a kind of counterbalancing functor F : Set → SM to G, called a left adjoint to G. Such a functor F would produce, from any given set A, a self-map fA : XA → XA ; that is, F(A) = fA . Moreover, F must also satisfy the following “balance condition”: For any self-map g : B → B and any set A, the number of Set morphisms from A to G(g) = B (that is, the number of ordinary functions there are from A to B) precisely equals the number of SM morphisms α from F(A) = fA to g. 53 In fact, preservation properties are involved, but the self-map that we end up defining does not actually exhibit the interesting properties that its factors do. 92 As a matter of notation, in general, for any sets X, Y , we let Set(X, Y ) denote the set of all functions from X to Y , and for any self-maps u : X → X, v : Y → Y in SM, we let SM(u, v) denote the set of all SM-morphisms from u to v. Recall that a morphism α : u → v is a function α : X → Y so that the following is commutative: - u X X α α ? - v Y ? Y With this notation, assuming F is a left adjoint to G, for any A in Set and g : B → B in SM, we let ΘA,g : SM(F (A), g) → Set(A, G(g)) denote a bijection between these two sets of morphisms. A further requirement, which we express in only an intuitive way here, is that the way the bijection Θ is defined is “the same” for all choices of A, g; this can be expressed precisely54 54 We give the precise formulation here, for the avid student of mathematics. For any category C, C op is another category, the opposite category of C, whose objects are the same as those of C and whose morphisms are those of C, but reversed (so if f : A → B is a morphism of C, then f op : B → A is a morphism of C op ). In the present context, define functors ΓSet : Setop × SM → Set and ΓSM : Setop × SM → Set defined on objects by ΓSet (A, β) = Set(A, G(β))), and ΓSM (A, β) = SM(F(A), β), where β : B → B is any SM-object. These functors are defined on respective morphisms as follows: Given any (A, β), (C, δ) ∈ Setop × SM and any Setop × SMmorphism hf op , gi : (A, β) → (C, δ), and any h : A → G(β) in Set, we have f G(g) h ΓSet (hf op , gi)(h) = C −→ A −→ G(β) −→ G(δ) = G(g) ◦ h ◦ f, and given h : F (A) → β in SM, we have F(f ) h g ΓSM (hf op , gi)(h) = F(C) −→ F(A) −→ β −→ δ = g ◦ h ◦ F(f ). Then to say that ΘA,β : SM(A, G(β)) → Set(F (A), β) is natural in A and β means Θ is a natural transformation from ΓSM to ΓSet , which in turn means that, for each Setop × SM-morphism hf op , gi : (A, β) → (C, δ), with β : B → B and δ : D → D, the following diagram is commutative (here, natural maps are drawn horizontally rather than vertically, as was originally done in the definition of “natural”): ΓSM (A, β) - ΓSet (A, β) ΘA,β ΓSM (f op ,g) ? ΓSM (C, δ) ΓSet (f op ,g) ? - ΓSet (C, δ) ΘC,δ It is convenient to draw the diagram in the following way: SM(F(A), β) - Set(A, G(β)) ΘA,β h7→g◦h◦F(f ) ? SM(F(C), δ) h7→G(g)◦h◦f ? - Set(C, G(δ)) ΘC,δ 93 by providing a certain natural transformation between functors of the right type, depending on A, g. Assuming such a functor F, and corresponding natural bijections ΘA,g , can be found, we indicate how a (set) Dedekind self-map emerges from the relationship between F and G. Once we have F and Θ, we can locate a single seed computation Θ1,f1 (recall F(1) = f1 : X1 → X1 ) from which the values of F can be derived, and from which a Dedekind self-map emerges. Notice that Θ1,f1 is the bijection from SM(f1 , f1 ) to Set(1, G(F(1)) = Set(1, X1). Let η = Θ1,f1 (1f1 ) (where 1f1 is the identity morphism in SM from f1 to itself). The function η picks out one of the maps 1 → X1 from the collection Set(1, X1). In this way η picks out a point that will turn out to be a critical point in X1 : Write η0 = η(0) ∈ X1 . The next point, which follows from the fact that ΘA,β is a natural bijection, is that, for any other map g from 1 to a set of the form G(h : M → M ) = M , we can find a unique SM-map As a matter of terminology, notice that in the above discussion, we begin with a morphism hf op , βi : (A, β) → (C, δ) in Setop × SM, which, by definition, is precisely the morphism hf, βi : (C, β) → (A, δ) in Set × SM. Applying ΓSM produces a morphism ΓSM (A, β) → ΓSM (C, δ) and applying ΓSet produces a morphism ΓSet (A, β) → ΓSet (C, δ). In each case, the domain and codomain of f , but not of β, have been reversed by the functors. We say that ΓSM and ΓSet are contravariant in the first argument, covariant in the second argument. 94 τ : f1 → h that takes η0 to g(0) and makes the following diagram commutative: 55 55 We prove a more general fact and indicate how the result mentioned in the text follows. For each Set object A, define a function ηA : A → GF(A) by ηA = ΘA,F(A) (1F(A) ). η is called the unit of the adjunction F a G. We show that ηA has the following universal property: Given any k : A → G(β), we show there is a unique k : F(A) → β such that the following diagram commutes: - GF(A) ηA A H H H HH j k G(k) ? G(β) We first observe that, for any g : F(A) → β, we have ΘA,β (g) = G(g) ◦ ηA . We show this by studying a special case of the main diagram from the previous footnote (showing naturalness of Θ), where f op is the identity map 1F(A) : SM(F(A), F(A)) - Set(A, GF(A)) ΘA,F(A) h7→g◦h h7→G(g)◦h ? SM(F(A), β) ? - Set(A, G(β)) ΘA,β By commutativity of the diagram, we obtain, for any g : F(A) → β in SM, ` ´ ` ´ (∗) G(g) ◦ η = ΓSet ΘA,F(A) (1F(A) ) = ΘA,β ΓSM (1F(A) ) = ΘA,β (g). We now prove that ηA has the desired universal property: As described earlier, we are given k : A → G(β). Since ΘA,β is a bijection, we can obtain k such that ΘA,β (k) = k. Now by (∗) we have k = ΘA,β (k) = G(k) ◦ ηA , as required. To show that k is unique, assume k = G(`) ◦ ηA , for some ` : F(A) → β. Let ` = ΘA,β (`). Then by (∗), ` = ΘA,β (`) = G(`) ◦ ηA = k. Since ΘA,β is a bijection, ` = k. Finally, we show how these observations demonstrate the claim in the text. The function η described in the text is, in the notation of this footnote, η1 . Suppose we are given g : 1 → G(h), where h : M → M is an object in SM. By the universal property of η1 , we can find a unique τ : F(1) → h so that the following is commutative: 1 η1 H H - GF(1) H HH j g G(τ ) ? G(h) 95 X1 f1 - τ τ ? M X1 h - ? M We call this the universal property of f1 . We verify in several steps that f1 : X1 → X1 is a Dedekind self-map with critical point η0 : First we observe that X1 has more than one element: If, for a contradiction, we suppose X1 = {η0 }, then let Y = {0, 1} and h : Y → Y be defined by h(x) = 1 − x. There should be a map τ taking η0 to 0 and making the following commutative: {η0 } f1 - {η0 } τ τ ? {0, 1} ? h - {0, 1} However, if this were true, we would have 0 = τ (η0 ) = τ (f1 (η0 )) = h(τ (η0)) = h(0) = 1, which is impossible. Next we show that f1 cannot be the identity. Let a 6= η0 be an element of X1 . Define τ1 (a) = τ1 (η0 ) = η0 ; for all other x ∈ X1 , let τ1 (x) = x. Define τ2 = idX1 . Whether τ = τ1 or τ = τ2 , we have, in either case, τ ◦ idX1 = idX1 ◦ τ , violating uniqueness of τ . X1 idX1 - τ1 ? X1 X1 τ1 idX1 - ? X1 Since F(1) = f1 : X1 → X1 , h : M → M , and G(τ ) is simply the Set morphism τ : X1 → M , the commutative triangle implies that g(0) = τ (η1 (0)) (in the text, η1 (0) was denoted η0 ). Also, since τ : F(1) → h is an SMmorphism, we obtain the following commutative diagram: X1 f1 - τ τ ? M X1 h 96 - ? M idX1 X1 - X1 τ2 τ2 ? idX1 X1 f1−1 - ? X1 Next we show that f1 is not onto. Suppose it is. Since f1 is not the identity, f1 6= f1−1 6= idX1 . takes η0 to η0 and makes the following commutative: f1 X1 - X1 f1−1 ? X1 f1−1 f1 - ? X1 Since idX1 also makes the diagram commutative, this is a violation of uniqueness. Next we show that η0 is a critical point. Suppose not; since f1 is not onto, it has a critical point, which we denote x. Let τ be unique such that τ (η0 ) = x and the following is commutative: X1 f1 - X1 τ ? X1 τ f1 - ? X1 Since we are assuming η0 is not a critical point, there is u ∈ X1 such that f1 (u) = η0 . But then, chasing the diagram, we have f1 (τ (u)) = τ (f1 (u)) = τ (η0 ) = x, and so x is in the range of f1 , contradicting the fact that x is a critical point of f1 . We will take a roundabout path to establish that f1 is 1-1. We will first extract, in the familiar way, the sequence s generated by f1 and η0 : s = hη0 , f1 (η0 ), f1 (f1 (η0 )), . . .i, and show that s has no repeated terms. Letting W = ran s, we will then conclude that f1 W is 1-1. We will then show the unique τ : f1 → f1 W must be an isomorphism—in fact, the identity map. This will allow us to conclude not only that f1 itself is 1-1, but in fact that (X1 , f1 , η0 ) is an initial Dedekind self-map. We proceed in a sequence of claims. First, we make use of Theorem 37 to define the class sequence s = hη0 , f1 (η0 ), f1 (f1 (η0 )), . . .i: s0 = η0 sn+1 = f1 (sn ) 97 One may use the Axiom of Separation now to conclude that W = ran s is a set: W = ran s ∩ X1 . Claim A. s has no repeated terms. We show by (class) induction that, for all n, the terms s0 , s1 , . . ., sn are distinct. Let A ⊆ ω be defined by A = {n ∈ ω | s0 , s1 , . . . , sn are distinct}. Certainly 0 ∈ A. Assume n ∈ A, and hence that s0 , s1 , s2 , . . . , sn are distinct. Define a set U = W ∪ {u} where u 6∈ W . Define t : U → U by   if x = si and 0 ≤ i < n si+1 t(x) = u if x = sn   arbitrary otherwise Now, letting ti (x) denote the ith iterate of t at x, with t0 (x) = x, it follows that for 0 ≤ i ≤ n, ti (s0 ) = si = f1i (s0 ), but f1n+1 (s0 ) 6= u = tn+1 (s0 ). By the universal property of f1 , there is a unique τ : X1 → U making the following diagram commutative. X1 f1 - τ τ ? U X1 t - ? U To show that s0 , s1 , . . . , sn , sn+1 are distinct, assume instead that sn+1 = si for some i, 0 ≤ i ≤ n; in other words, f1n+1 (s0 ) = f1i (s0 ) for some i, 0 ≤ i ≤ n. Tracing through the diagram, we have si = ti (s0 ) = ti (τ (s0 )) = τ (f1i (s0 )) = τ (sn+1 (s0 )) = tn+1 (τ (s0 )) = tn+1 (s0 ) = u, which is impossible. Therefore, s0 , s1 , . . . , sn , sn+1 are indeed distinct. Claim B. The map f1 W : W → W is 1-1. Proof. If f1 is not 1-1, since all elements of W are of the form f1n (η0 ), it follows that f1 (f1n (η0 )) = f1 (f1m (η0 )) for some m 6= n. This means f1n+1 (η0 ) = f1m+1 (η0 ), and so the sequence s contains repeated elements, contradicting Claim A. Remark 14. We observe here (for later use) that we have not used the uniqueness of τ in either of our arguments in Claims A and B. Also recall that, in these arguments, it is not yet known that f1 is 1-1. 98 Claim B establishes that our construction does produce a Dedekind self-map, namely f1 W : W → W , with critical point η0 . In fact, using the techniques described earlier, it is straightforward to show that there is a Dedekind self-map isomorphism from (ω, s, 0) to (W, f1 W, η0). Nevertheless, we plan to show that f1 itself is such a map. By the universal property of f1 : X1 → X1 , there is a unique τ : X1 → W taking η0 to η0 and making the following the diagram commutative: f1 X1 - τ ? W X1 τ f1 W - ? W Claim C. X1 = W . Therefore f1 = f1 W is 1-1 and (X1 , f1 , η0) is a Dedekind self-map; indeed, it is an initial Dedekind self-map. Proof. The second clause follows immediately from X1 = W , which we prove now. Consider the following diagram: f1 X1 - τ ? W τ f1 W - incW,X1 ? X1 W ? W incW,X1 f1 - τ ? X1 ? X1 τ f1 W - ? W It is easy to see that the middle square is commutative, using the inclusion map incW,X1 . By the universal property of f1 , idX1 is the unique map taking η0 to η0 for which the following is commutative: X1 f1 - idX1 ? X1 X1 idX1 f1 - ? X1 Therefore, incW,X1 ◦ τ = idX1 . Likewise, since (W, f W, η0) is an initial object in SelfMap, idW 99 is the unique self-map taking η0 to η0 and making the following diagram commutative: f1 W - W idW ? W W idW f1 W - ? W Therefore τ ◦ incW,X1 = idW . It follows that both τ and incW,X1 are bijections. Since one of these is an inclusion map, we conclude X1 = W . Remark 15. We note that the entire proof that f1 : X1 → X1 is an initial Dedekind self-map was derived from the universal property that f1 enjoys (p. 96). We make one final observation about the adjunction F a G and the resulting Dedekind self-map (X1 , f1 , η0 ): It turns out that η0 ∈ G(f1 ) is a universal element for G. In general, if U : C → Set is a functor from some category C to the category of sets, an element a of a set U(A) is a weakly universal element for U if, for every B ∈ C and every b ∈ U(B), there is a g : A → B ∈ C so that U(g)(a) = b. Moreover, a is called a universal element for U if the map g is unique. We can picture this universal property using the following x diagram, identifying an element x of a set X with the map 1 −→ X: 1 a H H - U(A) bHH U(g) j H ? U(B) A g ? B We recall56 that a functor U : C → Set is cofinal if, for every set y, there is c ∈ C such that y ∈ U(c). In this situation, when U has a weakly universal element, and U is cofinal, it follows that V = {U(h)(a) | h is a morphism of C}.57 We verify this point: Given a set b, we show b = U(h)(a) for some h: Using cofinality of U, we let B ∈ C be such that b ∈ U(B), and let h : A → B be a morphism of C for which U(h)(a) = b, as in the definition of a weakly universal element. An important example of a weakly universal element that we encountered earlier arose in the reduced product construction obtained from a nonprincipal ultrafilter D on ω (Example 4, p. 81). Recall that, given such a D, jD : V → V is defined by jD (X) = X ω /D. It turns out that jD is a functor Set → Set; its definition on Set-morphisms is given by: jD (f ) : X ω /D → Y ω /D : [g] 7→ [f ◦ g], 56 57 This notion was defined in exactly the same way for any class function j : V → V on p. 73. In the collection on the right hand side, we ignore morphisms h for which a is not in the domain of U(h). 100 for any f : X → Y . Then [1ω ] ∈ jD (ω) is a weakly universal element for jD : 1 [1ω ] - jD (ω) = ω ω /D H H [f ]HH ω g [g]7→[f ◦g] j H ? ? jD (X) = X ω /D X Given [f ] ∈ jD (X) = X ω /D, with f : ω → X, f itself is the required function: We show that jD (f )([1ω]) = [f ]: jD (f )([1ω]) = [f ◦ 1ω ] = [f ]. Returning to the adjunction F a G, it is easy to see from previous work that, by the universal property of f1 (p. 96), if β : B → B ∈ SM and b ∈ G(β) = B, there is a unique morphism τ : f1 → β in SM so that G(τ )(η0 ) = b. 1 η0 HH - GF(1) bHH j H f1 X1 G(τ ) - τ ? τ ? G(β) X1 β B - ? B Therefore η0 is a universal element for G. Moreover, since G is cofinal (as is easily checked), we have V = {G(τ )(η0) | τ is a morphism of SM}. As we will see,58 a converse is also true: Existence of a universal element for G suffices to guarantee the existence of a Dedekind self-map, and hence also existence of a left adjoint for G. We summarize what we have accomplished so far. We showed that, in obtaining a functor F : Set → SM that is left adjoint to the forgetful functor G : SM → Set—so that, in a sense, the “annihilating” effect of G is counterbalanced by the “generating” effect of F—we were able to locate a seed Θ1,f1 (1f1 )(0) = η0 to generate a Dedekind self-map F (1) = f1 : X1 → X1 = hη0 , f1 (η0 ), f12 (η0 ), . . .i. Moreover, the set X1 is itself obtained by the computation X1 = G(F(1)). We can combine the two influences represented by F and G to produce our final functor: j : V → V , defined by j = G ◦ F. j V @ - V F@ G R @ SM As we have just seen, j has a strong critical point 1, since j(1) = G(F(1)) = X1 (and recall that X1 is itself infinite). 1 is also a critical point since j(0) = 0 and j(A) must be infinite, for all 58 See the footnote, starting on page 103, that discusses essentially Dedekind functors. 101 A 6= 0.59 And in fact 1 is a weak critical point; indeed, 1 is the least ordinal moved by j. Finally, η0 ∈ GF(1) turns out to be a universal element for G, “generating” all of V in the sense that V = {G(τ )(η0) | τ is a morphism of SM}. Although j : V → V has a critical point, we cannot quite guarantee that it is 1-1. Let us call a functor H : V → V essentially Dedekind if it is “naturally isomorphic” to a functor K : V → V which has a critical point and is 1-1 (on objects). It turns out that, if we obtain j = G ◦ F where F is any left adjoint to G, as we have done, though we cannot guarantee that j is a Dedekind 59 We prove this in the next footnote. 102 self-map, it can be shown that j is essentially Dedekind.60 Here, then, we have another example in which a class Dedekind self-map (in this case, essentially Dedekind) gives rise to a set Dedekind self-map. In fact, working in ZFC − Infinity, one can show that existence of the adjoint functor F, and hence, existence of j itself, is equivalent to the Axiom of Infinity.61 60 We outline the argument here and demonstrate, using the same method, two other related facts. We show: (1) Any left adjoint F for G is essentially Dedekind. (2) Whenever F a G and |A| > 0, j(A) = G(F(A)) is infinite. (3) Whenever G has a universal element, there is a naturally defined initial Dedekind self-map. Moreover, G has a left adjoint. For (1), since we have already shown that existence of a left adjoint for G produces a Dedekind self-map, existence of the left adjoint guarantees that ω exists. We can then define a functor F0 by F0 (A) = 1A × s : A × ω → A × ω. Also, if h : A → B is a Set morphism, then F0 (h) : A × ω → B × ω is defined by F0 (h) = h × 1ω . A×ω 1A ×s - A×ω h×1ω h×1ω ? B×ω ? 1B ×s - B×ω It can be shown that F0 is also left adjoint to G. To see that F0 is 1-1 on objects, suppose A 6= B, say a ∈ A − B. Then (a, 0) ∈ A × ω − B × ω, and it follows that GF0(A) 6= GF0 (B). Also, j 0 = G ◦ F0 has a critical point: As before, j 0 (0) = 0, and, for all A for which |A| ≥ 1, j 0 (A) is infinite (since j 0 (A) = A × ω). A fact from category theory ([2] or [26]) is that left adjoints for the same functor must be naturally isomorphic. In this case, this means that there is, for each set A, an SM-isomorphism σA : F(A) → F0 (A) that is natural in A. Applying G, we obtain another isomorphism G(σA ) : GF(A) → GF0 (A) (note that every functor preserves isomorphisms). Moreover G(σA ) can be shown to be natural in A. Therefore, j and j 0 are naturally isomorphic and j 0 is a Dedekind self-map. It follows that j is an essentially Dedekind self-map. For (2), using these ideas, we can now verify that, for any set A having one or more elements, j(A) is infinite. Since j ∼ = j 0 , it follows that |j(A)| = |j 0 (A)| = |A × ω| ≥ ω. For (3), we make use of Remark 15, which tells us that, to show that a triple (A, f, a) is an initial Dedekind self-map, it suffices to show that for any other triple (B, g, b), we can find a unique SM-map τ : f → g that takes a to b and makes the following diagram commutative: A f - τ τ ? B A g - ? B Given a universal element a ∈ G(f ) for G, where f : A → A, we show (A, f, a) is an initial Dedekind self-map. Let (B, g, b) be a triple, with g : B → B and b ∈ G(g) = B. By the universal property of universal elements, there is a unique τ : f → g such that G(τ )(a) = b. The latter statement ensures that τ (a) = b; the former ensures that the diagram above commutes. For the “moreover” clause, the proof given in part (1) shows how to define a left adjoint for G once existence of ω has been established. 61 Assuming the adjoint F exists, we have seen that F(1) is a Dedekind self-map, with j(1) = X1 an infinite set. Conversely, assuming some form of the Axiom of Infinity, we conclude that ω exists; as in the previous footnote, 103 The diagram shows that j is the composition of two functors, F and G; we say that F and G are the factors of j. Because F is left adjoint to G, it turns out that both F and G exhibit strong preservation properties. For instance, it can be shown that F preserves coproducts; in fact, if {Xi | i ∈ I} is a collection of disjoint sets, no matter how big I may be, the functor F has the property that ! [ [ Xi = F(Xi ) . F i∈I i∈I Recall that when we defined “preservation of coproducts” for functors, we considered only the case in which |I| = 2; the preservation property that F exhibits is much stronger. The functor G exhibits the “dual” preservation property of preserving products. Let us say that a class function j : V → V preserves products if, for any sets A, B, |j(A×B)| = |j(A)×j(B)|. It can be shown that, if {Xi | i ∈ I} is a collection of sets, no matter how big I may be, ! Y Y G(Xi ) , Xi = G i∈I i∈I Q where denotes the product operator. Once again, the definition of “preserving products” requires only that |I| = 2; the property exhibited by G is much stronger. The notions of product and coproduct have natural generalizations in category theory to the notions of limit and colimit. These more general definitions allow us to see that products—along with many other well-known constructions in mathematics—are a special type of limit, and likewise, coproducts, and many other constructions, are special instances of the colimit construction. We will not define these more general notions here, but to give a feeling of concreteness, we will at least introduce some notation. Suppose {Xi | i ∈ I} is a collection of sets and suppose H is a functor. We will say that H preserves set-indexed limits if H (Limi∈I (Xi )) = Limi∈I (H(Xi)) , where Lim denotes the limit construction. Likewise, H preserves set-indexed colimits if H (Colimi∈I (Xi )) = Colimi∈I (H(Xi)) , where Colim denotes the colimit construction. Now we are in a position to state the properties of F and G. Because F is left adjoint to G, it can be shown that F preserves all set-indexed colimits and G preserves all set-indexed limits. However, the composition j = G ◦ F exhibits, by comparison, very little in the way of preservation properties. We will discuss this “shortcoming” further in the next section. We may summarize the discussion as follows. We have labeled this theorem The Lawvere Construction since most of the contents of the theorem are due to Lawvere [23]: Theorem 56 (The Lawvere Construction). (ZFC − Infinity) Let G denote the forgetful functor from SM to Set; that is, for every f : A → A ∈ SM, G(f ) = A and for every SM-arrow α : f → g, where f : A → A and g : B → B, G(α) is α : A → B. Suppose G has a left adjoint F : Set → SM. Let ΘA,β denote the natural bijection from SM(F(A), β) to Set(A, G(β)). Write F(A) = fA : XA → XA . Let η0 = Θ1,F(1)(1F(1))(0). Let j = G ◦ F. Then F : Set → SM defined on objects by F(A) = 1A × s : A × ω → A × ω is left adjoint to G. 104 (1) j : V → V is an essentially Dedekind self-map for which 1 is both a critical point and a strong critical point of j. (2) j(1) is itself a Dedekind-infinite set X1 and F(1) is the corresponding Dedekind self-map f1 : X1 → X1 with critical point η0 . Moreover, f1 is initial and X1 = {η0 , f1 (η0 ), f12 (η0 ), . . .}. (3) η0 ∈ G(F(1)) is a universal element of G. Moreover, V = {G(f )(η0 ) | f is a morphism in SM}. (4) F preserves set-indexed colimits and G preserves set-indexed limits. As a final observation, we mention that the emergence of an infinite set from the Lawvere Construction follows exactly the pattern that was mentioned earlier (on p. 73)—that a Dedekindinfinite set occurs as the image of the critical point of the class map j : V → V . 18 Emergence of the Infinite in the Universe In the previous sections, we have examined ways in which a bare class Dedekind self-map can be strengthened so a set Dedekind self-map is guaranteed to exist. In an upcoming section, we will use these observations as tools for generalizing from “something infinite exists” to “large cardinals exist.” In this section, we step back for a moment and place our efforts to strengthen class Dedekind self-maps in a slightly different context. Suppose we live in a ZFC − Infinity world. In that world—unlike the world ZFC− Infinity + ¬Infinity—we cannot be sure whether infinite sets exist. If we look very closely, is it possible to determine how exactly anything infinite comes into being (assuming that it does)? We can view the last few sections of this paper as partial answers to this question. One answer is that the infinite emerges as certain closure properties (Theorem 41) or preservation properties of class Dedekind self-maps emerge. As these closure/preservation properties “come into view” or “become part of reality,” so too do infinite sets. We have also seen that a Dedekind self-map emerges from a kind of alchemy between the collapsing influence of the forgetful functor G : SM → Set and the expansive influence of a left adjoint F : Set → SM to G. When these influences are “balanced,” through the dynamics of adjointness, a set Dedekind self-map is born. There is yet another way that the infinite can be seen to emerge, in this case on the basis of the relationship between the dynamics within SM and points within Set. In the category Set, mappings from the terminal object 1 = {0} (recall, a terminal object in Set is any singleton set) can be used to specify any value in any set. Consider, for example, a set X = {u, v, w}. We can specify each element of X by considering different maps 1 → X. The map U : 1 → X defined by U (0) = u specifies u, and something similar can be done for the other elements of X. When we step into the highly dynamic expansion SM of Set, consisting of all self-maps of sets, the ability to distinguish points within sets becomes more difficult; the terminal object in SM (which is the unique map T : 1 → 1) is unable to distinguish points of the underlying sets of 105 the self-maps belonging to SM. To see this, consider any SM-map β from T to another object f : X → X of SM: 1 T - β ? X 1 β f - ? X Notice that the only points that any β : T → f can pick out from the underlying set X of f are the fixed points of f : Suppose x = β(0). Then f (x) = f (β(0)) = β(T (0)) = β(0) = x. It follows that the only function g : Y → Y for which the maps α : T → g can pick out all points of the underlying set Y of g is the identity map g = idY : Y → Y , since all points of Y are fixed points in that case. The requirement to be able to pick out points in this way leads to the notion of a weak natural number object. A triple (A, j, a), where j : A → A is a self-map and a ∈ A, is a weak natural number object if, for any f : X → X and any x ∈ X, (∗) there is an SM-map β : A → X such that β(a) = x and f ◦ β = β ◦ j; in other words, the following is commutative: A j - β ? X A β f - ? X Notice that a weak natural number object j : A → A accompishes exactly what the SM-terminal object T failed to accomplish: For every element x ∈ X, we can find a β : (A → A) → (X → X) that picks out x: β(a) = x. We verify now that every weak natural number object gives rise to a (set) Dedekind selfmap. Suppose (A, j, a) is a weak natural number object. We first check that a is a critical point of j: Consider the set X = {0, 1} in the diagram above, where f : X → X is defined so that f (0) = f (1) = 1. Now suppose a ∈ ran j, so that, for some u ∈ A, we have j(u) = a. Tracing through the diagram, we see that 0 = β(a) = β(j(u)) = f (β(u)) = 1, which is impossible. Next, we can use transfinite recursion over ω to define the sequence a0 , a1 , a2 , . . . within A, where a0 = a, a1 = j(a0 ), and an+1 = j(an); we let W denote the set of terms of this sequence: W = {a0 , a1 , a2 , . . .}. It follows from Remark 14 that j W : W → W is 1-1. It follows that j W is a Dedekind self-map with critical point a. 106 We have shown therefore that weak natural number objects give rise to (set) Dedekind selfmaps. In this context, we see that the “infinite” emerges from the need, within the dynamic expanded universe SM, to locate point values of underlying sets. In this sense, we see how the “infinite” within the realm of sets is born from the relationship between the self-referral dynamics inherent in sets (made manifest in the universe SM) and the points within those sets. Our demonstration that Dedekind self-maps “arise from” weak natural number objects also shows that Dedekind self-maps arise from weakly universal elements of the forgetful functor G : SM → Set, because the notions of weak natural number object and weakly universal element of G are essentially the same. The close relationship between these notions is most easily seen if we consider the following diagrams. a 1 H - G(j) = A H bH H j H G(τ ) A j - τ ? ? G(β) = B B B τ β - ? B Proposition 57. Suppose j : A → A and a ∈ A. The following are equivalent. (A) (A, j, a) is a weak natural number object. (B) If G : SM → Set is the forgetful functor, a ∈ G(j) is a weakly universal element for G. Remark 16. In particular, since (ω, s, 0) is a weak natural number object, it follows that 0 ∈ G(s) is a weakly universal element for G. Proof. Suppose (A, j, a) is a weak natural number object; we show a ∈ G(j) is a weakly universal element for G. Suppose b ∈ G(β) = B, where β : B → B is an object of SM. Because (A, j, a) is a weak natural number object, there is τ : A → B such that (+) τ (a) = b which makes the square above commute. Viewing τ : j → β as an SM-morphism, it follows from (+) that G(a) = b. We have shown a ∈ G(j) is a weakly universal element for G. Conversely, suppose a ∈ G(j) is a weakly universal element. Suppose β : B → B and b ∈ B. We find τ : A → B taking a to b and making the square above commute. Because a ∈ G(j) is a weakly universal element, we obtain an SM-morphism τ : j → β (which therefore makes the square above commute) satisfying G(τ )(a) = b. Identifying τ with G(τ ), we have τ (a) = b, as required. As a final observation, we point out that a strong converse to the fact that Dedekind selfmaps arise from weak natural number objects is also true. Certainly, existence of a Dedekind self-map suffices to prove the existence of a weak natural number object: An example is (ω, s, 0), where s is the successor function. We may in fact conclude that a stronger type of structure, called a natural number object, can be obtained from a Dedekind self-map. A natural number object is a weak natural number object for which the property (∗) is strengthened to require that 107 the map β is unique. We give the full statement here: A triple (A, j, a), where j : A → A is a self-map and a ∈ A, is a natural number object if, for any f : X → X and any x ∈ X, (∗∗) there is a unique SM-map β : A → X such that β(a) = x and f ◦ β = β ◦ j. We will show that (ω, s, 0) is a natural number object; this will be sufficient since, as we have seen, a Dedekind self-map is enough to generate ω and s. Our argument will show that (ω, s, 0) is initial not only for the category SelfMap, but also for the more general category SM. The argument relies heavily on the fact that (ω, s, 0) is initial for SelfMap. Theorem 58. If there is a Dedekind self-map, there is a natural number object. In particular, if s : ω → ω is the usual successor function on ω, then (ω, s, 0) is a natural number object and is the initial object for the category SM. In detail, for every function f : X → X and any x0 ∈ X, there is a unique β : ω → X with β(0) = x0 and f ◦ β = β ◦ s. ω s - β ? X ω β f - ? X Remark 17. It is straightforward to show that the theorem remains true if we replace s : ω → ω with any Dedekind self-map t : W → W that is Dedekind self-map isomorphic to s. Therefore, objects that are initial in the category SelfMap are also initial in SM. In particular, the self-map f1 : X1 → X1 , discussed earlier, that is the image F(1) by a left adjoint F to the forgetful functor G : SM → Set, is a natural number object.62 Proof. Since there is a Dedekind self-map, we may obtain the usual successor function s : ω → ω and make use of the fact that (ω, s, 0) is initial in SelfMap. Since f : X → X may not be 1-1, we define a 1-1 function F that “preserves” f : Define F : ω × X → ω × X by F (n, x) = (n, f (x)). Clearly, (ω × X, F, (0, x0)) is a Dedekind self-map. Let τ : ω → ω × X be the unique function guaranteed by the initial property of (ω, s, 0). In particular, τ (0) = (0, x0) and the following is commutative: ω s - τ τ ? ω×X ω ? F -ω×X Claim 1. For all n ≥ 1, τ (n) = (n, f (y)), for some y. Proof of Claim 1. We proceed by induction on n ≥ 1. For the base case, τ (1) = τ (s(0)) = F (τ (0)) = F (0, a) = (1, f (a)). 62 This was in fact already proven in the footnote on p. 95. 108 For the induction step, assume the result for n ≥ 1. Then τ (s(n)) = F (τ (n)) = F (n, f (y)) = (n + 1, f (f (y))) = (n + 1, f (z)), where the second equality follows from the induction hypothesis and in the last equation z = f (y). To pick out components from an ordered pair, we use the following notational convention: For any u, v, (u, v)0 = u and (u, v)1 = v. Claim 2. For all n ∈ ω, (F (τ (n)))1 = f (τ (n)1 ). Proof of Claim 2. There are two cases. When n = 0, we have: (F (τ (0)))1 = (F (0, x0 ))1 = (1, f (x0))1 = f (x0 ) = f ((0, x0)1 ) = f (τ (0)1 ). Then, when n ≥ 1, we have, by Claim 1, (F (τ (n)))1 = (F (n, f (y)))1 = (n + 1, f (f (y)))1 = f (f (y)) = f ((n, f (y))1) = f (τ (n)1 ). Claim 3. Define β : ω → X by β(n) = τ (n)1 . Then β satisfies β(0) = x0 and makes the diagram in the statement of the theorem commutative. Proof of Claim 3. We proceed by induction on n ≥ 0. For the first part, β(0) = τ (0)1 = (0, x0)1 = x0 . This is also establishes the first part of the Claim. For the induction step, assume the result for n ≥ 0. Then, by Claim 2, β(n + 1) = τ (s(n))1 = (F (τ (n)))1 = f (τ (n)1 ) = f (β(n)). Finally, we establish uniqueness of β: Claim 4. Suppose h : ω → X satisfies h(0) = x0 and for all n ∈ ω, f (h(n)) = h(s(n)). Then h = β. Proof of Claim 4. We proceed by induction on n ≥ 0. For the base case, clearly, h(0) = x0 = β(0). Assuming the result holds for n ≥ 0, h(s(n)) = f (h(n)) = f (β(n)) = β(s(n)). This completes the proof of the theorem. 19 The “Naturalness” of the New Axiom of Infinity In previous sections, we have examined several principles from which it is possible to derive a Dedekind self-map, working in the theory ZFC − Infinity. In the last section, we summarized this 109 work from the perspective of determining how an infinite set could possibly arise in a ZFC−Infinity world. In this section, we would like to address the more philosophical question, Is the New Axiom of Infinity a natural axiom? A different but related philosophical question, which may need to be addressed in a different way, is whether the New Axiom of Infinity is true. These questions have analogies in the prevailing discussion among logicians concerning large cardinals. Early in set theory’s history, there was considerable division among researchers about whether any large cardinals, particularly the stronger ones, should be considered to exist in the mathematical universe. By now (2014), very few who work in the field of foundations doubt that large cardinals exist. But still there is no general agreement about how to formulate one or more foundational axioms from which large cardinals can be derived. The problem in this case is not the truth of large cardinal axioms but the naturalness of a formulation. When the Axiom of Infinity, as it is known today, was formulated, the issue about naturalness of an axiom that asserts existence of an infinite set was merged with the truth issue. Because it had become clear that a rigorous formulation of the theorems of calculus, and even of the notion of a real number, rested on the existence of an infinite set, one could assert that the Axiom of Infinity should be viewed as both true and natural. Looking upon this early time in set theory’s history with hindsight, it may be worthwhile—in light of the modern attitude toward formulating one or more foundational axioms that could account for large cardinals—to make a distinction between the naturalness and the truth of the Axiom of Infinity. The hope would be that if a case could be made for the naturalness of an infinite set, one may be able to argue in an analogous fashion in support of naturalness of an axiom from which large cardinals can be derived. Cantor’s work made a convincing case that an infinite set must exist. But does it make sense to say that infinite sets occur naturally in the universe simply because there is a need for a more rigorous foundation for calculus and analysis? More convincing would be an observation that, the notion of “infinite set” arises in the way many mathematical structures typically arise. We would therefore like to answer the question, “In what way do infinite sets, or Dedekind self-maps, naturally arise?” Having offered a satisfactory answer, we would then wish to use that answer to help with the larger question, “In what way can it be said that large cardinals naturally arise?” From an answer to this question, we could hope to find a “natural” axiom that could account for all large cardinals. This second question has, like the question about infinite sets, two sides. One side is, “Is it true that large cardinals exist?” This question has been addressed in many ways already by set theorists. An example is the problem of determining whether certain pointclasses in definability theory are determined. A well-known ZFC result due to Martin is that the class of Borel sets is determined. The natural generalizations to analytic sets and ultimately projective sets require large cardinals for a positive answer; for projective sets, Woodin cardinals are necessary. Because of the naturalness of the generalization, one may answer the second question by saying, “The Borel sets are determined; it should also be the case that analytic sets, and even projective sets, should also be determined. Therefore, the large cardinals required in the proof of these results should exist.” A great deal of evidence of this kind has been accumulated in the past half century. Most set theorists believe that this evidence is sufficient to conclude that large cardinals are real entities in the universe. It could also be argued that some of this evidence also provides part of an answer to the second question as well. For instance, an intuitive remark along these lines would be “Projective sets ought to be determined in the same way as Borel sets are determined; therefore, 110 it is natural to conclude that Woodin cardinals exist.” More generally, evidence of this kind has the following flavor: A certain problem has the “expected” solution under the assumption of a large cardinal; the expected solution also implies existence or consistency of the large cardinal in question (or of a large cardinal not too much weaker than the one in question); therefore, it is “natural” to conclude that such large cardinals exist. What we are looking for here, however, is some kind of dynamic overarching principle that is fundamental to mathematics itself, which would naturally suggest the existence of large cardinals; perhaps even suggest the existence of all the known cardinals. We begin with a discussion of naturalness of an infinite set in mathematics. A reasonable starting point is the list of conditions, mentioned in previous sections, that imply, in ZFC−Infinity, the existence of an infinite set. We summarize these conditions for easy reference: (1) The existence of a Dedekind self-map j on V that has a j-closure point. (2) The existence of a Dedekind self-map j on V having a weak critical point A so that ran j A $ j(A). (3) The existence of a Dedekind self-map on V that preserves disjoint unions, the empty set, and singletons. (4) The existence of a Dedekind self-map j : V → V with critical point a, which preserves disjoint unions, intersections, the empty set, and terminal objects, with the additional properties that {a} is also a critical point and a ∈ j(A) is a weakly universal element for j, for some set A. (5) The existence of a self-map on V that preserves finite coproducts and terminal objects. (6) The existence of a Dedekind self-map on V whose critical point is itself an infinite set (arising from the existence of a left adjoint to the forgetful functor G : SM → Set). (7) The existence of a weak natural number object j : A → A, which makes it possible for an SM object to pick out points from objects in Set. (8) The existence of a weakly universal element for the forgetful functor G : SM → Set. In our view, even though all of these points are useful for the purpose of formulating large cardinal analogues, mostly they do not provide compelling reasons for including an infinite set in a ZFC − Infinity universe. The category theorist may find (7) compelling; Lawvere [24, chap. 9] suggests that the requirement to be able to pick out points in this way is fundamental, but it still has the form of “solving a problem in a natural way.” For us, item (6) seems the most promising. The fact that an infinite set, or a Dedekind self-map, is equivalent to the existence of an adjoint functor demonstrates that infinite sets arise in the same way an enormous number of other structures in mathematics arise. For this to be convincing, we will offer some evidence, well-known to category theorists, of the rather ubiquitous role that is played by adjunctions in the creation of mathematical structures and their properties. We will recall several facts as evidence for our thesis. First, we recall the fact that all algebraic 111 theories, as well as many other categories such as SM, can be seen to emerge from the category of sets by way of an adjunction. Second, we will repeat a point made by Lawvere 50 years ago that many of the constructive properties inherent in any model of set theory arise naturally from existence of adjoints. And finally, we will recall a recent theorem which demonstrates that the category of sets can be characterized in terms of maximum-length adjoint strings of a certain kind. Let us begin by quoting category theorists, who find adjunctions to be a fundamental pattern in mathematics. In the preface to his classic text on category theory, S. Mac Lane mentions [26, p. vii] a slogan that captures an intuition shared by experts in the field: “Adjoint functors arise everywhere.” In the same spirit, S. Awodey mentions [2, p. 231] in his text Category Theory, “...in a sense, every functor has an adjoint.” It is obvious to those who have sought to detect the presence of adjunctions that they pervade mathematics. We turn to some evidence that category theorists often cite in support of this point. As is well known (see [34, p. 120], for example), every algebraic theory, such as group theory or ring theory, can be understood as emerging from the category of sets by way of a free object functor, which always turns out to be left adjoint to the forgetful functor. We review the mechanics of this for the case of group theory; it is a theorem of universal algebra that something like this is true for any algebraic theory. We quote some standard facts about groups. First, every group is a homomorphic image of a free group. Free groups can be seen as arising from the free object functor F : Set → Grp, where Grp is the category of groups with group homomorphisms. For each set X, F(X) is the set of all reduced words obtained from the elements of X (see [34]). Let G : Grp → Set be the forgetful functor. Then every set function f : X → G(H) induces a unique homomorphism f : F(X) → H, and in fact there is a natural bijection from Set(X, G(H)) and Grp(F(X), H), and so F a G. Philosophically speaking, one can argue that the category of groups arises from the category of sets by virtue of the fact that G has a left adjoint; in this sense, existence of a left adjoint to the forgetful functor, at least in the world of algebraic theories, is a kind of creational transformation that produces a more structured category from the “structureless” category of sets. And this happens because of the balanced relationship between the “dynamic and creative” influence of F and the “silent and destructive” influence of G. Given any category C, one can ask if it arises from Set in a similar fashion. The answer is that, in many cases, this is indeed the situation, but not always. One can answer this question by examining the internal structure of C, using the Adjoint Functor Theorem. This theorem tells us that G : SM → Set has a left adjoint precisely when an infinite set exists. This demonstrates that the notion of “infinite set” is directly related to the dynamic principle by which algebraic theories arise from adjunctions. This phenomenon therefore provides evidence for the naturalness of infinite sets. As a second point, we observe that the category of sets is an example of a cartesian closed category, whose defining characteristics all arise from adjunctions. Stated simply, a cartesian closed category is one that has a terminal object, is closed under cartesian products (for any objects X, Y , there is another object X × Y in the usual sense), and is closed under taking exponents (for any objects X, Y , there is an exponential object X Y consisting of all morphisms 112 from Y to X).63 These defining axioms can be understood as expressions of adjoint situations, as follows:64 A cartesian closed category can be defined to be a category C equipped with adjoint situations of the following three sorts: (1) Existence of Terminal Object. C has a terminal object if there is an object 1 with the property that, for every X in C, there is a unique morphism X → 1. Existence of a terminal object for C is equivalent to existence of a right adjoint 1 → C to the unique morphism (which, in the category of all categories, is a functor) from C to the category 1. (2) Existence of Products. For any objects X, Y in C, the product of X and Y is an object X × Y together with projection morphisms πX : X × Y → X, πY : X × Y having the universal property: For any object Z in C and morphisms f : Z → X, g : Z → Y , there is a unique morphism hf, gi : Z → X × Y making the following diagram commutative: Z H f g hf,gi HH HH π j ? πY X X ×Y Y X Existence of products in C is equivalent to existence of a right adjoint Π : C × C → C to the diagonal functor ∆ : C → C × C, where ∆ is defined by ∆(X) = (X, X). For each X ∈ C, we let ΠX : C → C denote the X-product functor defined by ΠX (Y ) = X × Y , with the obvious definition on morphisms. (3) Existence of Exponentials. For any objects X, Y in C, an exponential object Y X in C, together with its evaluation map evX,Y : Y X × X → Y , is an object-morphism pair that satisfies the following universal property: Given any object Z ∈ C and any g : Z × X → Y , there is a unique morphism ĝ : Z → Y X such that the following diagram commutes: Z×X ĝ×1X ? g - Y * evX,Y YX ×X For Set, evX,Y is the evaluation map given by evX,Y (f, y) = f (y). Exponentiation by X can be seen as a functor EX : C → C that is right adjoint to the X-product functor ΠX , that is, ΠX a EX . The fact that cartesian closed categories are defined in terms of adjoints indicates that the structure and dynamics of the category Set are intimately tied to adjunctions. 63 The definitions of “product” and “exponential” in this context need to be expressed in the language of categories; each of these is defined entirely in terms or morphisms with universal properties, in the spirit of the definition of coproducts, discussed earlier. 64 See [23]. 113 Another way in which adjunctions make their presence known in the category of sets was discovered relatively recently in work by Rosebrugh and Wood [33]. They show that the category Set can be completely characterized in terms of maximal adjoint strings. We give an overview of this characterization here. After stating some preliminaries, we will give the main result, and then explain the terminology somewhat more and list the background theorems that the result depends on. First, we consider a new way to construct categories. Suppose C is any category with the property that, for any objects C, D of C, the collection C(C, D) of morphisms from C to D in C is a set; such op categories are said to be locally small. We denote by SetC the category of contravariant functors from C to Set whose morphisms are natural transformations between functors.65 C can always op be “embedded” in SetC via what is known as the Yoneda embedding, which is a special functor Y defined as follows: For each object C in C, Y(C) is the functor yC : C op → Set defined by: yC(D) = C(D, C). In other words, yC takes each object D of C to the set of C-morphisms from D to C; in order for C(D, C) to be an object of Set, the condition that C be locally small is necessary. Moreover, for any morphism f : C → E in C, Y(f ) : yC → yE is the natural transformation yf defined, for each object D in C, by f h yfD (h) = D −→ C −→ E = f ◦ h. Rosebrugh-Wood Adjoint String Theorem (RAST) [33]. Suppose C is a locally small category. op (A) Suppose that the Yoneda embedding Y : C → SetC has a left adjoint X, which in turn has left adjoints W, V, U; in other words, suppose C admits the following adjoint string: U a V a W a X a Y. Then C is equivalent to Set. Moreover, the maximum possible length of an adjoint string of this type (beginning at the right with a Yoneda embedding) is 5. (B) An example of an adjoint string of length 5, of the type mentioned in (A), is given by op ∃∃Y0 a K(∃Y0) a ∃Y1 a K(Y1) a YSet : Set → SetSet . We explain the meaning of the notation in (B) and derive the adjoint string mentioned there. We will make use of the following standard results from the literature:66 We let 0 denote the empty category (no objects or morphisms); it is the initial object in the category Cat of categories. We let 1 denote the category having just one object and one morphism; it is the terminal object in Cat. If C is an object in Cat, the unique functor F : C → 1 is denoted !C , or simply ! if the context makes the meaning clear. Note that 0op = 0 and 1op = 1. 65 The definition of natural transformation is given on p. 91. The opposite category of a category C was introduced on p. 93, and the notion of a contravariant functor was introduced on p. 94. 66 These are stated, with references, in [33]. 114 op For each category C, we let K(C) denote SetC .67 K is a contravariant functor from Cat to Cat (equivalently, a functor Cat → Catop ); it is defined on morphisms of Cat in the following way: Suppose C, D are categories and F : C → D is a Cat-morphism. Then op op K(F) : SetD → SetC is defined by K(F)(H) = H ◦ F. Lemma R1 . If L1 , L2 are left adjoint to a functor F, then L1 and L2 are naturally isomorphic. Likewise, if R1 , R2 are both right adjoints of F, then R1 and R2 are naturally isomorphic. Lemma R2 . Whenever L a F, we have K(L) a K(R). Lemma R3 . If the morphisms of C form a set and if D is locally small, and F : C → D, then K(F) has a left adjoint, denoted ∃F and a right adjoint, denoted ∀F. Therefore, ∃F a K(F) a ∀F. op In particular, if F is the Yoneda embedding YC : C → K(C) = SetC , then ∀YC ∼ = YK(C). Therefore: ∃YC a K(YC ) a YK(C). We build the adjoint string of the RAST Theorem: We begin by considering the Yoneda embeddings Y0 : 0 → Set0 ∼ = 1 and Y1 : 1 → Set1 ∼ = Set. Applying R3 to Y0 , we have the following adjoint string: (1) ∃Y0 a K(Y0) a ∀Y0 ∼ = YK(0) ∼ = Y1. Notice that K(Y0) : K(1) → K(0), that is, K(Y0) is the unique morphism ! : Set → 1. This implies that the other two functors in the adjoint string (1) have the signature 1 → Set. Next, we apply R2 to (1): (2) K(∃Y0) a K(!) a K(Y1). op Since ! : Set → 1, K(!) : Set → SetSet . It follows by adjointness that the other two functors in op (2) have signature SetSet → Set. Next, we apply R3 to the functor ∃Y0 : 1 → Set to obtain the following adjoint string: (3) ∃∃Y0 a K(∃Y0) a ∀∃Y0 . Examining (2) and (3) we see that both K(!) and ∀∃Y0 are right adjoints to K(∃Y0), so, by R1 , K(!) ∼ = ∀∃Y0 . Using this observation, we can combine (1), (2) and (3) to produce a length-4 adjoint string: (4) ∃∃Y0 a K(∃Y0) a ∀∃Y0 ∼ = K(!) a K(Y1). 67 The notation K is in honor of Kan who discovered this functor and established the first mathematical results about it. 115 Next, we apply R3 to the functor Y1 : 1 → Set, producing the following adjoint string: (5) ∃Y1 a K(Y1) a ∀Y1 ∼ = YK(1). Examining (4) and (5), we see that K(Y1) has two left adjoints: K(!) and ∃Y1 , which, by R1 , must be isomorphic. Combining (4) and (5), and noting that K(1) ∼ = Set, yields the final result: (6) ∃∃Y0 a K(∃Y0 ) a ∃Y1 a K(Y1) a YSet. What does the RAST tell us about infinite sets? The fact is, the result only makes sense if op infinite sets exist; one cannot even talk about the category SetSet without infinite sets. In this sense then, RAST implies the Axiom of Infinity. We have seen that adjunctions show up throughout mathematics, with or without the Axiom of Infinity. And existence of an infinite set turns out to be equivalent to more than one important result about existence of adjunctions. These points provide evidence that infinite sets are as natural for mathematics as adjoint situations. We make one final point about the emergence of the category SM from Set by way of a (putative) adjunction. We would like to make the point that infinite sets arise not just from adjunctions generally, but from adjunctions as dynamical principles internal to the universe of sets. To see what we mean, we return to the example of group theory. We have suggested that group theory “arises” because of a left adjoint to the forgetful functor G : Grp → Set, and we wish to claim the same holds for SM: that it arises from the existence of a left adjoint to the forgetful functor G : SM → Set. But we cannot really claim these categories “arise” from Set, at least by way of a particular left adjoint, because in each case, the new category must already exist in order for a forgetful functor to exist. We close this section by addressing this point. It turns out that, for algebraic theories, as well as for the category SM, whenever we have a left adjoint F : C → Set to a particular forgetful functor G : Set → C, if we let j = G ◦ F : Set → Set, the category C and the functors F, G can be recovered by examining j and its action on Set, without reference to an external category. This shows that, in the case of G : SM → Set, an infinite set, or equivalently, existence of a left adjoint for G, arises from dynamics entirely internal to Set itself. To explore these details further, we begin by introducing a bit more information about the structure of adjunctions. Suppose C, D are categories and F : C → D, G : D → C are adjoint functors. We observed earlier (see the footnote on p. 95) that the adjunction has an associated unit, denoted η, defined, for each object A in C, by ηA : A → G(F(A)) = ΘA,F(A) (1F(A)), where ΘA,B is the natural bijection from D(F(A), B) to C(A, G(B)). One also defines a co-unit of the adjunction as follows: For each D-object B, B : F(G(B)) → B = Θ−1 G(B),B (1G(B)). One can show that these components of have a universal property that is dual to those of the components of η, and that is a natural transformation. One says that hF, G, η, i is an adjunction. 116 Any adjunction hF, G, η, i determines another functor T : C → C by composition: T = G◦F, as we have already observed in the case of the categories SM and Set. By virtue of the properties of the adjunction, T forms a central part of another structure, called a monad. Given a category C, a monad is a triple hT, η, µi for which T : C → C is a functor, and η : 1 → T and µ : T2 → T are natural transformations, so that (i) for any object A in C and any x ∈ T3 (A), µA (µT(A) (x)) = (T(µA ))(µA (x)). (ii) for any object A in C and any x ∈ T(A), µA (ηT(A)(x)) = x = µA ((T(ηA ))(x)) . Note that the maps T(ηA ) : T(A) → T2 (A), for A ∈ C, are the components of a natural transformation. Any adjunction hF, G, η, i gives rise to a monad hT, η, µi by way of the following definitions (η is the same in both structures): (i) T = G ◦ F (ii) η is the same in both structures (iii) for all objects A in C and x ∈ T2 (A), µA (x) = G(F(A)) (x). A natural question, which is relevant for us in this paper, is whether, given a monad hT, η, µi as above that is obtained from an adjunction hF, G, η, i, is it possible to recover this adjunction, and indeed the category D, from the monad? In general, the answer is no, but when D is an algebraic category, like Grp, or the special category SM that we have been considering, this can indeed be done. We outline the results in the special case in which the adjunction hF, G, η, i is for the functors F : Set → SM and G : SM → Set, and resulting monadic functor is j = G ◦ F. Note that to proceed further, we need to assume some form of the Axiom of Infinity, since existence of F depends on this assumption. Starting with j : V → V (we identify V with Set as we were doing earlier), together with natural transformations η, µ that give us that hj, η, µi is a monad, we build a category V j of j-algebras as follows: V j = {(A, a) | a : j(A) → A}, where each (A, a) in V j satisfies the equations: a ◦ η A = 1A a ◦ j(a) = a ◦ µA . Morphisms of V j of the form f : (A, a) → (B, b) are functions f : A → B satisfying b ◦ j(f ) = f ◦ a. We can define the forgetful functor Gj : V j → Set by Gj ((A, a)) = A. Gj has a left adjoint F defined by F j (A) = (j(A), µA). We have constructed the category V j from j, η, µ, which are mappings belonging to the dynamics of V alone. As it turns out, in building V j , we have reconstructed, up to isomorphism, the category SM. The Eilenberg-Moore comparison functor Φ : SM → V j is defined as follows: For any f : X → X belonging to SM, Φ(f ) is the j-algebra (X, G(f )). Given an SM-morphism j 117 h : f → g, where f : X → X and g : Y → Y , Φ(h) : (X, G(f )) → (Y, G(g )) is the function h : X → Y . One shows that Φ is the unique functor satisfying Gj ◦ Φ = G and Φ ◦ F = F j . Moreover, we have (A) Φ : SM → V j is an isomorphism (B) j = Gj ◦ F j (C) F j (1) = Φ(F(1)) = (X1 , j(X1) → X1 ) yields a j-algebra based on an infinite set; in particular, it continues to be true that X1 is the least critical point of j. By (C), we see how an infinite set emerges from the dynamics of j : V → V without reference to an external category. Our discussion in this section has made a case for the claim that existence of infinite sets is “natural” because they arise, like so many structures in mathematics, from adjunctions. Our discussion of monads gives us additional information: Every adjunction between Set and another category produces a monad j : V → V . So, like adjunctions, monads are fundamental to mathematics, and we have seen that an infinite set arises naturally from a monad j : V → V —a certain type of (essentially) Dedekind self-map—in particular, from its critical point. Moreover, the dynamics of this monad can be seen to arise, by the Eilenberg-Moore results, entirely within the universe (and not dependent on the presence of external categories). We can now ask whether the naturalness of infinite sets provides some insight into the naturalness of large cardinals in the universe. We will explore this topic further in Section 22. A reasonable hope is that the arguments for naturalness of infinite sets that we have discussed here might carry over to the context of large cardinals in the following ways: (1) “monad-like” Dedekind self-maps j : V → V occur naturally in mathematics, and large cardinals may be seen to emerge from the dynamics of such a map (2) the “largeness” of large cardinals may naturally become apparent in examining the behavior of such a j at a canonical critical point—such as its least critical point. Our argument for the naturalness of infinite sets suggests a point of view about larger infinities and points to an axiomatic formulation, involving (1) and (2). We have provided evidence for the claim that a formulation of an axiom that accounts for large cardinals and that has the characteristics of (1) and (2) is natural. Moreover, our experience with Dedekind self-maps of this kind tell us that they can be “tuned” by requiring them to satisfy certain preservation properties. Maps of this kind, as described in (1) and (2), are natural, and the ones that have been “tuned” yield the strongest large cardinal consequences. 20 Tools for Generalizing to a Context for Large Cardinals As we discussed at the beginning of this paper, one of our motivations for formulating a new Axiom of Infinity is to provide, in the structure of the axiom itself, the sort of intuition about 118 the “Infinite” that could be useful in addressing the Problem of Large Cardinals. In particular, we are looking for patterns in the statement of the axiom that would be natural choices for generalization; the intended result would be stronger, naturally motivated axioms that could possibly be used to derive known large cardinal properties. In this section, we collect together patterns of this kind that seem amenable to natural strengthenings. 20.1 The structure of a Dedekind self-map Our New Axiom of Infinity states simply that a Dedekind self-map exists. The underlying intuition is that the discrete values that belong to an infinite collection arise as a consequence of more fundamental transformational dynamics, namely, some j : A → A having a critical point a. We found that such a j has three salient characteristics: (1) It preserves essential features of its domain: The image B = j[A] is itself a Dedekind infinite set, and j B : B → B is a Dedekind self-map with critical point j(a). (2) It is not simply the identity map—some element of A is moved by j. (3) Something interesting happens in the interaction between j and its critical point. In this case, a blueprint W ⊆ A of the set ω of natural numbers arises from repeated application of j to a: W = {a, j(a), j(j(a)), . . .}. The intuition that arises from this formulation of the “infinite” is that perhaps large cardinal notions also arise from the dynamics of a Dedekind self-map. Since large cardinal properties often involve the entire universe, it is reasonable to look for answers regarding large cardinals in proper class Dedekind self-maps. As we have observed, existence of a proper class Dedekind self-map is not enough to imply the existence of infinite sets, even though the set version of Dedekind self-maps is. We collect together the techniques we have discussed for strengthening such class maps so existence of infinite sets can be derived. We view these techniques as candidates for further development in the direction of deriving large cardinals. We review the content of Theorems 46, 47, and 49: Theorem 46. (ZFC − Infinity) Suppose j : V → V is a class Dedekind self-map. Suppose also that j preserves disjoint unions, the empty set, and singletons. Then there is an infinite set. Indeed, every critical point of j is infinite. Theorem 47. (ZFC − Infinity) Suppose j : V → V is a class Dedekind self-map with critical point a. Suppose also that j preserves disjoint unions, intersections, and the empty set, and that there is a set A such that a ∈ j(A). Then there is an ultrafilter D on A. Moreover, if either of the following conditions holds, D is nonprincipal, and A is infinite. (1) j preserves singletons. 119 (2) j preserves terminal objects and {a} is also a critical point of j. Theorem 49. (ZFC − Infinity) Suppose j : V → V is a class Dedekind self-map with a strong critical point. Suppose j preserves finite coproducts and terminal objects. Then there is an infinite set. Indeed, any strong critical point of j is infinite. We see in each of these results how requiring a class Dedekind self-map j : V → V to have certain combinations of preservation properties results in an infinite set, and that these combinations produce an infinite set that can typically be located at a (strong) critical point of j. A natural direction for further development, then, would be to require j to satisfy additional preservation properties and to see whether the properties of a critical point for j are strengthened in the direction of large cardinals. Theorem 53(3) shows that whenever j preserves ∈ and preserves ordinals, and has any type of critical point, we can locate a least ordinal moved by j; when we are working with a j with these properties, we will let crit(j) denote the least ordinal moved by j; crit(j) will be a natural critical point to examine as we seek to “generate” large cardinals from preservation properties. In this section we will provide two strengthenings of these preservation properties that lead to existence of large cardinals. We focus in this section on perhaps the simplest (and certainly one of the weakest) large cardinal notions—inaccessible cardinals—in order to illustrate the techniques. In Section 21, we will discuss other large cardinals and expand the techniques introduced here so that even the biggest large cardinals can be derived. An infinite cardinal κ is regular if unbounded subsets of κ have size κ; κ is a strong limit if for every λ < κ, 2λ < κ. An uncountable cardinal κ is inaccessible if it is regular and a strong limit. We begin with a few new preservation properties that Dedekind self-maps j : V → V may satisfy. Definition 9. Suppose j : V → V is a Dedekind self-map. (1) j preserves countable disjoint unions if, Swhenever hXS n | n ∈ N i (where either N ∈ ω or N = ω) is a sequence of disjoint sets, j X = n n∈N n∈N j(Xn). (2) j preserves unboundedness if, whenever α ∈ ON and A ⊆ α is unbounded in α, then, if j(A) ⊆ j(α), then j(A) is unbounded in j(α). (3) j preserves power sets if, for all X, j(P(X)) = P(j(X)). (4) Suppose f : A → B and g : A → B are functions. The equalizer E = Ef,g of f and g is the set {x ∈ A | f (x) = g(x)}. If j is a functor, j is said to preserve equalizers if for any f, g as above, j(Ef,g) = Ej(f ),j(g); that is, if E is the equalizer of f and g, then j(E) is the equalizer of j(f ) and j(g). Theorem 59 (Generalization of Theorem 46 to Inaccessibles). (ZFC − Infinity) Suppose j : V → V is a Dedekind self-map that preserves ∈ and preserves ordinals. Let κ = crit(j). (A) Suppose j preserves countable disjoint unions, the empty set, and singletons. Then κ > ω. (B) Suppose j is BSP and preserves countable disjoint unions, the empty set, singletons, and unboundedness. Then κ is an uncountable regular cardinal. 120 (C) Suppose j is BSP and preserves countable disjoint unions, the empty set, singletons, unboundedness, and power sets. Then κ is an inaccessible cardinal. Proof of (A). By Theorem 53(4), ω exists and κ ≥ ω. We have, by Theorem 53 and the list of properties that j preserves: ω = {0} ∪ {1} ∪ {2} ∪ · · · = {j(0)} ∪ {j(1)} ∪ {j(2)} ∪ · · · = j({0}) ∪ j({1}) ∪ j({2}) ∪ · · · = j ({0} ∪ {1} ∪ {2} ∪ · · · ) = j(ω). It follows that κ > ω. Proof of (B). By Theorem 53(5), κ is a cardinal. By (A), κ is an uncountable cardinal. Suppose f : α → κ and ran f is unbounded in κ. Using the BSP property of j, we can reason as in the proof of Theorem 53(5) to show that j(f ) : j(α) → j(κ) has domain α and j(f )(β) = f (β) for all β < α. Because j preserves images, we have (a) ran j(f ) = j(ran f ) ⊆ κ, but, because j also preserves unboundedness and ran f is unbounded in κ, (b) j(ranf ) is unbounded in j(κ). Clearly, since κ < j(κ), (a) and (b) contradict each other. Therefore, all functions from α to κ have bounded range. It follows, therefore, that κ is regular. We have shown κ is an uncountable regular cardinal. Proof of (C). We begin by showing that, for every A ⊆ α, where α < κ, we have j(A) = A. Because α < κ and ∈ is preserved, A ⊆ j(A). Suppose γ ∈ j(A) − A. Note that α is the disjoint union of A and α − A: α = A ∪ (α − A). Since γ 6∈ A, it follows that γ ∈ α − A. Since j preserves disjoint unions, α = j(α) = j(A) ∪ j(α − A). Since γ ∈ α − A, γ = j(γ) ∈ j(α − A), and so γ 6∈ j(A), which is a contradiction. We have shown A = j(A). Suppose, for a contradiction, that there is a surjective function g : P(α) → κ, where α < κ. Since j preserves functions and power sets and j(α) = α, j(g) : P(α) → j(κ). Note that, for each A ∈ P(α), g(A) ∈ κ, whence j(g(A)) = g(A). Therefore, for each A ∈ P(α), j(g)(A) = j(g)(j(A)) = j(g(A)) = g(A). Therefore, we have (∗) ran j(g) = ran g = κ. We also have, by preservation of images, (∗∗) ran j(g) = j(ran g) = j(κ) > κ. 121 Clearly, (∗) and (∗∗) contradict each other, and so no such function g exists. We have shown κ is a strong limit, and hence, inaccessible. The next result generalizes the approaches in Theorems 47 and 49, where somewhat different preservation properties, in some cases of a more category-theoretic flavor, were used. These results also incorporate observations from Example 4, which illustrate the role of ultrafilters in climbing further up the hierarchy of infinities. And finally, the important role of universal elements, which, as Proposition 57 shows, was the key to the Lawvere construction of a j : V → V , shows up again in the next theorem as an important element in lifting results leading to infinite sets to results leading to large cardinals. Theorem 60 (Generalization of Theorems 47 and 49 to Inaccessibles). (ZFC − Infinity) Suppose j : V → V is a functor, having a strong critical point, which preserves countable disjoint unions, intersections, equalizers, the empty set, and terminal objects, and has a weakly universal element a ∈ j(A) for j, for some set A. Then there is a collection A of subsets of A with the following properties:68 (1) For every X ∈ A, |X| is inaccessible (2) If X, Y ∈ A, then |X| = 6 |Y |. (3) The size of A itself is inaccessible. Remark 18. We will show that j induces a nonprincipal ω1 -complete ultrafilter on A. The properties (1)–(3) listed above are known consequences of the existence of such an ultrafilter, so we do not provide proofs of these. Note that properties (1)–(3) imply that there are many large cardinals that are less than |A|. We have not required j to be a Dedekind self-map; we do not have a proof, under the given hypotheses, that it must have this property. We can show, however, that j must be essentially Dedekind (see p. 102 for the definition): In the proof given below, Claim 1 shows that j and jD are naturally isomorophic. As mentioned in Example 8, jD itself is 1-1. Also, although the hypotheses require j to have a strong critical point, they do not require the weakly universal element a ∈ j(A) itself to be a strong critical point. However, under somewhat stronger hypotheses, this does turn out to be the case—see Claim 4 of Example 8, where existence of a measurable cardinal is shown to suffice to obtain this consequence. Proof. Let D = {X ⊆ A | a ∈ j(X)}. As in the first part of Theorem 47, D is an ultrafilter (we do not claim yet that it is nonprincipal). Define jD : V → V by jD (X) = X A /D. Claim 1. j is naturally isomorophic to jD . That is, there are, for all sets B, bijections φB : jD (B) → j(B), natural in B. Proof. We first define an onto map φB : B A → j(B), and then show that φB induces a bijection φB : B A /D → j(B). Define φB by φB (f ) = j(f )(a). 68 These properties follow from the fact that, assuming the hypotheses of the theorem, there must exist a measurable cardinal κ ≤ |A|. See Section 21. 122 We use the fact that a is a weakly universal element to show that φB is onto: Suppose y ∈ j(B). By weak universality, there is f : A → B so that y = j(f )(a) = φB (f ), as required. Now define φB : B A /D → j(B) by φB ([f ]) = j(f )(a). Like φB , φB is onto. To see it is well-defined and 1-1, it suffices to show that, for all partial functions f, g : A → B, f ∼ g if and only if j(f )(a) = j(g)(a). f ∼g ⇔ E = Ef,g ∈ D ⇔ a ∈ j(E) ⇔ a ∈ {z | j(f )(z) = j(g)(z)} = Ej(f ),j(g) ⇔ j(f )(a) = j(g)(a). The proof that the φB are components of a natural transformation is straightforward. Claim 2. D is nonprincipal. Proof. Let Z be a strong critical point for j. By Claim 1, Z is also a strong critical point for jD . Assume D is principal. Then there is u ∈ A such that {u} ∈ D; it follows that D = {X ⊆ A | u ∈ X}. For a contradiction, it suffices to exhibit a bijection jD (Z) → Z. For each z ∈ Z, let cz : A → Z be the constant function defined by cz (x) = z for all x ∈ A. Suppose g : A → Z is total and let z = zg = g(u). Let E = Ecz ,g = {x ∈ A | cz (x) = g(x)}. Since u ∈ E, it follows that E ∈ D, and so [cz ] = [g]. Since every [f ] ∈ Z A /D has a representative g that is total, and that for any such g, [g] = [czg ], it follows that Z A /D = {[z] | z ∈ Z}. The map [cz ] 7→ z is therefore a 1-1 correspondence between Z A /D and Z. Claim 3. D is ω1 -complete. S Proof. It suffices to show that if {Xn | n ∈ ω} are disjoint subsets of A with X = n∈ω Xn ∈ D, then for some n ∈ ω, Xn ∈ D. Since X ∈ D and since j preserves countable disjoint unions, [ a ∈ j(X) ⇐⇒ a ∈ j(Xn) ⇐⇒ ∃n ∈ ω a ∈ j(Xn) ⇐⇒ ∃n ∈ ω Xn ∈ D. n∈ω After introducing large cardinals more formally in Section 21, we will give examples of Dedekind self-maps j : V → V that exhibit the properties mentioned in the two previous theorems, assuming the existence of large cardinals. It turns out the most productive direction for generalizing further arises from strengthening the results of Theorem 59. To generalize this kind of preservation, a natural choice is some form of elementary embedding. A map j : V → V is an elementary embedding if, for any formula φ(x1 , x2 , . . . , xn ) in the language of sets and for any particular sets a1 , a2 , . . . , an , φ(a1 , a2 , . . . , an ) holds true in V if and only if φ (j(a1 ), j(a2), . . ., j(an)) holds true in V . 123 Intuitively speaking, elementarity of a Dedekind self-map j : V → V means that all properties69 and relationships of sets that hold in V are preserved by j; preservation of disjoint unions and terminal objects, for example, are special cases of this much stronger form of preservation. Though elementary embeddings seem natural to consider, it turns out that, without some modification, the preservation that they introduce is too strong. A consequence of early work by K. Kunen [20] is that whenever j : V → V is definable in V (as all class Dedekind self-maps from V to V must be), if j is elementary, j must be simply the identity map idV : V → V —it cannot have a critical point (or even a weak critical point).70 One way to pursue this approach in a consistent way is to eliminate the “definability” of j. This can be done by expanding the language of set theory. Ordinarily, the only “extralogical” symbol that is used in set theory is ∈, which is a formal symbol intended to represent the membership relation. In the expanded set theory we are suggesting, we would introduce one additional extralogical symbol j, intended to represent a map from V to V . One can then introduce axioms that collectively assert that j is elementary and has a weak critical point. In fact, the statement that there is a weak critical point is usually strengthened to assert “there is a least ordinal moved.”71 A convention we will adopt for the rest of the paper is to refer to the least critical point of any BTEE-embedding j : V → V as the critical point; the least weak critical point does indeed turn out to be a critical point—indeed, the least critical point in ON—in the usual sense of Dedekind self-maps. Collectively, these new j-axioms are given the name Basic Theory of Elementary Embeddings, or BTEE. Having removed the “definability” of j by requiring it to be an interpretation of a new function symbol j in the extended language {∈, j}, we are free to consider elementary embeddings from V to V , as formulated in the expanded theory ZFC + BTEE, and to consider elementary embeddings in strengthenings of this theory; in this context, the Kunen inconsistency result arises as an extreme special case in which ZFC + BTEE is strengthened “too much” with additional axioms. We examine consequences of the theory ZFC + BTEE after a brief but more formal introduction to large cardinals in Section 21. 21 Introduction to Large Cardinals In the previous section we mentioned some principles and strategies for strengthening a proper class Dedekind self-map in such a way that it could give rise to large cardinals. In this section, we give some background information about large cardinals, and then, in the following section, examine two approaches to achieving our goal. In the 19th century, Georg Cantor demonstrated that, assuming there is an infinite set (like the set ω of natural numbers), there must be an endless hierarchy of ever-larger infinite sizes, called infinite cardinals. He gave these different infinite sizes names: ℵ0 , ℵ1 , ℵ2 , . . . (pronounced “alephzero, aleph-one, . . .”). Later it was recognized that the best way to define cardinal numbers is as 69 Technically speaking, all first-order properties are preserved by an elementary embedding. Kunen’s result is somewhat stronger than this; a detailed discussion is given in [6, 7]. His work shows that whenever j : V → V is elementary and weakly definable, j must be the identity. 71 Details of this approach can be found in [7]. 70 124 special kinds of ordinal numbers. Viewed in this way, the infinite cardinals are ω = ω0 , ω1 , ω2 , . . .. It is known that every infinite set has size that is equal to exactly one of these infinite cardinals (though it is not always possible to determine, for a given set, which of the cardinals represents its size). In the early 20th century, certain combinations of properties of infinite cardinals were discovered to be quite strong; it was not clear that any infinite cardinal could actually have such combinations of properties. We give an example of one such combination. One of the properties involved is that of being an aleph fixed point. A cardinal κ is an aleph fixed point if ωκ = κ; in other words, the index of the aleph is identical to the aleph itself. Existence of such a cardinal seems at first to be unlikely when one considers that, among the smallest infinite cardinals, the index of an aleph is always much smaller than the aleph itself: 0 < ω0 , 1 < ω1 , 2 < ω2 , . . . It is possible however to construct an aleph fixed point. A second property is regularity, for which we gave a quick definition in an earlier section (p. 120). We give an equivalent definition here that may have more intuitive appeal. An infinite cardinal κ is regular if, whenever S is a collection of subsets of κ having these two properties: S (1) S = κ, (2) each element of S has size < κ, it must be the case that |S| = κ. In other words, κ cannot be covered by subsets of smaller size unless one uses κ many of them. ω is an easy example of a regular cardinal: Any cover of the set ω = {0, 1, 2, . . .} by finite subsets of ω must contain infinitely many elements; ω is not covered by finitely many of its finite subsets. A question that was asked early on is: Is there an infinite cardinal that is both regular and an aleph fixed-point? As was discovered many years later, if such a cardinal exists, it certainly cannot be proven to exist. This was one of the first large cardinal notions to be discovered. Any regular cardinal that is an aleph-fixed point is now known as a weakly inaccessible cardinal. We spend a moment to explain why this type of cardinal, like any other large cardinal, cannot be proved to exist using the ZFC axioms alone. In the 1930s, Gödel proved two very important things in mathematical logic that hold the key to understanding the limitative results surrounding large cardinals: Gödel’s Theorems (1) (Completeness Theorem) A mathematical theory is consistent if and only if it has a model. (2) (Second Incompleteness Theorem) If a mathematical theory is strong enough to prove the theorems of Peano Arithmetic, it cannot prove its own consistency, unless it is inconsistent. We begin the discussion by defining some terms used in these statements that may be unfamiliar. A mathematical theory is a collection of axioms—like the ZFC axioms or axioms for Euclidean geometry—together with all the theorems that can be proven from the axioms. A theory is consistent if one cannot prove from the theory a self-contradictory statement such 0 6= 0. A 125 model for a theory is a set (or for some purposes, we can consider a proper class as well) in which the extralogical symbols are interpreted and in which all the axioms hold true. A model of set theory, for example, interprets the ∈ symbol as usual membership, and the axioms of ZFC hold true in the model. An example of a model of set theory is V . However, the Second Incompleteness Theorem tells us that if ZFC is consistent, we cannot prove that ZFC is consistent using ZFC alone; in particular, there is no way to define V within ZFC (otherwise we would have defined from ZFC a model of ZFC). Now we can explain why large cardinals are “large”: Suppose κ is a large cardinal—say κ is a weakly inaccessible cardinal. It can be shown that from κ one can define a transitive model of ZFC.72 It follows, therefore, that one cannot prove from ZFC the existence of any large cardinals. Despite this fact, large cardinals show up as key ingredients in the solutions of many problems in analysis, topology, algebra, and logic. The Problem of Large Cardinals attempts to add to the axioms of ZFC new (natural) axioms from which the known large cardinals could be derived. Some of the better known types of large cardinals are listed below, in increasing order of strength: weakly inaccessible inaccessible Mahlo weakly compact Ramsey measurable supercompact extendible huge superhuge super-n-huge for every n ∈ ω In previous sections, we have already encountered inaccessible cardinals (p. 120), and the essential ingredients of measurable cardinals have also been introduced (Theorem 60 together with Remark 18). Since we will have more to say about measurable cardinals in the next section, we give a definition here: Definition 10 (Measurable Cardinals). Suppose κ is an infinite cardinal and U is an ultrafilter on κ. Then U is saidSto be κ-complete if, for any collection {Xα | α < λ} of elements of U , with λ < κ, we have that α<λ Xα ∈ U. If κ is uncountable, we say that κ is a measurable cardinal if there is a nonprincipal, κ-complete ultrafilter on κ. In Theorem 60, we showed how a nonprincipal ω1 -complete ultrafilter on some set A could be obtained from a Dedekind self-map j : V → V with sufficiently strong preservation properties, 72 For most large cardinals κ, this model is the κth stage of the universe, Vκ , but in some cases, especially for some of the smaller large cardinals (for instance, the weakly inaccessibles), a modification of Vκ is needed. The usual way to handle these special cases is to use the relativized model VκL , where L denotes Gödel’s constructible universe (another proper class model of set theory) and where the notation VκL signifies “the κth stage of the universe, built inside L.” 126 and that the presence of such an ultrafilter guaranteed existence of many inaccessible cardinals. Following this thread for a moment, it can be shown that, whenever such an ultrafilter on a set A exists, there is a measurable cardinal κ such that κ ≤ |A|, with a corresponding κ-complete ultrafilter D. It can then be shown that the set {λ < κ|λ is inaccessible} belongs to D, so that, in a sense, “almost all” cardinals below κ are inaccessible. It is by way of this fact that the conclusion to Theorem 60 can be demonstrated. 22 Some Strengthenings of Dedekind Self-Maps to Account for Large Cardinals In Section 15, we saw how certain combinations of preservation properties belonging to a class Dedekind self-map j : V → V resulted in the emergence of an infinite set, and then, with the introduction of a few more properties, the emergence of inaccessible cardinals. Examples 4 and 5 (p. 81 and p. 82) showed that class Dedekind self-maps with the required properties, at least in some cases, could be constructed under sufficiently strong assumptions. In this section, we begin by showing that the preservation properties of the Dedekind self-map described in Theorem 60, which produced many inaccessibles, can be realized under that assumption of a measurable cardinal. We also give a class Dedekind self-map example to show that the properties of Theorem 59 can also be realized, and carry this work further still to give a much fuller account of nearly all large cardinals. We reproduce Theorem 60 here, and then give the related example. Theorem 60 (ZFC − Infinity) Suppose j : V → V is a functor, having a strong critical point, which preserves countable disjoint unions, intersections, equalizers, the empty set, and terminal objects, and has a weakly universal element a ∈ j(A) for j, for some set A. Then there is a collection A of subsets of A with the following properties: (1) For every X ∈ A, |X| is inaccessible (2) If X, Y ∈ A, then |X| = 6 |Y |. (3) The size of A itself is inaccessible. As was mentioned briefly before, the hypotheses of Theorem 60 imply the existence of a measurable cardinal κ ≤ |A|. The next example exhibits a functor V → V having the properties listed in Theorem 60: Example 8. Suppose κ is a measurable cardinal and D is a nonprincipal, κ-complete ultrafilter on κ. Define jD : V → V as was done in the proof of Theorem 60: jD (X) = X κ /D. Defining jD on functions f : X → Y by jD (f )(g) = [f ◦ g], as before, turns jD into a functor. The proof of the fact that jD is 1-1 and preserves disjoint unions, intersections, and terminal objects is essentially the same (replacing ω with κ) as the corresponding verifications given in Example 4. 127 Claim 1. [idκ ] ∈ jD (κ). Moreover, D = {X ⊆ κ | [idκ ] ∈ jD (X)}. (∗) In addition, [idκ ] ∈ jD (κ) is a universal element for jD . Proof. The proof of (∗) is as in the proof of Theorem 60. The proof that [idκ ] ∈ jD (κ) is a universal element for jD follows the logic given on p. 100. Claim 2. jD preserves countable disjoint unions. S Proof. Suppose X = n∈ω Xn is a countable disjoint union. It is clear that the sets in S {Xnκ/D | n ∈ ω} are also disjoint and that n∈ω (Xnκ /D) ⊆ X κ /D. To show the converse, let f : κ → X. For each n ∈ ω, let Sn = {α S < κ | fκ (α) ∈ Xn }. By κ-completeness, some Sn κ belongs to D. It follows that [f ] ∈ Xn /D ⊆ n∈ω (Xn /D). Claim 3. jD preserves equalizers. Proof. Let f, g : X → Y and let E = Ef,g = {x ∈ X | f (x) = g(x)}. We show that jD (E) = E κ/D is the equalizer EjD (f ),jD(g) of jD (f ), jD(g): [h] ∈ jD (E) ⇐⇒ {α < κ | h(α) ∈ E} ∈ D ⇐⇒ {α < κ | f (h(α)) = g(h(α))} ∈ D ⇐⇒ [h] ∈ EjD (f ),jD (g). Claim 4. κ is a strong critical point for j. Proof. By κ-completeness and the fact that D is nonprincipal, all members of D have size κ; in particular, all final segments [α, κ) belong to D. We show κ < |jD (κ)| = |κκ /D|. Let hfα | α < κi be a sequence of κ functions κ → κ. Define g : κ → κ by g(α) = sup{fβ (α) | β < α} + 1. Then for each β, {α | fβ (α) < g(α)} ⊇ (β, κ) ∈ D. Therefore, |κκ /D| > κ, as required. It can also be shown [10] that κ is a critical point of jD and the least ordinal moved by jD . What’s more, since D is (at least) ω1 -complete, it can be shown that, if we identify κκ /D with its transitive collapse, [idκ ] is mapped to κ; in this sense, κ ∈ jD (κ) itself is a weakly universal element of jD . This Example, combined with Theorem 60, provides the following characterization: There is a measurable cardinal if and only if there is a Dedekind self-map j : V → V that is a functor, preserves unions, intersections, equalizers, the empty set, and terminal objects, and that has a weakly universal element. In [3], Trṅkova-Blass strengthen this characterization by using the notion of exact functor: A functor is exact if it preserves all finite limits and colimits.73 The main result of [3] is the following: 73 Limits and colimits are informally defined on p. 104. 128 Theorem 61 (Trṅkova-Blass Theorem). There exists a measurable cardinal if and only if there is an exact functor from V to V having a strong critical point. We consider next an example showing that the properties of a j : V → V mentioned in Theorem 59 can be realized. We reproduce the theorem here for easy reference. Theorem 59. (ZFC − Infinity) Suppose j : V → V is a Dedekind self-map that preserves ∈ and preserves ordinals. Let κ = crit(j). (A) Suppose j preserves countable disjoint unions, the empty set, and singletons. Then κ > ω. (B) Suppose j is BSP and preserves countable disjoint unions, the empty set, singletons, and unboundedness. Then κ is an uncountable regular cardinal. (C) Suppose j is BSP and preserves countable disjoint unions, the empty set, singletons, unboundedness, and power sets. Then κ is an inaccessible cardinal. We mentioned in Section 20 that any elementary embedding j : V → V for which (V, ∈, j) satisfies ZFC + BTEE will possess all the preservation properties that we have required of a class Dedekind self-map in the previous few sections. We formulate this observation as a proposition: Proposition 62. Suppose (V, ∈, j) is a model of ZFC + BTEE, and critj = κ. Then (A) j : V → V is a Dedekind self-map, with critical point κ and strong critical point κ. (B) j is BSP and preserves ∈, ordinals, countable disjoint unions, the empty set, singletons, unboundedness, and power sets. j also reflects ordinals. (C) j preserves intersections, equalizers, terminal objects. (D) j is an exact functor. Remark 19. Part (B) lists the properties mentioned in Theorem 59, while (C) lists the additional properties mentioned in Theorem 60. Proof. We prove some of these claims. Assume, as in the hypothesis, that (V, ∈, j) is a model of ZFC + BTEE and critj = κ. Claim 1. j preserves ∈. Proof. Suppose u ∈ v. Consider the formula φ(x, y) : x ∈ y; certainly V |= φ[u, v]. Then by elementarity, V |= φ[j(u), j(v)]; in other words, j(u) ∈ j(v). Claim 2. j preserves and reflects ordinals. Proof. Let φ(x) be the formula that asserts “x is an ordinal.” Let α be an ordinal. Then V |= φ[α]. By elementarity, V |= φ[j(α)]. Therefore, j(α) is an ordinal. (To be more precise, we would unwind the definition of an ordinal and verify that the elements of the definition are 129 preserved by j. x is an ordinal if and only if (x, ∈) is a well-ordering and x is transitive. x is transitive if and only if, for all u, v, if u ∈ x and v ∈ u, then v ∈ x. Suppose ψ(x) asserts “x is transitive.” Then, formally, ψ(x) ≡ ∀x∀u∀v (u ∈ x ∧ v ∈ u → v ∈ x). So, if w is transitive, V |= ψ[w]. By elementarity, V |= ψ[j(w)]; that is, V |= ∀x∀u∀v (u ∈ j(w) ∧ v ∈ u → v ∈ j(w)). It follows that j(w) is transitive. Similar steps of analysis allow us to conclude that if (x, ∈) is a well-ordering, so is (j(x), ∈). From these observations, whenever α is an ordinal, j(α) is also an ordinal. In future arguments, we will be more informal.) Suppose now that β is not an ordinal; it follows that V |= ¬φ[β]. By elementarity, V |= ¬φ[j(β)], and so j(β) is not an ordinal. This proves that j reflects ordinals. Claim 3. j is a Dedekind self-map, with critical point κ. Moreover, j(κ) > κ. Proof. Since j(κ) 6= κ (by assumption), and since, by Lemma 53, j(κ) ≥ κ, it follows that j(κ) > κ. Suppose κ ∈ ran j. Let j(x) = κ. Since j reflects ordinals, x is an ordinal. If x < κ, then j(x) = x, by leastness, so x > κ. But then j(x) = κ < x, contradicting Lemma 53. The fact that j is 1-1 follows from elementarity: Let φ(x, y) be the formula “x 6= y.” Suppose u 6= v. Then V |= φ[u, v], so V |= φ[j(u), j(v)], and j(u) 6= j(v). So j is 1-1. Claim 4. j is BSP and j is a functor. Proof. Recall that the formula φ(x) that asserts “x is a cardinal” is formulated as follows: φ(x) ⇐⇒ x is an ordinal and for all α < x there is no 1-1 correspondence from α to x. Clearly, if γ is a cardinal, then V |= φ[γ]; by elementarity, V |= φ[j(γ)]. In particular V |= j(γ) is an ordinal and for all α < j(γ), there is no 1-1 correspondence from α to j(γ). Thus, j(γ) is a cardinal. We verify that j is a functor. If f : A → B, since V |= “domain of f is A, codomain of f is B”, we have V |= “domain of j(f ) is j(A), codomain of j(f ) is j(B)”, so j(f ) : j(A) → j(B). A simple application of elementarity shows that j preserves composition of functions and preserves identity functions. We verify that j preserves functions. Suppose f : A → B. Then for all a ∈ A, by elementarity, j(a) ∈ j(A). Note that, for each a ∈ A and each b ∈ B, if f (a) = b, then we have: V |= “b is the value of f at a.” and so V |= “j(b) is the value of j(f ) at j(a).” But this implies that j(f )(j(a)) = j(b), as required. 130 We omit the proof that j preserves images. Claim 5. j preserves intersections and equalizers. Proof. Suppose A, B are sets. Then by elementarity of j, j(A ∩ B) = j({x ∈ A | x ∈ B}) = {x ∈ j(A) | x ∈ j(B)} = j(A) ∩ j(B). Suppose f, g : A → B are functions. Then j(Ef,g) = j({x ∈ A | f (x) = g(x)}) = {x ∈ j(A) | (j(f ))(x) = (j(g))(x)} = Ej(f ),j(g). Claim 6. j is exact. Proof. We show j is left exact; the proof that j is right exact is similar. It suffices to show that j preserves products, equalizers, and terminal objects. We have already established the latter two of these. If A, B are sets, we have j(A × B) = j({(a, b) | a ∈ A and b ∈ B}) = {(a, b) | a ∈ j(A) and b ∈ j(B)} = j(A) × j(B). Corollary 63. (ZFC − Infinity) Suppose (V, ∈, j) is a model of ZFC + BTEE, with κ = critj. Then κ is inaccessible. Proof. This follows from Theorem 59 and Proposition 62. Exhibiting models of ZFC + BTEE of the form (V, ∈, j), where V is the universe of sets, presents a peculiar problem. To see the difficulty, recall how we were able to obtain an example of the properties of Theorem 60: We started with a nonprincipal κ-complete ultrafilter D and defined a jD : V → V with the required properties. However, Kunen’s Theorem [20] tells us that it is impossible to define a j : V → V for which (V, ∈, j) |= ZFC + BTEE; such a j, if it exists at all, must be undefinable (in V ). Assuming large cardinals, one can define set models, and even certain class models, of ZFC + BTEE. We describe such a class model next in the spirit of providing examples to show that the conditions described in our earlier results—in this case, Theorem 59—can be realized. Example 9 (Gödel’s Constructible Universe). Let L denote Gödel’s constructible universe, whose construction we describe now. L, like V , is built in stages, except that at each successor step of the construction, instead of requiring the next stage to contain all subsets of the previous stage, we require only the subsets of the previous stage that are definable (with parameters) from the previous stage. Suppose X is a set and M is a model of a reasonable fragment of ZFC. Then X is a definable subset of M with parameters if, for some formula φ(x, y1 , . . . , yn ) and some elements 131 u1 , . . . , un ∈ M , X = {w ∈ M | M |= φ[w, u1 , . . ., un ]}. We can define the stages of L: L0 = ∅ Lα+1 = {X ⊆ Lα | X is a definable subset of Lα with parameters} [ Lλ = Lα (λ a limit) α<λ L = [ Lα α∈ON In the presence of large enough cardinals, such as a measurable cardinal, it is known that L is much smaller than the full universe V and, in this situation, L can be used as the underlying class for a model of ZFC + BTEE. We state the result, which is due to Kunen:74 Theorem 64 (Kunen). Suppose there is a measurable cardinal. Then there is an elementary embedding j : L → L such that hL, ∈, ji |= ZFC + BTEE. Moreover j is definable in V but not in L. Although we do not have a way of explicitly defining examples of Dedekind self-maps j : V → V for which hV, ∈, ji is a model of ZFC + BTEE, the pattern of generalization that we have described in the previous sections, introducing more and more preservation properties, seems to us to be quite natural. Of course, one could introduce “natural”-looking preservation properties that are not consistent with each other, so one must verify the consistency of the various combinations of properties that are proposed. We have been able to do this so far using explicitly defined Dedekind self-maps j : V → V . Though we will not be able to do this as we introduce strengthenings of BTEE, it is possible to check consistency of the properties involved by building set models, or, occasionally, class models as we have done in Theorem 64. By Kunen’s Theorem, the embedding j of Theorem 64 could not be definable in L, but it turns out to be quite important for j to be definable (or at least partially definable) in V itself. By Proposition 62, j : L → L has all the properties mentioned in Theorem 59, except for the fact that L 6= V . It turns out that the conclusion of Theorem 64—that there is a transitive model of ZFC + BTEE—follows from a large cardinal notion that is weaker than measurable; for instance, assuming there is a Ramsey cardinal, the conclusion of the theorem continues to hold true. Now suppose we have a model hL, ∈, ji, as in the theorem (obtained from existence of a Ramsey cardinal). It is not hard to see that, reasoning as in Proposition 62, j : L → L is an exact functor with a strong critical point. By the Trṅkova-Blass Theorem, one might naturally conclude that, at least relative to the universe L, there should exist a measurable cardinal. This, however is not the case; it is known that measurable cardinals do not exist in L; and one cannot, under any circumstances, prove the existence (or consistency) of a measurable cardinal assuming only existence of a Ramsey cardinal. A closer look at the proof of the Trṅkova-Blass Theorem, or equally well at the proof of Theorem 60, identifies the problem: The definition of the ultrafilter D from j makes sense only if j itself is definable. In the context of Theorem 60, it is implicitly 74 Hereafter, when we talk about Kunen’s Theorem, we will not be referring to the result mentioned in Theorem 64, but rather the result that says there is no weakly definable nontrivial elementary embedding j : V → V . 132 assumed that the Dedekind self-map j : V → V in the hypothesis is definable in V ; if it were not, defining D by D = {X ⊆ κ | κ ∈ j(X)} provides no guarantee that D is a set. But no such assumption about j can be made when j : M → M is supposed to be an elementary embedding with a least ordinal moved, where M is a model of ZFC, because Kunen’s Theorem shows that such embeddings don’t exist. Therefore, the class D cannot be taken to be a set, and the argument cannot be completed. In this case in which j is not definable, exactness of j (and the presence of a strong critical point) are not sufficient to produce a measurable cardinal. This observation suggests a way to strengthen the theory ZFC + BTEE further. It is known that ZFC+ BTEE, on its own, implies the existence of a weakly compact cardinal (and somewhat more). By adding to the axioms of ZFC + BTEE an axiom that asserts that the ultrafilter naturally defined from j is a set (this can be formulated as an instance of the Separation axiom for j-formulas), the theory becomes much stronger. Measurable Ultrafilter Axiom (MUA). The class {X ⊆ κ | κ ∈ j(X)} is a set. We remark that the boldface j signifies that it is a symbol of the expanded language of set theory that we are working in: ZFC + BTEE + MUA. Any model hV, ∈, ji of ZFC + BTEE + MUA will interpret j as a nontrivial elementary embedding j : V → V , where V contains, as one of its elements, {X ⊆ κ | κ ∈ j(X)}. Theorem 65. The theory ZFC+BTEE+MUA proves the existence of many measurable cardinals. Moreover, existence of a supercompact cardinal (even a 2κ -supercompact, where κ is the critical point of the embedding) is sufficient to produce a transitive model of ZFC + BTEE + MUA. In [7], a variety of strengthenings of the theory ZFC + BTEE, like ZFC + BTEE + MUA, are studied. The stronger the models are, the more fully they provide an account of the known large cardinals. Before examining the ultimate limit of this process of generalization, we take a closer look at the theory ZFC + BTEE + MUA to see the extent to which our original intuitions, based on our analysis of Dedekind self-maps and the bare notion of “infinity,” have been successful in leading to an account for the existence of “many” measurable cardinals—entities that exist in the middle-range of the spectrum of large cardinals. 23 The Theory ZFC + BTEE + MUA We recall significant properties of set Dedekind self-maps, which we expected would generalize to the context of Dedekind self-maps V → V in our search for a natural account of the origin of large cardinals (p. 60). We found that if j : A → A is a Dedekind self-map on a set A with critical point a, j exhibits the following characteristics: Properties of Set Dedekind Self-Maps (A) The self-map j preserves essential properties of its domain (A is Dedekind infinite, and so is the image j[A] of j) and of itself (the property of being a Dedekind self-map propagates to the restriction j j[A]). 133 (B) The definition of j entails a critical point that plays a key role in its dynamics. One aspect of those dynamics is that repeated restrictions of j to successive images give rise to a critical sequence W = {a, j(a), j(j(a)), . . .} which is a precursor to the set ω of finite ordinals; the emergence of this critical sequence provides a strong analogy to the ancient and quantum field theoretic perspective that “particles arise from the dynamics of an unbounded field,” where, in this context, particular finite ordinals are viewed as “particles.” The sequence of restrictions j0 , j1, j2 , . . . of j that produce the critical sequence is defined by the clauses: A0 = A j0 = j : A → A crit(j0 ) = a A1 = j[A0 ] j1 = j A1 crit(j1 ) = j(a) An+1 = j[An] jn+1 = j An+1 crit(jn+1 ) = j n+1 (a). Therefore crit(jn ) = j n (a) is a critical point of jn . (C) Through the interplay between j and its critical point, a blueprint (j W, a, E) of the set ω of finite ordinals arises. In particular, j W is a Dedekind self-map, and for every n ∈ ω, there is i ∈ E such that i(j W )(a) = n. (D) In special cases (starting with a transitive set A that is closed under pairs), there also arises a strong blueprint (S W 0 , F W 0 , ∅, E 0), where S is the singleton map x 7→ {x} (which is a Dedekind self-map), W 0 = {∅, {∅}, {{∅}} . . .}, F takes a set to a ∈-minimal element of the set (F is a co-Dedekind self-map and S A is a section of F ), and E has the property that for all x ∈ W 0 there are i, k ∈ E such that i(S W )(∅) = x and k(F W )(x) = ∅. In this case, the strong blueprint gives expression to both the generating effect of a Dedekind self-map and the collapsing effect of the dual map, a co-Dedekind self-map. Theorems 46, 47, and 49 illustrate the role of preservation properties, motivated by (A), in strengthening preservation properties of Dedekind self-maps j : V → V sufficiently to give rise to infinite sets. Moreover, the infinite set generated by each of the self-maps of Theorems 46 and 49 arose as the (strong) critical point of the self-map, as anticipated by (B). Theorem 47 accords with (B) in another way: The critical point of the map is the seed for defining a nonprincipal ultrafilter on a set A; the existence of such an ultrafilter implies that its underlying set is infinite. Elaborating on these results through the introduction of additional preservation properties led to an account of two types of large cardinals—inaccessibles (Theorem 59) and measurables (Theorem 60). In Theorem 59, the critical point turned out to be inaccessible; in Theorem 60, the critical point was the seed that was used to build an ω1 -complete nonprincipal ultrafilter, the existence of which implies the existence of a measurable cardinal. Our approach suggests that these types of large cardinals arise in the same “natural” way as infinite sets do, from Dedekind 134 self-maps V → V with suitable preservation properties, and with the critical point playing a key part in the emergence of the large cardinal at hand. Generalizing preservation properties of Dedekind self-maps to the notion of elementary embedding, as we do when we consider the theory ZFC+BTEE, provides the ultimate generalization of (A). If j : V → V is a Dedekind self-map given by a transitive model of ZFC + BTEE, its critical point is necessarily a large cardinal—at least a weakly compact cardinal. Examining the Dedekind self-maps j : V → V for any model of ZFC + BTEE + MUA will show how other characteristics of Dedekind self-maps that we have identified (and listed in (B)–(D) above) come into play to give rise to still stronger large cardinals. Since such a j is a ZFC + BTEE embedding, it is already clear that j generalizes properties mentioned in (A) and (B) (though the significance of repeatedly restricting j to subsets of its domain is not yet apparent). In particular, in the theory ZFC + BTEE + MUA, if κ is the (weak) critical point of j, κ is measurable, and is in fact the κth measurable. For ZFC + BTEE + MUA there are also natural analogues to (C)–(D), which we describe next. The fact that the collection Uj = {X ⊆ κ | κ ∈ j(X)}, where j : V → V is the embedding provided by a model of ZFC + BTEE + MUA (with critical point κ) turns out to be a set in ZFC+BTEE+MUA opens the door for a new construction, allowing us to generate (and collapse) sets starting from the critical point κ. It can be shown that Uj is a nonprincipal, κ-complete ultrafilter on κ, so we have immediately that κ is measurable. It has one additional property that should be mentioned here: Uj is closed under diagonal intersections. This T means that, whenever hXα : α < κi are elements of Uj , the diagonal intersection {ξ < κ | ξ ∈ α<ξ Xα} also belongs to Uj . Any nonprincipal, κ-complete ultrafilter on an uncountable cardinal κ that is closed under diagonal intersections is said to be a normal measure on κ. We will therefore refer to Uj as the normal measure derived from j. As we will show now, if j : V → V is an MUA-embedding with critical point κ, we can obtain from j, κ a blueprint (`, κ, E) for the stage Vκ+1 of the universe, where ` : Vκ → Vκ is the blueprint map and E consists of certain elementary embeddings (described below). In other words, the blueprint generates all sets in the universe through stage Vκ+1 . In order to understand the definitions of ` and E it is necessary to introduce a method for constructing new universes of set theory—called the ultrapower construction. We will describe the construction and its properties without proofs; proofs may be found in [19]. Suppose κ is a measurable cardinal and U is a κ-complete nonprincipal ultrafilter over κ. Recall that the collection V κ consists of all functions having domain κ. We declare two such functions f, g equivalent, and write f ∼U g, if f and g agree on a set in U , that is, if {α | f (α) = g(α)} ∈ U . It is easy to check that ∼U is an equivalence relation. We denote the equivalence class containing f by [f ]. Since [f ] is a proper class, we re-define it in the following way: Let g be a function κ → V of least rank that is equivalent to f ; so, g ∼u f and g is of least possible rank with this property. Then define [f ] to be {h : κ → V | h ∼u f and rank(h) = rank(g)}. Under this definition, [f ] is always a set (being a subset of some Vα). Let V κ /U = {[f ] | f : κ → U }. Two elements [f ], [g] of V κ /U are equal if f ∼u g. For membership, we write [f ] ∈∗ [g] if and only if {α | f (α) ∈ g(α)} ∈ U . With these definitions of equality and membership, V κ /U can be shown to be a model of ZFC. Moreover, because U is κ-complete, V κ /U is a well-founded model; this means that there is no infinite decreasing 135 ∈∗ -chain in V κ /U , i.e., no sequence f1 , f2 , f3 , . . . so that . . . ∈∗ [f3 ] ∈∗ [f2 ] ∈∗ [f1 ]. Because V κ /U is well-founded and has the property that for each [f ] ∈ V κ /U , {[g] ∈ V κ /U | [g] ∈∗ [f ]} is a set, Mostowski’s Collapsing Theorem can be applied to produce an isomorphic transitive image (N, ∈) of V κ /U , which of course is also a model of ZFC. We identify elements of V κ /U with those of N . We explain this point a bit more. Note that if π : V κ /U → N is the collapsing isomorphism, then [f ] ∈∗ [g] if and only if π([f ]) ∈ π([g]); we will identify π([f ]) with [f ]. Finally, we mention that the map iU : V → V κ /U ∼ = N defined by iU (x) = [cx], where cx is the constant function with value x, is an elementary embedding with critical point κ. The map iU is called the canonical embedding derived from U . We now start moving toward the construction of the blueprint (`, κ, E) for Vκ+1 . The map ` will turn out to be a co-Dedekind self-map, and the elements of E will be restrictions of ultrapower embeddings. We will be able to show that for each X ⊆ Vκ , there is a normal measure U on κ such that if iU is the canonical embedding derived from U , iU (`)(κ) = X. In this way, every set in the stage Vκ+1 , the initial part of the universe up to subsets of Vκ , can be seen as arising from or being generated by the interplay of κ, j and `, where ` is encoded and decoded by E. The essence of the construction of ` is a Vκ+1 -Laver function. For any set X, a function f : κ → Vκ is an X-Laver function at κ if, for each x ∈ X, there is a normal measure U on κ such that iU (f )(κ) = x. The function f is defined by a clever method of encoding information about all possible normal measures over κ. We define f and then explain how ` is obtained from f . We first define a formula that is needed both in the definition of f and in the proof that it has the desired properties: ψ(g, x, λ) : g : λ → Vλ ∧ x ⊆ Vλ ∧ for all normal measures U on λ,iU (g)(λ) 6= x. When ψ(g, x, λ) holds true, it means that g is not a Vλ+1 -Laver sequence at λ: Some subset x of Vλ cannot be computed as iU (g)(λ) for any choice of U . We can now define f : (∗) ( ∅ f (α) = x if α is not a cardinal or f α is Vα+1 -Laver at α where x satisfies ψ(f α, x, α). The definition tells us that f (α) has nonempty value just when the restriction f α is not Vα+1 -Laver at α, and in that case, its value is a witness to non-Laverness. Theorem 66 (Vκ+1 -Laver Functions Under MUA). The function f defined in (∗) is a Vκ+1 -Laver function at κ. Proof. Let j : V → V be the Dedekind self-map, with critical point κ, given to us in a model of ZFC + BTEE + MUA. Suppose f is not Vκ+1 -Laver at κ, so, in particular, for some y, ψ(f, y, κ) holds. We consider j(f ) : j(κ) → Vj(κ) . Notice that j(f ) κ = f : For each α < κ, we have j(f )(α) = j(f )(j(α)) = j (f (α))) = f (α) (since f (α) ∈ Vκ ). By elementarity, j(f ) has the same definition as f : 136 ( ∅ j(f )(α) = x if α is not a cardinal or j(f ) α is Vα+1 -Laver at α where x satisfies ψ(j(f ) α, x, α). In particular, since f = j(f ) κ is not Vκ+1 -Laver, j(f )(κ) is a witness to non-Laverness and ψ(j(f ), j(f )(κ), κ) is true. Let D = Uj be the normal measure derived from j; that is, D = {X ⊆ κ | κ ∈ j(X)}. By MUA, D is a set. Let i = iD : V → V κ /D ∼ = N be the canonical embedding, and define k : N → V by k([h]) = j(h)(κ). One can show that k is an elementary embedding with critical point > κ and makes the following diagram commutative: V j - H H V 6 iDHH k j H N By the diagram, we compute: j(f )(κ) = (k ◦ i)(f )(κ) = k(i(f )) (k(κ)) = k i(f )(κ) . Notice that, since j(f )(κ) ⊆ Vκ , k j(f )(κ) = j(f )(κ). By 1-1-ness of k, therefore, since k j(f )(κ) = j(f )(κ) = k( i(f )(κ) , it follows that j(f )(κ) = i(f )(κ). The import of this last equation is that, while it is claimed that ψ(j(f ), j(f )(κ), κ) holds true, we have just exhibited a normal measure D on κ such that iD (f )(κ) = j(f )(κ). We conclude that f is Vκ+1 -Laver after all. We mention briefly the significance of having V itself as the codomain of j. If the codomain of j were some smaller transitive model M , then, assuming f is not Vκ+1 -Laver at κ does imply the triple j(f ), j(f )(κ), κ satisfies the formula ψ(g, x, λ), but this formula is now relativized to the model M , and so asserts that for no normal measure U in M is it true that iU (f )(κ) = j(f )(κ). So, although it is true that for D as defined in the proof, iD (f )(κ) = j(f )(κ), D may not be one of M ’s normal measures, and so would not give us a contradiction. We turn to the construction of `: (+) ( f (x) `(x) = x if x is an ordinal < κ otherwise. We now show that ` is the blueprint map for a blueprint (`, κ, E) for Vκ+1 , assuming MUA. We will describe the members of E in a more precise way in the discussion in Remark 20, below. 137 Theorem 67 (Existence of Blueprint Self-Maps under MUA). (ZFC + BTEE + MUA) Suppose j : V → V is a a Dedekind self-map given by a model of ZFC + BTEE + MUA, with critical point κ. Then the function ` : Vκ → Vκ defined in (+) has the following property: (∗∗) For any set X ⊆ Vκ , there is a normal measure U on κ such that iU (`)(κ) = X, where iU : V → V κ /U ∼ = N is the canonical elementary embedding. Moreover, ` is a co-Dedekind self-map. For the rest of this section, we shall say that a self-map having the property (∗∗) has the Laver property at κ. Proof. Note that for all α < κ, f (α) = `(α). In particular, if T = {α < κ | f (α) = `(α)} and U is a normal measure on κ, then T ∈ U . Therefore, if i = iU is the canonical embedding, κ ∈ i(T ) = i ({α < κ | f (α) = `(α)}) = {α < i(κ) | i(f )(α) = `(f )(α)}. It follows that, for every normal measure U on κ, iU (f )(κ) = iU (`)(κ). It follows that ` has the Laver property at κ since f does. To see that ` is a co-Dedekind self-map, it is sufficient to show that, whenever f : κ → Vκ is Vκ+1 -Laver for subsets of Vκ, for each x ∈ Vκ, |f −1 (x)| = κ. Given x ∈ Vκ , let Tx = {α < κ | f (α) = x}. Let U be a normal measure on κ such that iU (f )(κ) = x. Note that iU (x) = x since x ∈ Vκ . Then since κ ∈ {α < i(κ) | i(f )(α) = x} = i(Tx), it follows that Tx ∈ U . Therefore |f −1 (x)| = |Tx | = κ. Remark 20 (The Blueprint Coder E). We describe the blueprint coder for the blueprint of Vκ+1 in more detail. Suppose j : V → V is a a Dedekind self-map given by a model of ZFC+BTEE+MUA, with critical point κ. For each normal measure U on κ, let iU : V → V κ /U ∼ = MU be the canonical κ /U defined in the same way as embedding and let iU = iU Vκ+1 (equivalently, iU : Vκ+1 → Vκ+1 iU ). We define E by E = {iU | U is a normal measure on κ}. We show that (`, κ, E) is a blueprint for Vκ+1 ; we use the criteria specified in Definition 3 (p. 56). The map ` satisfies (1) because it is a co-Dedekind self-map. For (2), κ is a legitimate blueprint seed since it is a critical point of j. For (3), parts (a)–(c), it is clear that each g : Vκ → Vκ belongs to Vκ+1 , and hence to the domain of each i ∈ E. Also, for each i ∈ E, by elementarity, i(`) has domain Vi(κ) ⊇ Vκ = dom `, and κ ∈ dom i(`). For the main part of (3), each i is an elementary embedding, hence structure-preserving in the same way as j. The “compatibility” requirement comes from [6]; in the present setting, we observe that there is i : Vκ+1 → N ∈ E and an elementary embedding k : N → Vj(κ)+1 with the following two properties: (a) j Vκ+1 = k ◦ i; (b) k Vκ+1 = idVκ+1 138 In the language of [6, Definition 4.18], these facts tell us that, at least in a weak sense, E is compatible with j.75 For (D), it is obvious that ` is defined from j, κ, and E. For (E), suppose X ⊆ Vκ . Since, 75 We explain more details about this compatibility criterion. In [6], familiar globally defined large cardinal notions (like supercompactness and superhugeness) were represented by classes of set embeddings. The reason for this representation was to provide a uniform setting for discussing existence of Laver sequences for arbitrary globally defined large cardinals. The starting point for this study is the notion of a suitable formula. Let θ(x, y, z, w) be a first-order formula (in the language {∈}) with all free variables displayed. We will call θ a suitable formula if the following sentence is provable in ZFC: ˆ ∀x, y, z, w θ(x, y, z, w) =⇒ “w is a transitive set” ∧ z ∈ ON ˜ ∧ “x : Vz → w is an elementary embedding with critical point y” . For each cardinal κ and each suitable θ(x, y, z, w), let Eκθ = {(i, M ) : ∃β θ(i, κ, β, M )}. The codomain of an elementary embedding i needs to be explicitly associated with i in the definition for technical reasons; for practical purposes, we think of Eκθ as a collection of elementary embeddings i : Vβ → M with critical point κ. In [6], familiar large cardinals, like supercompact and superhuge, are re-defined in terms of classes Eκθ of embeddings. With this general concept of classes of set embeddings, we defined in [6] a general notion of Laver sequence: Given a class Eκθ of embeddings, where θ is a suitable formula, a function g : κ → Vκ is defined to be Eκθ -Laver at κ if for each set x and for arbitrarily large λ there are β > λ, and i : Vβ → M ∈ Eκθ such that i(κ) > λ and i(g)(κ) = x. A key sufficient condition for Laverness, mentioned in [6], is the notion of compatibility with an ambient elementary embedding j : V → V having critical point κ, such as the embedding given by MUA. Suppose κ < λ < β, and iβ : Vβ → M is an elementary embedding with critical point κ. Then iβ is compatible with j up to Vλ if there is k : M → Vj(β) with j Vβ = k ◦ i and k Vλ ∩ M = idVλ ∩M . If θ is suitable, then Eκθ is said to be compatible with j if for each λ < j(κ) there is a β > λ and i : Vβ → M ∈ Eκθ that is compatible with j Vβ up to Vλ . j - V V Vβ i ? j Vβ - V j(β) k 3 M In the present context of ZFC+MUA, the large cardinal under consideration is not globally defined and therefore we do not expect to obtain a Laver sequence for such a cardinal. Our concern is to demonstrate the existence of a Vκ+1 -Laver sequence, appropriate for strong kinds of measurable cardinals. We can adapt the definitions given here: First, we can represent the notion of a measurable cardinal with a suitable formula θm : θm (i, κ, β, M ) : β = κ + 1 ∧ M is transitive ∧ i : Vβ → M is elementary. Now κ is measurable if and only if there exists i, β, M such that θm (i, κ, β, M ): If κ is measurable, let ı̄ : V → N N be an elementary embedding with critical point κ, and consider i = ı̄ Vκ+1 : Vκ+1 → Vj(κ)+1 = M . Conversely, given i : Vκ+1 → M with critical point κ, define U = {X ⊆ κ | κ ∈ j(X)}. U is well-defined since P(κ) ⊆ Vκ+1 , and is easily seen to be a normal measure on κ. 139 from the previous theorem, ` has the Laver property at κ, we can find U such that iU (`)(κ) = X. Since ` ∈ Vκ+1 , it follows that iU (`)(κ) = X as well. Next, we show that there is a strong blueprint for Vκ+1 − Vκ , the collection of subsets of Vκ of size κ. This strong blueprint shows the generating and collapsing effects of ` and its dual, òp , which will be defined to be a certain section of `. In particular, just as in the case of the strong blueprint (S W 0 , F W 0 , ∅, E) for W 0 , in which every element of W 0 − {∅} is returned by F to its “source” ∅, which is the seed of the blueprint (Example 1), so likewise we will have in the present context that all sets in Vκ+1 − Vκ are returned to the blueprint seed κ by òp ; that is, for each x ∈ Vκ+1 − Vκ , there is i ∈ E such that i(òp )(x) = κ. Note, however, that we do not claim that there is such an i for which i(òp)(x) = κ for x ∈ Vκ . In fact, there is no way to define a section s of ` so that this is true: Let s : Vκ → Vκ be a section of ` and let x ∈ Vκ . Let i = iU be a canonical embedding, as usual. Then i(s)(x) = i(s) (i(x)) = i(s(x)) = s(x) 6= κ. The fact that elements of Vκ are “left out” of the dynamics of the strong blueprint accords with our expectation: For any MUA-embedding j : V → V with critical point κ, we are given κ, and thereby Vκ as well, so the only sets that need to “return” to κ are those that lie outside of Vκ . On a smaller scale, in Example 1, the same thing happens: All sets in W 0 − {∅} are returned to ∅, but the function F that accomplishes this does not return the seed ∅ itself to ∅ (unless it does so by accident—recall that F (∅) has arbitrary value). Example 10 (Strong Blueprint for Vκ+1 − Vκ ). Let j : V → V be a Dedekind self-map given by a model of ZFC + BTEE + MUA, with critical point κ. Let (`, κ, E) be a blueprint for Vκ+1 . We define the dual map òp so that (`, òp, κ, E) is a strong blueprint for Vκ+1 − Vκ . We define òp : Vκ → Vκ as follows: ( α if α is the least ordinal in `−1 (x), if there is one op ` (x) = y otherwise, where y is an arbitrary element of `−1 (x) Next, we can define the notion of an X-Eκθm -Laver sequence, adapting the definition given in the main text: For any set X, a function f : κ → Vκ is an X-Eκθm -Laver function at κ if, for each x ∈ X, there is i ∈ Eκθm such that i(f )(κ) = x. It is straightforward to show that for any f : κ → Vκ and any set X, f is X-Laver at κ if and only if f is X-Eκθm -Laver at κ. These definitions naturally lead to the weak form of compatibility of Eκθm with j : V → V that is discussed in the main text: Starting with an MUA-embedding j : V → V with critical point κ, we declare that Eκθm is locally compatible with j if there is i : Vκ+1 → M ∈ Eκθm that is compatible with j Vκ+1 up to Vκ+1 . This definition is equivalent to the clauses (a) and (b) given in the main text. Vκ+1 i - j V j ? V Vκ+1 - V j(κ)+1 k 3 M 140 As òp (x) ∈ `−1 (x) for each x ∈ Vκ , it is obvious that òp is a Dedekind self-map and a section of `. Claim. Suppose X ∈ Vκ+1 − Vκ, U is a normal measure on κ, and i = iU is the canonical embedding with critical point κ for which i(`)(γ) = X. Then γ ≥ κ. Note that the Claim (once proven) continues to hold true if iU is replaced with iU . Proof. Suppose α < κ. We compute i(`)(α), using the fact that i(α) = α: i(`)(α) = i(`) (i(α)) = i(`(α)) = `(α) ∈ Vκ . The fact that i(`(α)) = `(α) follows because `(α) ∈ Vκ and i is the identity on Vκ . We have shown that if X ∈ Vκ+1 − Vκ and i(`)(γ) = X, then γ ≥ κ. We verify the main property of òp : Let X ∈ Vκ+1 − Vκ . Let U be a normal measure on κ and i = iU the canonical embedding so that i(`)(κ) = X. Note that, by elementarity, i (òp ) is defined, for each x ∈ Vi(κ), by ( α if α is the least ordinal in i(`)−1 (x), if there is one i (òp ) (x) = y otherwise, where y is an arbitrary element of i(`)−1 (x) By the Claim, κ is the least ordinal in i(`)−1 (X). By definition of i(`)op, it follows that i(`)op(X) = κ. Again, note that the same argument goes through if iU is replaced by iU . We can now formally establish that (`, òp, κ, E) is a strong blueprint for Vκ+1 −Vκ by verifying the properties in Definition 3. As we have already shown that (`, κ, E) is a blueprint for X, where X = Vκ+1 − Vκ , what remains is to establish the following points, and these were demonstrated in the paragraphs above: (A) dom òp = dom `. (B) òp is a section of `. (C) For each i ∈ E, dom i(òp) = dom i(`) (D) For every x ∈ X, there is i ∈ E such that i(òp )(x) = κ. We arrived at the theory ZFC+BTEE by noticing (1) the needed strengthening of a Dedekind self-map j : V → V to produce an infinite set was obtained by requiring j to have fairly natural preservation properties; (2) by strengthening these preservation properties further, certain large cardinals could be derived; (3) the strongest kind of preservation possible is obtained when j is an elementary embedding, and the theory ZFC + BTEE is the formal assertion of the existence of such an embedding from V to V . In our initial study of Dedekind self-maps, we found they exihibited not only interesting preservation properties, but led naturally to the concept of a nonprincipal ultrafilter. Generalizing 141 these ideas led to an example and corresponding theoretical results in which a Dedekind self-map j : V → V exhibits strong preservation properties and a nonprincipal ultrafilter plays a key role. The results in this case provided motivation for the existence of a measurable cardinal. The construction of the ultrafilter in this case could not be carried out directly in the theory ZFC+BTEE because of definability restrictions, and so we were led to postulate a supplementary axiom to ZFC + BTEE which asserts that the ultrafilter naturally derived from a j : V → V exists as a set; this is the theory ZFC + BTEE + MUA. This strengthened theory turns out to imply existence of more than just a measurable cardinal, and also exhibits many more of the characteristics that we specified in the case of a Dedekind self-map—in particular, a blueprint for Vκ+1 and a strong blueprint for Vκ1 − Vκ . These points were described in Properties of Set Dedekind Self Maps (p. 133) and elaborated in this section. However, one characteristic of Dedekind self-maps that we did not discuss—mentioned in part (B) of in Properties of Set Dedekind Self-Maps (p. 133)—is the generation of a critical sequence and the role of restrictions of j. Although it is true that the “generating” effect of j, in the theory ZFC + BTEE + MUA, was captured nicely by the blueprint and strong blueprint that are derived from j, still it is natural to ask about the properties of the sequence κ, j(κ), j(j(κ)), . . . and the role of restrictions of j. This topic reveals interesting limitations in the theory ZFC + BTEE + MUA. We mention some known results [7] and introduce some new refinements. (1) In the theory ZFC + BTEE + MUA, the critical sequence κ, j(κ), j(j(κ)), . . . cannot be shown to “exist,” even as a j-class, without supplementing the theory with additional axioms. What can be shown is that, for each particular (metatheoretic) natural number n, the theory proves that the sequence hκ, j(κ), . . ., j n(κ)i exists as a set. On the other hand, whenever we work in a transitive model of ZFC + BTEE + MUA, this problem is corrected, and the critical sequence can indeed be shown to be a j-class in the model. (2) The theory ZFC+BTEE +MUA shows, for each particular (metatheoretic) natural number n that Vκ ≺ Vj(κ) ≺ . . . ≺ Vj n (κ) , but the sequence of models Vκ , Vj(κ), Vj 2(κ) , . . . cannot be shown to be a j-class. Once again, this sequence can be shown to be a j-class inside any transitive model of the theory. However, even inside such a transitive model, without additional axioms, the reasonable conjecture Vκ ≺ Vj(κ) ≺ . . . ≺ Vj n (κ) ≺ . . . ≺ V cannot be proven. (3) A natural question, which the theory ZFC + BTEE + MUA cannot answer, is whether the critical sequence κ, j(κ), j(j(κ)), . . . is bounded. Assuming a 2κ -supercompact cardinal, there is a transitive model of ZFC + BTEE + MUA in which the critical sequence is bounded, but, on the other hand, any I3 embedding i : Vλ → Vλ (with critical point κ and with λ a limit > κ) gives rise to a transitive model (Vλ , i) of ZFC + BTEE + MUA + “the critical sequence is unbounded”. (4) The restriction j Vκ can be shown to exist as a set in ZFC + BTEE + MUA (it is equal to idVκ ); using elementarity of j, one can show that Vκ ≺ Vj(κ), and hence that j Vκ : Vκ → Vj(κ) is an elementary embedding.76 However, it is not possible to show that restrictions of 76 Here is a short proof: We first show that Vκ ≺ j(Vκ ). Arguing in the metatheory: Suppose φ(x0, x1 , x2 , . . . , xn ) 142 j to Vj n(κ) , for n ≥ 1, exist as sets in ZFC + BTEE + MUA.77 Here is a sampling of what is known: (A) The theory ZFC + BTEE + ∃z (z = j κ+ ) has consistency strength at least that of a strong cardinal (B) The theory ZFC + BTEE + ∃z (z = j Vκ+1 ) has consistency strength at least that of a Woodin cardinal. (C) The theory ZFC + BTEE + ∃z (z = j P(P(j(κ)))) directly proves κ is huge with κ huge cardinals below it.78 is a formula, a1 , a2 , . . . , an are sets, and there is c = j(b) ∈ j(Vκ ) such that j(Vκ ) |= φ[j(b), j(a1 ), . . . , j(an )]. By elementarity of j, Vκ |= φ[b, a1 , . . . , an ]; the result follows. Since j(Vκ ) = Vj(κ) , and since the statement that Vκ ≺ Vj(κ) is equivalent to the assertion that j Vκ : Vκ → Vj(κ) is elementary, we have therefore established this latter assertion. To make this metatheoretic argument formal, since we cannot quantify over formulas, we use codes for formulas; codes are sets that live in Vω . Every formula in the language {∈} of ZFC can be coded as a set in Vω . Given a formula φ(x0 , x1 , . . . , xn ) as before, let p be a code for φ. Let f : rank(p) → Vκ ; f represents a possible finite sequence of sets for the formula coded by p; in the above example, the values of f were a1 , . . . , an . Then j(p) = p and j(f ) : rank(p) → Vj(κ) . By elementarity of j, we have Sat(p, Vκ , f ) ⇐⇒ Sat(p, Vj(κ) , j(f )), where, in general Sat(x, M, g) is the ∆1 statement that asserts formally that M satisfies the formula coded by x with parameter values given by g. This argument shows formally Vκ ≺ Vj(κ) . 77 We prove this here by showing that existence of j Vj(κ) in any model (V, ∈, j) of ZFC + BTEE suffices to produce a transitive model of ZFC + BTEE + MUA; by Gödel Incompleteness, this shows that the restricted map j Vj n (κ) cannot be a set in a model of ZFC + BTEE + MUA, for any n ≥ 1. Proposition 68. The theory ZFC+ BTEE + ∃z (z = j Vj(κ) ) proves “there is a transitive model of ZFC+ BTEE + MUA.” Proof. Let j : V → V be the embedding given in the hypothesis, having critical point κ and having the property that j Vj(κ) is a set. Let g = j Vj(κ) . Since P(P(P(κ))) ∈ Vj(κ) , one can define a 2κ -supercompactness measure Ug by Ug = {X ⊆ Pκ2κ | g00 2κ ∈ g(X)}, which ensures that κ is 2κ -supercompact. But this degree of supercompactness has been shown [7, Proposition 9.10] to be sufficient to build a transitive model of the theory ZFC + BTEE + MUA. 78 Assuming a related but somewhat weaker axiom “ is enough to ” obtain a blueprint for Vκ+2 . We outline the proof here. We work in the theory ZFC + BTEE + ∃z z = j j(κ)) . Let λ = 2κ and let h = j λ. Our extra axiom, together with Lemma 69, imply that h exists as a set (since λ < j(κ)). We describe our plan for building a blueprint (`+ , κ, E + ) for Vκ+2 : We will define a Vκ+2 -Laver function f + : κ → Vκ from which `+ can be defined, as was done earlier. We let E + consist of all λ-supercompact embeddings (restricted to Pκλ+ ), defined from a normal measure U over Pκλ (where Pκλ = {X ⊆ P(λ) | |X| < κ}) as the canonical ultrapower embedding i = iU : V → V Pκ λ ∼ = MU . The Laver property that we will show f + has is that for every X ∈ Vκ+2 , there is a λ-supercompact ultrafilter U such that iU (f + )(κ) = X. We follow the steps of our earlier construction of a Vκ+1 -Laver sequence. As before, we define a formula ψ: ψ(u, x, γ) : u : γ → Vγ ∧ x ⊆ Vγ+1 ∧ for all normal measures U on Pγ 2γ ,iU (u)(γ) 6= x. When ψ(u, x, λ) holds true, it means that u is not a Vγ+2 -Laver sequence at γ: Some subset x of Vγ+1 cannot be computed as iU (u)(γ) for any choice of U . We can now define f + : 143 (∗) + f (α) = ( ∅ x if α is not a cardinal or f + α is Vα+2 -Laver at α where x satisfies ψ(f + α, x, α). The definition tells us that f + (α) has nonempty value just when the restriction f + α is not Vα+2 -Laver at α, and in that case, its value is a witness to non-Laverness. Claim. The function f + defined in (∗) is a Vκ+2 -Laver function at κ. Proof. Let j : V → V be the embedding, with critical point κ, given to us in a model of our theory ZFC + BTEE + ∃z (z = j j(κ)). Suppose f + is not Vκ+2 -Laver at κ, so, in particular, for some y, ψ(f, y, κ) holds. We consider j(f +) : j(κ) → Vj(κ) . Notice, as in the earlier proof, that j(f + ) κ = f + . Also, by elementarity, j(f + ) has the same definition as f +: ( ∅ if α is not a cardinal or j(f + ) α is Vα+2 -Laver at α + j(f )(α) = x where x satisfies ψ(j(f + ) α, x, α). In particular, since f + = j(f + ) κ is not Vκ+2 -Laver, j(f + )(κ) is itself a witness to non-Laverness and ψ(j(f + ), j(f + )(κ), κ) is true. Let D = Uj be the normal measure derived from j; that is, D = {X ⊆ Pκ λ | h[λ] ∈ j(X)} (recall h = j λ). We claim that, by our extra axiom, D is a set: Since κ is measurable, j(κ) is a measurable above κ and so <κ |P(Pκλ)| = 2λ < j(κ). Therefore, by Lemma 69, there is a set r = j P(Pκλ). Therefore, D = {X ⊆ Pκ λ | h[λ] ∈ r(X)} is a set. Let i = iD : V → V /D ∼ = N be the canonical λ-supercompact embedding, and define k : N → V by k([t]) = j(t)(h[λ]). One can show that k is an elementary embedding with critical point > 2κ and makes the following diagram commutative: Pκ λ j V H H - V 6 k H HH j iD N By the diagram, we compute: j(f + )(κ) = = = (k ◦ i)(f + )(κ) ` ´ k(i(f + )) (k(κ)) ` ´ k i(f + )(κ) . Since iD : V → N is λ-supercompact, λ N ⊆ N . It follows that N contains all sets that are hereditarily of size ≤ 2κ —in particular, Vκ+1 and each of its subsets belongs to N . Therefore V`κ+2 = P(V ´ κ+1 ) ⊆ N . Since crit(k) > 2κ , k P(Vκ+1 ) = idVκ+1 . Then since j(f + )(κ) ⊆ Vκ+1 , k j(f + )(κ) = j(f + )(κ). By 1-1-ness of k, therefore, since ` ´ ` ´ k j(f +)(κ) = j(f + )(κ) = k( i(f + )(κ) , it follows that j(f + )(κ) = i(f + )(κ). Therefore, as before, though we have claimed that ψ(j(f + ), j(f + )(κ), κ) holds true, we have just exhibited a normal measure D on Pκλ such that iD (f + )(κ) = j(f + )(κ). We have a contradiction. Therefore f + is Vκ+2 -Laver after all. 144 (D) The theory ZFC + BTEE + ∃λ∃z (λ is an upper bound for the critical sequence and z = j λ) is inconsistent. The extension of ZFC + BTEE obtained by adding the axiom MUA is derivable from an axiom that asserts that a restriction of j exists. In particular, ZFC + BTEE + ∃z (z = j P (κ)) ` MUA. This can be shown as follows: Working in the theory ZFC+BTEE+∃z (z = j P (κ)), where κ is the critical point of the embedding j : V → V , let g = j P (κ). Now the ultrafilter derived from j has the following simple ZFC definition: U = {X ⊆ κ | κ ∈ g(X)}. It can be shown that, for any set X for which |X| ≤ κ, the restriction j X does exist as a set in ZFC + BTEE + MUA (but restrictions of j to sets of larger cardinality require stronger axioms, as (A) demonstrates). This follows from two observations, which we prove below. Lemma 69. Suppose T is an extension of the theory ZFC + BTEE and j : V → V is the embedding with critical point κ. Suppose j X can be proven to exist (as a set) in the theory T . Then: (i) For any Y ⊆ X, j Y is also a set (ii) For any Y for which there is a bijection X → Y , j Y is also a set. Proof of (i). Suppose Y ⊆ X and let i = j X. Note that j Y = {(u, v) | u ∈ Y and i(u) = v}, which is a set by ordinary Replacement. Proof of (ii). Suppose f : X → Y is a bijection. By elementarity, j(f ) : j(X) → j(Y ) is also a bijection. Let i = j(f ) and k = j X, both of which are set in T . We have j Y = {(y, z) | z = j(y)} = {(y, z) | ∃x ∈ X y = f (x) and z = i(k(x))}. and the last expression is a set by ordinary Replacement. A consequence of (ii) is that the following theories are equivalent: ZFC + BTEE + ∃z (z = j Vκ+1 ) and ZFC + BTEE + ∃z (z = j P (κ)). As our sampling of results suggests, the large cardinal consequences of requiring that restrictions of j to sets X be themselves sets increase in strength as those sets X increase in size. Part (C) shows one limitative result in this direction. When the critical sequence is unbounded, however, we are free to require restrictions of j to sets of any size without inconsistency. The next section explores this possibility. ` ´ Examining the proof shows that the blueprint (`+ , κ, E + ) derived from f + is actually a blueprint for H (2κ )+ ) Vκ+2 (for any infinite cardinal γ, H(γ) is the set of sets having hereditary cardinality < γ). Moreover, the proof goes through in the weaker theory ZFC + BTEE + ∃z (z = j P(P(P(κ)))). 145 24 The Theory ZFC + WA Our analysis of the embedding j : V → V that we get from a model of ZFC + BTEE + MUA shows that our strategy for strenghtening the notion of a Dedekind self-map V → V based on observations we have made about set Dedekind self-maps, in Properties of Set Dedekind Self-Maps (see p. 133 and the remarks following), has been successful so far: We have, using techniques of generalization that are suggested to us by our new Axiom of Infinity, arrived at a formulation of a Dedekind self-map of the universe whose properties ensure the existence of many measurable cardinals. However, we have yet to tap the full potential of these Properties. For example, our blueprint for generating sets (discussed in part (C) of Properties) has taken us only up to Vκ+1 . Also, property (B) in Properties suggests that the critical sequence derived from j plays a special role (for set Dedekind self-maps, it forms a blueprint for ω) and that the critical sequence “emerges” from consideration of a sequence of restrictions of the Dedekind selfmap under consideration. However, under MUA, the critical sequence for j is not even formally defined, and it is not possible to define restrictions j Vj n (κ) for n > 0; doing so, entails much stronger large cardinal consequences than those available in the theory ZFC+BTEE+MUA. The last paragraph of the previous section suggested a way to proceed further, and to address the limitations we have just outlined. Working in the theory ZFC + BTEE provides us with a nontrivial elementary embedding j : V → v, and maximizes the preservation properties a Dedekind self-map from V to V could have. Insisting also that restrictions of j to various sets are also sets in the universe leads to significant strengthenings of the theory, in the direction of stronger large cardinals. As we mentioned in the last section, the theory ZFC + BTEE + ∃z (z = j P(j(κ))) is already strong enough to derive MUA, while the theory ZFC + BTEE + ∃z (z = j λ) + “λ is ≥ the supremum of the critical sequence” is inconsistent. In the spirit of maximizing the strength of our axiomatic extension of ZFC + BTEE, we consider the following new axiom, which we will add to ZFC + BTEE: Axiom of Amenability. For every set X, j X : X → j(X) is a set. Recall that the boldface j signifies that it is a symbol of the expanded language of set theory that we are working in, as we saw in our discussion of ZFC + BTEE + MUA. Any model hV, ∈, ji of ZFC + BTEE + Amenability will interpret j as a self-map j : V → V . Intuitively, the Axiom of Amenability is a way of ensuring that the “dynamics” of an elementary embedding j : V → V are present within the universe by requiring that the restriction of j to any set is also a set in the universe. In the literature, the set of axioms BTEE + Amenability is given the name The Wholeness Axiom or WA.79 We have the following result: 79 To be precise, the Wholeness Axiom is BTEE + Separationj , where Separation j is the usual Separation axiom, applied to j-formulas. Amenability is a consequence of Separation j . In the literature, BTEE + Amenability is denoted WA0 to indicate it is slightly weaker than WA. Nevertheless, it has been shown that all the known large cardinal consequences of WA can also be shown to be consequences of WA0 . Therefore, in this paper, we have emphasized the more intuitively appealing Amenability axiom. 146 Theorem 70 (Consequences of the Wholeness Axiom). [6] 80 Working in ZFC + WA, let j : V → V denote the WA-embedding and let κ denote the least ordinal moved by j. Then j is a Dedekind self-map (not a class or a set) that is BSP and has critical point κ. Moreover, κ has “virtually all” the known large cardinal properties; in particular, κ is super-n-huge for every n (and is in fact the κth such cardinal). Using WA, we are in a position to see even more clearly the extent to which the notion of a Dedekind self-map points the way to a deeper understanding of the origin of large cardinals. We once again refer to the Properties of Set Dedekind Self-Maps (A)–(D) mentioned earlier (p. 133). As we discussed for MUA embeddings, any WA-embedding will be a clear generalization of (A) and most of (B). In particular, relative to (B), the critical point κ in this case has extremely strong large cardinal properties; indeed, κ has virtually all known large cardinal properties.81 There are also natural analogues to (C)–(D), which we discuss next. These are very similar to those described in our analysis of ZFC + BTEE + MUA, except that we are now able to obtain a blueprint for all sets in the universe rather than just for all subsets of Vκ . The steps of reasoning are very similar to the MUA case, so we just outline the results. In place of elementary embeddings derived from a normal measure, we use, in the present context, extendible embeddings. To begin, then, we discuss the notion of an extendible cardinal a bit further. A perusal of our list of large cardinals given earlier (p. 126) shows that extendible cardinals are among the very strongest large cardinals in the list. A cardinal κ is extendible if, for every η > κ, there is an elementary embedding i : Vη → Vξ —called an extendible embedding— having critical point κ. Intuitively, extendible cardinals arise from a WA-embedding (and its iterates) by restriction: Certainly, for every η, if κ is the critical point of j : V → V , where j is a WA-embedding and κ < η < j(κ), it follows that j Vη is an extendible embedding with critical point κ. Likewise, for any η, j(κ) ≤ η < j(j(κ)), j ◦ j Vη is another extendible embedding with critical point κ, and so forth. Associated with any extendible embedding i : Vη → Vξ , with critical point κ and η ≥ κ + 1, is a particular normal measure Ui defined, exactly as for the theory ZFC + BTEE + MUA, by Ui = {X ⊆ κ | κ ∈ i(X)}. Using the notion of extendible elementary embeddings, which, as we have seen, are simply derived embeddings from a WA-embedding j : V → V , one can define from j and its critical point κ a blueprint ` : Vκ → Vκ for all sets in the universe; ` will turn out to be, as before, a 80 The theory ZFC + WA provides examples of phenomena that were alluded to earlier in the paper. First of all, any WA-embedding j : V → V , viewed as a collection of ordered pairs, is an example of a subcollection of V that is neither a set nor a class. (In fact, the range of j is also an example of this kind.) The question of whether such entities could meaningfully exist was raised on p. 65. A second interesting phenomenon arises because of the fact (which we do not prove here) that the sequence κ, j(κ), j(j(κ)), . . . , j n (κ), . . . has no upper bound in ON. It follows that the function f defined on ω by f (n) = j n (κ) is a member of V <V but not a member of V <V ∩ V . This is because the range of f is unbounded in V , so f cannot be a set. We asked (p. 44) whether a function with these properties could exist. Note that even though the domain of f is a set, and the collection of ordered pairs that determine f is only countably infinite, it is not possible to view f as a set since its range is unbounded in the universe. 81 There are a few exceptions. WA does not produce the very strongest large cardinals; there are axioms, notably I1 , I2 , I3 , that have greater consistency strength, and so the large cardinals they produce are, consistencywise, stronger than those arising from WA. 147 co-Dedekind self-map. We will once again call ` a blueprint self-map. We will show that, for every set x ∈ V , there is an extendible embedding i such that i(`)(κ) = x. In this way, every set in the universe can be seen as arising from or being generated by the interplay of κ, j and `. The essence of the construction of ` is a Laver function f : κ → Vκ . A Laver function82 is a function f : κ → Vκ with the property that for any set x, there is an extendible embedding i such that i(f )(κ) = x. This definition is in contrast with that for an X-Laver function (in particular, a Vκ+1 -Laver function, as discussed in connection with ZFC+BTEE + MUA) which restricts the possible values of x to the set X. The Wholeness Axiom guarantees the existence of a Laver function: Theorem 71. [6](WA) Suppose j : V → V is a WA-embedding with critical point κ. Then there is a Laver function f : κ → Vκ . Proof. Let φ(g, x) be the formula ∃α α is a cardinal ∧ g : α → Vα ∧ ∀η∀ζ ∀i : Vη → Vζ elem(i, α) ∧ rank(x) < η < i(α) < ζ → i(g)(α) 6= x . where elem(i, α) asserts “i is an elementary embedding with critical point α.” The formula φ(g, x) says that g is not Laver, with witness x. Define f : κ → Vκ by ( ∅ if f α is Laver at α or α is not a cardinal f (α) = x otherwise, where x satisfies φ(f α, x). Let D = Uj be the normal ultrafilter over κ that is derived from j; that is: D = {X ⊆ κ | κ ∈ j(X)}. By Amenability, D is a set, since D = {X ⊆ κ | κ ∈ g(X)}, where g = j P(κ). Define sets S1 and S2 by S1 = {α < κ | f α is Laver at α} S2 = {α < κ | φ(f α, f (α)}. Clearly, S1 ∪ S2 ∈ D. To complete the proof, it suffices to prove that S1 ∈ D, and for this, it suffices to show S2 6∈ D. Toward a contradiction, suppose S2 ∈ D. Then, φ(f, j(f )(κ)) holds in V . Let x = j(f )(κ). Pick η so that rank(x) < η < j(κ). Let i = j Vη : Vη → Vζ , where ζ = j(η). By Amenability, i is a set, and is an elementary embedding with critical point κ. Clearly, in V , rank(x) < η < i(κ) < ζ and i(f )(κ) = x, contradicting the fact that φ(f, j(f )(κ)) holds in V . Therefore S2 6∈ D, as required. We can now state the main fact about `: 82 In the literature, it is called an extendible Laver function. This notion first appeared in [6] where the existence of such a function was proved to follow from (and to be equivalent to) the existence of an extendible cardinal. 148 Theorem 72 (Existence of Blueprint Self-Maps). (WA) Suppose j : V → V is a WA-embedding with critical point κ. Then there is a blueprint self-map Vκ → Vκ; that is, there is an ` : Vκ → Vκ such that for any set x, there is an extendible elementary embedding i : Vη → Vξ with critical point κ and η ≥ κ + 1 such that i(`)(κ) = x. Moreover, ` is a co-Dedekind self-map. The proof is exactly the same as the one given for the MUA case, replacing canonical embeddings iU with extendible embeddings i. For (D) in the Properties list, we show that there is, just as in the MUA case, a natural dual to `, which we will once again denote òp , which sends every sufficiently large set back to the “point” κ. The precise statement is given in the following theorem: Theorem 73. (WA) Let j : V → V be a WA-embedding with critical point κ. For each blueprint self-map ` : Vκ → Vκ , there is a Dedekind self-map òp : Vκ → Vκ with the following properties: (1) For every x 6∈ Vκ , there is an extendible elementary embedding i : Vη → Vξ with critical point κ and η ≥ κ + 1 such that i(òp)(x) = κ (2) òp is a section of `; in particular, ` ◦ òp = idVκ . Again, the proofs are essentially identical to those given in the MUA case, replacing embeddings iU obtained from a normal measure with extendible embeddings. In this case, we are not restricted to subsets of Vκ , as we were in the MUA case, but the reasoning is the same since now we have a (full) Laver function f that gives access to sets of arbitrarily large rank. As before, we do not claim that there is an i for which i(òp)(x) = κ when x ∈ Vκ . Indeed, as before, there is no way to define a section s of ` so that this is true, and the proof of this is identical to the one given in the previous section. We turn now to a more detailed examination of the our blueprint for the universe under WA. Remark 21 (The Blueprint Coder E for V ). We describe now in more detail the blueprint coder for the blueprint of V , given to us by WA. Suppose j : V → V is a WA-embedding, given to us by a model of ZFC + WA, with critical point κ. We show that the triple (`, κ, E), where E is the class of all extendible embeddings having critical point κ, is a blueprint for V , and also that (`, òp, κ, E) is a strong blueprint for V − Vκ . We refer to Definition 3. The fact that ` is a co-Dedekind self-map satisfies (1), and the existence of the critical point κ satisfies (2). Parts (3)(a)–(c) hold just as in the MUA case, where B = Vκ . For (4), it is clear that the Laver function, on the basis of which both ` and òp are defined, is defined from E, j, κ. And we have already verified condition (5) with Theorem 72. What remains to be checked are the two requirements in (3): that E consists of “structure-preserving” functionals—having essentially the same structure-preserving characteristics as j itself—and that E is “compatible with” j. As we have seen, the extendible elementary embeddings with critical point κ, which comprise E, can be seen as approximations to the given WA-embedding j; this takes care of the “structure-preserving” requirement. To see that E is compatible with j, we again—as with MUA—defer to the notion of compatibility developed in [6]. In the present context, this notion of compatibilty can be described as follows: Given a WA-embedding j : V → V with critical 149 point κ, for each β > κ, there is an extendible embedding i : Vβ → Vη ∈ E with critical point κ such that j Vβ = i.83 To verify that (`, òp, κ, E) is a strong blueprint for V − Vκ, we need to verify points (A)–(D) in Definition 3. This verification was done in Theorem 73. 24.1 Restrictions of a WA-Embedding and Its Critical Sequence In our analysis of the theory ZFC + BTEE + MUA, we showed that most of the interesting properties of Dedekind self-maps, which we listed in the section “Properties of Set Dedekind Self-Maps” (p. 133), had natural generalizations to MUA-embeddings j : V → V . However, one of the properties in the original Properties list had to do with the critical sequence of the selfmap and its emergence on the basis of successive restrictions of the original self-map. We listed several points (p. 142) that show that these particular characteristics of Dedekind self-maps do not generalize well to the context of MUA embeddings. The situation with WA is quite different. We indicate here how the stronger theory ZFC+WA corrects the “deficiencies” we encountered in the theory ZFC + BTEE + MUA. One technical point that cannot be ignored, however, is that the “weak” version of WA that we have been using so far will not be adequate for our discussion in this section. The version of WA that we have been using, and that is usually denoted WA0 , is BTEE + Amenability; the “full” version of WA, however, is BTEE + Separationj . The results concerning WA that we have discussed so far can in every case be carried out in the theory ZFC + WA0 . It is known [7], however, that our discussion below about indiscernibilty of the terms of the critical sequence and regarding the self-application operator · on j cannot be carried out in ZFC + WA0 . Therefore, for the remainder of this section, our base theory will be the “full” ZFC + WA. We recall that, given a Dedekind self-map j : A → A with critical point a, the terms of j 0 s 83 We continue the discussion that we began in the footnote on p.139, concerning compatibility, as it was developed in [6], and as it pertains to the class of extendible elementary embeddings. In [6], the class of extendible embeddings was shown to be captured by the suitable formula: θext(i, κ, β, M ) : ∃δ > 0 ∃ζ [β = κ + δ ∧ M = Vζ ∧ i : Vβ → M is elementary with critical point κ ∧ β < i(κ) < ζ]. Applying the definition of compatibility to the class Eκθext leads to the following: Suppose κ < λ < β, and iβ : Vβ → Vη ∈ Eκθext . Then compatibility of iβ with j up to Vλ simply means that j Vλ = i Vλ . Then, to say that Eκθext is compatible with j means that for each λ < j(κ) there is a β > λ and i : Vβ → M ∈ Eκθext that is compatible with j Vβ up to Vλ . To show this requirement is met, the fact that for each β > κ, there is an extendible embedding i : Vβ → Vη ∈ Eκθext such that j Vβ = i suffices. (See [6, Theorem 4.33].) 150 critical sequence a, j(a), j(j(a)), . . . arose as critical points of successive restrictions of j: A0 = A j0 = j : A → A crit(j0 ) = a A1 = j[A0 ] j1 = j A1 crit(j1 ) = j(a) An+1 = j[An ] jn+1 = j An+1 crit(jn+1 ) = j n+1 (a). Something similar occurs when we consider a certain sequence of restrictions of a given WAembedding j : V → V with critical point κ. We observed in our study of ZFC + BTEE + MUA that j Vκ : Vκ → Vj(κ) is an elementary embedding, but we were unable to consider restrictions like j Vj(κ), j Vj(j(κ)), . . . because the theory was not strong enough to admit such restrictions as sets in the universe. In the theory ZFC+WA, we no longer have this limitation and we are now able to observe that each of these restricted embeddings serves to bring to light the next term in the critical sequence κ, j(κ), j(j(κ)), . . ., as we observed in the case of set Dedekind self-maps. In particular, the observation that j Vκ : Vκ → j(Vκ) = Vj(κ) is an elementary embedding brings to light the image j(κ) of κ. This is analogous to the discovery of the critical point j(a) for j B obtained by restricting j : A → A to its range: j B : B → B, with B = j[A]. If we now restrict j to the codomain of j Vκ , we obtain (the set) j Vj(κ). Elementarity tells us that, for any x ∈ Vj(κ), j(x) ∈ j(Vj(κ)) = Vj(j(κ)). In this way, the next term of j’s critical sequence appears, namely, j(j(κ)). In general, the critical sequence for j can be seen to arise as the sequence of successive ranks of the codomains obtained by considering restrictions j Vκ, j Vj(κ) , j Vj(j(κ)), and so forth. The critical sequence of j that we obtain in the theory ZFC + WA must be unbounded in the universe: As we mentioned at the end of Section 23, any theory which includes both Amenability and the statement that the critical sequence is bounded must be inconsistent. A surprising fact about the terms of the critical sequence is that they are indistinguishable from each other on the basis of properties formulated in the language {∈} (that is, formulas that do not include the symbol j). More formally, the critical sequence hκ, j(κ), j(j(κ)), . . .i is a j-class of ∈-indiscernibles [6, Theorem 3.18]. This means that, for any ∈-formula φ(x1 , . . . , xm) and any two finite increasing sequences of integers i1 < i2 < . . . < im , k1 < k2 < . . . < km , V |= φ[j i1 (κ), . . ., j im (κ)] ⇔ V |= φ[j k1 (κ), . . . , j km (κ)]. One consequence is that each j n (κ) has all the same large cardinal properties as κ itself. Using this observation, one can show that “almost all” cardinals in the universe are super-n-huge for every n ∈ ω: Given m ∈ ω, let Xm denote the set of all cardinals α < j m(κ) that are supern-huge cardinals for every n. By indiscernibility, j m (κ) is super-n-huge for every n (since κ has this property), and by indiscernibility again, j m(κ) admits a normal measure that contains Xm 151 (since κ has this property relative to X0 ). Finally, let us say that for any class C of cardinals, almost all cardinals belong to C if, for all but finitely many m ∈ ω, C ∩ j m(κ) ∈ Dm . It follows that if C = {α | α is super-n-huge for every n}, then almost all cardinals belong to C. We consider next another sequence of restrictions of j that also leads to the critical sequence κ, j(κ), j(j(κ)), . . .. First, we can apply j to the set function j Vκ : Vκ ,→ Vj(κ) to obtain j(j Vκ) : Vj(κ) → Vj(j(κ)). By elementarity of j, we expect j(j Vκ ) to be an inclusion map and a restriction of some elementary embedding, but which one? Formally applying j suggests that the new map is j(j) j(Vκ) but the meaning of j(j) is not clear (certainly j itself is not in the domain of j). We can address the problem in the following way. Let α > κ + 1. Let f = j Vα . Certainly j Vκ = f Vκ . Now we can apply j: j(j Vκ ) = j(f Vκ ) = j(f ) Vj(κ). This example motivates the following definition: Definition 11. Suppose i, k : V → V are elementary embeddings. Then [ i·k = i(k Vα). α∈ON The operator · is called application. If i, k are elementary embeddings V → V , then i · k is also an elementary embedding. We shall often write ik for i · k. Note that j · j gives precise expression to the intuitive notation j(j). In particular, we may now write: j(j Vκ ) = jj Vj(κ) . Moreover, by elementarity of j, we have: jj Vj(κ) : Vj(κ) ,→ Vj(j(κ)). One may apply j in this way repeatedly (for instance, the next iteration yields the fact that j(j(j Vκ)) is the inclusion map, describing the fact that Vj(j(κ)) ≺ Vj(j(j(κ)))). This sequence of steps leads to the conclusion that there is an elementary chain of elementary submodels of V [6, Proposition 3.12]: Vκ ≺ Vj(κ) ≺ . . . ≺ Vj n (κ) ≺ . . . ≺ V. This statement provides us with one other powerful consequence: Because Vκ ≺ V , we may conclude that every first-order sentence that holds in V must also hold in Vκ . Moreover, since κ = |Vκ|, this tells us that the “point” κ is a set representative of the wholeness V —intuitively, κ can declare,84 “I am the totality.” 84 To make this point more concretely, one could define a model (κ, E) of set theory as follows: Let g : κ → Vκ be a bijection. Define E on κ by: α E β ⇔ g(α) ∈ g(β). Then i : κ → V , defined by i = incl ◦ g (where incl : Vκ ,→ V is the inclusion map), is an elementary embedding; in particular, for every sentence σ in the language {∈}, (κ, E) |= σ if and only if (V, ∈) |= σ. 152 In fact, by indiscernibility, each term of the critical sequence for j “embodies totality” in this way. This observation allows us to form a rather special blueprint for ω: Let S = {κ, j(κ), j(j(κ)), . . .} denote the j-class of critical points of j. Adapting the proof of Theorem 20(3) in minor ways, one may show (ω, s, 0) and (S, j S, κ) are Dedekind self-map isomorphic; in particular, there is a unique j-bijection F : ω → S that makes the following commutative (adapting the diagram from Theorem 40 slightly): ω s - F ? S ω F j S - ? S It follows that S is a blueprint for ω of a rather special kind. Unlike the ordinary natural numbers, each element of S, as we have been saying, “embodies the totality.” Though members of S are distinct and correspond to distinct natural numbers, each stands for totality and, at that level, each is indistinguishable from the others.85 One of the motivating themes of this paper has been to consider the set of natural numbers in a different light—as precipitations of transformational dynamics of an unbounded field, embodied in the concept of a Dedekind self-map. One philosophical motivation for this change of viewpoint is the desire to recognize, in accord with points made by Maharishi and other early philosophers (Section 2), the natural numbers as different on the surface but fundamentally the same, being in each case an individual expression of wholeness. In Maharishi’s approach, this viewpoint is expressed by declaring that each natural number is an expression of the Absolute Number. Viewing the natural numbers as arising from the Dedekind self-map (S, j S, κ) expresses this point of view, in two ways. First, any initial Dedekind self-map gives expression to the idea that individual natural numbers are precipitations of an underlying field (represented by the selfmap). Second, each natural number arising from this particular self-map (S, j S, κ) arises as the (Mostowski) collapse of “wholeness” (represented by the elements of S) to a particular value. 86 We summarize the observations of this section in the following theorem. Theorem 74. Suppose j : V → V is a WA-embedding with critical point κ. (1) The critical sequence of j, hκ, j(κ), j(j(κ)), . . .i, is unbounded in ON and is a j-class of ∈-indiscernibles. (2) Vκ ≺ Vj(κ) ≺ . . . ≺ Vj n (κ) ≺ . . . ≺ V. (3) Vκ ≺ V . Moreover, there exist a binary relation E on κ and an elementary embedding i : (κ, E) → (V, ∈). In particular, for every sentence σ in the language {∈}, (V, ∈) |= σ if and only if (κ, E) |= σ. 85 From the Vedic perspective, Maharishi remarks, “. . . every point of consciousness is infinitely correlated with every other point. . . [28, p. 64].” 86 Maharishi remarks, “At the door of the Transcendent, the finite numbers begin to tap the unlimited reservoir of the Absolute Number, rendering every finite value on the ground of the infinite—the Absolute” [25, p. 633]. 153 (4) Let S = {κ, j(κ), j(j(κ)), . . .}. Then (S, j S, κ) is Dedekind self-map isomorphic to (ω, s, 0), and is therefore a blueprint of the set of natural numbers. 24.2 A Connection Between the Critical Sequence of a WA-Embedding and Mathematical Physics We have seen that the critical sequence S of a WA-embedding j : V → V with critical point κ provides a blueprint for ω in terms of mathematical representatives of wholeness (namely, κ, j(κ), j(j(κ)), . . .). This result gives expression to the insight that the expressed value of the natural numbers has its reality in the unmanifest level of the natural numbers. Maharishi’s perspective on this point extends considerably further: Not only is the reality of the expressed value of numbers to be found within the unmanifest, but in fact the reality of the universe itself— of the space-time continuum, of every particle of manifest existence—is also to be found at the unmanifest level. All the verses of Veda are within AK. . .all the Devas (the administering intelligence of the universe) are lively within AK—the universe is lively within AK—the entire dynamism of the Veda and Vedic Literature, and the corresponding expression of the Veda and Vedic Literature in terms of the physical expression within the unmanifest structure of self-referral consciousness, presents the self-referral ocean of consciousness as the invincible Cosmic Catalyst of the entire ever-expanding universe. [25, p. 545] The notion of time comes into being with the phenomenon of sequence arising in the self-referral state of Samhita of Rishi, Devata, and Chhandas—the unmanifest continuum. The holistic value of Natural Law—Samhita—has within it the notion of time and space in the realization of itself—Rishi, Devata, Chhandas—the realization of the three-in-one structure of its unified nature. [25, p. 535] We describe here how the “absolute” blueprint S for ω that we located in the previous section can be elaborated further to produce a separable Hilbert space—the sort of mathematical object that serves as a mathematical foundation for development of quantum mechanics. In this paper, we have not provided physics applications that make use of the Hilbert space we define. Our intention is to just to show that the etheric realms of existence that are pointed to by the Wholeness Axiom can be applied to provide a new account in “absolute terms” not only of the set of natural numbers, but also of structures that are used to understand the physical world—in this case, a Hilbert space, suitable for use in an application of quantum mechanics. Recall that the j-class S is obtained by repeated applications of j to its critical point κ. It should be pointed out that all of the obvious elementary embeddings V → V definable from j that we have discussed in earlier sections—such as j ◦ j, j ◦ (j ◦ j), . . .—also have κ as their critical point. We now discuss how applying j to itself (using the application operator ·) in all possible ways produces embeddings which, collectively, have an infinite number of other critical points. For instance, the critical point of j · j turns out to be j(κ). (Here is the quick proof: Apply j to the true sentence “the critical point of j is κ” to obtain “the critical point of j · j is j(κ).” 154 A more careful proof is given in Proposition 76.87 ) As we will see, the collection Aj of all the 87 Here, when applying j to “the critical point of j is κ,” we resisted the temptation of writing “the critical point of j(j) is j(κ),” since, as discussed earlier, ‘j(j)’ makes no sense; in place of j(j), we wrote j · j since j · j can be understood as playing the same role as j(j) in elementary embedding arguments. In this footnote, we clarify the connection between j ·j and the intuitively appealing notation j(j). The technique we discuss is valid in the context of elementary embeddings i : Vλ → Vλ , with critical point < λ, and for which λ is a limit ordinal. Such embeddings are reflections of elementary embeddings V → V in the realm of sets, and have essentially the same characteristics as the embeddings V → V . However, embeddings Vλ → Vλ provide a special convenience because of the fact that we can explore layers in the universe above Vλ —in particular, the sets belonging to Vλ+1 . The technique we describe should therefore be viewed simply as a heuristic for our study of elementary embeddings V → V . We begin with the observation that every elementary embedding i : Vλ → Vλ can be lifted to a “partially” elementary embedding î : Vλ+1 → Vλ+1 . As a function, î is defined on the new sets in Vλ+1 (namely, the subsets of Vλ ) by [ î(X) = i(X ∩ Vα ). α<λ Now î is not exactly an elementary embedding on Vλ+1 , but it is elementary relative to all bounded formulas. We take a moment to discuss bounded formulas. Here is an example of a bounded formula: ∀x ∈ Vλ ∃y ∈ Vλ (x ∈ y). Notice that in this formula, all quantified variables (“for all x”, “there exists y”) are bound to Vλ . The formula says that for every element of Vλ , there is another element of Vλ which contains the first as an element. This is a true statement, and it is bounded relative to Vλ+1 because every quantified variable is bound to a set (in this case, Vλ ) that is a member of Vλ+1 . Because it is bounded like this, this statement is also true relative to Vλ+1 . Vλ+1 is not a model of set theory, but it is an ∈-structure, and does satisfy some of the ZFC axioms. It is not hard to prove that the statement above is true not only in the model V , but also in the model Vλ+1 . Here is the same formula without the bounds; we’ll call this formula (for now) ψ: ψ : ∀x ∃y (x ∈ y). This statement says that any set is contained in another set. It is certainly true in V . But it is not true in Vλ+1 . For instance, if x = Vλ , it is not true that there is another set y in Vλ+1 for which x ∈ y. The first of the statements above is called the relativization of the second formula ψ to Vλ . The first formula is often denoted ψVλ . ψVλ is true in both V and Vλ+1 whereas ψ is true in V but not Vλ+1 . These considerations lead to the following. Fact 1. Suppose i : Vλ → Vλ is elementary. Let î be the extension of i to Vλ+1 described above. Then for any formula φ(x1 , x2 , . . . , xn ) and any a1 , a2 , . . . , an ∈ Vλ+1 , (a) φVλ [a1 , a2 , . . . , an ] is true if and only if it is true in Vλ+1 , and (b) φVλ [a1 , a2 , . . . , an ] iff φVλ [î(a1 ), î(a2 ), . . . , î(an )]. Fact 1(b) is a precise way of saying that î is elementary relative to certain types of formulas. All this effort allows us to view the application operation in a more comfortable way: Fact 2. For any nontrivial elementary embeddings i, k : Vλ → Vλ , i · k = î(k). î(k) makes sense here because k ⊆ Vλ . For this reason, one sometimes writes i(k) instead of i · k, as a convenient notation, even though it is not correct to do so. To illustrate how this approach works, we prove a slight generalization of the fact mentioned in the main text, that crit(j · j) = j(κ): 155 elementary embeddings obtained from j through self-application has an internal structure that lends itself to analytical methods. In particular, Aj admits a natural metric that turns it into a dense-in-itself metric space. The completion Āj of Aj is then seen to be a perfect Polish space, which is a traditional starting point for defining a complex Hilbert space. This kind of Hilbert space can be used as a mathematical foundation for development of quantum mechanics. This path into mathematical foundations of physics has the advantage that the underlying space Āj is composed of the “absolute elements” provided by the Wholeness Axiom, or limits of sequences of such embeddings. We begin once again with a fixed WA-embedding j : V → V with critical point κ. For any n ≥ 1, let κn = j n (κ), and let κ0 = κ. The next proposition collects together a few well-known properties of this application operation. Proposition 76. (1) crit(j · j) = j(κ). (2) For all n ≥ 1, j · j(κn) = κn+1 . (3) (j · j) ◦ j = j ◦ j. Proof of (1). Let α > κ. Let i = j Vα. Since crit(i) = κ, the following is true in V : ∀β < κ (i(β) = β) and i(κ) > κ. Proposition 75. For any nontrivial elementary embeddings i, k : Vλ → Vλ , crit(i · k) = i(crit(k)). This proposition says that to find the critical point of the embedding i · k, simply apply i to the critical point of k. We are not assuming that the critical point of either embedding is κ, or that the critical points of the two embeddings are the same. In the special case in which i = k = j and crit(k) = κ, the result in the main text follows: crit(j · j) = j(κ). Here’s is the logic: Let α = crit(k). The following formula states that α is the critical point of k: k(α) 6= α and for all β < α, k(β) = β In fact, this is a formula of the form φ(x, y), defined above, with the substitutions x = k and y = α. There is just one quantified variable: ∀β < α. This could be written equivalently as ∀β ∈ Vλ (β < α −→ . . . ) (since α ∈ Vλ ), so the quantifier is bound to Vλ . So, φ(k, α) is a true statement in V . Since it is bounded in Vλ+1 (in the sense above), it is also true in Vλ+1 . Now we apply î to φ(k, α), and we may therefore conclude that φ(î(k), î(α)) holds in Vλ+1 , since î is elementary relative to such a formula. Again, by Fact 1(a), φ(î(k), î(α)) must also hold in V . Finally, let’s consider what the formula φ(î(k), î(α)) actually says. Returning to the informal form of the formula, we have î(k)(î(α)) 6= î(α) and for all β < î(α), î(k)(β) = β. In other words, the critical point of î(k) is î(α). Now î(α) = i(α) = i(crit(k)), so we have shown that crit(i · k) = crit(î(k)) = i(crit(k)). 156 By elementarity of j, the following also holds in V : ∀β < j(κ) (j(i)(β) = β), (+) and (++) j(i)(j(κ)) > j(κ). Since (+) and (++) hold for all large enough α, we conclude that both hold true if j(i) is replaced by j · j. Therefore: crit(j · j) = j(κ). Proof of (2). Let α > κn−1 . Let i = j Vα. Applying j to the formula i(κn−1 ) = κn and noting that j(κn−1 ) = κn yields (+) j(i)(κn) = j(i)(j(κn−1)) = j(κn ). Since κn−1 ∈ dom i, κn ∈ dom j(i), and so j(i) and j · j agree on κn . We have from (+): j · j(κn ) = j(i)(κn) = j(κn ) = κn+1 . Proof of (3). Let x be a set and α an ordinal such that x ∈ Vα. Let i = j Vα and let y = j(x). Then, applying j to the formula i(x) = y yields (j · j)(j(x)) = j(i)(j(x)) = j(y) = (j ◦ j)(x), as required. (j was applied in the middle step.) For any nontrivial elementary embedding k : V → V (whose critical point may or may not be κ), define k[n] and k[n] inductively by: k[1] = k[1] = k k[2] = k[2] = k · k k[n+1] = k · k[n] k[n+1] = k[n] · k. Fact 0. crit(j [n+1]) = κn . Proof. By induction on n. We wish to define a collection Aj consisting of all the elementary embeddings V → V obtainable by applying j to itself finitely many times. To do this formally, we first define j-terms to be all terms obtainable by finitely many applications of the following rules: 157 (j) is a j-term if i, k are j-terms, (i · k) is a j-term Then we define Aj to be the collection of j-terms. We also define crit(Aj ) = {γ | γ is the critical point of some k in Aj }. A fact that we have already assumed without proof (see p. 152) is that all elements of Aj are elementary embeddings.88 Proposition 77. Let id be the identity function (on V). Then id 6∈ Aj . Proof. We proceed by using the inductive definition of Aj . Certainly j 6= id. Assuming neither i, k equals id, we show i · k 6= id. For this we may invoke Proposition 75: Since k 6= id, k has a critical point, and so by the Proposition, i(crit(k)) = crit(i·k). Since crit(i·k) exists, i·k 6= id. In the literature, it is observed that Aj is a left-distributive (LD) algebra (usually, though, nontrivial elementary embeddings Vλ → Vλ are considered, where λ is a limit ordinal). Aj is the LD subalgebra generated by j (and this can be used as the formal defintion). See [12, 22]. Fact 0 tells us that the familiar elements of the critical sequence of j all turn out to be critical points of appropriately defined embeddings; therefore, crit(Aj ) must be infinite. A natural question is whether there is an embedding in Aj whose critical point is not a term in the critical sequence for j. It turns out that there are many such embeddings; the least such critical point lies between κ2 and κ3 .89 88 In fact, if k ∈ Aj , then k is a WA-embedding; that is, (V, ∈, k) is a model of ZFC + WA. Here, by “WA,” we mean the stronger form: WA = BTEE + Separation j . 89 For the interested reader, we exhibit an embedding in Aj whose critical point γ is not a term in the critical sequence for j. It can be shown that crit(Aj ) ∩ κ3 = {κ0 < κ1 < κ2 < γ}, so that γ is the least in crit(Aj ) that does not belong to the critical sequence. We prove that ((j · j) · j) · (j · j) has a critical point γ strictly between κ2 and κ3 . As we have already shown, crit(j · j) = κ1 < κ2 . We write: crit(j · j) < κ2 (∗) Now we apply the map (j · j) · j to the formula (*): (and suppress the · for readability): ((jj)j)(crit(jj)) < ((jj)j)(κ2). By Proposition 75: ` ´ crit ((jj)j)(jj) < ((jj)j)(κ2 ). We compute ((jj)j)(κ2 ): Notice that jj(κ1 ) = ((jj)◦j)(κ0 ) = (j◦j)(κ0 ) = κ2 . Likewise, jj(κ2 ) = (j◦j)(κ1 ) = κ3 . So we have ((jj)j)(κ2 ) = ((jj)j)(jj(κ1)) = jj(j(κ1 )) = jj(κ2 ) = κ3 Combining the last two displays, we have ` ´ crit ((jj)j)(jj) < κ3 . ` ´ Let γ = crit ((jj)j)(jj) ; we have shown so far thatγ < κ3 . Next we show that γ > κ2 . Again, since crit(jj) = κ1 , it follows that (∗∗) crit(jj) > κ0 . 158 Fact 1 [22]. crit(Aj ) has order-type ω. In fact, for each n ∈ ω, crit(Aj ) ∩ [κn , κn+1 ) is finite. Also, for each n > 2, crit(Aj ) ∩ (κn , κn+1 ) is nonempty. Fact 1 tells us that the critical sequence S of j is a proper subcollection of crit(Aj ). The fact that crit(Aj ) is well-ordered of type ω implies there is a successor function sj : crit(Aj ) → crit(Aj ), defined by sj (α) = least β ∈ crit(Aj ) such that β > α. It is easy to see that (crit(Aj ), sj , κ) is an initial Dedekind self-map. We state next (without proof) a remarkable result due to Laver. It tells us that any element of Aj is completely determined by its behavior on critAj . Fact 2 [22]. If i, k ∈ Aj and i crit(Aj ) = k crit(Aj ), then i = k. Lemma 78. Suppose k ∈ Aj and γ ∈ crit(Aj ). Then k(γ) ∈ crit(Aj ). Proof. We need to find i ∈ Aj such that crit(i) = k(γ). Let ` ∈ Aj be such that γ = crit(`). Let i = k · `. Then crit(i) = crit(k · l) = k(crit(`)) = k(γ). Using Fact 1, we define e : ω → crit(Aj ) to be the unique increasing enumeration of crit(Aj ). For every pair (i, k) from Aj for which i 6= k, let mi,k be the least n ∈ ω such that i(e(n)) 6= k(e(n)). By Fact 2, mi,k exists. Define d : Aj × Aj → R by ( 0 if i = k d(i, k) = 1/(mi,k + 1) otherwise Proposition 79. d is a metric on Aj . Proof. We’ll prove that the triangle inequality holds; the other properties of a metric are immediate. Let x, y, z ∈ Aj . We assume that x, y, z are all different (the other cases are easy). Let We apply (jj)j to (**); applying Proposition 75 as well, we have: ` ´ (∗ ∗ ∗) γ = crit ((jj)j)(jj) > ((jj)j)(κ0 ). Next, we compute ((jj)j)(κ0 ): Let y = j(κ0 ). Applying jj, we get, ` ´ (jj)(y) = ((jj)j) (jj)(κ0 ) = ((jj)j)(κ0). Since y = j(κ0 ), (jj)(y) = (jj ◦ j)(κ0 ) = (j ◦ j)(κ0 ) = κ2 , and this leads to: κ2 = ((jj)j)(κ0 ). Plugging into (***) yields: The conclusion is that ` ´ γ = crit ((jj)j)(jj) > κ2 . ` ´ κ2 < γ = crit ((jj)j)(jj) < κ3 and so we have exhibited a member γ of crit(Aj ) that does not lie in the critical sequence of j. 159 x0 , y 0, z 0 denote their corresponding restrictions to crit(Aj ). Let m be least such that x0 (e(m)) 6= y 0 (e(m)) and let n be least such that y 0 (e(n)) 6= z 0 (e(n)). Let r be least for which x0 (e(r)) 6= z 0 (e(r)). We show that r ≥ min{m, n}. Suppose s < min{m, n}. Then x0 (e(s)) = y 0 (e(s)) and y 0 (e(s)) = z 0 (e(s)), so x0 (e(s)) = z 0 (e(s)). Therefore, x0 and z 0 do not differ on any s < min{m, n}, whence r ≥ min{m, n}. To complete the proof, we must show that 1 1 1 + ≥ m+1 n+1 r+1 Without loss of generality, assume m ≤ n. Then, since r ≥ m, then 1 1 1 1 ≤ ≤ + , r+1 m+1 m+1 n+1 as required. Definition. A metric space (X, ρ) is dense-in-itself if for every x ∈ X, there is a sequence hyn in∈ω , where each yn is different from x, which converges to x (i.e., the sequence hρ(x, yn)in converges to 0). Proposition 80. (Aj , d) is dense-in-itself. Proof. Let k ∈ Aj . We define a sequence hin in∈ω of elements of Aj , all different from k, such that in → k. To define in , let α = e(n) ∈ crit(Aj ). Let mn be large enough so that κmn > k(α); it follows that κmn > α. Define in = j [mn+3] · k. Claim 1. For each n, in he(0), e(1), . . . , e(n)i = k he(0), e(1), . . ., e(n)i. Therefore, in → k. Proof of Claim 1. Let β ∈ crit(Aj ), β ≤ α. Let y = k(β). Applying j [mn +3] to the statement k applied to β = y yields j [mn+3] · k applied to j [mn +3] (β) = j [mn+3] (y). But j [mn+3] (β) = β and j [mn +3] (y) = j [mn+3] (k(β)) = k(β) since κmn > k(β). We therefore have: j [mn +3] · k applied to β = k(β). The result follows. Claim 2. For each n, in 6= k. 160 Proof of Claim 2. Case I. crit(k) ≥ κmn +2 . We show in and k have different critical points in this case: crit(in ) = crit(j [mn+3] · k) = j [mn +3] (crit(k)) > crit(k), as required. Case II. crit(k) < κmn +2 . By Fact 1, crit(Aj ) ∩ [crit(k), κmn+2 ) is finite. Let β be the largest element in this finite set. Since k(β) ∈ crit(Aj ) (by Lemma 78), k(β) > β (since β ≥ crit(k)), and β is the largest element of crit(Aj ) ∩ [crit(k), κmn +2 ), it follows that k(β) ≥ κmn +2 . Applying j [mn +3] to the statement k applied to β is k(β) yields j [mn +3] · k applied to j [mn+3] (β) is j [mn+3] (k(β)). But since j [mn+3] (β) = β and k(β) ≥ κmn +2 , (j [mn+3] · k)(β) > k(β). Therefore, in (β) 6= k(β). Proposition 80 will be useful when we look at the completion of (Aj , d); it will tell us that the completion must be a perfect Polish space [21, Theorem 1] and must therefore have cardinality 2ℵ0 (see for example [31, p. 66]). Before discussing this point, we will describe the completion of (Aj , d) in some detail. We will consider three ways of performing the construction. The least interesting way is to form equivalence classes of Cauchy sequences of members of Aj . This approach is certainly sufficient to provide a complete metric space in which Aj is densely embedded. But the elements of the space are not especially useful. A second approach to forming the completion—and we will use this approach to produce the completion in two other ways—is to embed (Aj , d) into a complete metric space and take the closure of the image. The resulting set, being a closed subspace of a complete metric space, must itself be a complete metric space. The all-purpose complete metric space that is often used in this strategy is obtained by defining a certain metric on the set of all functions from the given space into R that are bounded and continuous. In the present setting, let Y = {f : Aj → R | f is bounded and continuous}. Define a metric σ on Y by σ(f, g) = sup{|f (i) − g(i)| : i ∈ Aj }. Now we embed (Aj , d) into (Y, σ) using the following mapping H: For each k ∈ Aj obtain an fk ∈ Y defined as follows: fk (i) = d(i, k) − d(i, j). One shows that each fk is bounded and continuous (relative to the metric d). One also shows that H is an isometry with respect to d and σ; that is, H is one-one and for all `, k ∈ Aj , σ(f` , fk ) = d(`, k). 161 We outline the proofs: Claim A. For each k ∈ Aj , fk is bounded and continuous. Proof. It is easy to verify that range(fk ) is a subset of [−1, 1], whence fk is bounded. To prove 1 1 continuity we show that for each m ∈ ω, if d(i, i0) < 2m+2 , then |fk (i) − fk (i0 )| < m+1 . |fk (i) − fk (i0 )| = = ≤ ≤ ≤ |d(i, k) − d(i, j) − d(i0 , k) + d(i0 , j)| |(d(i, k) − d(i0 , k)) + (d(i0, j) − d(i, j))| |d(i, k) − d(i0, k)| + |d(i0 , j) − d(i, j)| d(i, i0) + d(i, i0) 2d(i, i0) 1 < . m+1 Claim B. H is one-one. Proof. Suppose fk = f` . We show that k = ` by showing that d(k, `) = 0. Since fk = f` , the two functions agree at ` ∈ Aj . We have 0 = fk (`) − f` (`) = (d(`, k) − d(`, j)) − (d(`, `) − d(`, j)) = d(`, k) − d(`, `) = d(`, k). Claim C. For every k, ` ∈ Aj , σ(f` , fk ) = d(`, k). Proof. First notice that by the triangle inequality, for any i ∈ Aj , d(k, `) ≥ |d(i, k) − d(i, `)|, and so the supremum of the set {|d(i, k) − d(i, `)| : i ∈ Aj } is achieved when i = `. Therefore, we have: σ(fk , f` ) = sup {|fk (i) − f` (i)| : i ∈ Aj } = sup {|d(i, k) − d(i, j) + d(i, j) − d(i, `)| : i ∈ Aj } = sup {|d(i, k) − d(i, `)| : i ∈ Aj } = |d(`, k) − d(`, `)| = d(`, k). These facts tell us that for each k ∈ Aj , fk is the representative of k in the space Y . In this way, we have identified each k ∈ Aj with a bounded continuous real-valued function. Unfortunately, in representing k by fk , one loses much information about the critical point and the values of k at crit(Aj ). In particular, given j and e, one cannot typically compute the critical point of k from fk without testing fk against arbitrarily many elements of Aj . For this reason, we will make use of an alternative approach. Another space in which to embed (Aj , d) is the space ω ω of all functions from ω to ω. This space is extremely useful because of its close relationship to R and several of its subspaces. A subspace of ω ω that will be important for our work is the set ω ↑ω of strictly increasing functions from ω to ω. We record some useful facts about ω ω and ω ↑ω , and then describe the embedding of (Aj , d). An important related subspace 162 Theorem 81 (ω ω ). (1) Define ρ : ω ω × ω ω → R by ( 0 ρ(f, g) = 1/(m + 1) if f = g if f = 6 g and m is least for which f (m) 6= g(m) Then (ω ω , d) is a complete metric space. The induced topology has a base consisting of sets of the form Np = {f : ω → ω | f ⊃ p}, where p is a finite sequence of elements of ω. (2) ω ω , with the topology induced by the metric ρ, is homeomorphic to both P, the set of irrationals (with the topology induced by R), and P[0,1] = P ∩ [0, 1]. (3) There is a continuous bijection τ : ω ω → [0, 1) that is “measure-preserving” in the following sense: Let µ be the product measure on ω ω , obtained from the measure on ω that assigns the value 2−n−1 to each n ∈ ω, and let λ denote Lebesgue measure on [0, 1). Then for any µ-measurable set X in ω ω , τ (X) is λ-measurable in [0, 1) and µ(X) = λ(τ (X)). (4) The subspace ω ↑ω of strictly increasing functions is closed in ω ω . Remark 22. The results mentioned in Theorem 81 are well-known [4, 30] so we omit the proofs, but include a few remarks about them. The proof of (1) is similar to Proposition 79. For (2), we give the construction that underlies the proof; we describe the constructions that demonstrate ω ω is homeomorphic to P and also to P[0,1], simultaneously. We begin by observing that any bijection v : ω → Z (Z is the set of integers) induces a homeomorphism v̄ : ω ω → Zω , defined by v̄(f )(n) = v(f (n)). Now build a collection {Is | s ∈ Z<ω } of open intervals in R (or [0, 1]) as follows: For each n ∈ Z, let Ihni = (n, n + 1) (or, in the case of [0, 1], let Ihni be the following collection of subintervals of (0, 1): 1 1 1 3 3 7 7 15 1 1 , , , , , , , , , , . . . .) . . ., 16 8 8 4 4 4 4 8 8 16 For any s ∈ Z<ω , assuming Is has been defined (for either construction), let {Is_ n | n ∈ Z} be a disjoint family of open subintervals of Is , ordered like Z and lying next to each other: The right endpoint of Is_ n is the left endpoint of Is_ n+1 , with union dense in Is and each having diameter ≤ half the diameter of Is . Now define τ : Zω → R (or τ : Zω → (0, 1)) by \ {τ (g)} = {Ig n | n ∈ ω}. Then τ is a homeomorphism Zω → R − H (or Zω → (0, 1) − H), where H is a countable dense set (consisting of the endpoints of the Is ’s). If Zω is given the lexicographic ordering, then τ is order-preserving. Since H is order-isomorphic to Q (and to Q ∩ (0, 1)), there is an order isomorphism R → R (or (0, 1) → (0, 1)) taking H to Q (or H to Q ∩ (0, 1)). We have exhibited homeomorphisms Zω → R − Q = P and Zω → P[0,1]. 163 For (3), we first describe τ : ω ω → [0, 1): Given f : ω → ω, τ (f ) is the real number x ∈ [0, 1) obtained as follows: We first specify a sequence s of 0s and 1s, coded by f : start with f (0) ones, then append a 0, then f (1) ones, then a 0, and so forth. Now define x by x= ∞ X s(i) i=1 2i . It is not hard to see that τ is a continuous bijection; however, τ −1 fails to be continuous at precisely the dyadic rationals. Let µ be the product measure on ω ω , where we first form the measure space based on ω ω : For each j ∈ ω, let Ej be the measure space (ω, P(ω), ν), with ν(n) = 2−n−1 . The product measure space is (ω ω , B, µ) where B is the smallest σ-algebra containing subsets A ⊆ ω × ω × . . . = ω ω of the form A0 × A1 × · · · where all but finitely many of the Ai are equal Q to ω, and µ(A) = j ν(Aj ). A couple of simple examples will clarify the relationship between µ and Lebesgue measure λ on [0, 1). We consider how µ acts on a couple of basic open sets of ω ω . Let p = h0i; we compute µ(Np): Since, as a product of components, Np has only one component different from ω, namely {0}, then µ(Np ) = 20−1 · 1 · 1 · . . . = 21 . Now τ (0) = 0, and it is easy to see that τ (Np) = [0, 12 ). Thus, λ(τ (Np)) = 21 = µ(Np ). Likewise, 1 3 µ(Nh1i) = 2−2 = λ , = λ(τ (Nh1i)). 2 4 We give a proof of (4): Suppose f ∈ ω ω − ω ↑ω , and let n be least such that f (n) ≥ f (n + 1). Let p = f n + 2. Then f ∈ Np and Np ∩ ω ↑ω = ∅. Therefore, whenever an element f of ω ω does not belong to ω ↑ω , there is an open set containing f and missing ω ↑ω ; hence, ω ↑ω is closed. Next, we embed (Aj , d) into (ω ω , ρ) by mapping each k ∈ Aj to the function fk : ω → ω defined by: (∗∗) fk (n) = e−1 (k(e(n))). One shows easily that the map F : k 7→ fk is an isometry: Let i 6= k in Aj . Notice that for each m, i(e(m)) 6= k(e(m)) if and only if fi (m) 6= fk (m). So ρ(fi , fk ) = 1/(m + 1) where m is least for which fi (m) 6= fk (m) = 1/(m + 1) where m is least for which i(e(m)) 6= k(e(m)) = d(i, k) We also observe here that each fk is strictly increasing: Certainly any k ∈ Aj is strictly increasing. Suppose m < n ∈ ω. Since both k and e are strictly increasing, fk (m) < fk (n) ⇔ e−1 (k(e(m))) < e−1 (k(e(n))) ⇔ k(e(m)) < k(e(n)) ⇔ e(m) < e(n) ⇔ m < n. 164 Therefore, the image B of (Aj , d) under F is a representation of Aj by increasing functions ¯ ω → ω.90 Let Āj be the closure of B in ω ω . Let d¯ be the restriction of ρ to Āj × Āj . Then (Āj , d) is (a third form of) the completion of (Aj , d). A complete metric space is separable if it has a countable dense set. The completion X̄ of any countable metric space X is always separable since X is dense in its completion. A point x in a metric space X is said to be an isolated point if {x} is an open set in X. A Polish space is a complete separable metric space. A metric space is perfect if it has no isolated points. (Both Euclidean spaces and Hilbert spaces are perfect.) The first part of the following proposition is well-known: Proposition 82. [21] The completion of a countable dense-in-itself metric space is a perfect ¯ is a perfect Polish space. Polish space. Therefore (Āj , d) Below, we list some additional properties of (Āj , d̄). These require a few preliminary definitions. Let Y be a Polish space. A subset X of Y is nowhere dense if, for every open subset U of Y , there is an open set V ⊆ U such that V ∩ X = ∅. A set X ⊆ Y has universal measure zero if, for every diffused (takes singletons to 0) Borel measure µ on Y , there is a µ-measure zero Borel set in Y that includes X. X has Marczewski measure zero if, for every perfect subset P ⊆ Y , there is a perfect subset Q ⊆ P such that Q ∩ X = ∅. Theorem 83. (1) All elements of Āj are strictly increasing; that is, Āj ⊆ ω ↑ω . (2) The set ω ↑ω is nowhere dense in ω ω ; thus, in particular, Āj is nowhere dense in ω ω . (3) There exists a finite diffused Borel measure µ on ω ω such that µ(Āj ) > 0 (in other words, Āj does not have universal measure zero). (4) (Āj , d̄) is closed and bounded in (ω ω , ρ) but not compact. Indeed, we can find an infinite subset X = {i1 , i2 , . . .} of Aj such that F [X] has no limit point in Āj (indeed, not even in ω ω ). (5) There is a real number c > 0 and an infinite decreasing T chain F0 ⊇ F1 ⊇ . . . of nonempty closed subsets of Āj for which hδ(Fn )i → c and yet n≥0 Fn = ∅. Proof of (1). Suppose hin in → k, where k ∈ Āj and each in belongs to Aj . As we observed earlier, each in is strictly increasing. Let m ∈ ω. It suffices to show k(m) < k(m + 1). Let n0 ∈ ω be large enough so that (+) for all n ≥ n0 , in m + 2 = k m + 2. 90 Dougherty-Jech [12] discuss B starting from first principles. A subset A of ω ω of strictly increasing functions is an embedding algebra if A is equipped with a binary operation · so that, for all f, g, h ∈ A, (1) f · (g · h) = (f · h) · (f · h) (self-distributivity) (2) if cr(f ) denotes the least integer moved by f (whenever there is one), then cr(f · g) = f (cr(g)). The set B is naturally an embedding algebra: Define · on B by fi · fk = F (i · k). Note that for any fk ∈ B, cr(fk ) = e−1 (crit(k)). With these definitions, (B, ·) can be shown to be an embedding algebra. 165 Since in is increasing, in (m) < in (m + 1). It follows by (+) that k(m) < k(m + 1), as required. Proof of (2). The basic open sets of ω ω are of the form Np = {f : ω → ω | f ⊃ p}, where p is a function p : n → ω, for some n ∈ ω. Given such a basic open set Np, we exihibit q for which Nq ⊆ Np and Nq misses Āj . If p is not strictly increasing, then we let q = p; this works because all elements of Āj are strictly increasing. If p is strictly increasing, let q = p_ h1, 0i. Clearly q is not strictly increasing, so Nq misses Āj , yet is a subset of Np . Proof of (3). It suffices to show that Āj is not universal measure zero. But this follows from the fact (see [30]) that every universal measure zero set has Marczewski measure zero (but no perfect set could have Marczewski measure zero). Proof of (4). Recall that for each k ∈ Aj , fk : ω → ω is defined as in (∗∗) by fk (m) = e−1 (k(e(m))). First we note that the ρ-diameter of ω ω is 1: sup{ρ(f, g) | f, g ∈ ω ω } = 1. Thus, in particular, the space Āj is bounded. To show Āj is not compact, we will make use of some known results about Aj . Let us say γ that, for any limit ordinal γ and any i, k ∈ Aj , i = k if for all x ∈ Vγ , i(x) ∩ Vγ = k(x) ∩ Vγ ; it γ can be shown [11] that = is an equivalence relation on Aj . We refer the reader to [11] for proofs of the following two facts: γ Fact A. [11, Section XII, Lemma 4.4]. Suppose i = k and α < γ. Then if i(α) < γ, we have that i(α) = k(α). Fact B. [11, Section XII, Lemma 4.13]. Let j : V → V be a WA-embedding with critical point κ. Let n ∈ ω. Then there is i ∈ Aj such that κn+1 i = j n. Claim. For each n ∈ ω there is in ∈ Aj such that crit(in ) = κ and in (κ) = κn . κn+1 Proof. Given n ∈ ω, using Fact B, we obtain in ∈ Aj such that in = j n . Since κ < κn+1 and j n (κ) < κn+1 , by Fact A, we conclude that in (κ) = j n (κ). Using the notation from the Claim, let X = {fin | n ∈ ω}. Notice that for each n, fin ∈ Nhni, and for m 6= n, Nhni ∩ Nhmi = ∅. It follows that X is infinite. To see X has no limit point (even in ω ω ), let g ∈ ω ω , and let n = g(0). Then g ∈ Nhni and X ∩ Nhni = {fin }. This shows g is not a limit point of X, since in a metric space, a limit point of a set Y has the property that every open set containing the limit point contains infinitely many points of Y . Proof of (5). Let X = {fin | n = 1, 2, . . .}, from part (4). Let F0 = X and for each n ≥ 1, let Fn = X − {fi1 , . . ., fin }. Each Fn is closed and F0 ⊇ F1 ⊇ . . .. Since for each n, fn 6∈ Fn , it 166 T ¯ i , fi ) = 1 whenever m 6= n, it follows that δ(Fn ) = 1 follows that n≥0 Fn = ∅. Also, since d(f m n for each n and so hδ(Fn )in → 1. This completes the proof that the sets F0 , F1 , F2 , . . . have the required properties. Remark 23. Part (4) shows that even though Āj is closed and nowhere dense in ω ω , Āj is not homeomorphic to the Cantor set (since the Cantor set is compact). Part (5) shows Āj is somewhat pathological; the result of (5) contrasts with the fact that, since Āj is a complete metric space,Tit has the property that whenever F0 ⊇ F1 ⊇ . . . are closed subsets with hδ(Fn )in → 0, then n≥0 Fn 6= ∅. This way of representing the completion of Aj , as in (∗∗), allows us to preserve information about the behavior of elements of Aj on crit(Aj ). For instance, if crit(k) = e(m), we can discover this fact by examining fk : Look for the least r for which fk (r) 6= r; this r must equal m. In addition, if k(e(m)) = e(n), we can discover this fact by applying fk to m; it must be the case that fk (m) = n. This feature of strong preservation of structure is in sharp contrast to the characteristic of the space of bounded continuous real-valued functions on Aj in which, as we have seen, very little of this structure is preserved. This representation of elements of Aj also allows us to understand the new elements of Āj as other functions from ω to ω, that is, as sequences of natural numbers. Example 11. We show how a limit of a sequence of elements in Aj is realized as a function ω → ω in Āj − Aj . Consider the sequence hj [m] : m ≥ 1i. For each m ∈ ω, let rm be such that e(rm ) = κm , which is the critical point of j [m+1]. Under F , j [m+1] is mapped to a function fm+1 where fm+1 rm = id rm , and fm+1 (rm ) > rm . So ρ(fm+1 , id) = 1/(rm + 1), and it follows that hfm im → id. As above, if we let B = F 00 Aj , I claim that id ∈ Āj − B: Suppose instead that it lies in B. Then there is an ` ∈ Aj whose restriction to crit(Aj ) is id = idC , where C = crit(Aj ). But this is impossible because, as we showed earlier, id 6∈ Aj , as required. Finally, we outline the steps one might typically take to make use of our perfect Polish space Āj in applications to quantum physics. We begin with a complex-valued Borel measure µ on Āj ; it would suffice to use the measure guaranteed by Theorem 83(3). As a first step toward defining the Hilbert space L 2 (Āj , µ), we define: V = {f : Āj → C | |f |2 is µ-integrable over Āj }, where C is the set of complex numbers. Defining addition and scalar multiplication pointwise turns V into a vector space. A quasi-inner product can be defined by Z (∗) hf1 , f2 i = f1 f2∗ dµ. Āj As usual in this construction, a property that is needed, but that we do not have, is hf, f i = 0 ⇔ f = 0. This difficulty is corrected by introducing an equivalence relation ≈ on V by f1 ≈ f2 iff |f1 − f2 | = 0 (µ-a.e.) on Āj . 167 For each f ∈ V , let [f ] denote the ≈-equivalence class containing f . Define V̄ = {[f ] | f ∈ V }. One shows that the inner product computation (∗) does not depend on the choice of representatives, and that addition and scalar multiplication are well-defined. Now, we have properly defined L 2 (Āj , µ) by the triple V̄ , µ and this inner product defined on equivalence classes. Because Āj is a Polish space, it follows that L 2 (Āj , µ) is a separable Hilbert space. We have now the mathematical setting for performing a quantum-mechanical analysis of some system behaving within Āj . We began this section by elaborating on the dynamics of a WA-embedding j : V → V . By studying the self-application operation · that allows us to combine WA-embeddings V → V to produce others, we discovered an infinity of such embeddings—forming the collection Aj —and a countable j-class crit(Aj ) of critical points that determine the behavior of the embeddings belonging to Aj . Having observed that Aj is a dense-in-itself metric space, we took the next logical mathematical step and obtained the completion Āj of Aj , which turned out to be a nowhere dense perfect Polish space—a space suitable for building a canonical Hilbert space X = L 2 (Āj , µ). Although we do not have a particular physical application in mind for this formalism, we have taken a step in the direction of forging a theoretical link between, on the one hand, our highly abstract mathematical model of wholeness and its dynamics, and, on the other hand, the mathematical formalism (that of separable Hilbert spaces) for studying the physical universe through the eyes of quantum mechanics. Such a link parallels, we feel, the real-life link that joins the unmanifest dynamics of pure consciousness to the phenomena of manifest existence. We close this section with a technical question that must not be overlooked, but which we have neglected so far, namely: How can we provide a proper formal treatment of the collection Aj as a j-class when each i ∈ Aj is itself already a j-class? Indeed, the collection Aj itself is too big to be a j-class; it doesn’t “fit” in the universe V . There are several ways to handle this situation. An elegant solution would be to make use of the following theorem due to Laver (see [12]): Theorem 84. (Aj , ·) is, up to isomorphism, the unique, free, monogenic, left-distributive algebra. Left-distributivity asserts that i · (k · `) = (i · k) · (i · `); Aj satisfies this property [11]. It is monogenic because the algebra is generated by a single element, namely, j. Because of freeness, this algebra is unique up to isomorphism. But there are examples of such algebras in the realm of sets. One example that has been studied extensively is found in the braid group B∞ with infinitely many generators b1 , b2 , . . .[11]. One can define an operation · on B∞ that makes B∞ a left-distributive algebra. It follows that if we form the left-distributive subalgebra B1 = hb1 i of B∞ , then B1 is an example of a free (monogenic) left-distributive algebra. Therefore, the interesting applications of Aj that we have in mind could be carried out using B1 in its place.91 91 We describe here a direct approach to coding elements of Aj so that Aj may be treated as if it were a j-class. We begin by recalling that the elements of j can be obtained by forming the closure of the following statements which define “elements” of Aj : (i) (j) is an element (ii) if k, ` are elements, (k · `) is an element. This finitary definition can be abstracted as a notion of a set of expressions, defined independently of the notion of elementary embeddings, and then any particular i ∈ Aj can be viewed as a realization of an expression, obtained 168 by substituting suitable restrictions of j for each element in the expression. This footnote is an elaboration of most of the details of this approach. We begin with an abstract definition of the set Expr of expressions: Start with an alphabet A = {a}, with a single element a. Expr is the smallest set closed under the following rules: (i) (a) is an expression (ii) if b, c are expressions, (bc) is an expression The length of an expression ex, denoted |ex|, is the number of occurrences of a in ex. An indexed expression is an expression in which each of the integers 1, 2, . . . , |ex| is assigned to an occurrence of a in ex, in descending order, scanning ex from left to right. An indexing of an expression ex is the result of assigning integers to occurrences of a so that ex becomes an indexed expression. We shall denote the assignment of an integer i to an occurrence a in an expression by ai . For example, an indexing of the expression (((a)(a))(((a)(a))(a)))(a) is (((a6 )(a5 ))(((a4 )(a3 ))(a2 )))(a1 ). One can show by induction that every expression has a unique indexing: Certainly (a) has the unique indexing (a1 ). Assuming expressions b and c have unique indexings, obtain an indexing for (bc) by increasing each index in b by the amount |c|. Since |bc| is a fixed size and indices must decrease from left to right, the indexing is unique. To represent a computation of i(x) for some i ∈ Aj , our plan is to substitute into each indexed expression a term of the form j Vα , for α large enough. The following lemma tells us what “large enough” needs to be. Lemma 85. For each n ∈ ω, j Vκn +ω·n ⊆ Vκn+1 +ω·n . Proof. For each x ∈ Vκn +ω·n , j(x) ∈ j (Vκn +ω·n ) = Vκn+1 +ω·n . Clearly, then, all ordered pairs of the form (x, j(x)) for x ∈ Vκn +ω·n , belong to Vκn+1 +ω·n . Next, we define a function that will facilitate our intended substitutions, in light of the Lemma. Definition 12. Define t : V → ω by t(z) = least m ∈ ω such that rank z < κm We define a substitution function Sub : (indexed expressions) → V , so that for each ex, z, Sub(ex, z) is obtained in the following way: For each ai in ex, substitute j Vκt(z)+i +ω·i for ai , and consider concatenations of these restrictions of j to be functional application. For example, let ex = (x3 )((x2 )(x1 )) be an indexed expression, and let z = κ + κ. Then t(z) = 1 since κ1 = j(κ) is the least cardinal in the critical sequence above z. We have: “ “ ”” Sub(ex, z) = j Vκt(z)+3 +ω·3 j Vκt(z)+2 +ω·2 j Vκt(z)+1 +ω “ “ ”” = j Vκ4 +ω·3 j Vκ3 +ω·2 j Vκ2 +ω . Now let A0j = {Sub(ex, z) | ex is an indexed expression and z ∈ V }. Note that the range of each Sub(ex, z) is bounded in V . Therefore, A0j is a j-class, and provides us with (a j-class of) codes for Aj in the following way: For each i ∈ Aj , let ex be an expression realized by i. Then for any z ∈ V , i(z) can be computed by Sub(ex, z)(z). (It can be shown that this computation does not depend on the representing expression ex—see [11].) The class A0j of codes is good enough to provide us with j-class versions of the collections/maps: (a) crit(Aj ) (b) the increasing enumeration e : ω → crit(Aj ) (c) the operator mi,k defined to be the least n for which i(e(n)) 6= k(e(n)). (d) the metric d : Aj × Aj → R. For (a), crit(Aj ) = {α | ∃ex, z (z is an ordinal and Sub(ex, z)(z) 6= z and ∀δ < z (Sub(ex, z)(δ) = δ))}. Parts (b)–(d) follow easily from (a). 169 Let us also remark that the seemingly larger completion Āj of Aj is not difficult to formalize since it has been formally defined as a subspace of ω ω . 24.3 The WA Blueprint and the Ancient View of Blueprints A theme that we have traced in this article from our initial study of the notion of infinite set, by way of Dedekind self-maps, through strong forms of Dedekind self-maps of the universe, including the theories ZFC+BTEE+MUA and ZFC+WA, is the idea of a blueprint. We think of a blueprint as containing the essence, the knowledge, and the vitality of all that is shaped from the blueprint. The blueprint is the link between the primary intelligence, out of which the blueprint is woven, and the diversity of form to which the blueprint gives rise. There are many instances in the ancient wisdom of this idea of a blueprint, emerging from the Source, and giving rise to everything else. A well-known example is Plato’s world of forms. Plato’s forms are fundamental archetypes or templates on the basis of which the changeable manifest world is constructed. Plato cites as examples of forms the form of virtue and geometric forms, such as a circle. The form of virtue is to be understood as the archetype by which we recognize a great variety of behaviors—which could arise in an endless variety of contexts—as instances of virtue. Likewise, geometric forms, like a circle, represent archetypes of another kind: For instance, though no one has ever seen in the physical world a perfect circle, it is by virtue of the form of a circle that the physical approximations to a circle that we encounter are recognized as approximations of a circle. In Plato’s dialogue, The Phaedo, he makes it clear that the forms are unchangeable, eternal, beyond sense perception (but knowable), of divine nature, and the cause of the multiplicity of beings, each shaped according to the parent form’s nature. Although there is a vast diversity of forms, Plato makes clear in The Sophist that the forms are subsumed under the highest form: the one being. By virtue of “participating” in the one being, each form is existent and is a “one” on its own, and all forms collectively form a unity, a oneness of being. The Sophist also makes clear a distinction between one being and the one; in Plato’s view, the one is devoid of multiplicty, while the one being is the interface between one and many.92 From the one being emerges, according to Plato, the intelligible realm, which has a threefold division, described in the Philebus as bound, infinite, mixed, and also as symmetry, truth, beauty;93 and described by the Neoplatonist Simplicius, in his commentary94 on Aristotle’s Physics, as intellect, intellection, intelligible. Simplicius remarks95 that these three, when considered as the constituents of the one 92 In The Sophist, we read, “According to right reason, that which is truly one should be. . . entirely without parts.. . . For, since in a certain respect being is passive to the one, it does not appear to be the same as the one” [35, Vol. 3, pp. 246-7]. On this passage, Thomas Taylor comments, “Plato here dividing the one and being from each other, and showing that the conception of the one is different from that of being, evinces that what is most properly and primarily one is exempt from the one being. For the one being does not abide purely in an unmultiplied and uniform hyparxis. But the one withdraws itself from all addition, since by adding any thing to it you diminish its supreme and ineffable union.”[35, Vol. 3, p. 243]. 93 See Thomas Taylor’s discussion of this division in his notes on the Parmenides [35, Vol. 3, pp. 162-3]. 94 See [35, Vol. 3, pp. 245–7]. 95 Simplicius says, “It remains, therefore, that the Parmenidean one being must be the intelligible, the cause of all things, and hence it is intellect and intellection, in which all things are unitedly and contractedly comprehended according to one union. . .” The use of these variations of the word “intellect” can easily be misinterpreted, though they are commonly encountered translations of the terminology used by Plato and the Neoplatonists. Thomas 170 being, are identical with one another. This emergence of three within the one being is, according to Neoplatonic doctrines, further elaborated in groups of three, forming a vast hierarchy of intelligence, remaining all the while unified with its source. See [32]. In a different direction, in the Sophist, Plato enumerates five genera of being—essence, sameness, difference, permanence, and motion. Again, these are to be understood as being no different from being itself, but provide rather a texture for grasping the purely abstract field of being. In the Timaeus, Plato discusses how the manifest universe is shaped by the world of forms. In that dialogue, he explains how the demiurge, a fundamental principle of intelligence, molds “the Receptacle”—which may be understood as pure emptiness—using the forms as a template. In the process, this intelligence intends that “all things should come as near as possible to being like himself” [5, Part I, p. 217]. Summarizing these points, we see in Plato’s vision that the ultimate nature of things is unity, which has an entirely unexpressed aspect that Plato calls the One, and an aspect that interfaces with and gives rise to multiplicity, which he calls the one being. In particular, the one being gives rise to a threefold division intellect, intellection, intelligible, which in turn unfold into additional triads and ultimately unfold as the world of forms. These divisions—the one being, the threefold division, the triads, and the realm of forms—are to be understood as one reality, in more or less elaborated form. Each can be viewed, in the sense we intend in this paper, as a blueprint for manifest existence. Then, manifest existence arises as an image of this blueprint, architected by a fundamental principle of intelligence. Another approach that considers manifest existence to be an expression of a fundamental blueprint is given in Maharishi Mahesh Yogi’s discussion of the Vedic literature. In the Vedic conception, the first step in the move of wholeness—the ultimate totality—within itself involves a gathering of all its unbounded potential into a single focal point; this point of focus is said to be a point of “infinite dynamism,” teeming with the tremendous power and possibilities of the Taylor hastens to distinguish ordinary discursive reasoning, which arises from the dianoetic power of the soul, from the “intellection” described by Simplicius, which arises from the noetic power of the soul: The rational and gnostic power of the soul receive a triple division: for one of these is opinion, another, the dianoetic power, and another intellect. Opinion therefore is conversant with the universal in sensibles, which also it knows, as that every man is a biped, and that all color is the object of sight. It likewise knows the conclusions of the dianoetic energy, but it does not know them scientifically. For it knows that the soul is immortal, but is ignorant why it is so, because this is the province of the dianoetic power.. . . For, the dianoetic power, having collected by a syllogistic process that the soul is immortal, opinion receiving this conclusion knows this alone, that it is immortal. But the dianoetic power. . .[makes] a transition from propositions to conclusions.. . . But the province of the intellect is to dart itself, as it were, to things themselves by simple projections, like the emissions of visual rays, and by an energy superior to demonstration. And in this respect intellect is similar to the sense of sight, which by simple intuition knows the objects which present themselves to its view. [35, Vol. 1, pp. 353–4] 171 Absolute.96 This focal point arises in the collapse of unboundedness;97 this collapse, according to Maharishi, is captured in the first syllable, AK, of Rik Veda, where the letter ‘A’ embodies the unbounded value (pronounced with a fully open throat) and ‘K’ embodies the point value (pronounced with full restriction of the throat). In this way, the letter ‘K’ represents this focal point of all possibilities and infinite dynamism.98 In the Vedic conception, this focal point begins to expand, step by step, into the fundamental frequencies from which the manifest universe is comprised. Collectively, these frequencies are known as the Veda and form a blueprint for the manifest universe. The term Veda is translated as “knowledge”; these fundamental frequencies represent the self-knowledge that emerges sequentially within the pure wakefuleness of wholeness.99 In the following passage, the Rik Veda itself describes its own unfoldment from its first syllable AK [8, pp. 180–181]: r.icho akshare parame vyoman yasmindevā adhi vishve nisheduh. The verses of the Veda exist in the collapse of fullness in the transcendental field, in which reside the fundamental frequencies responsible for the whole manifest universe. –R . k Veda I.164.39 The expression “verses” in the above passage is also sometimes translated “frequencies.” The quotation indicates how, within wholeness, the “transcendental field,” the basic frequencies of the univese emerge in the collapse of the unbounded fullness of the Absolute. The fact that this collapse is a gathering of potential to a point is important and is brought out by a basic feature in the structure of Vedic Sanskrit: The structure of the Vedic literature and the Vedic language is such that the first word in a verse, hymn, or longer section in the Veda contains in seed form the total content of this verse, hymn or section. Moreover, the sound of 96 Like Plato, Maharishi also considers the ultimate wholeness to have two aspects, one that is entirely unexpressed, which he calls singularity or pure Samhita (“Samhita” can be translated as “unity”) and Samhita of rishi, devata, and chhandas (that is, knower, knowing, and known in their unified state). In our discussion here, we focus on the latter since it is this aspect that provides a blueprint for the universe. In the following quotation, Maharishi describes this distinction between the two aspects of the ultimate and how these two aspects of wholeness are related to each other: Veda equates with the unmanifest self-referral intelligence of Samhita, which conceives of the three qualities of rishi, devata, and chhandas within its own self-referral singularity—singularity finds diversity within its structure. [28, p. 61] 97 On this point, Maharishi remarks, “It is interesting to observe that A in its continuum stands for the continuum of silence, and the collapse of this infinite value onto its point value generates dynamism within the nature of silence, and this silent dynamism is the lively home of all the Lasw of Nature from where specific Laws of Nature emerge within the wholeness of the total potential of Natural Law in A.” [25, p. 620] 98 Maharishi remarks, “Vedic Science identifies ‘K’ as the point of all possibilities . . . from which the whole sequential progression of the Samhita emerges simultaneously. This locates in ‘K’ the infinite creative potential of nature that in one stroke can give expression to the infinite diversity of creation” [29]. 99 In a quote from Maharishi cited in [36, p. 218], we read that the Veda is “a beautiful, sequentially available script of nature in its own unmanifest state, eternally functioning within itself, and, on that basis of self-interaction, creating the whole universe and governing it.” 172 this first word—indeed, even the first syllable—is said to contain the fundamental impulses or frequencies that structure the content of the entire verse, hymn or section. Maharishi describes this structure of the Veda as a“self-created commentary”—Apaurusheya Bhasya—since later stages of unfoldment express in ever greater detail the totality of knowledge inherent in the earlier stages [25, pp. 495–500]. Not only does the Veda give rise to every detail of manifest creation, but it does so, according to Maharishi, by virtue of its nature as simultaneously infinite dynamic and infinitely silent; all opposite values find their lively integration within this field.100 This high degree of integration is due to the fact that at each stage of unfoldment, the Veda remains completely self-referral and united with itself; the parts of expression never dominate the original underlying wholeness. This integration within the Veda is responsible for all unity and coherence displayed in manifest existence. How then does the Veda, as blueprint of the universe, actually give rise to the universe? According to Maharishi Vedic Science, the tranformational dynamics by which the universe “emerges” from the Veda is a principle of intelligence within the nature of pure consciousness itself, called vivarta. “The principle of vivarta makes . . . the Veda and Vedic Literature appear as Vishwa [the universe]” [25, pp. 377, 589]. So, in reality, the universe is not produced by the Veda, but by a power embedded in intelligence itself, the Veda appears as the universe.101 Summarizing the discussion so far, we list some of the key characteristics of the Veda, as a blueprint for manifest existence; we compare these points with the points we extracted earlier from Plato’s philosophy. (1) Every detail of the manifest universe arises from Veda; this manifestation or appearance is by virtue of an aspect of intelligence—vivarta—inherent in pure consciousness itself. In Plato, likewise, the world of forms provide a template for the manifest world, which arises by virtue of the dynamics of intelligence embodied in the demiurge. (2) Maharishi’s Apaurusheya Bhasya asserts that knowledge of the whole of Veda is present in expanding packets of knowledge, in increasingly elaborated forms: From the first letter, to the first syllable, to the first pada, to the first richa, to the first mandala, to the entire Veda. Each contains the totality of knowledge. In Plato, also, we see that the world of forms is appreciated in different degrees of elaboration—first as one being, then as the triad intellect, intellection, intelligence, and ultimately as the full range of eternal forms. Each stage of this elaboration is understood to be fundamentally the same as all the others, differing only in degree of elaboration. (3) The Veda embodies, in its unfoldment, simultaneously infinite silence and infinite dynamism. Recall also that two of the genera of being in Plato’s account of forms are permanence and motion, which resemble closely the Vedic notions. (4) Everything that emerges from the Veda arises in the collapse of fullness to a point. It is not clear from Plato’s writings that he held a similar view, but Neoplatonist commentators 100 See for example [25, p. 566]. Maharishi remarks, “Vishwa (universe) is not the parinam (result) of Veda, it is the vivarta of Veda” [25, p. 589]. 101 173 observe [35] that inspiration for Plato’s doctrine of forms (or “ideas” as they are somtimes called) came in part from the Chaldean oracles. Proclus in particular traces Plato’s forms to a particular poem from the Chaldeans in which it is stated, “The intellect of the Father made a crashing noise, understanding with unwearied counsel omniform ideas. But with winged speed they leaped forth from one fountain: for both the counsel and the end were from one Father” [35, Vol. 3, p. 20]. This “crashing” has been interpreted as the first sprouting of the world of forms, and perhaps parallels the Vedic conception of “collapse.” We now turn to a summary of characteristic features of a Laver sequence, derived from ZFC + WA, as a blueprint for the sets in V . The numbering of the points below is intended to correspond to the numbering in the list above. In the points below, we have taken certain liberties in our mathematical use of the term “blueprint.” Formally, a blueprint for V has been defined to be a triple (`, κ, E). However, ` was defined on the basis of a Laver sequence f : κ → Vκ . For this discussion, therefore, it will be convenient at times to refer to the Laver sequence f , rather than ` itself, as the blueprint for V . (1) As we already discussed (Remark 21), every set in the universe arises from the blueprint (`, κ, E), formed through the interaction of a WA-embedding j : V → V with its critical point κ. Moreover, the “decoding” of this blueprint is by virtue of mappings that resemble j itself (reminiscent of the demiurge transforming the world of forms into the universe, in such a way that “all things should come as near as possible to being like himself”). (2) Maharishi’s Apaurusheya Bhasya, and sequence of elaborations of Plato’s one being, have a parallel in the structure of a Laver sequence f : κ → Vκ under WA: For almost all α < κ, f α is itself Laver, another blueprint for the universe. Proposition 86. Suppose j : V → V is a WA-embedding with critical point κ and let E be the class of extendible embeddings with critical point κ. (A) Let D = {X ⊆ κ | κ ∈ j(X)}. Let f : κ → Vκ be a Laver sequence at κ. Then (+) {α < κ | f α is Laver at α} ∈ D. (B) There is a normal measure U on κ such that if X is defined by X = {α < κ | f (α) = f α and f α is Laver at α}, then X ∈ U . In other words, for almost all α, the αth term of f is f α which is itself a Laver sequence at α. Proof. Notice that whenever i ∈ E or i = j, i(f ) κ = f : For each α < κ, i(f )(α) = i(f )(i(α)) = i i(α) = i(α). The last step follows because i(α) ∈ Vκ . Proof of (A). Applying j to the set (+), it is clear that κ ∈ {α < j(κ) | j(f ) α is Laver at α} 174 since j(f ) κ = f . Proof of (B). Let i ∈ E be such that i(f )(κ) = f and let U = {X ⊆ κ | κ ∈ i(X)}. Then the result follows from the fact that κ ∈ {α < i(κ) | i(f ) α = i(f )(α) and i(f )(α) is Laver at α}. (3) The fact that the Veda simultaneously incorporates infinite dynamism and infinite silence, and that Plato’s one being embodies at once permanency and motion, find a parallel in the structure of a Laver sequence f in the following ways: Almost all terms of f are “silent,” and, at the same time, almost all terms are “dynamic.” Relative to a Laver function f , we interpret “silence” or “permanency” at a point α to mean that f does not move α; that is, f (α) = α. And we consider “dynamism” or “motion” to be represented by rapidly growing behavior; if a function dominates a known fast-growing function (like a generalized exponential), we consider that such a function exhibits a high degree of dynamism. Proposition 87. Suppose j : V → V is a WA-embedding with critical point κ. Let f : κ → Vκ be a Laver sequence at κ. Then (A) There is a normal measure U on κ that contains the set {α < κ | f (α) = α}. Therefore, f is “silent” almost everywhere (relative to U ). (B) Consider the following generalized exponential function A(α, β), for α, β below κ, defined by the following clauses A(α, 0) = 2α A(α, γ + 1) = 2A(α,γ) A(α, λ) = sup{A(α, γ) | γ < λ}(λ a limit) Then there is a normal measure U on κ such that the set {ρ < κ | f (ρ) is a cardinal and f (ρ) > A(ρ, ρ)} belongs to U . Therefore, f is “highly dynamic” almost everywhere (relative to U ). (C) Assuming102 no Laver sequence at κ is definable in (Vκ , ∈) with parameters,103 there is a Laver sequence f : κ → Vκ such that for each h : κ → κ definable in Vκ , {α | f (α) is a cardinal and f (α) > h(α)} 102 Assuming the existence of an I3 -embedding is consistent with ZFC, the theory ZFC + WA + “No Laver sequence at κ is definable in Vκ ” is also consistent. This is proven in [6]. The idea is: start with an I3 -embedding, which directly provides a transitive model of ZFC + WA; then use a modified reverse Easton forcing to obtain a model of ZFC + WA + V = HOD. In that model, no Laver sequence is definable in Vκ with parameters. 103 A function q : κ → Vκ is definable in (Vκ , ∈) with parameters if there are sets a1 , . . . , am ∈ Vκ and a formula ψ(x, y, z, u1 , . . . , um ) such that for all sets X, Y, Z ∈ Vκ Z = q(X, Y ) if and only if ψ(X, Y, Z, a1 , . . . , am ). An easy counting argument shows that there are precisely κ functions κ → κ that are definable in (Vκ , ∈) with parameters; therefore, they can be enumerated as {hξ | ξ < κ}, as discussed in the proof of this result. 175 belongs to D. In other words, f dominates, on a large set, every definable function κ → κ. Proof of (A). Let i ∈ E be such that i(f )(κ) = κ. Let U = {X ⊆ κ | κ ∈ i(X)} Let X = {α < κ | f (α) = α}. We claim that X ∈ U , which will complete the proof. We observe that κ ∈ i(X) = {α < i(κ) | i(f )(α) = α}, from which it follows that X ∈ U . Proof of (B) Let i ∈ E be such that i(f )(κ) = 2A(κ,κ) . Let X = {α < κ | f (α) = 2A(α,α)}. Since A is definable, i(X) = {α < i(κ) | i(f )(α) = 2A(α,α)}, and clearly κ ∈ i(X). But now let Y = {α < κ | f (α) is a cardinal and f (α) > A(α, α)}. Clearly X ⊆ Y , so Y ∈ U , as required. Proof of (C). To ensure that α 7→ f (α) dominates, on a D-measure 1 set, every function κ → κ that is definable in Vκ (with parameters), we define the sequence t of sets as follows: Let hhξ : ξ < κi enumerate the members of κ κ which are definable in Vκ . Define a function t : κ → κ by t(α) = sup{hξ (α) : ξ < α} + 1. Let φ(g, x) be the formula φ(g, x) : ∃α α is a cardinal ∧ g : α → Vα ∧ ∀η ∀ζ∀i : Vη → Vζ elem(i, α) ∧ rank(x) < η < i(α) < ζ → i(g)(α) 6= x , where elem(i, α) asserts “i is an elementary embedding with critical point α.” As in our earlier construction of a Laver sequence, the formula φ(g, x) says that g is not Laver, with witness x. Define f : κ → Vκ by   t(α) if f α is Laver at α f (α) = ∅ if α is not a cardinal   x otherwise, where x satisfies φ(f α, x). Let D = Uj be the normal ultrafilter over κ that is derived from j; that is: D = {X ⊆ κ | κ ∈ j(X)}. By Amenability, D is a set, since D = {X ⊆ κ | κ ∈ h(X)}, where h = j P(κ). As in our previous argument, define sets S1 and S2 by S1 = {α < κ | f α is Laver at α} S2 = {α < κ | φ(f α, f (α)}. 176 Clearly, S1 ∪ S2 ∈ D. We show that S1 ∈ D, and for this, it suffices to show S2 6∈ D. Toward a contradiction, suppose S2 ∈ D. Then, φ(f, j(f )(κ)) holds in V . Let x = j(f )(κ). Pick η so that rank(x) < η < j(κ). Let i = j Vη : Vη → Vζ , where ζ = j(η). By Amenability, i is a set, and is an elementary embedding with critical point κ. Clearly, in V , rank(x) < η < i(κ) < ζ and i(f )(κ) = x, contradicting the fact that φ(f, j(f )(κ)) holds in V . Therefore S2 6∈ D, as required. Finally, we prove that f has the required additional property. Suppose h : κ → κ is a function that is definable in Vκ with parameters a1 , . . ., am ∈ Vκ . Let T = {α | α is a cardinal, f (α) is a cardinal, and h(α) < f (α)}. Let χ < κ be such that h = hχ (this is possible since the hξ collectively enumerate all functions κ → κ that are definable from parameters in Vκ ). Suppose α ∈ S1 ∩ (χ, κ). Then f (α) = t(α) = sup{hξ (α) : ξ < α} + 1 ≥ hχ (α) + 1 > hχ (α). Therefore T ⊇ S1 ∩ (χ, κ) ∈ D, and so T ∈ D. We have shown that f dominates each definable function κ → κ on a set of D-normal measure 1, as required. (4) Just as everything in existence emerges from the collapsing dynamics of the Veda, so likewise every set in V arises from the collapsing dynamics of the blueprint (`, κ, E). We indicate here some highly “self-referral” dynamics that are responsible for the emergence of several particularly interesting sets: The critical point κ, the Laver sequence f itself, and each extendible elementary embedding e : Vη → Vξ with critical point κ. As usual, j : V → V denotes a WA-embedding with critical point κ and f : κ → Vκ is a Laver sequence at κ, and E is the class of extendible embeddings with critical point κ. (A) Collapse giving rise to κ. Let i ∈ E with i(f )(κ) = κ. As we showed before, such an i forces it to be the case that for some normal measure U on κ, the set {α < κ | f (α) = α} belongs to U . (B) Collapse giving rise to f itself. Let i ∈ E with i(f )(κ) = f . Then there must be a normal measure U on κ (namely, {X ⊆ κ | κ ∈ i(X)}) so that the set {α < κ | f (α) = f α} belongs to U . This follows from the fact that f = i(f ) κ, as we observed earlier. This phenomenon, that for so many α we have f (α) = f α, is reminiscent of the fact that the Veda is structured in self-referral loops;104 almost every value of f is structured in terms of the wholeness of the Laver sequence. (C) Collapse giving rise to each extendible embedding e : Vη → Vξ . Let i ∈ E be such that i(f )(κ) = e. Then there is a normal measure U on κ such that {α < κ | f (α) is a nontrivial elementary embedding of the form Vη → Vξ } belongs to U . From this point of view, the range of f contains almost exclusively extendible embeddings! This is evidence of another kind of dynamism inherent in the blueprint. 104 See [28, pp. 63–64]. 177 25 Conclusion We began our discussion of the set of natural numbers with a brief look at the viewpoint of the ancients. We found that ancient traditions of knowledge have recognized the natural numbers as having a common source, a unified foundation, whose internal dynamics lead to the sequential emergence of these numbers, while at the same time remaining connected to their source. We also recalled the Maharishi Vedic Science admonition that failure to view the natural numbers as connected to their source marks the beginning of ignorance. Maharishi’s name for the self-referral dynamics at the source of all the natural numbers is the Absolute Number. We went on to observe that the usual way of asserting the existence of the set N of natural numbers axiomatically, in the foundations of mathematics, tends to disregard the question of the possible “origin” of this set, and instead, simply asserts that it exists. We suggested that failure to view N as originating from something more fundamental has led to unnecessary limitations on the mathematician’s intuition regarding the infinite in mathematics. This difficulty, we argued, becomes apparent when one attempts to address the Problem of Large Cardinals. Trying to determine which of the known large cardinals should be taken to be “real” entities in the universe of sets requires a deeper intuition about the nature of the mathematical infinite. If one attempts to acquire the necessary insight by examining more deeply the foundational ZFC axioms, one is led to the Axiom of Infinity, which, in simply asserting that the set N exists, tells us very little about which large cardinals exist. We have suggested, in accord with the ancient view of the matter, that the natural numbers should be seen as emerging from self-referral dynamics of an unbounded field, and that it is these dynamics that should be declared by the Axiom of Infinity to exist. Moreover, we have shown that this philosophy is easy to implement: Our New Axiom of Infinity states that a Dedekind self-map—a map from a set to itself that is 1-1 but not onto—exists. The axiom emphasizes the existence of transformational dynamics rather than the existence of discrete values. We then went to considerable lengths to show how the sequence of natural numbers can be derived from this new axiom, without any reliance on the natural numbers in the process. We then were in a position to verify whether this new perspective could shed more light on questions about the mathematical infinite than the old perspective. It seemed natural, considering the formulation of the New Axiom of Infinity, to ask whether large cardinals, like the sequence of natural numbers, might be understood as emerging from some kind of Dedekind self-map; we suggested looking at such self-maps that were themselves proper classes. We found that proper class Dedekind self-maps are weaker than Dedekind self-maps that are sets, and do not imply the existence of infinite sets. We showed however that slight strengthenings of the properties of the proper class maps did allow us to derive infinite sets, and we argued that the strategies for strengthening the maps in these ways could serve as further guidance in formulating axioms to account for large cardinals. Moreover, the strategy of strengthening preservation properties was inspired by an examination of the characteristics of set Dedekind self-maps themselves: Dedekind self-maps always preserve the character of their domain—the image B of a Dedekind self-map j : A → A is itself Dedekind-infinite—and, in turn, j induces another such map, obtained by restricting j to B. Our efforts to find appropriate preservation properties that a Dedekind self-map j : V → V 178 might have, which would give rise to an infinite set, led to the following results: Existence of an infinite set is derivable from each of the following. (1) A Dedekind self-map j : V → V that preserves disjoint unions, the empty set, and singletons. (2) A Dedekind self-map j : V → V with critical point a with the following properties: (a) j preserves disjoint unions, intersections, terminal objects, and the empty set. (b) a ∈ j(A) is a universal element for j, for some set A. (c) The set {a} is also a critical point of j. (3) A self-map j : V → V with a strong critical point that preserves finite coproducts and terminal objects. Moreover, we showed that these properties, in each case, are consistent by giving examples under a suitable axiom of infinity. (For (2) and (3), we could provide an example assuming only the existence of ω. For (1), our only example was a model of ZFC + BTEE—the example given was a nontrivial elementary embedding j : L → L; this example, however, requires large cardinals.105) In another direction, we also observed that an essential characteristic of a Dedekind self-map j : A → A is the presence of a critical point a. We found that the sequence of natural numbers could be computed through interaction between j and its critical point; indeed, we discussed in some detail the sense in which the sequence {a, j(a), j(j(a)), . . .} forms a blueprint for the set ω of natural numbers, developing a reasonably precise notion of blueprint. The significance of the critical point in the study of Dedekind self-map suggested the possibility that large cardinals themselves could arise from the interaction between a proper class Dedekind self-map with its critical point, and that some parallel notion of “blueprint” for large sections of the universe should be definable. We saw an example of the role of the critical point for a suitably strong j : V → V in the construction of the Lawvere functor j : V → V, j = G ◦ F. In that construction, we found that the notion of the infinite—in the form of a Dedekind-infinite set and corresponding Dedekind selfmap—first emerges as the image of the critical point of j: j(1) = X1 = {η0 , f1 (η0 ), f12(η0 ), . . .} (where f1 = F(1)). Using these tools for generalization, we showed how certain combinations of preservation properties belonging to a Dedekind self-map j : V → V had large cardinal consequences. For instance, if j : V → V is a Dedekind self-map with critical point κ, κ must be inaccessible if the following properties hold: (A) j preserves ∈ (B) j preserves ordinals (C) j is BSP (Definition 6, p. 88) 105 It is unlikely that a model of (1) actually requires large cardinals, but we do not have a proof at this time. 179 (D) j preserves countable unions (E) j preserves the empty set (F) j preserves singletons (G) j preserves unboundedness (H) j preserves power sets. Moreover, we showed that these properties are consistent by showing that for any model hV, ∈, ji of ZFC + BTEE, these properties are satisfied by j (once again, an example of such a model is given by any nontrivial elementary embedding j : L → L).106 Another result in this direction was that, assuming j : V → V is a (essentially Dedekind) functor with a strong critical point and a weakly universal element a ∈ j(A) for j, for some set A, then, if j has the additional preservation properties listed below, there must exist a nonprincipal ω1 -complete ultrafilter on A—in fact, there must exist a measurable cardinal ≤ |A|. (A) j preserves countable disjoint unions (B) j preserves intersections (C) j preserves equalizers (D) j preserves the empty set (E) j preserves terminal objects. Consistency of these properties follows from the existence of a canonical model that can be built assuming the existence of a κ-complete nonprincipal ultrafilter on some uncountable cardinal κ; in other words, assuming existence of a measurable cardinal is enough to produce a model of all the preservation properties just mentioned. These latter results were improved slightly and expressed more elegantly by Trṅkova-Blass, who showed that there is a measurable cardinal if and only if there is an exact functor j : V → V having a strong critical point. To form still stronger statements, using these same tools, we strengthened the preservation properties even further. We considered stronger models hV, ∈, ji of ZFC + BTEE. For such models, because of limitative results due to Kunen, we observed that j cannot be definable in V ; however, it was observed that the formulation of these axioms using an extended language of set theory in which there is an extralogical symbol j averts the Kunen inconsistency. Recall that BTEE is a collection of axioms that collectively assert that j is an elementary embedding V → V with a least ordinal moved, denoted κ. Such embeddings are necessarily Dedekind self-maps, but are neither sets nor proper classes; they resemble proper classes because they have values arbitrarily high in V , but they are not defined by a single ∈-formula. We observed that the critical point κ of a BTEE embedding j : V → V is weakly compact; however, the theory ZFC + BTEE is not strong enough to derive a measurable cardinal; in 106 Existence of such a model requires considerably more than an inaccessible (more than an ineffable cardinal is required). We do not know at this time how to build such a model assuming only a single inaccessible. 180 particular, κ is not measurable. The natural choice for defining a normal measure on κ in the theory ZFC + BTEE, namely, (∗) U = {X ⊆ κ | κ ∈ j(X)}, is not available in this theory (that is, it is not a set in any model of the theory). Having seen by the Blass-Trnková result the naturalness of a measurable cardinal, and also having seen the naturalness of the notion of “nonprincipal ultrafilter” on a set A, we introduced the postulate MUA, which asserts that this naturally defined normal measure U , as defined in (∗), exists. The theory ZFC + BTEE + MUA renders the critical point κ of the embedding j : V → V a very strong kind of measurable cardinal. In addition to the fact that j exhibits the strongest possible preservation properties and strong large cardinal properties at its critical point, the theory is now strong enough to exhibit the beginnings of an interesting blueprint: From ZFC + BTEE + MUA, we can derive the existence of a blueprint (`, κ, E) for Vκ+1 , where E consists of ultrapower embeddings iU : Vκ+1 → M , having critical point κ. Still motivated by the properties that we originally found present in a set Dedekind selfmap—namely, the existence of a blueprint for ω—we sought to strengthen the theory ZFC + BTEE + MUA even further. One objective was to see whether it is possible to have a j : V → V that yields a blueprint of all of V rather than just an initial part of V . The direction for strengthening the theory further appeared when we attempted to generalize another feature of a set Dedekind self-map, namely, that many of the “interesting things” about the self-map arise when we consider successive restrictions. In the case of a Dedekind self-map j : A → A with critical point a, the sequence a, j(a), j(j(a)), . . . arose from a consideration of restrictions of j to successive ranges: First jB = j B : B → B, where B = ran j; then jC = j C : C → C where C = ran jB , and so forth. A critical point of jB is j(a); a critical point of jC is j(j(a)), and so on. When we attempt a similar sequence of steps in a model (V, ∈, j) of ZFC + BTEE + MUA, we immediately discover the limits of the theory; while we can define the restriction j Vκ : Vκ ,→ Vj(κ) in the theory, we cannot take the next step suggested by the set Dedekind self-map dynamics: The restriction j Vj(κ) is not a set in this theory; the reason is that existence of such a map (as a set in the universe) implies large cardinals that are stronger than the theory itself allows. Indeed, asserting the existence of restrictions of j to various power sets of κ and j(κ) results in a number of extremely strong theories—strong enough to derive existence of some of the strongest large cardinals; for instance, the theory ZFC + BTEE + ∃z (z = j P(P(j(κ)))) proves that κ is the κth huge cardinal. We also discovered where the limit of generalization in this direction lies: We cannot have a consistent theory that allows j λ to be a set where λ lies above the critical sequence κ, j(κ), j(j(κ)), . . .. This tells us that to carry this direction of generalization as far as possible, we require (if possible) that the critical sequence κ, j(κ), j(j(κ)), . . . is unbounded in the class of ordinals. Following this line of reasoning, we were led to the Amenability Axiom, which states that the BTEE-embedding j should have the additional property that for any set X, j X is also be a set; again, this theory has a chance to be consistent only if the critical sequence of j is unbounded in ON.107 107 It turns out that the theory ZFC + BTEE + Amenability is indeed consistent, relative to another very strong 181 The axioms BTEE together with the Amenability Axiom are known collectively as the (weak) Wholeness Axiom, or WA. We indicated earlier that the theory ZFC + WA is enough to imply existence of virtually all large cardinals; moreover, if j : V → V is a WA-embedding with critical point κ, we found that the “place of origin” of all large cardinal properites is κ itself, so, in this sense, our earlier observation that the natural numbers emerge from dynamics of a Dedekind self map interacting with its critical point is realized on the larger scale of emergence of large cardinals. Moreover, in the theory ZFC + WA, we also showed that our objective regarding construction of a blueprint for all sets is finally achieved; we exhibited a Laver sequence f : κ → Vκ (based on extendible embeddings) for V, and showed how existence of f leads to a blueprint (`, κ, E) for V itself, where κ is the critical point of the WA-embedding j and E consists of all extendible embeddings with critical point κ. We also showed that the theory ZFC + WA (in its stronger form, where WA = BTEE + Separationj ) is strong enough to include as sets all restrictions of j of the form j Vκ , j Vj(κ), j Vj(j(κ)), . . . , providing an alternative way to generate the critical sequence κ, j(κ), j(j(κ)), . . . and also providing a satisfying generalization of the restriction properties of set Dedekind self-maps. We also showed that another sequence of restrictions of a WA-embedding j : V → V is possible in the theory ZFC + WA, namely, the sequence j Vκ ≡ Vκ ≺ Vj(κ) j(j Vκ ) = j · j Vj(κ) : Vj(κ) ,→ Vj(j(κ)) ≡ Vj(κ) ≺ Vj(j(κ)) j(j · j Vj(κ)) = j [3] Vj(j(κ)) : Vj(j(κ)) ,→ Vj(j(j(κ))) ≡ Vj(j(κ)) ≺ Vj(j(j(κ))) ··· ··· leading to the remarkable fact that Vκ ≺ Vj(κ) ≺ · · · ≺ V . An easy proof then demonstrated that the critical sequence S of a WA-embedding j : V → V with critical point κ turns out to be yet another blueprint for ω, just as in the case of (ordinary) set Dedekind self-maps, except in this case, all elements of the blueprint have “full knowledge” of the universe V itelf since all statements σ (in the language {∈}) hold in V if and only if they hold in any one of the models Vj n (κ) . In this way, the elements of S are like Maharishi’s Absolute Number. Studying S further, we were led to a deeper investigation of the application operation · on the collection Aj of elementary embeddings V → V with fixed critical point κ, obtained by applying j to itself finitely many times in all possible ways. This investigation led to the discovery that Aj admits a natural metric that renders it a dense-in-itself metric space, whose completion Āj in ω ω forms a nowhere dense, perfect Polish space, which admits a finite, non-zero, diffused Borel measure. We arrived at the conclusion that we can obtain from Āj a reasonably natural Hilbert space L 2 (Āj , µ), which could in principle be used in the context of quantum mechanics. This discovery shows how, having discovered a blueprint for ω consisting of all “absolute” elements, large cardinal notion, called I3 . In any such model, the critical sequence of j is indeed unbounded in the ordinals of the model. 182 we could expand this discovery to include a rather concrete mathematical object, also composed entirely of “absolute” elements. Having arrived at a reasonable generalization of the characteristics of a Dedekind self-map to give an account of large cardinals, we are in a position to comment on the naturalness of this solution. In the first place, the path we have taken, starting from a set Dedekind self-map i : A → A, extending to self-maps V → V and incrementally strengthening them by gradually introducing generalizations of characteristics of i, already provides considerable evidence for the naturalness of the theory ZFC + WA. Arguing from another perspective, as in Section 19, one can use as a criterion for naturalness a parallel criterion for existence of an infinite set. We suggested in that section that the formulation There is an infinite set if and only if the forgetful functor SM → Set has a left adjoint illustrates naturalness of infinite sets in a particularly striking way, since it shows how the concept of an infinite set arises in the same way as so many other mathematical structures have arisen: from an adjoint situation, or equivalently, from a monad j : V → V . We suggested in that discussion (p. 118) that if it turned out that large cardinals also arise from a “monad-like” Dedekind self-map j : V → V , appearing first in the vicinity of the least critical point of j, then we would have similar evidence of the naturalness of large cardinals. The j : V → V that arises from the Wholeness Axiom satisfies these criteria to a large extent, though not completely. While it is unlikely that j itself is a monad (for example, it is known that κ ∈ j(κ) is not a universal element of j), large cardinals do arise in the vicinity of its critical point, and the existence of a Laver sequence seems to be a reasonable strengthening of the idea of a universal element: One can show, for example, that, if f is a Laver function obtained from j, then V = {i(f )(κ) | i ∈ E}, where E is the class of extendible embeddings with critical point κ. In this sense, then, the j : V → V provided by the Wholeness Axiom can be viewed as natural not just because of the natural process of generalization from existence of Dedekind self-maps that was employed, but also because it reflects to a large extent the highly natural dynamics of the adjunction relationship, by which the very notion of “infinite set” can be seen to emerge in mathematics. We began this paper by proposing a new Axiom of Infinity, that would reflect some principles of ancient and modern wisdom regarding the possible “origin” of the set of natural numbers. We suggested that by viewing the concept of an infinite set, particularly the set of natural numbers, as fundamentally the self-referral dynamics of an unbounded field, rather than a bag of discrete points, we would be able to approach questions about the infinite in mathematics, like the Problem of Large Cardinals, with more clarity and a stronger intuition. By now, we have demonstrated our claim reasonably well, and in so doing, justified to a large degree this new viewpoint about the “underlying reality” of the set of natural numbers. We summarize our conclusions in the following short list of assertions: (1) The infinite in mathematics is, fundamentally, the self-referral dynamics of an unbounded field. 183 (2) The set of natural numbers is a side effect of the dynamics of a Dedekind self-map. (3) Large cardinals are justifiable entities in mathematics because they arise through the same natural means and self-referral dynamics as the set of natural numbers themselves. 184 26 Appendix I: A First Look at How 1 Emerges from 0 We have discussed in this paper how notions of the mathematical infinite—from the bare existence of an infinite set to large cardinals—arise from the dynamics of Dedekind self-maps, which can be seen as mathematical representatives of the self-referral dynamics of an unbounded field. In particular, we have seen how the “Infinite” emerges in the hereditarily finite world HF by the introduction of a set Dedekind self-map, and how large cardinals can be seen to emerge into the world of ZFC by way of class Dedekind self-maps with strong preservation properties. In both cases, Dedekind self-maps can be seen as the catalysts that drive expansion from ZFC − Infinity to ZFC and from ZFC to ZFC + large cardinals. With these insights in mind, we venture into a rather speculative topic: How does the empty set ever give rise to a nonempty set? How does 0 give rise to 1? How does something arise from nothing? In the set theoretic universe V , to answer this question is to shed light on the first sprouting of the unmanifest into manifestation. Our proposal is to view the transition from 0 to 1 as being of the same character as the expansions described in the previous paragraph; indeed, that this fundamental transformation, from 0 to 1, is then reflected outwardly as the more elaborate transformations from ZFC − Infinity to ZFC and from ZFC to ZFC + large cardinals. This theme of unfoldment reflects, in the world of mathematics, the analysis of the structure of pure consciousness as described in Maharishi’s Apaurusheya Bhasya: The transformational dynamics that are found in the sequential unfoldment of the Ved, and ultimately of the creation itself, are to be traced back to the very first transformation—the transformation from ‘A’ to ‘K’. In this collapse of unboundedness to a point is found the seed of all transformations that emerge from it. Likewise, we hope to locate within this first stage of unfoldment—from 0 to 1—the dynamics at work in the larger-scale transitions within the dynamics of V . As a first naı̈ve attempt at answering this question, we consider whether there could exist anything like a Dedekind self-map from 0 to 0. We can certainly get started by examining any map j : ∅ → ∅. But there is only one such map, namely, the empty map, which is in fact simply ∅ itself. In particular, since domain, codomain, and map are all empty, there can be no critical point of such a map, no seed that can germinate into something that is not empty. j : ∅ → ∅ represents unmovable silence, a stillness that never changes, a state of unperturbable balance and symmetry. How does the emptiness of the empty set develop into concrete, bounded, nonempty, individual sets? Maharishi remarks [27, p. 286], “...the [empty] set is the self-perpetuating source of all sets;” in the same sentence, he says, “We have seen in mathematics that the consciousness of the mathematician is the source of all order....”. He also remarks [27, p. 294], In order to bring about the transformation of anything into anything, the intelligence of the [mathematician] must operate from the transcendental level, which is the basic level of intelligence of every object in creation. These points remind us of something that is in a sense obvious: The impulse of intelligence that drives the empty map j : ∅ → ∅ to expand into the universe is the impulse of the transcendental level of the consciousness of the mathematician. The empty set is the unshaped all-possibilities source of all sets, but the impulse for this emptiness to take shape as individual 185 sets does not arise from the objective form on its own.108 It is consciousness itself that is responsible for locating and bringing into expression the potential that is locked in the empty set. In this way, the empty set can be seen as the meeting point between the objective world of sets and mathematics and the subjective reality of consciousness. The impulse of consciousness that causes the sprouting of j : ∅ → ∅, we suggest, finds expression in what we could call unmanifest flows—transformational dynamics of wholeness that are realized as proper class maps, like the global successor function s and the power set operator P. We discovered that maps of this type do not have much strength on their own—for instance, they do not imply the existence even of an infinite set—but they provide a context in which manifestation occurs. To show how all-pervasive proper class maps really are, one can prove, for example, that every (set) Dedekind self-map is obtained from a proper class Dedekind self-map by restriction: Let f : A → A be a Dedekind self-map with critical point a. Define f : V → V by ( f (x) if x ∈ A f (x) = x if x 6∈ A Certainly f is a proper class map with the property that f = f A. But now, even though this is the case, it is not generally true, as we have already seen, that, for a given proper class Dedekind self-map j : V → V with critical point a that there is any set A at all that contains a and is j-closed; in other words, none of the restrictions of j to sets may turn out to be Dedekind self-maps. The conclusion is that the notion of a Dedekind self-map that is a proper class is what we would call an unmanifest flow, creating a possibility but not quite able to manifest on its own. We speculate that two of these unmanifest flows are primary, and work together to “form the universe” through their interaction, corresponding to two fundamental principles underlying all expression from the unmanifest: existence and intelligence. The flows we have in mind are the global successor function s and the power set operation. Each provides a sequential unfoldment of sets, but according to different themes. The power set operation P draws out from any set argument A all possibilities from the set, giving rise to the set of all subsets of the set A. Overall, P is what gives rise to the raw material of mathematics — all structures that are used in mathematics arise from the work of P. In this sense, P gives expression to the principle of existence. The successor operation s, applied to any set, points in the direction of the “next step.” This directional character of s is especially meaningful in the context of the class ON of ordinals, as it provides a dynamic expression of the well-ordered property of ON: For each ordinal α, s(α) is equal to the smallest ordinal that is greater than α. For this reason, we think of s as expressing the principle of intelligence in the unfoldment from the unmanifest. These two operations, s and P, follow identical steps at early stages. The critical step for each, from a philosophical point of view, is the move from 0 to 1. This is the transformation from 0 to {0}. Here, the value of emptiness in a sense awakens to itself to see itself as an object 108 Compare with this remark from Maharishi [27, p. 296]: “What is lacking in the objective approach of science, where the whole knowledge is truthfully contained in one little symbol, is that the symbolic expression of total knowledge . . . does not by itself evolve into the self-referral quality—the self referral subjectivity of the mathematician”. 186 of awareness. Seeing itself as an object means viewing the boundless in terms of a boundary— in this case, as a singleton set whose only element is itself. This is the crucial step, according to Maharishi, in the unfoldment of pure consciousness: consciousness becoming conscious, and becoming conscious of itself. This vital step, in the mathematical context, is performed by the unmanifest dynamics represented by the global successor operator and the power set operator simultaneously. Once 1 emerges, each operator is applied to produce the number 2. Then, as Maharishi has described in his Vedic Science, the number 3 emerges; 3 is the number that gives expression to the relationship between the two, between existence and intelligence; it is the number that gives expression to the flow between these, and to the awareness that one has of the other. Maharishi identifies the number 3 with the fundamental principles of rishi, devata, and chhandas—knower, process of knowing, and the known. In a similar way, we find that 3 marks the point at which the operations of s and P begin to be different: s(2) = 3, whereas P(2) = {0, 1, 2, {1}}, a new set that is not an ordinal. The operators continue thereafter to unfold along different paths, but there is a close relationship between them. This relationship can be expressed in the following commutative diagram: ON s - ON T T ? V P - ? V where T is the map ON → V defined in the remarks following Theorem 33, where it is shown to satisfy T(α) = Vα for every ordinal α. In particular, P(T(α)) = P(Vα ) = Vα+1 = T(s(α)). In this way, the impulse for sequential unfoldment, embodied in s is combined with the expansive power of the set-generator P to produce a sequence of stages V0 , V1, . . . , Vα, . . ., providing us with an orderly unfoldment of all sets in the universe.109 What we have suggested here is that 1 arises from 0 because of the confluence of two unmanifest tendencies operating in the “unmanifest context” in which 0 originally appears. This unmanifest context, and the impulses that arise within it, we recognize as the expression of the transcendental level of the consciousness of the mathematician: 1 cannot arise from 0 without the involvement of the subjective reality of consciousness. We speculated that among such unmanifest flows, two are primary: The power set operator P and the global successor operator s. The 109 In these points we have emphasized the active side of the unfoldment of the stages in the construction of V . In reality, the development of V0 , V1 , . . . , Vα , . . . involves a mix of active and silent steps. Active steps are performed at successor ordinals, having the form β + 1. Stages of the form Vβ+1 are obtained by applying the power set operator to the previous stage: Vβ+1 = P(Vβ ). However, whenever α is an infinite limit ordinal, Vα is obtained by collecting togetherSall stages that have been built so far—it is a restful phase in which no new sets are produced. In this case, Vα = β<α Vβ . We can represent the silent phase of construction with a diagram as follows: Let LIM denote the class of limit ordinals. Let H : LIM → V be defined by H(α) = T α. And let S : V → V be defined by (S ran x if x is a function S(x) = ∅ otherwise. 187 power set operator elicits from 0 its hidden value as a storehouse for all sets. The global successor operator elicits from 0 its capacity to set things in order through sequential unfoldment. Each operating on 0 in its own way, 0 becomes as if aware of itself and discovers itself to be {0}, or 1. Then, as 1 gives rise to 2, these operators begin separate but closely related lines of unfoldement, giving rise to the ordered unfoldment of all sets. Finally, we return to the hypothesis that we began with: That the dynamics of emergence of 1 from 0 should contain in seed form the dynamics of the larger-scale transitions in the unfoldment of all sets, such as the emergence of an infinite set in a ZFC − Infinity world. The dynamics involved in the emergence of 1 from 0 entail an objectification of the pure subjectivity represented by ∅, so that this emptiness is appreciated in terms of an objective, single-element set {∅}. This transformation occurs by introducing what we have called unmanifest flows, or proper class maps. In particular, s(∅) = {∅} = P(∅). In a similar vein, we have seen that the notion of an infinite set also arises as a kind of objectification of a potential. Without explicitly introducing the Axiom of Infinity, the sequences ∅, s(∅), s2 (∅), . . . and ∅, P(∅), P2 (∅), . . . cannot be seen as sets, but rather as proper class sequences, representing in a sense unmanifest possibilities. This potential for infinity is realized by declaring—or, in a parallel way, in the subjective world, recognizing—that these class maps have a closure point: There is a set A such that s A : A → A and P A : A → A. The recognition of something infinite is at the same time a bringing into concrete realization certain unmanifest flows or possibilities. Something similar happens in taking the step of including the Wholeness Axiom as a fundamental axiom of set theory; this step corresponds on the subjective level to a certain kind of awakening to the reality that self-referral dynamics of consciousness are in fact responsible for the emergence of everything. The truth of WA is recognized in acknowledging the presence of a transformation j : V → V , a transformation of wholeness, that preserves all first-order properties but at the same time remains connected to every point (j X is a set for every set X). In this way, we see that emergence of 1 from 0, of infinite from finite, and of large cardinals from the ordinary world of sets occurs as a realization or manifestation of a possibility that is evident at an undeveloped stage: The empty set contains the possibility of 1; the world HF of hereditarily finite sets recognizes a potential infinity; and in the standard universe, the large cardinal nature of V itself, as well as its natural transformations V → V that accompany its construction, points to the potential for existence of large cardinals. In every case, these potentials are realized objectively in the mathematics by declaring that certain proper class functions The silent phase of the construction is then represented by the following commutative diagram: LIM id - LIM H T ? V S - ? V Here, the action of the diagram is driven by the identity transformation id : LIM → LIM rather than by s as was done for successor stages. 188 exhibit certain new (but natural) characteristics; this objective declaration corresponds on the subjective level to a certain realization about the nature of one’s own consciousness. 189 27 Appendix II: Unfoldment of the Universe Within the Unmanifest In Appendix I, we attempted to give an account of how 1 arises from 0—the mathematical analogue to the philosophical question of how “something” arises from “nothing.” We observed that the transformation of 0 into 1 occurs by virtue of class operators, like the power set operator and the global successor function. These, we argued, live in the unmanifest context for creation and provide the hidden impetus for unfoldment. From the point of view of giving an account of the universe from first principles, there is something unsatisfactory about the approach taken there. From the ancient point of view, everything arises from the dynamics of pure consciousness, the dynamics of one fundamental field. Nothing from the outside is needed to stimulate these dynamics. In the case of mathematics, however, the empty set really has nothing in it that would cause it to expand into other sets. While it is true that, by virtue of the construction of the cumulative hierarchy V , each set in V can be appreciated as a pattern involving set braces ({, }) and the empty set, these complex patterns cannot be said to arise from dynamics that are somehow “inherent in” the empty set. In this Appendix, we describe another approach, which does account for the universe in a way that parallels more closely the ancient view. We begin by asking for a solution to a set of equations that express essential characteristics of the field of pure consciousness. These essential characteristics are sufficient, from the Vedic point of view, to give an account of the origin of everything within the field of consciousness. We then ask whether, within the realm of mathematics itself, there is a solution to these equations. We write these equations first and then explain their significance: (+) xx = x = {x}. The notation xx is intended to mean the set of all transformations from x to x, in the usual way. A solution Ω to (+) would have the following characteristics: (1) Ω consists precisely of all possible transformations of itself to itself. (2) One of the transformations of Ω to itself is Ω itself: Ω : Ω → Ω (since Ω ∈ Ω). (3) Ω and its collapse to a point {Ω} are one and the same In this Appendix, we show there is a natural (and unique) mathematical solution Ω to (+), using a slight “expansion” of the usual ZFC axioms. We show also how it is possibe to derive all sets in the universe from this one point Ω, and at the same time, how all sets naturally collapse back to Ω. Moreover, the entire mathematical landscape will be seen, on this view, to be nothing but patterns and permutations of this one “reality” Ω. Properties (1)–(3) tell us that Ω—or, equivalently, the ‘x’ of equations (+)—exhibits the essential characteristics of pure consciousness itself, as described by Maharishi [25]: It is single and undifferentiated (like Ω) but, from a certain point of view, has internal dynamics that arise from the fact that, being consciousness, it assumes the roles of knower, known, and process of knowing. In the role of process of knowing, it transforms itself within itself. This is seen in the 190 case of Ω from the fact that Ω ∈ ΩΩ , which implies that Ω : Ω → Ω. Also, in the role of knower, it is the witness of all such transformations; it contains within itself all possible transformations that could arise within it. This role as knower is expressed by the fact that Ω = ΩΩ . It also assumes the role of a point within itself (Ω = {Ω}) and in this way becomes the object of its own perception, its own process of knowing. In addition, its dynamics can be appreciated as an expression of the field of Maharishi’s Vedic Mathematics—the dynamics by which, behind the scenes, Nature carries out its administration of the universe. This can be seen by the fact that Ω is itself equal to the evaluation map eval : ΩΩ × Ω → Ω defined by eval(f, x) = f (x). The evaluation map expresses the way in which Natural Law is applied to each point in existence to carry it forward to the next stage of its unfoldment. Ω As Samhita of Rishi, Devata, and Chhandas, and Administrator of the Cosmos (1) Rishi: Ω = ΩΩ (2) Devata: Ω : Ω → Ω (3) Chhandas: Ω = {Ω} (4) Structuring Dynamics of Creation: Ω = eval : ΩΩ × Ω → Ω : (Ω, Ω) 7→ Ω. We demonstrate the fact that Ω = eval in the following Proposition. We take as our background theory the ZFC axioms without the Axiom of Foundation (denoted ZFC− ), together with the assumption that Ω is indeed a solution to (+). To draw further conclusions, we will assume somewhat more later on. Proposition 88. (ZFC− ) Suppose Ω is a solution to the equations (+). (A) Ω ∈ ΩΩ , so we may write Ω : Ω → Ω. (B) Ω = (Ω, Ω) (C) Ω = Ω × Ω = ΩΩ × Ω. (D) For all x ∈ Ω, Ω(x) = Ω (using the representation of Ω in (A)). (E) Ω = eval. Proof of (A). This follows from the fact that Ω ∈ Ω = ΩΩ . 191 Proof of (B). We have the following derivation (using the fact that, by definition, for any sets x, y, (x, y) = {{x}, {x, y}}). (Ω, Ω) = {{Ω}, {Ω, Ω}} = {{Ω}, {Ω}} = {{Ω}} = {Ω} = Ω. Note that the proof of (B) makes use only of the fact that Ω = {Ω} (and not the fact that Ω = ΩΩ ). Proof of (C). Note that Ω × Ω = ΩΩ × Ω since Ω = ΩΩ . For the first equality, we have: Ω × Ω = {(x, y) | x ∈ Ω and y ∈ Ω} = {(Ω, Ω)} = {Ω} by (B) = Ω Proof of (D). Since Ω = {Ω}, the only element of Ω is Ω. Therefore, it suffices to show that Ω(Ω) = Ω. But this is equivalent to the assertion that (Ω, Ω) ∈ Ω, and this follows from (B) and the fact that Ω ∈ Ω. Proof of (E). The set-theoretic definition of eval is: eval = {(x, y, z) ∈ ΩΩ × Ω × Ω | x(y) = z}. For any (x, y, z) ∈ eval, the fact that (x, y) ∈ ΩΩ × Ω implies that (x, y) = (Ω, Ω); and the fact that z ∈ Ω implies z = Ω. Therefore, the only element of ΩΩ × Ω × Ω is (Ω, Ω, Ω), and this element (x, y, z) does satisfy x(y) = z (as shown in (D)). Therefore, eval = {(Ω, Ω, Ω)} = {(Ω, Ω)} = {Ω} = Ω, as required. The Proposition has been established in the theory ZFC− together with (+), but we have not yet demonstrated that this theory is consistent. We handle this issue by showing that (+) is derivable from the theory ZFC− + AFA, where AFA denotes Aczel’s Anti-Foundation Axiom. It is known that if ZFC is consistent, so is the theory ZFC− + AFA. Later in this appendix, we will discuss interesting points about the proof of this fact. We give a quick introduction to this axiom AFA so that we can use it to gain additional insights into this model of pure consciousness that we have proposed. A (directed) graph G is a pair (M, E) consisting of a set M of vertices and a set E of edges. Edges are represented by pairs of vertices.110 So, if u, v are vertices in a graph G and G has an edge from u to v, this edge is denoted (u, v). We also write u → v to indicate that (u, v) ∈ E. 110 We assume that our graphs do not admit parallel edges—i.e., each pair of vertices has at most one edge incident to both vertices. 192 A pointed graph is a graph with a designated vertex, called its point. When we draw pointed graphs, the point of the graph is the top vertex (whenever that makes sense) and descendants of the point evolve downward. (See examples below.) A pointed graph (G, p) = (M, E, p) is accessible if for all v ∈ M there is a path from p to v in G. Accessible pointed graphs are referred to by the acronym apg. If, for all v, there is exactly one path from p to v, then G is called a tree. A decoration of a graph is an assignment of a set to each node of the graph in such a way that the elements of a set assigned to a node are always assigned to the children of that node. In symbols, a decoration of G is a map d : G → V such that, for all vertices v, w of G, v → w if and only if d(w) ∈ d(v). A picture of a set X is an apg that has a decoration in which X is assigned to the point. A graph is well-founded if it has no infinite path. The following theorem does not require AFA; it follows from ZFC: Theorem 89. (A) Every well-founded graph has a unique decoration. (B) Every well-founded apg is picture of a unique set. (C) Every set has a picture. Here are some examples of well-founded apgs; each apg shown is a picture of the set that labels it. The first apg has no edges; this means that the set it represents can have no elements. Accordingly, the unique set that is represented by this apg is the empty set ∅. The designated point for the second apg has just one child, which, in turn, has no children; accordingly, the set it represents is {∅}, the set whose only element is ∅. Similarly, the third apg shown represents the set having as children the empty set and the set whose only element is the empty set, namely, {∅, {∅}}. These examples illustrate Theorem 89(A); in these simple cases, it is easy to see that there is only one way to decorate the vertices of the given apgs with sets. A reasonable generalization of this theorem to all possible apgs is the Anti-Foundation Axiom: The Anti-Foundation Axiom (AFA). Every graph has a unique decoration. An immediate consequence is the following: 193 Proposition 90 (Uniqueness Theorem). Every apg is a picture of a unique set. A consequence of the Uniqueness Theorem is that the following apg uniquely determines a set: The unique way to decorate this graph is with a set whose only element is itself; such sets can be shown not to exist in any universe built from the standard ZFC axioms; in particular, existence of such a set contradicts the Axiom of Foundation. For this reason, we must consider this new axiom AFA only in the absence of the Axiom of Foundation. In the theory ZFC− +AFA, the Uniqueness Theorem does indeed hold true. The unique set pictured by this single-loop graph is usually denoted Ω. Single-loop Graph with Unique Decoration Ω We observe that, with regard to the single-loop graph shown above, AFA tells us two things: First, that the graph is a picture for some set; and second, that there is only one such set. This latter point is as important as the first. Without AFA, even if we assume that the single-loop graph above can be decorated with a set X = {X}, there is no guarantee that it is unique (there could be X and Y such that X = {X} and Y = {Y } and X 6= Y ). We have now used the symbol Ω in two different ways: as a solution to (+) and as the unique decoration for the single-loop graph. We show now that this ambiguous usage is justified. Theorem 91. (ZFC− + AFA). Let Ω be the unique decoration of the loop graph shown above. Then Ω is the unique solution to the equations (+); that is, Ω is the unique set satisfying Ω = {Ω} and Ω = ΩΩ . Proof. The fact that Ω = {Ω} follows from the structure of the apg for Ω: certainly the designated point of the single-loop graph is a child of itself, so it follows Ω ∈ Ω. But it is also clear that the designated point is its only child. Therefore, Ω is the only element of Ω, and so Ω = {Ω}. To see that Ω = ΩΩ , since Ω is the singleton set {Ω}, it follows there is exactly one map f : Ω → Ω, namely, the function defined by f (Ω) = Ω. The function f can be represented as a 194 set of ordered pairs, namely f = {(Ω, Ω)}. As we argued before (Proposition 88(B)), Ω = (Ω, Ω). It follows that f = Ω. Therefore, ΩΩ = {f } = {Ω} = Ω, as required. To see that Ω is the unique solution to the equations (+), note that any solution to these equations—even to the single equation x = {x}—is a decoration of the single-loop apg pictured above. By AFA, there is only one such decoration. Therefore, there is only one solution to (+). The Theorem, together with the fact that ZFC− + AFA is consistent whenever ZFC is consistent, shows that ZFC− is consistent with the equations (+), and so our earlier work in this section is fully legitimized. Before embarking on a discussion about how all real sets can be seen to arise from, and return to, the single non-well-founded set Ω, we spend some time exploring how a ZFC− + AFA universe is built. Building a ZFC− + AFA Universe Let V denote the usual universe of sets—a model of ZFC—as we have discussed in the main part of this paper. One can build, within V , a model V̂ of ZFC− + AFA. Moreover, any such model will have a well-founded part WF = WFV̂ (consisting of all the well-founded sets in V̂ ) that is isomorphic to the original ZFC model V : WF ∼ =V. In this way, we obtain the intuitive picture that a ZFC− + AFA universe is an expansion of the standard cumulative hierarchy of well-founded sets—an expansion consisting of well-founded sets together with ideal elements, which in the present context are the non-well-founded sets, like Ω. We take a moment to describe, at a high level, how a universe for ZFC− + AFA can be constructed. We begin with the usual well-founded universe V of ZFC. Roughly speaking, we wish to think of the sets of our new universe as being precisely the apgs that live in V . This isn’t quite right though because different (nonisomorphic) apgs can picture the same set, even in the well-founded case. For example, the following pictures of the (well-founded) set {0, 1} are nonisomorphic: 195 This situation is similar to the construction of the reals from the rationals—a first try is to declare that a real is a Cauchy sequence of rationals; the problem here also is that many Cauchy sequences converge to the same real, even Cauchy sequences that converge to a rational. The solution is to form a quotient by an appropriate equivalence relation. We wish to say that two apgs are equivalent if they picture the same set, but, since we have not yet constructed all the sets of our new universe, this statement of the relation is not formally correct (likewise, one would like to say that two Cauchy sequences are equivalent if and only if they converge to the same real, but this definition, for the same reason, is not formally correct). To capture this idea without assuming existence of the non-well-founded sets we are trying to build, the necessary equivalence relation is formulated in another way. We give a brief description of the required equivalence relation. Suppose G = (MG , EG ), H = (MH , EH ) are graphs. A bisimulation for G, H is any relation R ⊆ MG ×MH having the following properties: There is a relation R+ with R ⊆ R+ ⊆ MG × MH satisfying the following: For each a ∈ MG and b ∈ MH , aR+ b if and only if both of the following hold: (i) whenever a → x there is y ∈ MH with b → y and xRy (ii) whenever b → y there is x ∈ MG with a → x and xRy The equivalence relation that we will need is a maximum bisimulation; we will discuss this concept after giving an example. Example 1. Consider the two apgs mentioned earlier that picture the set {0, 1}; for this example, we will call them G and H. 196 We have labelled these graphs differently here to emphasize the fact that bisimulation relations make sense for any kind of directed graph, not just apgs. We define three bisimulations for G, H. The first is a trivial bisimulation, which is obtained by declaring that childless nodes are related to each other: R1 = {(b, f ), (d, f )}. The conditions for a bisimulation are satisfied vacuously because none of the vertices in the relation have children. The second bisimulation includes vertices that do have children: R2 = {(c, g), (d, f ), (b, f )}. The new pair (c, g) can be included because the respective children of c and g occur as pairs also. Notice that R02 = {(c, f ), (d, f ), (b, f )} is not a bisimulation since, although c → d, there is no d0 in H such that f → d0 . Also, R002 = {(c, g), (b, f )} is not a bisimulation because, though g → f in H, there is no corresponding child of c that is paired with f in the relation. The third bisimulation includes all the vertices. R3 = {(a, e), (c, g), (d, f ), (b, f )}. It is not hard to see that there is only one bisimulation that includes all the vertices; it is called the total bisimulation, or the maximum bisimulation. Two graphs that admit such a bisimulation are called bisimilar. Considering these graphs as apgs (with respective designated points a and e), it is apparent in this example that, although the graphs are not isomorphic, the membership structures that they specify are the same; it is clear in this case that these apgs must picture the same set. Bisimilarity can be shown to be an equivalence relation on directed graphs. Moreover, the discussion in the example gives some idea about why this equivalence relation is the one we are seeking: When we consider apgs as displays of potential membership structure, it seems intuitively clear that whenever apgs are bisimilar, the membership structure of both apgs is the same, so they should picture the same set. Using this equivalence relation ≡ on apgs, we wish to form the collection of resulting equivalence classes. Since typically each equivalence class will itself be a proper class, we use a familiar technique (known as Scott’s trick) to reduce their size: we take a representative r from each such class having least rank and we let [r] denote the set of all apgs equivalent to r and having the same rank as r; and finally, we let V̂ denote the collection of all such reduced equivalence classes [r]. We have described the “sets” of our new univere. We also need to specify the “membership relation.” Suppose [r], [s] ∈ V̂ . Since r, s are pointed graphs, we may write r = (G, pG) and s = (H, pH ), where pG and pH are the designated points of the apgs. We declare that [r] is a “member of” [s] if there is a vertex q in H with pH → q such that the sub-apg Hq of H determined by q (defined as: Hq = {h ∈ MH | there is a path in H from q to h}) is equivalent to G: ∃q ∈ MH Hq ≡ G. 197 We give a simple example: Example 2. Consider the following apgs: We observe that [G] is an “element” of [K], according to our new definition. We show that G is equivalent to a sub-apg of K whose designated point is a child of u. Here, there is only one way for this to happen since u has only one child, namely, e. Clearly, the sub-apg of K whose designated point is e is precisely the apg H of Example 1, which, as we have seen, is indeed equivalent to G. Therefore, [G]“∈”[K]. Of course, this is what we expect since G is a picture of {0, 1} and K is a picture of {{0, 1}}. It can be shown that the equivalence classes belonging to V̂ that contain well-founded trees correspond exactly to the sets in the original universe V ; more precisely, if we let WF = {[r] ∈ V̂ | r is well-founded}, then V ∼ = WF. The Ideal Elements of V̂ As Solutions to Equations This viewpoint that a ZFC− + AFA universe is an expansion of the usual universe V of well-founded sets by ideal elements is familiar from other areas of mathematics, such as the expansion of the reals to the complex numbers, through the introduction of an “ideal” solution to the equation x2 + 1 = 0. In fact, one can also view the expansion from V to V̂ as arising from the introduction of “ideal” solutions to—in this case—classes of equations. One such equation, as we have seen, is x = {x}. Another is x = (0, x). An example of a small system of such equations is: x = {0, x, y} y = {{x}} We can give a precise formulation of the relevant equations as follows: By analogy with the expansion from R to C, we need to introduce indeterminate elements. To take the step from R to C, we first need to obtain the domain R[x] of polynomials in the indeterminate x; then x2 + 1 ∈ R[x] is an expression for which we seek a root; moreover, any root will be an expression that does not implicitly contain the indeterminate x. Likewise, we will expand V with a class X of indeterminates, one for each set in V : X = {xa : a ∈ V }.111 And the “polynomials” we 111 More formally, we are re-building the universe starting with atoms or urelements at the 0th stage. Urelements 198 obtain—which we will call complex sets—are sets built up from other sets together with elements of X. As a simple example, consider the complex set A(xa , xb) = {0, {1, xa}, (xb, 2)}, where xa , xb ∈ X. One of the equations that we wish to be able to solve is xa = A(xa , xb). A solution to such an equation will be a set whose build-up does not contain any of the elements of X; such a set is called a pure set. The Solution Lemma, which is equivalent in ZFC− to AFA, gives a precise statement of classes of equations that we wish to consider, and asserts that any such system of equations always has a unique solution. Theorem 92 (Solution Lemma). Suppose Ax is a complex set, for each x ∈ X. Then the system of equations (∗∗) x = Ax has a unique solution; that is, there is a unique family {bx | x ∈ X} of pure sets such that for each x ∈ X, and each indeterminate xa occurring in Ax , if we replace in Ax each such occurrence of xa with bxa , and if we denote the resulting set Bx , then, for each x ∈ X, bx = Bx . Therefore, a ZFC− + AFA universe is obtained as a universe that includes the well-founded sets and provides unique solutions to all equations of the form (∗∗). We now return to our discussion about the relationship between Ω and all sets in the standard universe V . Ω As the Underlying Reality of All Sets In the first part of this Appendix, we suggested that a mathematical model of pure consciousness, as the ultimate reality, unfolding within itself and knowing itself and itself alone, could be obtained by finding a solution to the equations xx = x = {x}. We found that, although no such solution can be found in the standard universe of sets, there is one, and only one, solution (the set Ω) if we expand the universe with ideal elements—non-wellfounded sets—by working in the theory ZFC− + AFA. Next, we will indicate how the dynamics inherent in this solution Ω give rise to an “unmanifest” version of all mathematical objects. We will see how this unmanifest “cosmic counterpart” to each mathematical object behaves exactly as the mathematical object itself does, but the transcendental reality of the mathematical object in its cosmic counterpart is not lost in the expression of its details. are sets, different from the empty set, that have no elements. 199 Even without introducing the set Ω, the standardard universe V can already be seen as an expression of a single reality: the empty set. As we have seen, V is built in stages V0 , V1 , V2, . . ., where V0 = ∅, V1 = P(V0 ) = {∅}, V2 = P(V1 ) = {∅, {∅}}, . . . 112 Notice that in these first few stages, every set can be seen as a pattern of set braces ({, }) and the empty set ∅. The universe V is obtained as the union of all these stages: [ V = Vα = V0 ∪ V1 ∪ V2 ∪ · · · α∈ON The observation we made regarding the first few stages continues to hold throughout the construction: Each stage consists of sets which can be seen as just patterns of set braces ({, }) and the empty set ∅. Moreover, the Axiom of Foundation guarantees that, for any set A ∈ V , there is a finite sequence of sets a0 , a1, . . . , am satisfying: (#) a0 = ∅, am = A, and a0 ∈ a1 ∈ · · · ∈ am . In this way, every set can be seen as being “made of” the emptyset, and it is always possible to trace each set back to its “origin” in ∅. These facts already provide a profound analogy to the ancient wisdom concerning the structure of the universe: Everything is, from this point of view, an expression or precipitation of one reality, pure consciousness, and everything has, deep within its nature, the full value of pure consciousness. Another insight from the ancient perspective, which the present analogy fails to capture, is that everything is “nothing but” pure consciounsness; that everything is nothing but the dynamics of pure consciousness. The reality is one; differences arise as a point of view, a way of looking at or conceiving, this one reality. We do not see a parallel to this profound insight and potential realization as we consider the dynamics of the empty set and set formation. What we find instead is that, even though the empty set is at the core of every set, differences among sets in the universe dominate. The entire enterprise of modern mathematics relies on the fact that sets that do not have precisely the same elements are different sets. In no mathematical sense can it be said, for example, that {1, 2} and {0, 4, 9} are “the same.” The fact that these sets have a common source in the empty set is a hidden reality, not a “living” reality. The reason that this fundamental unity is not seen “on the surface” of mathematics is, we suggest, because of the fact that even the dynamics of ∅ are hidden from view: How does the empty set actually “give rise” to anything? Even to the set {∅}? Although the very concept of set formation naturally leads us from ∅ to {∅}, this concept is not in any mathematical way present in ∅; it is, rather, added “from the outside” to propel ∅ to {∅}. This kind of transformation is in sharp contrast to the dynamics of pure consciousness, from the ancient perspective. On that 112 For this discussion, we have suppressed the fact that limit stages of the construction are obtained by forming the union of all previous stages; the logic of the discussion remains the same whether or not this technical point is included. 200 view, the dynamics that propel pure consciousness to go through its inner transformations are the dynamics of consciousness alone; indeed, they are nothing but consciousness. Nothing “outside” of pure consciousness impels it to move or to know itself. Indeed, from this point of view, there is nothing else. A more suitable mathematical analogue to pure consciousness and its dynamics is the set Ω, which we have introduced in this appendix. The difference between the empty set and Ω is captured nicely in contrasting their respective apgs: In the single-loop graph, we see a picture (and apg) of a self-relationship, an inherent dynamism between the vertex and itself. Moreover, the starting point, the target, and the relationship between them are, when realized as a set, one and the same: We have Ω : Ω → Ω, with Ω(Ω) = Ω; indeed, Ω(x) = Ω for every x ∈ Ω. In this way, Ω succeeds as a model of pure consciousness in ways that ∅ cannot. One way to view the single-loop graph that pictures Ω is as a kind of refinement of the single-vertex graph that pictures ∅, in the sense that the self-referral dynamics that one may imagine are “hidden” deep within the empty set have been brought into plain view. The edge from the single vertex to itself that is added to the single-vertex graph can be seen as symbolic of “re-connecting” the point to itself. This viewpoint provides a new way of viewing all sets in the standard ZFC universe V . In V , all sets are seen as distinct and unrelated, even though the “core” of every set is always simply the empty set (as we discussed earlier). We can use our insights about the relationship between ∅ and Ω to explicitly re-connect every set to its “source” in the following way. First, let us observe that every well-founded set can be pictured by an apg that has exactly one childless vertex, and in every case, this childless vertex is decorated with the empty set. This is true because, for every set A, by the Axiom of Foundation (as we have observed), every maximal ∈-chain starting at A is finite and terminates in ∅, as in the display (#) mentioned earlier and reproduced here: ∅ ∈ . . . ∈ ai ∈ . . . ∈ A. Therefore, as in the different apg pictures of the set {0, 1} discussed earlier, all edges pointing to ∅ can be directed to a single node decorated with ∅, and this node is necessarily childless (since ∅ has no elements). We shall call any such picture of a well-founded set a canonical picture of the set. Then, to reconnect any well-founded set A to its source, we can simply add one edge to a canonical picture from the vertex decorated with ∅ to the designated point, decorated with A. We will call the edge that it is added in this way the reconnecting edge; notice that there is, for 201 any canonical picture of a well-founded set, just one reconnecting edge. Here is an example: In the example, we begin with a canonical picture of the well-founded set {0, 1}; recall that 0 = ∅ and 1 = {∅}, so the graph on the left is also a canonical picture of {∅, {∅}}. Its unique childless node is located in the lower left of the picture, labelled by 0 = ∅. The middle graph shows what happens when we reconnect this node to its source by adding an edge from 0 to {0, 1}. The decoration will necessarily change because of the addition of the reconnecting edge (so no decoration is shown in the middle graph). Finally, in the third apg, we attempt to decorate the graph in the middle with some set. Certainly Ω can be used, as one may easily verify. But now because every apg has a unique decoration, the only way to decorate this middle graph is by placing Ω at every vertex. What the example shows is that, by adding the reconnecting edge to a canonical apg of a well-founded set, everything about the set, including its elements and internal relationships, reveal themselves to be nothing but Ω. This captures in a mathematical way the vision of Vedanta, the reality of Brahman Consciousness according to Maharishi Mahesh Yogi, in which every object is recognized to be nothing but pure consciousness. Though differences remain, what dominates perception is the reality of pure consciousness in every detail. This same approach can be taken for every well-founded set, as the following theorem shows: Theorem 93 (The Ω-Theorem). An apg is a picture of Ω if and only if every vertex of the apg has a child. By the Ω-Theorem, the “fundamental reality” underlying each well-founded set can be discovered by adding a single reconnecting edge to its canonical picture, from the vertex labelled with 0 to the designated point of the apg. We observe that, in particular, one can apply this procedure to every stage Vα of the universe, and so the “reality” of every stage of V is seen to be Ω itself. Finally, one may form the union of all stages, as one does to obtain V itself, and what one gets after adding the reconnecting edge to canonical apgs of each Vα and then forming the union of all these refined stages is, once again, Ω. One thereby sees that the “reality” of the entire universe is nothing other than Ω. Returning to our example for a moment, notice also that by removing the reconnecting edge, the original apg, and hence the original set, comes back into view. For example, removing the reconnecting edge in the example just displayed, Ω is transformed back to the set {0, 1}. This same maneuver can be applied to any canonical apg that has a reconnecting edge. If we form a collection U of all apgs having a single vertex decorated with ∅ and for which there is a reconnecting edge, one could say that every well-founded set “arises from” the act of removing the reconnecting edge and thereby breaking the apg’s (and the set’s) connection to its source. This is a kind of “symmetry-breaking” by which the “perfectly balanced” state of Ω is disrupted through the removal of one edge from one of Ω’s representations as an apg. In this sense, Ω is 202 seen as the source of all sets, arising from a broken symmetry in the perfectly balanced state represented by Ω. These observations suggest that all sets have their fundamental underlying reality in Ω; that the viewpoint that sets are discrete unconnected expressions is a mirage, just a peculiar way of looking at the one reality Ω. One could say that the final truth about sets is that each is a unique expression of one reality, but both the “essence” of the set and its individual characteristics originate in the dynamics of the ideal set Ω. Unlike the empty set, which is in a sense the “core” of every set, Ω embodies the dynamics of each set, at an unmanifest level. As an example, the set (∅, {∅}), after adding the reconnecting edge to its canonical apg, becomes (Ω, {Ω}). This modified version is built up from Ω in exactly the same way the original set was built up from the empty set, and so retains all the unique features of the original set, except that the new version is nothing other than Ω: (Ω, {Ω}) = Ω. All the differences that distinguished the original set have become, as if, transparent, revealing the true nature of the set as Ω. In this sense, Ω plays the role of Maharishi’s Absolute Number, as described at the beginning of this paper, revealing the ultimate nature of all numbers as well as all sets, without obliterating the differences that distinguish each from another.113 113 See p. 5. 203 References [1] P. Aczek. Non-Well-Founded Sets. in CSLI Lecture Notes, #14. Center for the Study of Language and Information. 1988. [2] S. Awodey. Category Theory, Second Edition. Oxford University Press. 2011. [3] A. Blass. Exact functors and Measurable Cardinals. Pacific Journal of Mathematics. (1976) 63(2), pp. 335–346.. [4] A. Blass. Combinatorial cardinal characteristics of the continuum. In Handbook of Set Theory. Eds. A. Kanamori, M. Foreman. Springer. 2010. pp. 393–489. [5] S.J. Coppleston. History of philosophy, volume 1: Greece & Rome. Image Books. 1962. [6] P. Corazza. The Wholeness Axiom and Laver Sequences. Annals of Pure and Applied Logic. (2000) 105, pp. 157–260. [7] P. Corazza. The Spectrum of Elementary Embeddings j : V → V Annals of Pure and Applied Logic. (2006) 139, pp. 327–399. [8] P. Corazza. The Wholeness Axiom. In Consciousness-Based Education: A Foundation for Teaching and Learning in the Academic Disciplines, Volume 5: Consciousness-Based Education in Mathematics, Part I. Eds. P. Corazza, A. Dow, D. Llewelyn, C. Pearson. MUM Press. 2011. [9] P. Corazza. Vedic Wholeness and the Mathematical Universe: Maharishi Vedic Science As a Tool for Research in the Foundations of Mathematics. In Consciousness-Based Education: A Foundation for Teaching and Learning in the Academic Disciplines, Volume 5: ConsciousnessBased Education in Mathematics, Part II. Eds. P. Corazza, A. Dow, D. Llewelyn, C. Pearson. MUM Press. 2011. [10] P. Corazza. The Axiom of Infinity and Transformations j : V → V . The Bulletin of Symbolic Logic. (2010) 16, pp. 37–84. [11] P. Dehornoy. Braids and Self-Distributivity. Birkhäuser. 2000. [12] R. Dougherty, T. Jech. Finite Left-Distributive Algebras and Embedding Algebras. Advances in Mathematics. (1997) 130, pp. 201–241. [13] F. D’Olivet, N.L. Redfield. Golden Verses of Pythagoras (English and French Edition). Kessinger Publishing, LLC. 1992. [14] F. Drake. Set theory: An introduction to large cardinals. American Elsevier Publishing Co., New York, 1974. [15] A. Enayat, J. Schmerl, A. Visser. ω-Models of Finite Set Theory. In Set Theory, Arithmetic, and Foundations of Mathematics: Theorems, Philosophies. Eds. Kennedy, J. and Kossak, R.. Cambridge University Press. United Kingdom. pp. 43–65. 204 [16] G. Feng, J. English. Tao Te Ching, 25th Anniversary Edition. http://www.duhtao.com/translations.html. Vintage. 1997. [17] M. Hallett. Cantorian Set Theory and the Limitation of Size. Clarendon Press. 1988. [18] K. Hrbacek, T. Jech. Introduction to set theory. Marcel Dekker, New York, 1999. [19] T. Jech. Set Theory: The Third Millenium Edition, Revised and Expanded. Springer-Verlag. Berlin, Heidelberg, New York. 2003. [20] K. Kunen. Elementary Embeddings and Infinitary Combinatorics. The Journal of Symbolic Logic. (1971) 36, pp. 407–413. [21] K. Kuratowski. Topology, Volume 1. Academic Press. 1966. [22] R. Laver. On the Algebra of Elementary Embeddings from a Rank into Itself. Advances in Mathematics. (1995) 110, pp. 334–346. [23] W. Lawvere. Adjointness in Foundations. Dialectica. (1969) 23. pp. 281–296. [24] W. Lawvere. Sets for Mathematics. Cambridge University Press. 2003. [25] Maharishi Mahesh Yogi. Maharishi’s Absolute Theory of Defence. Maharishi Vedic University Press. Vlodrop, The Netherlands. 1996. [26] S. Mac Lane. Categories for the Working Mathematician, Second Edition. Springer. 1978. [27] Maharishi Mahesh Yogi. Maharishi’s Absolute Theory of Government. Maharishi Prakashan. India. 1995. [28] Maharishi Mahesh Yogi. Vedic Knowledge for Everyone. Maharishi Vedic University Press. Vlodrop, The Netherlands. 1994. [29] Maharishi Mahesh Yogi. Maharishi Vedic University, Bulletin for 1986-1988. MERU Press. West Germany. 1986. [30] A. Miller. Special Subsets of the Real Line. In Handbook of Set-Theoretic Topology. Eds. K. Kunen, J.E. Vaughn. North-Holland. 1984. [31] Y. Moschovakis. Descriptive Set Theory. North-Holland. 1980. [32] G.R.S. Mead. Orpheus. John W. Watkins. London. 1965. [33] R. Rosebrugh, R.J. Wood. An Adjoint Characterization of the Category of Sets. Proceedings of the American Mathematical Society. (1994) 122. pp. 409-413. [34] Saunders Mac Lane. Categories for the Working Mathematician. Springer-Verlag. New York. 1971. [35] Thomas Taylor. The Works of Plato. AMS Press. New York. 1972. 205 [36] R.K. Wallace. The Physiology of consciousness. Maharishi International University Press. Fairfield, IA. 1993. [37] H. Weber. Leopold Kronecker. Mathematische Annalen. (1893) 43, pp. 1–25. 206

The Magical Origin of the Natural Numbers

Related documents

Products

Support

The Magical Origin of the Natural Numbers

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib