The Magical Origin of the Natural Numbers

advertisement
The Magical Origin of the Natural Numbers
Paul Corazza
Department of Mathematics
Maharishi University of Management
Presented at the Annual Faculty Research Seminar
April 26, 2013, Maharishi University of Management, Fairfield, IA
DRAFT Version 6: September 12, 2015
Abstract: The usual treatment of the set N of natural numbers in foundational
theories, particularly ZFC, is simply to assert axiomatically that “N exists,” but not to
provide a perspective about how the sequence of natural numbers might originate. We
review the perspective of the ancient wisdom, particularly Maharishi Vedic Science,
which considers knowledge to be fundamentally incomplete when discrete values are
understood only in terms of their separateness and not in terms of a unified source.
We argue that the usual—incomplete—way of treating the natural numbers in set
theory, in the form of the Axiom of Infinity, has made certain modern-day problems
about the infinite in mathematics more difficult to solve, particularly the well-known
Problem of Large Cardinals. We suggest in this paper a way to reformulate the
Axiom of Infinity so that the natural numbers are seen to emerge as precipitations
or side-effects of more fundamental self-referral dynamics of an unbounded field. We
observe that this New Axiom of Infinity has characteristics that naturally generalize
to stronger forms, which can be used—and, in fact, have been used—to solve the
Problem of Large Cardinals.
1
Contents
1 The Role of N in Mathematics
4
2 The Maharishi Vedic Science Perspective
4
3 The Origin of N According to Modern Mathematics
6
4 A Plan for a New Axiom of Infinity
8
5 Dedekind-Infinite Sets and Dedekind Self-Maps
9
6 A New Axiom of Infinity
15
7 Obtaining the Well-Ordered Set W
16
8 Generating the Natural Numbers with the Mostowski Collapse
24
9 The Self-Map s : ω → ω As the Smallest Among All Dedekind Self-Maps
31
10 Co-Dedekind Self-Maps, Dedekind Maps, and co-Dedekind Maps
46
10.1 Dedekind Maps and co-Dedekind Maps . . . . . . . . . . . . . . . . . . . . . . . . 50
11 Blueprints Generated by a Dedekind Self-Map
56
12 A First Look at Dedekind Self-Maps of the Universe
59
13 The Classes ON and V
62
14 The Class of Natural Numbers
65
15 Strengthening Proper Class Dedekind Self-Maps
68
16 The Relationship Between the Different Notions of Critical Point
87
17 Class Dedekind Self-Maps and Functors
90
18 Emergence of the Infinite in the Universe
105
19 The “Naturalness” of the New Axiom of Infinity
109
20 Tools for Generalizing to a Context for Large Cardinals
118
20.1 The structure of a Dedekind self-map . . . . . . . . . . . . . . . . . . . . . . . . . . 119
21 Introduction to Large Cardinals
124
22 Some Strengthenings of Dedekind Self-Maps to Account for Large Cardinals 127
2
23 The Theory ZFC + BTEE + MUA
133
24 The Theory ZFC + WA
24.1 Restrictions of a WA-Embedding and Its Critical Sequence . . . . . . . . . . . .
24.2 A Connection Between the Critical Sequence of a WA-Embedding and Mathematical Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24.3 The WA Blueprint and the Ancient View of Blueprints . . . . . . . . . . . . . . .
146
. 150
. 154
. 170
25 Conclusion
178
26 Appendix I: A First Look at How 1 Emerges from 0
185
27 Appendix II: Unfoldment of the Universe Within the Unmanifest
190
3
The set N = {0, 1, 2, 3, . . .} of natural numbers has always been recognized as playing a vital
role in the development of mathematics. Historically, though, mathematics has not felt the need
to examine the more philosophical question of the origin of the natural numbers—what might
be their source?—though, of course, the fundamental axioms (the ZFC axioms) do provide some
information about it since the natural numbers, like the rest of mathematics, are derived from
ZFC. In this paper we will offer an alternative formulation of the ZFC axiom that declares the
existence of an infinite set, the Axiom of Infinity. Our new Axiom of Infinity, though mathematically equivalent to the original version, views the set of natural numbers not as simply “given,”
as is usually assumed, but rather as emerging from self-referral dynamics of an unbounded field.
We will argue that this alternative perspective provides advantages and insight in addressing
issues having to do with the mathematical infinite—most importantly, the century-old Problem
of Large Cardinals.
1
The Role of N in Mathematics
Prior to the emergence of the ZFC axioms as the standard foundation for mathematics, mathematicians, at least in an intuitive sense, viewed the natural numbers as the source of all of
mathematics. Nineteenth century German mathematician Leopold Kronecker made the oftenquoted statement [37], “God made natural numbers; all else is the work of man.” In an attempt
to identify a “starting point” for establishing results about the natural numbers, Italian mathematician Giuseppe Peano, building upon earlier work of Peirce and Dedekind, formulated a set
of axioms for the natural numbers and their fundamental properties; his axioms are known today
as Peano Arithmetic, or PA.
With the advent of ZFC, the prominent role of N in mathematics appeared to fade somewhat,
since now both N and each particular natural number were to be seen as special kinds of sets, built
from the ZFC axioms. However, it was observed in subsequent decades that ZFC is equivalent
to Peano Arithmetic together with the statement that all the natural numbers can be collected
together into a single completed set:
ZFC ⇔ PA + the natural numbers 1, 2, 3, . . . form a set.1
One may therefore conclude that, even from the perspective of modern mathematics, all
of mathematics can be derived from the natural numbers and the knowledge that the natural
numbers form an infinite set.
2
The Maharishi Vedic Science Perspective
In 1994–95, Maharishi gave a series of lectures on his Vedic Mathematics and Absolute Number.
In one such lecture (which the author had the privilege of attending), he made the following
1
Recent work [15] in the model theory of PA suggests that a more accurate formulation of the equivalence here
is that PA is equivalent to the theory ZFC − Infinity + ¬Infinity + Trans, where Trans asserts that every set is
included in a transitive set. (A set X is transitive if, for all y, z, if y ∈ X and z ∈ y, then z ∈ X.)
4
remarks:2
All of creation comes from the natural numbers,
and, in the same lecture:
To view the natural numbers only as separate quantities, unconnected to their source,
is the beginning of ignorance.3
Certainly the view that Maharishi warns against in the second of these quotes—that the
natural numbers 0, 1, 2, . . . are nothing more than conceptual devices, without a common
source, to serve the practical need of counting items in the world around us—is the common view,
the view that one learns in school, and a view that one would have little occasion to reflect upon
or call into question.
To remedy this incomplete view of the natural numbers, Maharishi introduced his Absolute
Number, the ultimate wholeness underlying all numbers and all creation [25, pp. 625–626]—the
field of “infinite dynamism on the ground of infinite silence” [25, p. 620]. Appreciating each
number in terms of its source in the Absolute Number, as Maharishi explains, restores the full
dignity of each number as both an individual expression with its own unique characteristics, and
as an expression of infinite wholeness. He says [25, pp. 614-15]:
The ever-expanding value of the universe, in terms of an infinity of numbers, is the
natural characteristic feature of the Absolute Number, which enables all numbers
to function from their common basis. It is this effect of the Absolute Number on
all numbers that actually initiates and maintains order in the ever-evolving infinite
diversity of the universe.
He also explains [25, p. 633]
Without the support of the Absolute Number the study of modern mathematics can
never account for the total reality—which is infinite—because it [mathematics] proceeds in steps. This is the time for the mathematics of finite number, of finite steps, to
peep into the reality of the Absolute, and this Absolute Number represents complete
knowledge, the Veda.
When the natural numbers are able to realize their “full potential” on the ground of the
Absolute Number, one may wonder, in what way would they be different from the natural numbers
as they are commonly understood today? Maharishi has elaborated on the significance of several
natural numbers, indicating that their true nature is more of a universal principle than mere
distinct quantities. He describes the number 1 as unity, an eternal continuum [25, p. 613], from
2
These statement are quotes taken from my notes from the lecture and do not, as far as we know, appear in
any of the Maharishi’s published works; however, similar statements do appear in his Absolute Theory of Defence
[25].
3
Elaborating further on the loss to life that occurs when individuals are disconnected from their source, Maharishi
remarks, “This segregation of the individual from the cosmos is very unnatural. . . When the connectedness of
individual life with Cosmic Life is damaged, indvidual intelligence remains disconnected from its own cosmic value.
It remains like a bud without flowering” [28, pp. 200–201].
5
which all the other natural numbers emerge. He has described the number 2 as the view of
wholeness in which wholeness assumes the role of subject and object, infinite silence and infinite
dynamism, intelligence and existence, Purusha and Prakriti [25, p. 630]. He has described the
number 3 as the fundamental structure of unity, in terms of Rishi, Devata, and Chhandas [25, p.
630]. In these examples we see that each number has its own character but at the same time gives
expression and remains connected to fullness. It is this more expanded view of the nature of the
natural numbers, it would seem, that makes it possible for them to truly give rise to everything
in the universe.
The view that the natural numbers have a deeper, universal significance has been expressed
in many ancient traditions of knowledge. In the West, Pythagoras and his school maintained that
all things in the universe are, fundamentally, natural numbers. “All is number” is an expression
attributed to this school.4 Pythagoras also maintained [13, p. 137] that at the basis of all natural
numbers is a “Number of numbers,” an ultimate source of all numbers, something Divine in
nature. Likewise, the Tao Te Ching, one of the deepest treasures of China’s ancient wisdom, sees
unity, one, giving rise to two, two giving rise to three, and three giving rise to all things [16, v.
42]); and the source of all number is Tao, wholeness. There is even an admonition in the Tao
Te Ching to appreciate all individual expressions in terms of their source—that the role of Tao
in relation to the natural numbers, and to all particular expressions in the universe, is similar to
that of Maharishi’s Absolute Number.5
3
The Origin of N According to Modern Mathematics
It is natural to wonder to what extent the common view of the natural numbers is truly the
mathematical view. All mathematics arises from set theory, so we can look to the axioms of set
theory to see to what extent the commonly held restricted view of the natural numbers is present
even in the foundation of mathematics.
Among the axioms of ZFC6 (Zermelo-Fraenkel set theory with the Axiom of Choice), there
4
5
A discussion of the Pythagorean school can be found in [5, Part I].
In verse 52 [16], we read:
The beginning of the universe is the mother of all things.
Knowing the mother, one also knows the sons.
Knowing the sons, yet remaining in touch with the mother, brings freedom from the fear of death.
The passage emphasizes the importance of individual expressions remaining connected to—and being appreciated
in terms of—their source.
6
In this paper, we will often switch between the theories ZFC and ZFC without the Axiom of Infinity: ZFC −
Infinity. In the latter case, the theory that actually is needed (in light of footnote on p. 4) is ZFC − Infinity + Trans.
To avoid becoming too involved in fine points, in this paper we simply assume that Trans is one of the standard
axioms of ZFC. With this assumption, ZFC − Infinity automatically also includes Trans. For reference, we list
our version of the ZFC axioms here. A functional formula is a formula φ(x, y, z1 , . . . , zk ) with the property that,
for any sets a1 , . . . , ak , whenever t, u1 , u2 are sets and both φ(t, u1 , a1 , . . . , ak ) and φ(t, u2 , a1 , . . . , ak ) hold, then
u1 = u2 .
(Empty Set) There is a set with no element.
(Axiom of Infinity) There is an inductive set.
(Axiom of Extensionality) Two sets are equal if and only if they have the same elements.
(Pairing Axiom) For any sets x, y, the collection {x, y} is also a set. More precisely, for all x, y, there
6
is just one axiom that talks about infinite sets; this axiom is called the Axiom of Infinity. Historically, this axiom was considered to be essential because of the work of Cantor, who showed
that a rigorous formulation of a number system as fundamental as the real number line would
not be possible without the notion of infinite sets—in particular, it is necessary to conceive of
the natural numbers 0, 1, 2, 3, . . . as elements of a single, completed set, which we denote, in
modern times, by N [17].
In coming up with a precise statement of the Axiom of Infinity, the early founders of set
theory had to decide how to express the idea that “an infinite set exists” or “there is a set whose
elements are the natural numbers” in the language of set theory. In the language of set theory,
everything is taken to be a set, but at the time the axioms were being formulated, the natural
numbers themselves were not usually thought of in this way. To meet the need of representing
natural numbers as sets and asserting that there is a set that contains all natural numbers, the
early crafters of the axioms settled on the notion of an inductive set [19, 18].
Therefore, the Axiom of Infinity as it is known today asserts the existence of an inductive
set. A set S is inductive if it satisfies two properties: (1) S contains the empty set ∅, and (2)
for any set x belonging to S, the set s(x) = x ∪ {x} also belongs to S. The natural numbers
are then defined to be precisely those sets that belong to all inductive sets; the set of all natural
numbers is denoted by the Greek letter ω (“omega”). This definition established ω, the set of
natural numbers, as the smallest inductive set. This approach to the natural numbers provides
is z such that the only elements of z are x and y.
(Union Axiom) The union of any set of sets is again a set. More precisely, for all x, there is a z such
that z = ∪x.
(Power Set Axiom) The collection of all subsets of a set is again a set. More precisely, for all x, there
is z such that, for all u, u ∈ z if and only if u ⊆ x.
(Foundation) Every set has an ∈-minimal element. In other words, for every x, there is y such that
for all z ∈ y, z 6∈ x.
(Separation) For every formula φ(x, z1 , . . . , zk ), every set A, and all sets a1 , . . . , ak the collection
{y ∈ A | φ(y, a1 , . . . , ak )} is a set.
(Replacement) For any functional formula φ(x, y, z1 , . . . , zk ), any set A, and any sets a1 , . . . , ak , the
collection {v | ∃u ∈ A φ(u, v, a1 , . . . , ak )} is a set.
(Choice) For any set X of nonempty sets, there is another set Y containing an element of each of the
elements of X.
(Trans) For every set X there is a transitive set Y such that X ⊆ Y .
7
a formal way of declaring that the natural numbers have the following definition:
0 = ∅
1 = {0}
2 = {0, 1}
3 = {0, 1, 2}
·
=
·
·
=
·
·
=
·
n + 1 = s(n) = n ∪ {n} = {0, 1, 2, . . ., n}
·
=
·
·
·
=
=
·
·
The function s is fundamental to the definition of the natural numbers; it tells us, given any
number n, which number follows n in sequence. The usual way of defining s is by s(n) = n + 1,
but in the context of pure sets, this definition becomes s(x) = x ∪ {x}.7
Viewing the natural numbers in this way—as simply the sets ∅, {∅}, {∅, {∅}}, . . ., or even as
the “smallest inductive set”—provides no clue that there might be, even within the field of mathematics itself, an origin to the natural numbers; that the natural numbers might reasonably be
seen as emerging from some source—a source that could be considered a mathematical analogue
to Maharishi’s Absolute Number.
4
A Plan for a New Axiom of Infinity
We will argue that failure to recognize such an origin has resulted in an unnecessarily limited
view of the nature of the mathematical infinite. One of the ongoing problems in foundations of
mathematics has been to discover axioms that could be added to the ZFC axioms to provide an
axiomatic foundation for certain extremely large sets that have arisen in mathematical practice,
known as large cardinals.8 A natural place to look for answers to questions about the “infinite”
7
Note that s, acting on any input x, has the effect of producing a new set x ∪ {x} that includes all elements of
x together with one additional element, x itself. It should be observed that the set x ∪ {x} always consists of one
more set than x itself because x and {x} are disjoint. They are disjoint because, in the universe of sets, no set is
a member of itself—it is never the case that x ∈ x.
8
Cantor showed, at the end of the 19th century, that there are different sizes of infinite sets; every infinite set
has one of these sizes. But shortly after his discovery, certain types of infinite set emerged that were so enormous,
they were very difficult to classify, and, years later, it was shown that such infinities could not actually be proven to
exist at all from ZFC. These notions of infinity are known today as large cardinals. What is surprising about these
large cardinals is that they have appeared as key elements in the solutions of a wide range of research problems
in mathematics; moreover, despite concerted efforts by many early set theorists, no one has ever proved that large
cardinals do not exist. The so-called “Problem of Large Cardinals” is the problem of adding to the standard
axioms of ZFC one or more axioms that could be used to derive the known large cardinals. The need here is to
8
in mathematics (such as “Which, if any, large cardinals exist—that is, which should be taken
as valid objects in the universe of mathematics?”) is the Axiom of Infinity. As we have seen,
the Axiom of Infinity tells us little more than that an infinite set exists, and that, in particular,
there is a set ω consisting precisely of the set-versions of the natural numbers. Surprisingly, the
properties of ω do already suggest the naturalness of some of the smaller large cardinals, but
these properties are not nearly deep enough to motivate or justify the existence of the stronger
large cardinal notions.
Our aim in this paper is to take seriously the proposal that the natural numbers do indeed
have a source that can be represented naturally in the axiomatic foundation of the natural numbers. We will suggest an alternative to the Axiom of Infinity that asserts, in the language of set
theory, the existence of “transformational self-referral dynamics of an unbounded field”; we will
see that a consequence of these dynamics is the sequential emergence of the natural numbers, but
the Axiom of Infinity itself will be concerned simply with the existence of the underlying transformational dynamics. We will suggest that this new viewpoint concerning the formulation of
the Axiom of Infinity provides strong motivation for a solution to the Problem of Large Cardinals.
5
Dedekind-Infinite Sets and Dedekind Self-Maps
Among the many early definitions of “infinite set” that were considered, as the axioms of set
theory were being formulated, a notion of infinity that did not rely on the sequence of natural
numbers was Dedekind-Infinite sets, named after the mathematician, Richard Dedekind, who
proposed the idea. We state the formal definition here:
Definition 1 (Dedekind-Infinite Sets). A set A is Dedekind-Infinite if it can be put in 1-1
correspondence with a proper subset9 of itself. In other words, A is Dedekind-Infinite if there is
a 1-1 and onto function f : A → B where B is a proper subset of A.
An easy example of a Dedekind-Infinite set is A = {1, 2, 3, . . .}. Here, an example of a proper
subset B of A that can be put in 1-1 correspondence with A is given by B = {2, 3, 4, . . .}, with
correspondence given by f (n) = n + 1.
find naturally motivated axioms that could be truly considered foundational axioms for mathematics, in the same
spirit as the ZFC axioms themselves. See [6, 7, 9].
9
A subset B of a set A is a proper subset if B 6= A.
9
Associated with every Dedekind-Infinite set A is a corresponding self-map j : A → A. For
instance, if A is Dedekind-Infinite and B ⊆ A is a proper subset, and f : A → B is a 1-1 correspondence, the associated self-map j : A → A is defined just to be f itself, except the codomain10
has changed from B to A. Now, since j is a function from A to A, j is not itself a 1-1 correspondence: It is 1-1 but not onto.11 Since j is not onto, there is at least one element a in the
codomain of j that is not in the range.12 Any such element a that is in the codomain but not
the range will be called a critical point of j. We state a formal definition:
Definition 2 (Dedekind Self-Maps). A Dedekind self-map is a 1-1 function j : A → A that has
a critical point.
The diagram shows an example of a Dedekind self-map with critical point 1. Here, the
function f : A → B of the previous example has been replaced by j : A → A, but acts on
elements of A in exactly the same way: For each n, j(n) = n + 1. Notice that the critical point
of j has been circled.
10
The codomain of a function h : C → D is D; one writes cod h = D.
A function h : C → D is 1-1 if, whenever x 6= y are distinct elements of C, h(x) 6= h(y) are distinct elements
of D. Also, h is onto if, for every d ∈ D there is an x ∈ C such that h(x) = d.
12
The range of a function h : C → D is the set of all outputs of h: ran h = {h(x) | x ∈ C}.
11
10
A critical point a of a Dedekind self-map j : A → A plays an important role in the “internal
dynamics” of A because it guarantees that the range of j, which we denote B (as in the examples),
is itself a Dedekind-Infinite set, and that its restriction13 to B is in fact a Dedekind self-map with
critical point j(a).
We spend a moment to verify these details: Let B = j[A] and let i = j B. We show
ran i ⊆ B: If b ∈ B, let x ∈ A be such that b = j(x). Since j(x) ∈ A, we have
j(b) = j(j(x)) ∈ j[A] = B.
It follows that i : B → B.
Since i is a restriction of j, i is 1-1. We verify that j(a) is a critical point of i: If i(b) = j(a)
for some b ∈ B, let x ∈ A with b = j(x). Then i(j(x)) = j(a) implies j(x) = a, which is
impossible since a 6∈ ran j. It follows, therefore that i = j B : B → B is a Dedekind self-map
with critical point j(a). Also, since i is 1-1, it follows that i : B → C = i[B] is a bijection; since
j(a) 6∈ C, the existence of C shows that B itself is Dedekind-infinite.
It is not difficult to carry the reasoning further and show that C itself is Dedekind infinite
and j C : C → C is another Dedekind self-map with critical point j(j(a)). This reasoning leads
to an infinite chain of Dedekind-infinite sets, Dedekind self-maps, and a sequence of critical points
a, j(a), j(j(a)), . . .. This perspective is represented in the following diagram:
The restriction i of a function h : S → T to a subset R of its domain S, denoted i = h R, has domain R and
is defined by i(x) = h(x) for all x ∈ R.
13
11
The reader will notice that the sequence a, j(a), j(j(a)), . . . closely resembles the sequence
of natural numbers. As we will show, it is natural to regard this sequence as a precursor to the
“real” natural numbers. Another sequence, that represents the natural numbers more abstractly,
is idA , j, j ◦ j, j ◦ j ◦ j, . . ., where idA : A → A is the identity function (that is, idA (x) = x for all
x ∈ A).
From this perspective—which, at this point in the discussion, has not yet been fully or
rigorously developed—“natural numbers” a, j(a), j(j(a)), . . ., seem to arise as concrete “precipitations” from a self-referral flow (j : A → A), originating from the interaction between j and its
critical point a. And, more abstractly, a subtler version of the natural numbers, idA , j, j ◦ j, . . .,
arises simply from the interaction of j with itself; from this point of view, each natural number
is seen as a self-referral loop.
In addition, the dynamics expressed by j have an important characteristic: While j does in
fact transform A (in the sense that j is not simply the identity function; values of A are moved), j
nevertheless preserves the essential character of A, namely, that of being Dedekind-infinite, since,
as we showed earlier, by virtue of the critical point of j, the image B of A under j also turns out
to be Dedekind-infinite.
These characteristics of a Dedekind self-map j : A → A are reminiscent of the transformational dynamics of wholeness, as described in Maharishi Vedic Science, from which the Veda, the
blueprint of creation, emerges. In these dynamics, the first move of wholeness, of “A,” gives rise
to a sequence of transformations from “AK” to “Agni” to “Agnimile” to the first pada, first richa,
first mandala, to the entire Rik Veda [25, p. 636]. In these transformations, what is preserved is
the essential fullness of “A”, even in its collapse to “K” (which Maharishi describes as “fullness of
emptiness”). The particulars of manifest existence, like the natural numbers themselves, emerge
as a side-effect of the unmanifest dynamics of wholeness. Moreover, Maharishi has described
the ultimate nature of these particulars as “self-referral loops” within pure consciousness: “The
evolution of consciousness into its object-referral expressions . . . progresses in self-referral
12
loops—every step of progress is in terms of a self-referral loop” [28, pp. 63–64].
When the “natural numbers” are seen to be represented by idA , j, j ◦ j, j ◦ j ◦ j, . . ., it is
possible to see that each successive term of the sequence is an elaboration of the previous term.
This viewpoint is expressed in the following diagrams, showing how the move to each successive
term elaborates the previous term.
Corresponding to the number 0, and by analogy, the first letter A of Rik Veda, we have the
identity map idA : A → A that does nothing; it represents complete silence:
id
A
A −→
A.
Corresponding to the number 1, and by analogy, the first syllable AK of Rik Veda, we have
the fundamental transformation given by a Dedekind self-map j : A → A, including the “collapse”
of A to the critical point a.
j
A
-
@
A
f@
inc
R
@
B
In the diagram, f : A → B is a 1-1 correspondence from A to the proper subset B of A, and
inc : B → A is a function that behaves like the identity map in that inc(x) = x for all x ∈ B; the
only the difference between the identity map on B and the function inc is that the codomain of
inc is A rather than B. The function inc is called the inclusion map from B to A; when necessary,
its domain and codomains are indicated as well: inc = incB,A. The diagram as a whole—known
as a commutative diagram—indicates that, following the behavior of the map across the top,
namely, j, produces the same result as following the two lower arrows: f followed by inc, denoted
inc ◦ f . Thus, for any x ∈ A, the value of j(x) is the same as inc(f (x)), and it is easy to verify
the truth of this. The diagram shows that j is formed in two steps: The first step is the 1-1
correspondence between A and the set B, given by f ; the second step is the assertion that B
itself is in fact a proper subset, expressed by the fact that inc(x) = x for all x ∈ B and that inc
is not onto. In this diagram, the self-map does something; there is a critical point; elements of
A are moved; in this way, the realm of pure silence has been enlivened to include considerable
activity.
Corresponding to the number 2, and by analogy, to the fuller elaboration of AK given by
the first word of Rik Veda, Agnim, is the composition map j ◦ j.
A
j ◦j
-
6
f
?
B
A
inc
j B
-
B
Recall that if j : A → A is a Dedekind self-map with critical point a and range B, then
j B : B → B is a Dedekind self-map with critical point j(a). The diagram shows that the
13
self-map j ◦ j : A → A is obtained by first applying f , then j B, and finally the inclusion map
inc from B to A. One can view this suite of maps as a further elaboration of j itself. The visual
impact of the diagram itself suggests that the first diagram is being expanded: The identity map
idB : B → B that is only implicit in the first diagram is replaced in the second diagram by the
Dedekind self-map j B.
As our last example, the number 3, analogous to the next packet of expression in the Veda—
the first pada, consisting of 8 syllables—corresponds to j ◦ j ◦ j.
j ◦ j ◦ j-
A
6
inc
f
?
B
B,A
(j B) ◦ (j -B)
C
B
6
inc
f B
?
A
C,B
j C
-
C
Here, incB,A : B → A is the inclusion map from B to A, while incC,B is the inclusion map
from C to B. Again, the diagram illustrates the more fully elaborated transformational dynamics
as j ◦ j moves to j ◦ j ◦ j. The dynamics seen in the diagram for j ◦ j are now recapitulated in
the lower square of the diagram for j ◦ j ◦ j, but the diagram as a whole tells a fuller story about
how j acts on A in various ways. In this way, we see that the view that takes idA , j, j ◦ j, . . . as
the blueprint of the natural numbers illustrates how the emergence of the natural numbers, at
its foundation, consists of successive elaborations of the dynamics inherent in j : A → A.
One final observation about this view of the natural numbers as arising from a sequence of
Dedekind self-maps is that this sequence itself originates from a higher-order Dedekind self-map
in the following way: Let j : A → A be a Dedekind self-map with critical point a, and let
AA = {h | h is a self-map with domain A}.14 Note that the identity map idA : A → A is one of
the elements of AA . Define a self-map Jj : AA → AA by Jj (h) = j ◦ h. Notice that idA is not in
the range of Jj and that Jj is 1-1: For h, k ∈ AA we have
Jj (h) 6= Jj (k) ⇒ j ◦ h = j ◦ k ⇒ h = k,
since 1-1 functions are left-cancellable.
We have shown that Jj : AA → AA is a Dedekind self-map with critical point idA . Now we
can observe that, in the same way as a, j(a), j(j(a)), . . . are obtained by repeated applications of
j to its critical point a, so likewise idA , j, j ◦ j, . . . is obtained by repeated applications of Jj to
its critical point idA . We will establish in a moment the way in which both of these sequences
can be considered blueprints for the natural numbers.
14
In general, if X, Y are sets, Y X denotes the set of all maps from X to Y .
14
6
A New Axiom of Infinity
We have seen that the existence of a Dedekind self-map gives rise to the natural numbers by
considering iterations of the map; in fact, as we have also seen, there are several ways to generate natural numbers by iteration of such a map. Each of these approaches presents the natural
numbers as emerging from self-referral dynamics of an “unbounded” field—the degree of “unboundedness” of the underlying field increases as we consider more and more abstract versions
of the self-map. And we have suggested that deriving the natural numbers in this way has the
advantage that it provides an insight that can be applied to solve problems that require deeper
intuition about the “true nature” of the mathematical infinite—our primary example of this
phenomenon is the Problem of Large Cardinals.
With these points in mind, we propose a new version of the Axiom of Infinity:
New Axiom of Infinity. There is a Dedekind self-map.
Stating the Axiom of Infinity in this way really causes a shift in viewpoint—a shift away from
the idea that “infinite” means a vast collection of discrete objects and a shift in the direction
of the insight, mentioned above, that the reality of the “infinite” is self-referral dynamics of an
unbounded field, and that large collections of discrete object should be thought of as emerging
from these dynamics, as a consequence or effect of the dynamics of the field.
It is well-known to set theorists that, from the ZFC axioms minus the usual Axiom of Infinity,
one can prove that the usual Axiom of Infinity and the New Axiom of Infinity are equivalent;
using this known equivalence, we could then use the existence of an inductive set to demonstrate
that the sequences a, j(a), j(j(a)), . . . and idA , j, j ◦ j, . . ., described in the previous section, do
indeed form sets that each represent—in a sense that can be made precise—the set ω of natural
numbers.
We do not wish to rely on this standard result, however. The usual proof involves an
auxiliary class of “natural numbers,” which does not necessarily form a set; the usual notions
of induction and inductive definition can be shown to hold relative to this class even without
the Axiom of Infinity. Roughly speaking, these “natural numbers” exist because of the fact that
ZFC without the Axiom of Infinity15 is essentially the same as PA, and PA gives us the natural
numbers as a class.16 Then, one can, given a Dedekind self-map j : A → A, define the class
W = {a, j(a), j(j(a)), . . .} by proceeding as follows: Define a class function F on the class of
natural numbers by
F(0) = a
F(n + 1) = j(F(n)).
One may then use the Axiom of Replacement to define the set W by
W = ran F ∩ A.
15
More precisely, the theory ZFC − Infinity + ¬Infinity.
After we have verified that the natural numbers can be derived from a Dedekind self-map without the help
of the natural numbers in any form, we will make use of this class of natural numbers. This subject is treated in
Section 12.
16
15
Therefore the collection W = {a, j(a), j(j(a)), . . .} does indeed form a set, and one may derive
the canonical set ω of natural numbers from W (we will carry out these last steps in detail in our
approach as well).
This approach to defining W , though mathematically sound, is somewhat unsatisfying since
we make use of a class version of the natural numbers to obtain a set version of them. It would
be more revealing to see how the dynamics of j give rise to the set of natural numbers without
any reliance on the notion of a natural number at the beginning.
We will therefore take a somewhat longer journey17 than is usually done to arrive at the
natural numbers. We will use the following outline to guide our steps of reasoning:
(A) We will define the notion of a j-inductive subset of A and we will let W denote the smallest
j-inductive set.
(B) We will define an order relation E on W . Roughly speaking, we will say that x E y if
and only if one can obtain y from x by applying j at most fintely many times to x: y =
j(j . . . (j(x)) . . .). This definition must not rely on any notion of natural number.
(C) We will show that
E
is a well-founded partial order.
(D) Using the Axiom of Choice, we will show that
is a well-ordering of W .
E
is a total order as well, so that in fact,
E
These four points will set the stage for the final step, in which we derive the set of natural
numbers, and the canonical successor function, by way of the Mostowski Collapsing Map; this
map will produce the set of natural numbers because E is a well-ordering (and because the collapsing map outputs a transitive set), and will produce the standard successor function because
of the way E is defined in terms of j; in other words, the successor s : ω → ω will be seen to be
the “collapse” of the Dedekind self-map j restricted to W .
7
Obtaining the Well-Ordered Set W
For this section we fix once and for all a Dedekind self-map j : A → A with critical point a.
Establishing (A). We define the notion of a j-inductive set: A set B ⊆ A will be called j-inductive
if
(1) a ∈ B
(2) whenever x ∈ B, j(x) is also in B.
Notice that A itself
T is j-inductive. Therefore, the set I = {B ⊆ A | B is j-inductive} is
nonempty. Let W = I.
Remark 1. For later reference, we observe here that W is definable from j and a.
17
The reader who does not want to take this longer journey and is satisfied with the definition of W just described,
which relies on a class of natural numbers, can skip to Section 8 where the set of natural numbers is computed.
16
Lemma 1. W is a j-inductive subset of A.
Proof. For (1), since a belongs to every j-inductive subset of A, a ∈ W . For (2), assume x ∈ W .
Then x belongs to every j-inductive subset of A. For each such j-inductive subset B, since x ∈ B,
j(x) ∈ B. Therefore j(x) ∈ W .
Establishing (B). As mentioned earlier, we wish to define an order relation E on W by declaring
that, for all x, y ∈ W , x E y if and only if y can be obtained by finitely many applications of
j to x: y = j(j(. . .(j(x)) . . .)). The difficulty with this definition in our present context is the
word “finite,” since a set is ordinarily defined to be finite if it can be put in 1-1 correspondence
with one of the elements n of ω (recall that each such n is equal to its set {0, 1, 2, . . ., n − 1} of
predecessors).
To get around this difficulty, we will reword our requirements on E so that no mention
of “finiteness” is necessary; the properties of the underlying structure W will ensure in the
background that the number of applications of j that actually occur to reach from x to y whenever
x E y will always be finite, but we will not need to prove this or make this fact explicit at any
point in this early stage of development.
In order to define E properly, we will first define a more elementary relation E0 on W : We
shall say that, for all x, y ∈ W , x E0 y if and only if y = j(x). We prove some basic facts about j
and E0 .
For any relation R on a set X, an R-minimal element of X is an x ∈ X such that for all
y ∈ X, it is not the case that y R x. Moreover, R is said to be wellfounded if every nonempty
subset Y of X has an R-minimal element. A familiar example is the set N = {1, 2, 3, . . .} of
natural numbers with its less than relation < (so here, R is simply <); < is wellfounded as every
nonempty subset of N has a smallest element. We will not make use of this example here, of
course, since the natural numbers have not yet formally been defined; the example is intended
just to give some concrete sense for the new definitions.
Claim 1. j has no fixed points in W ; that is, j(x) 6= x for all x ∈ W . Consequently, E0 is
irreflexive.
Proof. The last part follows immediately from the fact that j has no fixed points. We prove
the first part: Suppose B ⊆ A is j-inductive and x ∈ B is a fixed point of j. We observe that
B − {x} is also j-inductive: First, notice a 6= x since a 6∈ ran j but x = j(x) ∈ ran j. Therefore,
a ∈ B − {x}. Suppose y ∈ B − {x}; we show j(y) ∈ B − {x}. Certainly j(y) ∈ B since B is
j-inductive. If j(y) = x, then since j(x) = x too, we have j(x) = j(y), and by 1-1-ness of j,
x = y, which is impossible. Therefore j(y) ∈ B − {x}. We have shown B − {x} is j-inductive.
To prove the claim, suppose B ⊆ A is a j-inductive set that contains a fixed point x of j. As
we have seen, B − {x} is j-inductive, and by leastness of W , W ⊆ B − {x}. Therefore B 6= W .
This proves the claim.
Claim 2.
E0
is wellfounded.
17
Proof. Suppose X ⊆ W is a subset having no
E0 -minimal
element. Let B ⊆ W be defined by
B = {x ∈ W | x 6∈ X}.
We show that B is j-inductive and therefore equal to W ; the conclusion will then be that X itself
is empty, as required.
First we show that a ∈ B. If not, a ∈ X, then there is a b ∈ X such that b E0 a; that is,
j(b) = a. But this is impossible since a 6∈ ran j.
Next, assume x ∈ B; we show j(x) ∈ B. Assume for a contradiction that j(x) 6∈ B, so
j(x) ∈ X. Since X has no E0 -minimal element, there is u ∈ X with j(u) = j(x). Since j is 1-1,
u = x. But this is impossible because u ∈ X but x is not.
This completes the proof that B is j-inductive and that W is empty.
Claim 3.
E0
is antisymmetric.
Proof. Suppose x, y ∈ W and x E0 y. If y E0 x also, then the set {x, y} is a nonempty subset of
W with no E0 -minimal element.
The relation E0 is a first try at defining a prototype of the less than relation on the natural
numbers. It has many of the characteristics that are needed, but it is not transitive: It is not the
case that if z = j(y) and y = j(x), then z = j(x). To obtain a transitive order relation, we will
expand E0 to a relation E as follows: For all x, y ∈ W , x E y if and only if there is a subset F of
W satisfying the following:
(i) x, y, ∈ F
(ii) for some v ∈ F , y = j(v)
(iii) there is no u ∈ F for which x = j(u)
(iv) if u ∈ F and u 6= y, then j(u) ∈ F
(v) if v ∈ F and v 6= x, there is u ∈ F such that v = j(u)
The set F is said to join x to y, and is called a joining set. Intuitively, F is finite, but we do
not state this in the definition, nor try to prove it (in fact, at this point, we do not even have a
definition of “finite”).18
Establishing (C).
We begin by proving some facts about
Claim 4.
18
E0
E
and joining sets.
⊆ E . That is, whenever x, y ∈ W and x E0 y, then x E y.
We come back to this point later. The fact that F must always be finite is established in Theorem 13.
18
Proof. Suppose x E0 y, so y = j(x). Let F = {x, y}. We show F joins x to y. Parts (i),(ii),(iv),(v)
in the definition of E are obviously true for F . We prove that (iii) also holds: We must verify
that neither of the following holds: (a) j(x) = x, (b) j(y) = x. (a) fails because j has no fixed
points; (b) fails because E0 is antisymmetric. We have established (iii); the result follows.
Claim 5.
E
is irreflexive.
Proof. Suppose x ∈ W . If x E x, let F join x to x. By (ii), there is u ∈ F such that x = j(u),
but by (iii), there is no u ∈ F for which x = j(u). This contradiction shows that it is never the
case that x E x.
Claim 6.
E
is wellfounded.
Proof. We use the same strategy as was used in the proof of Claim 2. Suppose X ⊆ W is a set
with no E-minimal element. Let B = {x ∈ W | x 6∈ X}. We show B is j-inductive.
We first prove that a ∈ B. Suppose, for a contradiction, that a ∈ X. Since X has no
E -minimal element, there is x ∈ X with x E a. Let F ⊂ X join x to a. Then there is u ∈ F such
that j(u) = a. But this is impossible since a 6∈ ran j.
Next, suppose x ∈ B; we show j(x) ∈ B. Suppose not; then j(x) ∈ X and so for some u ∈ X,
u E j(x). Let F join u to j(x). By property (ii) of the definition of E, there is some v ∈ F ⊆ X
such that v E0 j(x), that is, j(v) = j(x). Since j is 1-1, v = x. But this is impossible since v ∈ X
but x is not.
We extract an observation made in the proof:
Claim 7. For all x ∈ W , x E
6 a.
Claim 8.
E
is anti-symmetric; that is, for all x, y ∈ W , if x E y then it is not the case that y E x.
Proof. Suppose x, y ∈ W are such that x E y and also y E x. Then {x, y} has no
element, contradicting Claim 6.
E -minimal
Claim 9. For all x ∈ W , if a 6= x, then a E x.
Proof. Let B = {z ∈ W | a = z or a E z}. It suffices to show B is a j-inductive set. Since it
is obvious that a ∈ B, it suffices to assume z ∈ B and prove j(z) ∈ B. If a = z, then certainly
a E j(a), and so j(z) ∈ B. If a 6= z, then, since z ∈ B, we have a E z. Let F join a to z. Let
G = F ∪ {j(z)}. We wish to show that G joins a to j(z). We verify that G satisfies (i)–(v) above.
Properties (i), (ii), (iv), and (v) are immediate. For (iii), certainly no u ∈ F has the property
that u E a, since F joins a to z (note (iii) already holds for F ). But j(z) 6 E a either because of
Claim 7. We have shown B is j-inductive, as required.
Claim 10. Suppose x, y, z ∈ W .
19
(1) If x E y, then j(x) E j(y).
(2) x E j(j(x)).
(3) Suppose x E y and y 6= j(x). Then j(x) E y.
(4) Suppose x E j(y) and x 6= y. Then x E y.
(5) Suppose x E y. Then j(y) E
6 x.
(6) If x E y, then x E j(y).
(7) If j(x) E y, then x E y.
(8) Suppose F joins x to y, u ∈ F , and u 6= x. Then x E u.
(9) Suppose F joins x to y and u ∈ F . Then y E
6 u.
(10) Suppose F joins x to y, u ∈ F , and u 6= y. Then u E y.
(11) If x 6= a, there is u ∈ W such that j(u) = x. Moreover, there is no v ∈ W such that
u E v E x.
Regarding part (11), for a given x ∈ W different from a, we will call an element u ∈ W for which
j(u) = x an E -immediate predecessor of x, or, when the context makes it clear, an immediate
predecessor of x. Since j is 1-1, if x has an immediate predecessor, it is unique. Part (11) says
that every element of W other than a has an immediate predecessor. The notion “immediate predecessor” is to be distinguished from the unqualified “predecessor”: Whenever x E y, x is called
a predecessor of y, but x is not necessarily the immediate predecessor of y.
Proof of (1). Suppose F joins x to y. It is straightforward to verify that if G = {j(u) | u ∈ F },
then G joins j(x) to j(y).
Proof of (2). Let B = {x ∈ W | x E j(j(x)). We show B is j-inductive. By Claim 9, a ∈ B.
Assume x ∈ B; we show j(x) ∈ B. Since x ∈ B, x E j(j(x)). By (1), j(x) E j(j(j(x))). It follows
that j(x) ∈ B, as required.
Proof of (3). Suppose x E y and y 6= j(x). Let F join x to y. Let G = F − {x}. It is straightforward to verify that G joins j(x) to y.
Proof of (4). Suppose x E j(y) and x 6= y. Let F join x to j(y). Let G = F − {j(y)}. It is
straightforward to verify that G joins x to y.
Proof of (5). Fix x ∈ W . Let B = {y ∈ W | if x E y then j(y) E
6 x}. We show B is j-inductive.
Vacuously, a ∈ B. Assume y ∈ B; we show j(y) ∈ B. Assume x E j(y). Suppose first that x = y.
By (2), y E j(j(y)). Since E is antisymmetric, then j(j(y)) E
6 y. Therefore, in this case, j(y) ∈ B.
Now assume x 6= y. By (4), x E y. Since y ∈ B, it follows that j(y) E
6 x. But now if j(j(y)) E x,
by (3) we would have j(y) E x, which is a contradiction. Therefore, j(j(y)) E
6 x and so j(y) ∈ B
20
in this case as well. This completes the proof that B is j-inductive. We have shown therefore
that for all x, y ∈ W , if x E y, then j(y) E
6 x.
Proof of (6). Suppose x E y. We show x E j(y). Let F join x to y. Let G = F ∪ {j(y)}. We show
G joins x to j(y). Verification of (i),(ii), (iv), (v) in the definition of E is straightforward. We
check that (iii) holds: Suppose u ∈ G; we must show j(u) 6= x. Certainly j(u) 6= x if u ∈ F (since
F joins x to y). We consider the case in which u = j(y). If j(j(y)) = x, then since j(y) E j(j(y)),
it follows that j(y) E x, and this contradicts (5). Therefore, in this case also, j(u) 6= x, and (iii)
is established. We have shown G joins x to j(y), and so x E j(y).
Proof of (7). Suppose j(x) E y. We show x E y. Let F join j(x) to y and let G = F ∪ {x}.
We show G joins x to y. Verification of (i),(ii), (iv), (v) in the definition of E is straightforward.
We check that (iii) holds: We must show that j(u) 6= x for any x ∈ G. First we observe that if
u = j(x), j(u) 6= x: If j(j(x)) = x, then j(x) E x, violating antisymmetric property of E . Next,
suppose u ∈ F is such that j(u) E x. By (6), j(u) E j(x), but this contradicts the fact that F
joins j(x) to y (in particular, (iii) is violated). Therefore, no such u exists, and (iii) holds for G
as well. We have shown G joins x to y, and so x E y.
Proof of (8). Suppose F joins x to y, u ∈ F , and u 6= x. We show x E u. Let B = {u ∈ W |
if u ∈ F and x 6= u, then x E u}. We show B is j-inductive. By definition of F , a 6∈ F , so a ∈ B.
Assume u ∈ B and assume j(u) ∈ F and x 6= j(u). If x = u, then x E j(u), as required. On the
other hand, if x 6= u, then since u ∈ B, x E u. It follows by (6) that x E j(u). In each case we
have shown that j(u) ∈ B, and so B is j-inductive. The result follows.
Proof of (9). Suppose F joins x to y and u ∈ F . We show y 6 E u. Let B = {u ∈ W |
if u ∈ F , then y E
6 u}. Since for no z ∈ W is it true that z E a, we have that a ∈ B. Assume
u ∈ B and j(u) ∈ F . y 6 E j(u). If u 6∈ F , then (reasoning as we did in (8)), it must be that
j(u) = x. But now by antisymmetry of E , y E
6 x. On the other hand, suppose u ∈ F . Then since
u ∈ B, y E
6 u. If y E j(u), then y E u (by (4)), giving a contradiction. Therefore, y E
6 j(u). We have
shown j(u) ∈ B. Therefore, B is j-inductive, as required.
Proof of (10). Suppose F joins x to y, u ∈ F , and u 6= y. We show u E y. Let B = {u ∈ W |
if u ∈ F and y 6= u, then u E y}. We show B is j-inductive. Vacuously, a ∈ B. Assume u ∈ B
and j(u) ∈ F with j(u) 6= y. We show j(u) E y. The first case is that u 6∈ F ; we claim j(u) = x.
If not, for some v ∈ F , j(v) = j(u) (by property (v)), and since j is 1-1, v = u; but this is
impossible since u 6∈ F but v ∈ F . Therefore, j(u) = x. But now F witnesses that j(u) E y.
The other case is that u ∈ F . I claim y 6= u: If y = u, then y E j(u), but, since j(u) ∈ F , this
contradicts the result established in (9). Therefore y 6= u. Since u ∈ B, u E y. By (3), j(u) E y.
Therefore, j(u) ∈ B. We have shown B is j-inductive, as required.
Proof of (11). Since a E x, let F join a to x. By definition of “join”, there must be a u ∈ F for
which j(u) = x. If there is a v ∈ W such that u E v E x, let G join v to x. By definition of “join”
again, there is a z ∈ G such that j(z) = x. By the antisymmetric property of E0 , z 6= x. Clearly,
either v = z or v E z. It follows that u E z. By irreflexivity, u 6= z. However, j(u) = x = j(z), and
21
so, by 1-1-ness of j, we must have u = z. This is a contradiction and shows that no such v exists.
Claim 11. Suppose x, y, z ∈ W are distinct. Then if F joins x to y and G joins y to z, then
F ∪ G joins x to z. In particular, E is transitive.
Proof. Verification of properties (i), (ii), (iv), (v) for F ∪ G is straightforward. For (iii), suppose
u ∈ F ∪ G is such that j(u) = x. Since F joins x to y, it follows that u ∈ G and u 6∈ F . In
particular, u 6= y. By Claim 10(8), y E u. By Claim 10(6), y E j(u), and so y E x. But this violates
the antisymmetric property of E (since x E y by assumption). Therefore, for no u ∈ F ∪ G is it
the case that j(u) = x. Therefore, we have established that all properties (i)–(v) hold for F ∪ G,
as required.
Theorem 2. (W, E) is a wellfounded partial order.
Proof. This follows from Claims 5, 6, 8, and 11.
Establishing (D).
We are now ready to show that E is a well-ordering. For this purpose we will use the Axiom
of Choice in its following equivalent form:
Hausdorff Maximal Principle. Every partially ordered set contains a maximal
totally ordered subset.
Theorem 3. W is well-ordered under E .
Proof. It suffices to show that E is a total order on W. We have shown that E is a partial order.
Applying the Hausdorff Maximal Principle, let W0 ⊆ W be a maximal totally ordered subset of
W , with respect to E . We show that W0 is j-inductive and therefore equal to W .
By Claim 9, a E x for all x ∈ W for which a 6= x. Therefore, if a 6∈ W0 , W0 ∪ {a} is still
totally ordered but W0 is a proper subset of W0 ∪ {a}, and this violates the maximality of W0 as
a totally ordered subset of W . Thus, a ∈ W0 .
Next, suppose x ∈ W0 ; we show j(x) ∈ W0 . Notice x is not an upper bound of W0 since,
if it were, W0 ∪ {j(x)} would also be a total order, properly containing W0 . Therefore, there is
y ∈ W0 such that x E y. Since W0 is totally ordered and wellfounded, it is well-ordered and the
set {z ∈ W0 | x E z} has a least element z0 . Let F join x to z0 . If z0 6= j(x), if follows from
Claims 4 and 10(3) that x E j(x) E z0 . I claim that W0 ∪ {j(x)} is a totally ordered subset of
W that properly contains W0 : We must show that j(x) is related to every element of W0 . Let
z ∈ W0 . Then because W0 is totally ordered and because of leastness of z0 , either z E x, z = x,
z = z0 , or z0 E z. If z E x, by transitivity of E , z E j(x). If z = x, clearly again, z E j(x). If z = z0 ,
as we have seen, j(x) E z0 . And if z0 E z, then by transitivity of E, j(x) E z. We have shown that
W0 ∪ {j(x)} is a totally ordered subset of W that properly contains W0 , contradicting maximality of W0 . Therefore, the least element z0 in W0 for which x E z0 must be j(x); in particular,
j(x) ∈ W0 . This completes the proof that W0 is j-inductive, and that W0 = W . Hence (W, E) is
a well-ordered set.
The proof shows one additional useful fact, which we extract in the form of a Corollary:
22
Corollary 4. Let x ∈ W . Then j(x) is the E -least element of {z ∈ W | x E z}.
Therefore, we may list the elements of W by a E j(a) E j(j(a)) E . . .. After we define the
natural numbers, we will be able to establish more formally that W = {a, j(a), j(j(a)), . . .}.19
For future use, for each x ∈ W , let Wx = {z ∈ W | z E x} and let W x = {z ∈ W | x E z}. Thus,
for each x ∈ W , W = Wx ∪ {x} ∪ W x .
Corollary 5. Suppose x, y ∈ W and x 6= y. Then Wx 6= Wy .
Proof. Since
x ∈ Wy − Wx .
E
is a total order, we may assume, without loss of generality, then x E y. Then
Because E has the property given by Corollary 5, one says that
use, we compute Wx for a few values x ∈ W :
E
is extensional. For later
Corollary 6.
(1) Wa = ∅.
(2) Wj(a) = {a}.
(3) Wj(j(a)) = {a, j(a)}.
(4) Wj(j(j(a))) = {a, j(a), j(j(a))}.
Proof of (1). This follows from Claim 7.
Proof of (2). Recall that by Claim 10(4), if x E j(a), either x E a or x = a. Thus,
Wj(a) = {x ∈ W | x E j(a)} = {a}.
Proof of (3). By Claim 10(4) again, if x E j(j(a)), then either x E j(a) or x = j(a), and the only
x satisfying the first of these is a itself. Therefore
Wj(j(a)) = {x ∈ W | x E j(j(a))} = {a, j(a)}.
Proof of (4). By Claim 10(4) again, if x E j(j(j(a))), then either x E j(j(a)) or x = j(j(a)). We
have seen in (3) that the x for which x E j(j(a)) are a and j(a). Therefore,
Wj(j(j(a))) = {x ∈ W | x E j(j(j(a)))} = {a, j(a), j(j(a))}.
19
This is done in Theorem 16.
23
8
Generating the Natural Numbers with the Mostowski Collapse
Before describing the Mostowski Collapsing Map, we need to introduce one other concept. A set
T is said to be transitive if, whenever z ∈ T and y ∈ z, we have y ∈ T . It is not hard to verify
that ω = {0, 1, 2, . . .} together with each of its elements, is transitive. Transitive sets are the
“well-behaved” sets in the universe.
We define a function π on W , which is known as the Mostowski Collapsing Map and which
can be defined more generally on any well-founded, extensional relation defined on a set. The
function π, defined on W , will turn out to be a bijection and has the effect of identifying its
domain with a transitive set N whose membership relation ∈ behaves exactly like the relation
E on W . The effect of the Mostowski Collapsing Map is to transform a set and its internal
relationships into a “concrete, well-behaved” set that resides at the very bottom of the universe
but that still exhibits the same relationships as before. As we will see, π will collapse the elements
of W to corresponding elements of the set ω = {0, 1, 2, 3, . . .}. From our initial assessment of W ,
we would expect that a corresponds to 0, j(a) to 1, j(j(a)) to 2, and so forth.
Theorem 7 (Mostowski Collapsing Theorem for W ). There is a unique function π defined on
W that satisfies the following relation, for every x ∈ W :
π(x) = {π(y) | y E x}.
(∗)
Proof. Let B ⊆ W be defined by putting z ∈ B if and only if the formula ψ(z) ≡ ∃!g φ(z, g)
holds, where “∃!” means “there exists exactly one” and
φ(z, g)
⇔
g is defined on Wz ∪ {z} so that it satisfies, for all x ∈ Wz ∪ {z}
g(x) = {g(y) | y E x}
Whenever there exists a g such that φ(z, g), we say that g is a witness for ψ(z). When such a g
defined on Wz ∪ {z} exists, it will typically be denoted πz .
We will show that B is j-inductive, and then, from B, obtain the Mostowski Collapsing map.
We first observe that a ∈ B: Since there is no y ∈ W for which y E a, {πa (y) | y E a} must be
empty, no matter how πa is defined on “predecessors” of a. Therefore, there is one and only one
function πa with domain Wa ∪ {a} = {a} that satisfies πa (a) = {πa(y) | y E a}, and that is the
function for which πa(a) = ∅. We have shown that ψ(a) holds, so a ∈ B.
Now assume z ∈ B and let πz be the unique map defined on Wz ∪ {z} that is a witness for
ψ(z). We prove j(z) ∈ B. We define πj(z) on Wj(z) ∪ {j(z)} by
(
πz (x)
if x E j(z)
πj(z) (x) =
{πz (y) | y E j(z)} if x = j(z)
Notice that if y E j(z), then either y = z or y E z (by Claim 10(4)). Therefore, defining
πj(z) (x) to be {πz (y) | y E j(z)} when x = j(z) makes sense. We verify that πj(z) is a witness for
ψ(j(z)):
If x E j(z), then
πj(z) (x) = πz (x)
= {πz (y) | y E x}
= {πj(z) (y) | y E x}.
24
The last line follows because, by definition of πj(z) , πj(z) agrees with πz on all y for which y E x
since in that case y E j(z).
On the other hand, if x = j(z), then
πj(z) (x) = {πz (y) | y E j(z)}
= {πj(z) (y) | y E j(z)}
Once again, by definition of πj(z) , πj(z) agrees with πz on y for which y E j(z), so the second
equality in the display follows from the first.
This shows that a witness πj(z) for ψ(j(z)) exists; we need to show it is unique. Assume f
is defined on Wj(z) ∪ {j(z)} is also a witness for ψ(j(z)); in particular, that f satisfies f (x) =
{f (y) | y E x}. It is not hard to check that f Wz is a witness for ψ(z), so by uniqueness of πz as
a witness for ψ(z), f Wz = πz . By this observation, we have
f (j(z)) = {f (y) | y E j(z)} = {πz (y) | y E j(z)} = {πj(z) (y) | y E j(z)} = πj(z) (j(z)).
Hence f = πj(z) and we have established uniqueness. It follows that j(z) ∈ B.
We have shown B is j-inductive, and so W = B. We now define the Mostowski Collapsing
map π on W as follows: For each x ∈ W ,
π(x) = πx (x).
Claim. For all y ∈ W ,
(+)
if x E y, then πx (x) = πy (x).
Proof. Let B = {y ∈ W | if x E y then πx (x) = πy (x)}. We show B is j-inductive. Vacuously,
a ∈ B. Suppose x ∈ B; we show j(x) ∈ B. Let y be such that j(x) E y. Then, using the fact
(twice) that x ∈ B, we have
πy x = {πy (u) | u E x} = {πx (u) | u E x} = {πj(x)(u) | u E x} = πj(x)(x).
Therefore j(x) ∈ B. Since B is j-inductive, B = W and the result follows.
We show that π satisfies (∗) for each x ∈ W . Using (+), we have:
π(x) = πx (x) = {πx (y) | y E x} = {π(y) | y E x},
as required.
Finally, we show that π is the unique f satisfying, for all x ∈ W , f (x) = {f (y) | y E x}.
Given any such f , we show f = π. Let
B = {x ∈ W | for all y such that y = x or y E x, f (y) = π(y)}.
We show B is j-inductive. The fact that a ∈ B is immediate; in particular, π(a) = ∅ = f (a).
Assume x ∈ B, so that f (y) = π(y) for all y for which y = x or y E x. Then since y E j(x) implies
y = x or y E x (as we observed earlier),
f (j(x)) = {f (y) | y E j(x)} = {π(y) | y E j(x)} = π(j(x)).
25
To show j(x) ∈ B, we must also show that for u E j(x), f (u) = π(u), but this follows from the
fact that x ∈ B. We have shown j(x) ∈ B. Therefore B is j-inductive and B = W . It follows
that f = π, as required.
As we establish properties of the collapsing map π, we will denote its range by N . We
establish some important properties of π and N :
Theorem 8 (Properties of π and N ).
(1) N is a transitive set.
(2) π is 1-1.
(3) For all x, y ∈ W , x E y if and only if π(x) ∈ π(y).
(4) (N, ∈) is a well-ordered set. In particular, 0 = ∅ is the ∈-least element of N .
(5) Each n ∈ N is a transitive set. Moreover, n = {m ∈ N | m ∈ n}.
Remark 2. Part (3) of the theorem tells us that the usual membership relation ∈ on the range
N of π exactly parallels the relation E on W : Whenever x is an E-predecessor of y in W , in the
sense of the relation E , the images π(x), π(y) have the same relationship to each other, namely,
that π(x) is an ∈-predecessor of π(y); the converse statement also holds. This property, together
with the fact that π is a bijection from W to N , tells us that π is an order isomorphism from W
to N ; this means that (N, ∈) is an exact reproduction of (W, E) Intuitively, we may think of this
relationship as indicating that the “blueprint” W has been faithfully reproduced as a concrete
(transitive) object in the bottom portion of the universe. As we shall show in a moment, N turns
out to be the concrete set ω of natural numbers.
Part (5) tells us that each n ∈ N consists precisely of the elements of N that precede it in
the well-ordering ∈. Thus, 1 = {0}, 2 = {0, 1}, and so forth.
Proof of (1). Suppose π(x) ∈ N and u ∈ π(x). We must show u ∈ N . Since π(x) = {π(y) |
y E x}, it follows that u = π(y) for some y ∈ W . Thus u ∈ N .
Proof of (2). Suppose π(x) = π(z) but x 6= z. Recall that Wx 6= Wz . Without loss of generality,
assume there is u ∈ Wx − Wz , so u E x and u 6E z. Then π(u) ∈ π(x). Since π(z) = π(x), then
π(u) ∈ π(z), whence u E z, and this is a contradiction. We have shown π is 1-1.
Proof of (3). This follows immediately from the definition of π.
Proof of (4). It is easy to see that the first part follows from (2) and (3); we verify a few
of the required points. For the irreflexive property, notice that for x ∈ W , x E x if and only if
π(x) ∈ π(x); since the former is false, the latter is also false. Similar reasoning shows (N, ∈) is
antisymmetric, transitive, and a total order. We verify that ∈ well-orders N: Suppose C ⊆ N is
nonempty. Let B denote the preimages of C in W ; that is, b ∈ B if and only if π(b) ∈ C. Let
26
b0 be the least element of B. Let c0 = π(b0). We show c0 is ∈-least in N : Let π(b) ∈ C. Then
b0 E b, and so π(b0) ∈ π(b). Since b was arbitrary, we have shown C has an ∈-least element.
For the second clause, let m ∈ N . In defining π, we have already seen that π(a) = 0. Let
x ∈ W be such that π(x) = m. Since a E x, then by (3), 0 = π(a) ∈ π(x) = m.
Proof of (5). The fact that each n ∈ N is a transitive set follows from the fact that ∈ is
transitive as an order relation: For all m, n, r ∈ N , if m ∈ n and n ∈ r, then m ∈ r. To show
that n = {m ∈ N | m ∈ n}, we perform a computation: Let x ∈ W be such that n = π(x).
n = π(x)
= {π(y) | y E x}
= {π(y) | π(y) ∈ π(x)} (because π is an order isomorphism)
= {m ∈ N | m ∈ n}.
Having defined the Mostowski Collapsing Map π on W , we can now demonstrate how π
“collapses” the blueprint W to the concrete set ω of natural numbers. We begin with the computation of the first few elements of ω. We will make use of Corollary 6 in which a few values of
Wx were computed.
Computation of 0. Here, 0 arises as the collapse of a in W :
π(a) = {π(y) | y E a} = ∅ = 0.
Computation of 1. Here, 1 arises as the collapse of j(a) in W . Recall from Corollary 6 that
Wj(a) = {a}.
π(j(a)) = {π(x) | x E j(a)}
= {π(x) | x ∈ Wj(a)}
= {π(x) | x ∈ {a}}
= {π(a)}
= {0}
= 1.
Computation of 2. Here, 2 arises as the collapse of j(j(a)) in W . Recall from Corollary 6 that
Wj(j(a)) = {a, j(a)}.
π(j(j(a))) = {π(x) | x E j(j(a))}
= {π(x) | x ∈ Wj(j(a))}
= {π(x) | x ∈ {a, j(a)}}
= {π(a), π(j(a))}
= {0, 1}
= 2.
27
Computation of 3. Here, 3 arises as the collapse of j(j(j(a))) in W . Recall from Corollary 6
that Wj(j(j(a))) = {a, j(a), j(j(a))}.
π(j(j(j(a)))) = {π(x) | x E j(j(j(a)))}
= {π(x) | x ∈ Wj(j(j(a)))}
= {π(x) | x ∈ {a, j(a), j(j(a))}}
= {π(a), π(j(a)), π(j(j(a))}
= {0, 1, 2}
= 3.
These observations suggest that, at least in some sense, π “gives rise” to the natural numbers.
We must be careful not to claim too much here, though, because the numbers 0, 1, 2, 3, . . ., with
their usual definition as sets ∅, {∅}, {∅, {∅}}, . . ., are already present in the universe, even without
the Axiom of Infinity, as part of the way the first stages of the universe are built; these come into
being on the basis of the other axioms of ZFC. What the other axioms do not tell us, however,
is how to form the set of natural numbers, and, in fact, without the Axiom of Infinity, it is not
possible to prove that such a set even exists. Therefore, the real content of our claim that π
“gives rise” to the natural numbers is that it gives rise to the sequence 0, 1, 2, . . . in its entirety,
as a completed set—seen as emerging from the self-referral dynamics of the original Dedekind
self-map j : A → A.
What we wish to do then is to show how to derive from π the successor function s, which,
once defined, specifies the full sequence of natural numbers: 0, s(0), s(s(0)), . . .. Recall that the
set-based version of the successor function has this form: s(x) = x ∪ {x}. For instance,
s(0) = s(∅) = ∅ ∪ {∅} = {∅} = {0} = 1.
What we will see is that the values of the successor function s are obtained simultaneously
from a certain type of interaction between π and j; to say it another way, s turns out to be the
unique map that makes the following diagram commutative:
W
j W
-
π
π
?
N
W
s
-
?
N
Notice that, in order for any map s : N → N to make the diagram commutative, it must be
true that s = π ◦ (j W ) ◦ π −1 . (Note that this computation makes sense because π is a bijection.)
We can therefore take this to be the definition of s and then verify that s is truly the successor
function on the usual set of natural numbers.
Theorem 9. Define s = π ◦ (j W ) ◦ π −1 : N → N . Then, for all n ∈ N , s(n) = n ∪ {n}.
Proof. By the definition of s, the diagram must be commutative. Let B = {x ∈ W | s(π(x)) =
π(x) ∪ {π(x)}}. We show B is j-inductive. This is enough because every n ∈ N is π(x) for some
x ∈ W , so, assuming B = W , we have s(n) = s(π(x)) = π(x) ∪ {π(x)} = n ∪ {n}.
28
To prove B is j-inductive, first we show a ∈ B: By commutativity,
s(π(a)) = π(j(a)) = {π(u) | u E j(a)} = {π(a)} = {0} = 0 ∪ {0} = π(a) ∪ {π(a)}.
Next, assume x ∈ B, so s(π(x)) = π(x) ∪ {π(x)}. We show j(x) ∈ B, that is, s(π(j(x))) =
π(j(x)) ∪ {π(j(x))}. But
s(π(j(x))) = π(j(j(x)))
= {π(y) | y E j(j(x))}
= {π(y) | y E j(x) or y = j(x)}
= {π(y) | y E j(x)} ∪ {π(y) | y = j(x)}
= π(j(x)) ∪ {π(j(x))}.
We can now verify in several theorems below that N and s have the expected properties. We
will say that a set S is inductive if ∅ ∈ S and whenever x ∈ S, we have x ∪ {x} is in S.
T
Theorem 10. N is an inductive set. Indeed, N = {I | I is inductive}.
Proof. We have seen already that ∅ ∈ N . Suppose n ∈ N . Then for some x ∈ W , n = π(x).
But now
n ∪ {n} = s(n) = s(π(x)) = π(j(x)) ∈ N.
For the second clause, it is sufficient to show that N ⊆ I for every inductive set I. Let I be
any inductive set. Let B = {x ∈ W | π(x) ∈ I}. We show B is j-inductive. By definition
π(a) = ∅ ∈ I, so a ∈ B. If x ∈ B, then n = π(x) ∈ I. But now
j(x) ∈ B ⇔ π(j(x)) ∈ I ⇔ s(π(x)) ∈ I ⇔ s(n) ∈ I ⇔ n ∪ {n} ∈ I,
and the last of these statements is true by definition of “inductive.” Hence B is j-inductive, and
so for every n ∈ N , n ∈ I, as required.
The property that N has of being an inductive set leads to the Principle of Mathematical
Induction:
Principle of Mathematical Induction. Suppose A ⊆ N has the following two properties:
(1) ∅ ∈ A
(2) whenever n ∈ A, s(n) ∈ A.
Then A = N .
We can now prove the Principle of Mathematical Induction. Before doing so, we state a
weaker version that is often also used, based on formulas. A formula is simply an expression
involving set parameters. Examples include statements like “x is an even natural number,” “x
has at least two elements,” “y = P(x)” (where P is the power set operation, which outputs the
29
set of all subsets of a set x). We have the following weak principle of induction:
Weak Principle of Mathematical Induction. Suppose φ(x) is a formula20 whose parameter
x represents an element of N . Suppose further that
(1) φ(∅) holds true
(2) whenever φ(n) holds, φ(s(n)) also holds.
Then φ(n) holds for every n ∈ N .
This priniciple is called “weak” because the Principle of Induction implies it: Given any
formula φ(x), where x represents elements of N , suppose (1) and (2) of the Weak Principle hold.
The Axiom of Separation (a ZFC axiom) implies that the collection A = {n ∈ N | φ(n)} is a
set, and so (1) and (2) of the Principle also hold. Since the Principle of Induction holds, the
conclusion of the Weak Principle now follows. The Weak Principle is useful because it generalizes
to classes—a topic we will take up in Section 11.
Theorem 11. The Principle of Mathematical Induction is correct.
Proof. Let A ⊆ N satisfy properties (1) and (2). Properties (1) and (2) assert that A is in fact
an inductive set. By Theorem 10, A ⊆ N. But since A ⊆ N also, we conclude that A = N .
An alternative proof is obtained by using our earlier observation that (W, E) and (N, ∈) are
order-isomorphic, so that, in particular, N is well-ordered under ∈. Thus, suppose A ⊆ N satisfies (1) and (2). We will use the fact that N is well-ordered to show A = N . Certainly 0 ∈ A.
Assume A 6= N . Let S = {m ∈ N | m 6∈ A}. S 6= ∅ by assumption. Let n be the ∈-least element
of S, using the well-ordering; we have seen that n 6= 0. Let y ∈ W with π(y) = n. Let x be the
E -immediate predecessor of y. Let m = π(x). The order isomorphism between (W, E ) and (N, ∈)
guarantees that m is the ∈-immediate predecessor of n (so that s(m) = n, and, moreover, m ∈ n,
but for all r ∈ N , it is not true that m ∈ r ∈ n). Now m 6∈ S by leastness of n, so m ∈ A, and it
follows that s(m) ∈ A. But this contradicts the fact that n 6∈ A.
Notation. A small part of the proof shows that, as we would expect, every nonzero element of
N has a unique immediate predecessor. For n 6= 0 in N , we denote the immediate predecessor
of n by n − 1. Similarly, we will adopt the usual convention of writing n + 1 for s(n), whenever
n ∈ N . Finally, it will be convenient, when emphasizing the use of ∈ on N as an order relation,
to write < in place of ∈, as is customarily done. Moreover, we shall write m ≤ n if and only if
either m ∈ n or m = n.
Theorem 12. 0 6∈ ran s and s is 1-1. Moreover, N = {0} ∪ ran s.
20
Formally, formulas are formed from variables and the membership relation ∈; notions like “natural number”
and “function” are defined in terms of these. Formulas are defined inductively (on the length of the expression). If
x, y are variables, both x = y and x ∈ y are formulas. If φ, ψ are formulas and x is a variable, so are φ∧ψ, φ∨ψ, φ →
ψ, ¬φ, ∃x φ, and ∀x φ. Formulas can have any finite number of parameters, so, to be more formal, we could write
φ(x, y1 , . . . , yk ), where y1 , . . . , yk are variables standing for arbitrary sets; for readability, we just display x.
30
Proof. If s(m) = 0 for some m ∈ N , let x ∈ W be such that π(x) = m. Then 0 = π(a) =
s(π(x)) = π(j(x)), and so, by 1-1-ness of π, a = j(x), which is impossible by Claim 7.
To prove s is 1-1, suppose s(m) = s(n), where m, n ∈ N . Let x, y ∈ W with π(x) =
m, π(y) = n. Then
s(m) = s(n) ⇔ π(j(x)) = π(j(y)) ⇔ j(x) = j(y) ⇔ x = y ⇔ m = n.
Finally, suppose n 6= 0 is in N . Let m be its unique immediate predecessor, so s(m) = n.
We have shown that ran s = N − {0}. It follows that N = {0} ∪ ran s.
Now we can define the concrete notion “natural number.” A set n is a natural number if
and only if n ∈ N , if and only if n belongs to every inductive set. This concrete definition is the
one that allows us to see each number explicitly rendered as a set:
Theorem 13. For each n ∈ N , if n 6= 0, then n = {0, 1, 2, . . ., n − 1}.
Proof. By Claim 10(11) and the fact that π is an isomorphism, for every m ∈ n, m ≤ n − 1.
Conversely, if m ≤ n − 1, then either m ∈ n − 1—so by transitivity, m ∈ n—or m = n − 1, in
which case also m ∈ n. Thus n consists precisely of the elements of N that precede n, the largest
of these being n − 1.
In light of Theorem 10, it is clear that N and ω (defined at the beginning of this paper) are
the same set. For the rest of the paper, we will use “ω” as the name of this set.
9
The Self-Map s : ω → ω As the Smallest Among All Dedekind
Self-Maps
Starting from an arbitrary Dedekind self-map, we have derived the set ω of natural numbers
together with its usual successor function s : ω → ω, which is itself a Dedekind self-map. Intuitively, one considers the set of natural numbers as the smallest type of infinite set. In the usual
development of ZFC set theory, for example, one proves that the size ℵ0 of ω is the smallest
infinite cardinal, and one can also show that ω is faithfully embedded in every infinite set; that
is, for any infinite S, there is a 1-1 function f : ω → S. Here, we will show something that is
apparently even stronger: That the successor function is “embedded” in every Dedekind self-map,
in a unique way. This result will have a number of practical consequences that we will mention
at the end of this section.
To begin, we return to the diagram that showed how j W : W → W is collapsed to the
successor function s : ω → ω:
W
j W
-
π
π
?
ω
W
s
31
-
?
ω
The Mostowski Collapsing Theorem tells us that π is the unique function defined on W
satisfying π(x) = {π(y) | y E x}. This result allowed us to demonstrate that the set ω and the
successor function s exist (as a function from ω to ω).
We wish now to look at the diagram in a slightly different way. Initially, we viewed the
diagram as starting with the maps j W and π, which occupied the top and two vertical sides of
the square determined by the diagram:
W
j W
-
π
W
π
?
?
ω
ω
The bottom edge was filled in by defining s to be a composition of the other maps (and their
inverses): s = π ◦ (j W ) ◦ π −1 .
Having established the existence of ω and s, we can view the diagram in another way. We
begin as before with j W : W → W and now also take as given the map s : ω → ω, defined now
by the formula s(n) = n ∪ {n}.21 For the moment, we leave the two vertical edges unspecified.
W
ω
j W
-
W
s
-
ω
We now define π on W as before, this time as a candidate to fill in the vertical edges of the
diagram: π(x) = {π(y) | y E x}. As we argued earlier, π is uniquely determined by this relation.
Let A ⊆ ω be defined by
A = {n ∈ ω | for some x ∈ W , π(x) = n}.
Showing that A is inductive will establish that ran π = ω. Since π(a) = 0, as we argued before,
0 ∈ A. If n ∈ A, then let x ∈ W with π(x) = n. We show s(n) ∈ A by showing π(j(x)) = s(n) =
n ∪ {n}. But
π(j(x)) = {π(y) | y E j(x)} = {π(y) | y E x} ∪ {π(x)} = π(x) ∪ {π(x)} = n ∪ {n},
as required.
We wish to show now that π : W → ω is the unique map for which the following two
conditions hold:
Technically, this amounts to defining s on ω to be s̄ ω; recall that s̄ is defined on all sets by s̄(x) = x ∪ {x}.
One needs to verify that the range of s̄ ω is a subset of ω, so that we may write, as usual, s : ω → ω. We do this
quick verification here: Let A ⊆ ω be the set {n ∈ ω | s(n) ∈ ω}. Certainly 0 ∈ A and if n ∈ A, s(n) ∈ ω, since ω
is inductive. Therefore, A is inductive, so ω ⊆ A, yielding that A = ω.
21
32
(i) π(a) = 0.
(ii) The diagram is commutative; that is,
s ◦ π = π ◦ (j W )
(∗∗)
We showed in Theorem 9 that (∗∗) holds true (we established this using a different definition
of s, but we also showed that for each n, s(n) = n ∪ {n}). We verify uniqueness: Suppose
h : W → ω satisfies (i) and (ii) above; that is, h(a) = 0 and s ◦ h = h ◦ (j W ).
j W
W
-
W
h
h
?
s
ω
?
-
ω
We establish uniqueness by j-induction: Let B ⊆ W be defined by
B = {x ∈ W | h(x) = π(x)}.
Since (i) holds for both functions, a ∈ B. Assume x ∈ B. Then since both functions make the
diagram commutative, we have
π(j(x)) = j(s(x)) = h(j(x)),
and so j(x) ∈ B, and B is j-inductive, as required.
We observed earlier that π is an order isomorphism. In the present context, π is another kind
of isomorphism, called a Dedekind self-map isomorphism. We define this notion now. Suppose
g : B → B is a Dedekind self-map with critical point b and h : C → C is a Dedekind self-map
with critical point c. A Dedekind self-map morphism from g to h is a map β : B → C satisfying:
(1) β(b) = c
(2) The diagram is commutative; that is, β ◦ g = h ◦ β.
g
B
-
β
B
β
?
h
C
-
?
C
Moreover, if β is a bijection, β is a Dedekind-self map isomorphism.22 The intuition behind a
Dedekind self-map morphism β from g : B → B to h : C → C is that the structural relationships
As an interesting sideline, we point out that the concept of a Dedekind-self map isomorphism from j W to
s : ω → ω is a slight weakening of “E -order isomorphism” to “E 0 -order isomorphism.” We can explain this point
in the following way. Given g and h as in the definition of Dedekind self-map morphisms. Suppose Eg is defined on
B by x Eg y if and only if y = g(x) and Eh is defined on C by x Eh y if and only if y = h(x), then β is a Dedekind
self-map isomorphism if and only if β is an (Eg , Eh )-isomorphism (that is, β is a bijection and x Eg y if and only
if β(x) Eh β(y)). In particular, to say that π is a Dedekind self-map isomorphism from j W to s is the same as
saying that it is an order isomorphism, relative to the relation E 0 .
22
33
given by g are reflected in h: if g takes x to y in B, then h takes β(x) to β(y) in C. Moreover, if
β is also an isomorphism, then the relationships between the two maps are structurally identical:
For all x, y ∈ C, g takes x to y if and only if h takes β(x) to β(y).
We introduce some notation for Dedekind self-map morphisms: Notice that every Dedekind
self-map g : B → B with critical point b specifies three parameters: g, B, and b. We therefore
may describe the triple (B, g, b) as being the Dedekind self-map. Then, we can indicate that β
is a Dedekind self-map morphism from g to h taking b to c by simply saying that β : (B, g, b) →
(C, h, c) is a Dedekind self-map morphism.
Our reasoning in the previous paragraphs has demonstrated the following:
Theorem 14. Let π : W → ω be defined by π(x) = {π(y) | y E x}. Then π is the unique
Dedekind self-map morphism from (W, j W, a) to (ω, s, 0). Moreover, π is in fact a Dedekind
self-map isomorphism.
Theorem 14 tells us that, structurally, W and ω are identical, with “successor” functions
that behave in exactly the same way; the structures (W, j W, a) and (ω, s, 0) are identical except
for notational differences.
It is natural to expect that the isomorphism π is invertible—that π −1 is also a Dedekind
self-map isomorphism, and that it is the unique morphism from s to j W . We verify this now.
Henceforth, we let τ denote π −1 .
ω
s
-
τ
?
W
ω
τ
j W
-
?
W
Theorem 15. Let τ : ω → W be π −1 . Then τ is a bijection and is the unique Dedekind self-map
morphism from (ω, s, 0) to (W, j W, a).
Proof. Since τ = π −1 , τ is a bijection and τ (0) = a. Then
τ ◦ s = j ◦ τ ⇔ s = π ◦ j ◦ τ ⇔ s ◦ π = π ◦ j.
Since π is a Dedekind self-map morphism, the last of these equations holds true, and so the first
one does as well.
For uniqueness, we first observe that if g is a Dedekind self-map morphism from (ω, s, 0) to
(W, j W, a), then g must be 1-1 and onto:
ω
s
-
g
?
W
ω
g
j W
34
-
?
W
Let A = {n ∈ ω | g(n) 6∈ {g(0), g(1), . . . , g(n − 1)}}. We show A is inductive; this will
establish that g is 1-1. Clearly 0 ∈ A. If n ∈ A and g(s(n)) = g(i) for some i, 0 ≤ i ≤ n − 1,
notice first that i 6= 0 since we would have in that case
a = g(0) = g(s(n)) = j(g(n)),
which is impossible since, for no x ∈ W is it true that x E a. Therefore, i = s(k) for some
k, 0 ≤ k < n − 1, and we have
j(g(n)) = g(s(n)) = g(s(k)) = j(g(k)).
Since j is 1-1, g(n) = g(k) which contradicts the fact that n ∈ A. Therefore, s(n) ∈ A as required.
To see that g is also onto, let B ⊆ W be defined by B = {x ∈ W | for some n ∈ ω, g(n) = x}.
Clearly a ∈ B. If x ∈ B, let n ∈ ω with g(n) = x. We show j(x) ∈ B. But
j(x) = j(g(n)) = g(s(n)),
as required. Since B is j-inductive, B = W , and g is onto.
To complete the proof, we must show that τ = g. But notice now that g −1 makes the
following diagram commutative:
W
j W
-
g −1
g −1
?
ω
W
s
-
?
ω
By uniqueness of π, g −1 = π, and so g = (g −1 )−1 = π −1 = τ.
We can now use Theorem 15 to demonstrate several facts about W and the relation E that
we have not been able to establish until now. First, we give some definitions. A set X is finite if
there is n ∈ ω for which there is a bijection from n to X. Let i = j W : W → W be defined as
above. For each n > 1, let in be the n-fold composition of i with itself: in = i ◦ i ◦ · · · ◦ i, where
the right-hand side has n factors. We let i0 = idW and i1 = i.23
ω
s
-
τ
τ
?
W
ω
i
-
?
W
Theorem 16.
23
At this point, we do not yet claim that the values i0 , i1 , i2 , . . . form an actual set, but we will use a result like
Theorem 15, to be described shortly, that will allow us to make (and prove) such a claim.
35
(1) For each n ∈ ω, in : W → W and in+1 = i ◦ in . Moreover, τ (n) = in (a) = j n (a).
(2) W = {a, j(a), j 2(a), . . .} = {a, i(a), i2(a), . . .} = {in (a) | n ∈ ω}
(3) Suppose F joins x to y in W . Then F is finite.
Proof of (1). Define A by
A = {n ∈ ω | in+1 = i ◦ in and ran in+1 ⊆ W }.
To prove the first half of (1), it suffices to show that A = ω. Since i0 = idW , 0 ∈ A. Assume
n ∈ A. Both in+1 and i ◦ in are n + 1-fold compositions of i with itself, so they are equal.
Also, if x ∈ W , in+1 (x) = i(in (x)). Since in (x) ∈ W (by the induction hypothesis), it follows
that in+1 (x) ∈ W as well. For the “moreover” clause, we perform a separate induction. Let
B = {n ∈ ω | τ (n) = in (a) = j n (a)}. It is clear that 0 ∈ B. Assuming n ∈ B, we have
τ (n + 1) = τ (s(n)) = i(τ (n)) = i(in(a)) = in+1 (a),
and similarly if we replace i by j.
Proof of (2). By (1), for all n ∈ ω, in (a) ∈ ω. This shows that W contains all the terms in (a).
We show that these are the only elements of W . Let B ⊆ W be defined by
B = {x ∈ W | for some n ∈ ω, x = in (a)}.
Notice a ∈ B since a = i0 (a). Assume x ∈ B, so x = in (a) for some n ∈ ω. Then
i(x) = i(in (a)) = in+1 (a).
Therefore, i(x) ∈ B and B = W . Therefore, every element of W is one of the terms in (a). This
completes the proof of (2).
Proof of (3). Define E n on W by x En y if there are x1 , . . . , xn ∈ W so that x = x1 , y = xn ,
and for each k, 1 ≤ k ≤ n − 1, xk+1 = j(xk ). We show that whenever x E y, there is n ∈ ω so
that x En y. Recall that E is a well-ordering of W , isomorphic under π to ω. Given x, y such
that x E y, let m, n be such that π(x) = m and π(y) = n, and let F join x to y. Then the chain
m < m + 1 < . . . < n has ` = n − m + 1 terms. It follows that x E ` y, as required.
We show that F consists of ` elements, F = {y1 , . . . , y` }, where, for each k, 1 ≤ k < `, j(yk) =
yk+1 . Let Y = {y1 , . . . , y` }. A simple induction shows each element of Y is in F . Suppose some
element of F is not in Y . Let y be the E-least such. The E -least element of F must be y1 .
Therefore, by properties of F , there is some z ∈ F such that y = j(z). Since z E y, y = yk for
some k. But now y = j(yk ) = yk+1 , contradicting the assumption that y is not in Y . We have
shown F = Y = {y1 , . . . , y` } as required.
We can now show the special relationship the successor function has with all other Dedekind
self-maps.
36
Theorem 17.
(1) Suppose j : A → A is a Dedekind self-map with critical point a. Then there is a unique
Dedekind self-map morphism τ from (ω, s, 0) to (A, j, a).
s
ω
-
ω
τ
τ
?
j
A
?
-
A
(2) Suppose j : A → A is a Dedekind self-map with critical point a and suppose k : U → U is any
Dedekind self map, with critical point u such that (U, k, u) is Dedekind self-map isomorphic
to (ω, s, 0). Then there is also in this case a unique Dedekind self-map morphism σ from
(U, k, u) to (A, j, a).
k
U
-
U
σ
σ
?
j
A
?
-
A
Proof. We have already completed the main steps of the proof; we review these now. We obtain
W ⊆ A as the smallest j-inductive set. We have seen that π : W → ω, as defined earlier, is
unique such that π(a) = 0 and π ◦ j W = s ◦ π and that π is a bijection. It follows, as we have
shown, that if τ = π −1 , then τ is a bijection and the unique Dedekind self-map from (ω, s, 0)
to (W, j W, a). The existence of τ requires one small additional step. Consider the following
diagram:
s
ω
-
τ
τ
?
i
W
-
incW,A
?
A
ω
?
W
incW,A
j
-
?
A
We now define τ = incW,A ◦ τ . Clearly τ (0) = a and, for all n ∈ ω,
(+)
j ◦ incW,A ◦ τ = incW,A ◦ i ◦ τ = incW,A ◦ τ ◦ s,
so the diagram is commutative. It follows that
j ◦ τ = τ ◦ s,
37
as required.
For uniqueness, suppose h : ω → A satisfies the same conditions: h(0) = a and h ◦ s = j ◦ h.
We show h = τ by proving by induction that h(n) = τ (n) for all n ∈ ω. Certainly h(0) = a = τ (0)
by assumption. Assuming h(n) = τ (n) we show h(s(n)) = τ (s(n)). But
h(s(n)) = j(h(n)) = j(τ(n)) = τ (s(n)),
as required. This completes the induction and shows that h = τ .
Proof of (2). The proof is similar to the one given for (1). Let β be a Dedekind self-map
isomorphism from (U, k, u) to (ω, s, 0) (we have already seen that if there is such an isomorphism
from s to k, the inverse is also such a map, now from k to s). Let τ be the Dedekind self-map
morphism from (ω, s, 0) to (A, j, a), as defined in (1).
U
k
-
β
β
?
ω
s
-
τ
?
A
U
?
ω
τ
j
-
?
A
Define σ from k to j by σ = τ ◦ β. The proof that σ(u) = a and j ◦ σ = σ ◦ k is essentially
identical to (+). The proof of uniqueness is also essentially the same as for (1).
Theorem 17(1) allows us to perform inductive definitions of sequences, indexed by the elements of ω, in the usual way.24 For instance, suppose we wish to formally define the sequence
of powers of 2: 20 , 21 , 22 , . . . = 1, 2, 4, . . .. Theorem 17(1) guarantees that a single function or
sequence exists that will produce precisely these values.
ω
s
-
τ
?
A
ω
τ
j
-
?
A
To use the theorem, we need to specify a function j (as in the bottom map of the diagram).
The map j declares, as we move through the sequence, how the next value is computed from the
current value. For this purpose, we define j : ω → ω by j(m) = 2m. The critical point of j that
we will use is 1. Now the theorem guarantees there is a unique τ : ω → ω that takes 0 to 1 and
24
Some familiar examples of inductive definitions actually require a slight variation of Theorem 17(1) in which
j : A → A is replaced by a proper class Dedekind self-map j : V → V ; we will introduce this variant later in this
discussion and develop it further in Sections 10–12.
38
makes the diagram commutative. Now we can see that τ is precisely the exponential function
n 7→ 2n (this is actually the definition of this exponential function). Notice by commutativity of
the diagram that
τ (1) = τ (s(0)) = j(τ(0)) = j(1) = 2.
and
τ (2) = τ (s(1)) = j(τ(1)) = j(2) = 2 · 2 = 4.
A slightly more complicated example is to compute the factorial function, where n! (“n
factorial”) is defined to be the product 1 · 2 · · · · · n.25 The first terms of the sequence are
1!, 2!, 3!, 4! . . . = 1, 2, 6, 24, . . .. Once again, to use the theorem, it is necessary to define a function
j : A → A, as in the bottom row of the diagram. For factorial, we will need to use two
arguments, and our final “factorial” function will be defined to be a sequence26 of pairs f =
h(0, 0!), (1, 1!), (2, 2!), (3, 3!), . . .i = h(0, 1), (1, 1), (2, 2), (3, 6), . . .i. We define j : ω × ω → ω × ω by
j(m, n) = (m + 1, (m + 1) · n), and we will use as critical point the pair (0, 1).
s
ω
-
τ
ω
τ
?
j
ω×ω
?
- ω×ω
Now there is a unique τ : ω → ω × ω that takes 0 to (0, 1) and makes the diagram commutative.
We can argue by induction to show that τ (n) = (n, n!) for each n: By assumption, τ (0) = (0, 1) =
25
This function is more elegantly defined using the variant of Theorem 17(1) mentioned in the previous footnote.
Using a similar and more traditional approach, we can derive the factorial function directly, but the maps in
the diagram change and look a bit more complicated. We give an idea about how it works in this note. What can
be shown is that, given any f : ω × ω → ω and value t0 , there is a unique function t : ω → ω so that t(0) = t0 ,
and for all n, t(s(n)) = f (n, t(n)). (A statement of a more general result is given in Theorem 21; the proof is like
the one given for Theorem 15.) To express this in the form of a commutative diagram, one uses the diagonal map
∆ : ω → ω × ω defined by ∆(n) = (n, n), and another map 1ω × t defined by (1ω × t)(m, n) = (m, t(n)). Thus,
(1ω × t) ((∆(n)) = (n, t(n)).
26
s
ω
-
(1ω ×t)◦∆
t
?
ω×ω
ω
f
-
?
ω
Now, the diagram commutes if and only if, for every n,
t(s(n)) = f ((1ω × t) (∆(n))) = f (n, t(n)),
as expected. In this example, defining f by f (m, n) = (m + 1)n and t0 = 1, the function t is the factorial function,
as one may prove by induction:
t(n + 1) = f (n, t(n)) = (n + 1) · n! = (n + 1)!
Thus, t(n) = n! for every n.
39
(0, 0!). Assuming the result for n, we have
τ (s(n)) = j(τ(n)) = j(n, n!) = (n + 1, (n + 1) · n!) = (n + 1, (n + 1)!).
As a final observation, note that in this case, the map τ has the property that ran τ is another
function, but in this case its range is precisely the factorial function.
The most useful application of the theorem, for our purposes, is the definition of the first ω
stages of the universe of sets. The stages V0 , V1 , V2, . . . of the universe V are obtained by taking
repeated power sets, starting with the empty set ∅, where, by definition, the power set P(X)
of a set X is the set whose elements are the subsets of X. To define this sequence formally,
we define a class version of Theorem 17(1). We discuss classes in more detail in Section 11.
For our purposes, a class is essentially a formula27, but we think of it as a collection of objects
defined by the formula. After we have given a definition of the universe V , we will think of
classes as enormous subcollections of V . For now, a class will simply be viewed as a collection
C = {x | φ(x)} for some formula φ (having possibly finitely many other set parameters not
displayed).
An example of a class is B = {{y} | y is a set}. Formally, the formula φ(x) that defines
B states “there is z such that for all u, u ∈ x if and only if u = z”: B = {x | φ(x)}. (Here,
x represents a set of the form {z} and the condition states that the only element of x is z.)
Intuitively, a collection like B spans the universe and so is too big to be a set.
We wish to establish a class version of Theorem 17(1):
Theorem 18. Suppose C is a class, c ∈ C, and j : C → C is 1-1 and has critical point c. Then
there is a unique class map τ : ω → C such that τ (0) = c and the following is commutative:
ω
s
-
τ
?
C
ω
τ
j
-
?
C
Proof. The proof follows the same steps as the proof of the Mostowski Collapsing Theorem
(p. 24), so we just give an outline. Let φ(x, u) be the formula “u ∈ ω and x is a function with
domain u so that x(0) = c and for all i, 0 ≤ i < u − 1, x(s(i)) = j(x(i).” We show that A = ω,
where
A = {n ∈ ω | there is a unique function tn such that φ(tn , n) holds},
by showing that A is inductive.
ω
s
-
tn
?
C
27
ω
tn
j
See Section 11 for a discussion.
40
-
?
C
Certainly 0 ∈ A; the unique map t0 is the empty function. Assume n ∈ A, so that a unique
tn is defined as in the definition of A. Define tn+1 = tn ∪ {(n, j(tn(s(n − 2))))}. Note that
j(tn (s(n − 2))) ∈ C since tn (s(n − 2)) ∈ C. The pair added to tn in the definition ensures that
tn+1 (s(n − 1)) = j(tn (s(n − 2))) = j(tn+1 (s(n − 2))) = j(tn+1 (n − 1),
and so we have tn+1 (0) = c and tn+1 ◦ s = j ◦ tn+1 . Uniqueness of tn+1 is straightforward to
verify since, whenever r is defined on n + 1, r(0) = c, and r ◦ s = j ◦ r, we must have r n = tn
(by uniqueness of tn ) and r(n) = j(tn (s(n − 2))) = tn+1 (n).
We have shown A = ω, so for each n we have a uniquely defined tn as described in the definition of A. We define τ on ω by: τ (n) = tn+1 (n). As in the proof of the Mostowski Collapsing
Theorem (in particular, (+)), it follows that τ (n) = tm (n) for all m ≥ n + 1. Verification that τ
has the required properties follows easily.
Now we define a particular class HF (the hereditarily finite sets) as follows: We place x ∈ HF
if the smallest transitive set that contains x as a subset is finite.28 We define j : HF → HF by
j(x) = P(x), where P denotes the power set operator: P(x) is the set of all subsets of x. We
verify that ran j ⊆ HF: Notice that if y is finite and transitive, so is P(y), so if x ∈ HF and y
is the smallest transitive set containing x (which, in particular, must be finite), then, since P(y)
is finite and transitive and P(x) ⊆ P(y), it follows P(x) ∈ HF. Therefore, ran j ⊆ HF. Notice
that ∅ is a critical point for j and that j is 1-1. Therefore, j is a Dedekind self-map on the class
HF.29
Using Theorem 18, we obtain a unique τ : ω → HF satisfying:
(a) τ (0) = ∅
(b) for all n ∈ ω, τ (s(n)) = j(τ (n)).
If we write Vn = τ (n), then these clauses become
(a0 ) V0 = ∅
(b0 ) for all n ∈ ω, Vn+1 = P(Vn ).
The Axiom of Replacement guarantees that we have a sequence30 hV0 , V1 , V2, . . .i. The Union
Axiom allows us now to form the union, which we denote Vω :
[
(++)
Vω =
Vn .
n∈ω
28
Recall that the axiom Trans (which asserts that every set is included in a transitive set) has been included in
ZFC; see the footnote on p. 6. The “smallest” such transitive set is found by forming the intersection of them all.
29
Reasoning of this kind leads to a proof from the theory ZFC − Infinity + ¬Infinity that V = HF: Given any
set x, let y be the smallest transitive set that contains x. Since ¬Infinity holds, y is finite, and so x ∈ HF.
30
In other words, the sequence is a set, not a proper class.
41
It can be shown that, in fact, Vω = HF.31 We complete the construction of the stages of V
in Section 11. In the absence of the Axiom of Infinity—in particular, when we assume that no
infinite set exists—HF is still defined in the same way, but in that context, it is a class that is not
a set. One may also use a variation of Theorem 18 to define the class sequence hV0 , V1, V2, . . .i,
exactly as above; in this case, every “set” belongs to one of these stages.
Next, we continue with a converse for Theorem 17(2):
Corollary 19. Suppose k : U → U is a Dedekind self-map with critical point u. Suppose
(U, k, u) has the property that for every Dedekind self-map j : A → A with critical point a, there
is a unique Dedekind self-map morphism σ from (U, k, u) to (A, j, a). Then there is a Dedekind
self-map isomorphism from (ω, s, 0) to (U, k, u).
Remark 3. Corollary 19 is the converse to Theorem 17(2). These two results tell us that the
structure of the Dedekind self-map (ω, s, 0)—and therefore the fundamental structure of the set of
natural numbers—is completely characterized by the universal property stated in the hypothesis
of Corollary 19.
Proof. The idea here is that both (ω, s, 0) and (U, k, u) have the universal property described in
the hypothesis of Corollary 16. The following diagram captures the relationships involved:
U
k
-
β
β
?
ω
s
-
γ
k
-
β
31
ω
?
U
β
?
ω
?
γ
?
U
U
s
-
?
ω
We prove this fact here. First we prove the following claim:
Claim. If X is finite and X ⊆ Vω , then X ∈ Vω .
Proof of Claim. Let X = {x1 , x2 , . . . , xk } and let n1 , n2 , . . . , nk be such that for 1 ≤ i ≤ k, xi ∈ Vni . Let
n = max{n1 , . . . , nk }. Then X ⊆ Vn , so X ∈ Vn+1 , and hence, X ∈ Vω .
To prove that Vω = HF, it is enough to prove HF ⊆ Vω . To prove this, we will make use of the Foundation
Axiom, which states that every set in the universe has an ∈-minimal element; that is, for every set x, there is a
z ∈ x such that no element of x belongs to z.
Given X ∈ HF, let Y ⊇ X be a finite, transitive set containing X; certainly, Y ∈ HF. Let Z = Y − Vω . We
wish to show that Z = ∅; then by the Claim we can conclude Y ∈ Vω , and so HF ⊆ Vω , as required. To prove
Z = ∅, we give an indirect argument, assuming Z 6= ∅. Let u ∈ Z be ∈-minimal in Z. Since u ∈ Y − Vω , u 6= ∅;
let v ∈ u. Since Y is transitive, v ∈ Y . Since u is ∈-minimal, v 6∈ Z, so v ∈ Vω . Since v was arbitrary, we have
shown u ⊆ Vω . By the Claim, it follows that u ∈ Vω , but this contradicts the original choice of u. Therefore, the
assumption that Z is nonempty is incorrect; we have shown Z = ∅, as required.
42
We let β : (U, k, u) → (ω, s, 0) be the unique Dedekind self-map guaranteed to exist because of
the hypothesis, and we let γ : (ω, s, 0) → (U, k, u) be the unique Dedekind self-map guaranteed to
exist by Theorem 17(1). Notice γ ◦ β takes u to u and makes the following diagram commutative:
k
U
-
γ◦β
γ◦β
?
k
U
U
-
?
U
but idU : U → U does the same: idU (u) = u and the following is commutative:
k
U
-
idU
idU
?
k
U
U
-
?
U
By the uniqueness guaranteed by the hypothesis of Corollary 19,
idU = γ ◦ β.
(#)
Likewise, we have β(γ(0)) = 0 and the following diagram is commutative:
s
ω
-
β◦γ
β◦γ
?
s
ω
ω
-
?
ω
but idω : ω → ω does the same: idω (0) = 0 and the following is commutative:
s
ω
-
idω
idω
?
ω
ω
s
-
?
ω
Therefore, we also have
(##)
idω = β ◦ γ.
It is straightforward to verify that (#) and (##) together imply that both β and γ are bijections.
Hence, in particular, γ : (ω, s, 0) → (U, k, u) is a Dedekind self-map isomorphism.
43
The theorem tells us that the Dedekind self-map (ω, s, 0) is “less than or equal to” all other
Dedekind self-maps, in the same way that 0 is less than or equal to every natural number. This
idea can be made precise with the concept of a category. A category is a pair (O, M), where O
is a collection of objects and M is a collection of morphisms, satisfying the following:
(1) Each morphism f ∈ M has a domain and a codomain, written dom f, cod f, respectively,
both belonging to O; in the familiar way, if A = dom f and B = cod f , we write f : A → B.
(2) Morphisms can be composed: If f : A → B and g : B → C both belong to M, then there
is another morphism g ◦ f : A → C that also belongs to M; moreover, composition is
associative: h ◦ (g ◦ f ) = (h ◦ g) ◦ f .
(3) For each object A ∈ O, there is an identity morphism 1A : A → A, which has the following
two properties: For all f : X → Y in M, 1Y ◦ f = f and f ◦ 1X = f . We sometimes denote
1A by idA .
A simple example of a category is Set, which has all sets as its objects, and all functions
between sets as its morphisms. If we denote the collection of all functions defined on a set in
V by V <V , then32 Set = (V, V <V ∩ V ). Another very different example is the category Nat,
whose objects are the natural numbers 0, 1, 2, . . . and whose morphisms are pairs (m, n) of natural
numbers for which m ≤ n. If S = {(m, n) ∈ ω × ω | m ≤ n}, then Nat = (ω, S). Notice that
in this case, “morphisms” are nothing like the usual notion of functions, but they do satisfy the
requirements mentioned in the definition of categories.33 Finally, a category of primary interest
to us in this paper is SelfMap, whose objects are Dedekind self-maps and whose morphisms are
Dedekind self-map morphisms.
Checking that the requirements for a category have been met for each of our examples is
straightforward, and we shall assume that this verification has been done.
A notion that we have already defined in particular cases, but which is best formulated in
the language of category theory, is isomorphism. In any category C = (O, M), if A, B ∈ O, an
isomorphism from A to B is a morphism f ∈ M with the property that there is another g ∈ M,
g : B → A, with f ◦ g = idB and g ◦ f = idA . In Set, the isomorphisms are the bijections. In Nat,
two objects are isomorphic if and only if they are equal. And in SelfMap, the isomorphisms are
precisely the Dedekind self-map isomorphisms.
The concept we wish to introduce at this point is that of an initial object. An initial object
in a category is the “smallest” object of the category. We give the definition: An object I in a
category C = (O, M) is initial if there is exactly one morphism from I to each object X in O.
32
The curious reader may wonder why the collection of morphisms for Set is V <V ∩ V rather than simply V <V .
The reason is that it is possible to devise a function f defined on some set X ∈ V that is not itself a member of
V . We will see an example at the end of this paper. Such functions are necessarily not definable in the universe,
nor derivable from the axioms of set theory, and hence do not properly belong to Set, which must be viewed as
a (definable) class. Moreover, the range of such functions is not itself a set, so it does not make sense to include
them as morphisms. Nevertheless, such undefinable functions play an important role in the theory of sets (but not
so much in the category of sets); see the footnote on p. 147 for an example.
33
It is helpful to think of a morphism f : A → B in a category as a directed edge in a directed graph. One could
have directed graphs in which such edges are indeed functions, and other graphs in which they are not. It can be
shown that any category is a directed graph with additional properties.
44
In Set, ∅ is initial because there is just one map (the empty map) from ∅ to any other set. In
Nat, 0 is initial because 0 ≤ m for all m ∈ ω. Now, with this new notion in hand, it is easy to
see that (ω, s, 0) is an initial object for SelfMap. In fact, Theorem 17 and Corollary 19 can be
summarized as follows, using this new concept:
Theorem 20.
(1) (ω, s, 0) is initial in the category SelfMap.
(2) Any object (U, k, u) of SelfMap that is initial is Dedekind self-map isomorphic (hence
SelfMap-isomorphic) to (ω, s, 0).
(3) Any object (U, k, u) of SelfMap that is Dedekind self-map isomorphic (hence SelfMapisomorphic) to (ω, s, 0) is initial.
Proof. This follows from Theorem 17 and Corollary 19.
Our reasoning so far has shown that, given a Dedekind self-map j : A → A with critical
point a, if we form the smallest j-inductive set W , it turns out that W = {j n (a) | n ∈ ω}.
We have also just seen that (W, j W, a) is initial—a structural duplicate of (ω, s, 0). We will
call W , defined as the smallest j-inductive set, the initial object generated by j. We have
also shown that τ = π −1 : ω → W is a map that lists the elements of W : For all n ∈ ω,
τ (n) = j n (a) = j(j(. . .((a)) . . .)) (where the right hand side of the expression consists of n
applications of j to a). We will call τ the canonical enumeration of W . Recall that τ is itself a
Dedekind self-map isomorphism from (ω, s, 0) to (W, j W, a).
We can now verify that, whenever j : A → A is a Dedekind self-map with critical point a, the
sequence of maps idA , j, j ◦ j, . . . forms a set that is, like W = {a, j(a), j(j(a)), . . .}, a structure
isomorphic to ω. Moreover, this sequence is defined by a Dedekind self-map Jj : AA → AA
defined by Jj (h) = j ◦ h, with critical point idA .
Theorem 21.
(1) The map Jj : AA → AA is a Dedekind self-map with critical point idA .
(2) There is a set W whose elements are precisely idA , j, j ◦ j, . . .. Indeed, the initial object
generated by Jj has this property, and if τ denotes the canonical enumeration of W , then
for each n ∈ ω, τ (n) = j n.
Proof of (1). The only point that needs to be checked is that Jj is 1-1. Suppose f, g ∈ AA and
j ◦ f = j ◦ g. Then f = g because j is 1-1.
Proof of (2). Let W denote the initial object generated by Jj and let τ : ω → W be its canonical
enumeration. We show by induction that for all n ∈ ω, τ (n) = j n .
ω
s
τ
?
AA
-
ω
τ
Jj
45
?
- AA
Let B ⊆ ω be the set {n ∈ ω | τ (n) = j n }. Since idA is the critical point of Jj , τ (0) = idA = j 0 ,
so 0 ∈ B. Suppose n ∈ B, so τ (n) = j n . Then τ (s(n)) = Jj (τ (n)) = Jj (j n ) = j ◦ j n = j n+1 .
Thus, s(n) ∈ B, and so the claim is proved. As was proved in Theorem 16(2),
W = {idA , Jj (idA ), Jj (Jj (idA )), . . .} = {idA , j, j ◦ j, j ◦ j ◦ j . . .}.
Our results in this section lead to important characterizations of the notions of “infinite set”
and “set of natural numbers.” We began our discussion with an effort to show that the Axiom of
Infinity could be formulated as the assertion that a Dedekind self-map exists. We showed that,
without reliance on the natural numbers (defined in the usual way), the set ω of natural numbers
could be derived, using just a Dedekind self-map in conjunction with the other axioms of ZFC.
These efforts led not only to a derivation of ω but also a characterization of those Dedekind
self-maps that are in every respect, up to notational differences, equivalent to ω together with its
successor function.
Speaking philosophically, we have given evidence for the following two conclusions:
(1) The “underlying reality” of infinite sets is the notion of a Dedekind self-map.
(2) The “underlying reality” of the set of natural numbers is the notion of an initial Dedekind
self-map.
In particular, every Dedekind self-map j : A → A with critical point a generates a blueprint
for the set of natural numbers, in the form W = {a, j(a), j(j(a)), . . .} ⊆ A; indeed, j W : W →
W is itself an initial Dedekind self-map, isomorphic to s : ω → ω and naturally embedded in
every other Dedekind self-map.
We consider next another equivalent form of Dedekind self-map that reveals another characteristic underlying the notion of the infinite.
10
Co-Dedekind Self-Maps, Dedekind Maps, and co-Dedekind
Maps
In our effort to capture a deeper meaning of the notion of the infinite in our Axiom of Infinity, we
isolated the concept of a Dedekind self-map. This notion gives mathematical expression to one
end of a polarity that characterizes the Infinite, according to the ancient wisdom, namely, the
dynamics of expansion to the infinite from a singularity. We have seen that this expansion takes
place, for a given Dedekind self-map j : A → A with critical point a, by repeated applications of
j to its critical point, producing the infinite sequence a, j(a), j(j(a)), . . ..
The other half of this polarity, is, according to the ancient view, the dynamics of collapse, by
which the vast diversity returns to the point value from which it originates. In a rather natural
way, these dynamics are expressed mathematically using the dual notion of a co-Dedekind selfmap.
Let us say that an onto self-map h : A → A is co-Dedekind if some preimage h−1 (a), for
some a ∈ A, has two or more elements. Whenever a ∈ A is such that |h−1 (a)| ≥ 2, any element
of h−1 (a) will be called a co-critical point of h.
46
Recall that, whenever h : A → A is onto, there is, by the Axiom of Choice, a function
s : A → A whose range contains exactly one element from each preimage h−1 (a), for a ∈ A; such
an s is called a section of h. Clearly, any section of an onto map must be 1-1. Moreover, we have
the following:
Theorem 22. (ZFC-Infinity). The following are equivalent:
(1) There is a Dedekind self-map.
(2) There is a co-Dedekind self-map.
Proof. Suppose there is a Dedekind self-map j : A → A with critical point a. Let a0 ∈ ran j.
Define h : A → A by
(
a0 if x 6∈ ran j
h(x) =
y
otherwise, where y ∈ A is unique such that j(y) = x.
Under the definition, we have that h(a) = a0 since a is a critical point of j. It is obvious that
h : A → A is onto since even h ran j is onto. It follows that some b = j(x) ∈ ran j is also mapped
by h to a0 , since h ran j is onto. Certainly b 6= a since a 6∈ ran j. Therefore, |h−1 (a0 )| ≥ 2, so h
is co-Dedekind.
Conversely, suppose h : A → A is co-Dedekind with co-critical point a and suppose s : A → A
is a section. We show s itself is Dedekind. We have already observed that s is 1-1. Let x ∈ A
be such that |h−1 (x)| ≥ 2, and let u 6= v ∈ A be elements of h−1 (x). Then one of u, v does not
belong to the range of s; in particular, one of u, v is a critical point of s.
The argument shows that any section s of a co-Dedekind self-map h : A → A is itself a
Dedekind self-map; moreover at least one co-critical point of h is a critical point of s.
We introduce a somewhat specialized context to bring to light the “collapsing” effect of a
co-Dedekind self-map. Let us say that a set A is closed under singletons if, for all x ∈ A, we
have {x} ∈ A; more generally, A is closed under pairs if, whenever x, y ∈ A, {x, y} ∈ A. It turns
out that, in studying co-Dedekind self-maps A → A, there is nothing lost if we assume A is a
transitive set closed under pairs:
Proposition 23. There is a co-Dedekind self-map on a set if and only if there is a co-Dedekind
self-map on a transitive set that is closed under pairs. Moreover, any set that is closed under
pairs admits a co-Dedekind self-map.
Proof. Suppose h : A → A is a co-Dedekind self-map on A. Let t : A → A be a section of h that
is a Dedekind self-map with critical point a. Then the successor s : ω → ω is, by Theorem 17(1),
uniquely embedded in h. Let S : V → V be defined by S(x) = {x}. By Theorem 18, there is a
unique τ̄ : ω → V such that τ̄ (0) = a and τ̄ ◦ S = s ◦ τ̄ . In particular, there is a set W whose
elements are precisely a, {a}, {{a}}, . . .—that is,
W = {a, {a}, {{a}}, . . .},
47
and S W : W → W is initial. In particular, (ω, s, 0) and (W, S W, a) are Dedekind self-map
isomorphic.
Claim. There is a set B that is both transitive and closed under pairs that admits a Dedekind
self-map. Indeed, there is such a set B that includes A.
Proof of Claim. We first observe that for any set C, we can obtain a set T (C) ⊇ C that is
closed under pairs: Define C = C0 ⊆ C1 ⊆ · · · as follows: Let C00 = [C0 ]2 (where, for any set
0
34
D, [D]2 denotes the set of all unordered pairs
S from D), and let C1 = C0 ∪ C0 . In general ,
0
0
2
Cn+1 = Cn where Cn = [Cn ] . Let T (C) = n∈ω Cn . If u, v ∈ C, then for some n, u, v ∈ Cn , and
so {u, v} ∈ Cn+1 ⊆ T (C).
Now we build a set B as the union of the following chain:
A = A0 ⊆ B0 ⊆ A1 ⊆ B1 ⊆ · · · ,
where, for each i ∈ ω, Bi = T (Ai ) and Ai+1 is a transitive set that contains Bi . If u, v ∈ B, then
u, v ∈ Bi for some i, and so, because Bi = T (Ai ), {u, v} ∈ Bi ⊆ B, and so B is closed under
pairs. Also, if u ∈ v ∈ B, then v ∈ Ai for some i > 0, and so, by transitivity of Ai , u ∈ Ai ⊆ B;
this shows B is transitive, as required.
We obtain a Dedekind self-map t̂ : B → B as follows:
(
b
if b 6∈ A
t̂(b) =
t(b) if b ∈ A
It is easy to see that t̂ is a Dedekind self-map.
To complete the proof of the Proposition, we simply recall that by the proof of Theorem 22,
whenever there is a Dedekind self-map B → B, there is also a co-Dedekind self-map B → B.
Finally, for the “moreover” clause, notice that if B is closed under pairs, it is closed under
singletons. If x ∈ B, then W = {x, {x}, {{x}} . . .} is a copy of ω in B, and so B is infinite, and
therefore must admit a co-Dedekind self-map (by the first half of the Proposition).
Example 1 (Generate/Collapse Duality). Let A be a transitive set that is closed under pairs.
We assume that A is a set “in the universe” in the sense that it is governed by the usual axioms
of set theory. (In particular, no element of A is an element of itself.) Consider the self-map
F : A → A, defined by
F (x) = any ∈-minimal element of x.35
34
Formally, we are using Theorem 18 here. Define j : V → V by j(x) = {{u, v} | u ∈ x and v ∈ x}; j is a
Dedekind self-map. As in the Theorem, there is a unique τ : ω → V satisfying jτ = τ s, where s : ω → ω is the
successor function. Then define Cn = τ (n) for each n ∈ ω. This gives us the sequence hC0 , C1 , C2 , . . .i as in the
main text, and one may then form the union, as described there. This formal treatment makes it clear that one
obtains a set T (A) for any set A only if ω already exists.
35
F (∅) is arbitrary. The notion of an ∈-minimal element of a set is defined on p. 17.
48
Because A is transitive, ran F ⊆ A. We also observe that F is onto: Suppose y ∈ A. Then
{y} ∈ A, and clearly F ({y}) = y. Finally, suppose z ∈ A and consider the sets x = {z} and
y = {z, {z}}. The fact that no set is an element of itself ensures that z, {z} are disjoint, and so
F (y) = z = F (x). Thus, |F −1 (z)| > 1. We have shown F is a co-Dedekind self-map.
Let SA = S A : A → A, where, we recall, S(x) = {x} for all x ∈ A. We show that SA is a
section of F by showing F ◦ S = idA :
F (S(x)) = F ({x}) = x.
The dual notions of Dedekind self-map and co-Dedekind self-map are expressed in the selfmaps SA and F . Certainly, SA plays the role of generating a blueprint for the natural numbers:
2
(a), . . .} is an initial Dedekind self-map. We wish to show that, conGiven a, SA {a, SA(a), SA
versely, F plays the role of collapsing the values of A to their point of origin.
We show that for every x ∈ A, there is n ∈ ω such that F n (x) = ∅. Suppose not. Then for
each n ∈ ω, F n (x) 6= ∅. It follows that the following is an infinite descending ∈-chain:
· · · F n (x) ∈ F n−1 (x) ∈ · · · ∈ F (x) ∈ x.
Such chains cannot exist in the presence of the Axiom of Foundation.
Let E = {in | n ∈ ω}, where, for each n, in : AA → AA is defined by
in (f ) = f n , the nth iterate of f .
We have the following:
Proposition 24. Suppose A is a transitive set that is closed under pairs. Let W = {∅, {∅},
{{∅}}, . . .} ⊆ A.
(1) For every x ∈ A—in particular, for every x ∈ W —there is i ∈ E such that i(F )(x) = ∅.
(2) For every x ∈ W , there is i ∈ E such that i(SA)(∅) = x.
The Proposition indicates how every element of A is “returned to its source” via the interplay
of F and a naturally occurring set E of functionals defined over A. Likewise, through the interplay
of SA and E, we also see once again in the present context how a blueprint for the whole numbers
is generated. In summary, the dual self-maps SA and F play the roles, respectively, of “generating
a blueprint” and “returning elements to their source.”
The concept of a “blueprint” appears naturally in the context of large cardinals, as we shall
see in later sections. In the next section, we make the notion of a blueprint, suggested by our
results here, more precise. The portions of our work here that we wish to emphasize, and that
will generalize nicely, is that any Dedekind self-map A → A produces a kind of blueprint for
the natural numbers, and also that in the special case in which A is transitive and closed under
pairs, there are dual maps SA W : W → W and F W : W → W (where W is the set of
iterates ∅, SA(∅), SA(SA (∅)), . . .) which play the roles of generating the elements of W from ∅ and
returning elements of W to ∅.
49
10.1
Dedekind Maps and co-Dedekind Maps
We close this section by considering two slight generalizations of Dedekind self-maps and coDedekind self-maps. Let us say that a function f : A → B is a Dedekind map if
(1) |A| = |B|
(2) f is 1-1 but not onto.
We will call any element b ∈ B not in the range of f a critical point of f .
Dually, we define a function g : A → B to be a co-Dedekind map if
(1) |A| = |B|
(2) g is onto but not 1-1.
We will call any element a ∈ A that belongs to some g −1 (b), for some b ∈ B, a co-critical point
of g if |g −1 (b)| > 1.
As we now show, Dedekind maps factor as a composition of a bijection and a Dedekind
self-map.
Proposition 25. Suppose f : A → B and b ∈ B. Then f is a Dedekind map with critical point
b if and only if there exist functions π, j so that f = π ◦ j, where π : A → B is a bijection and
j : A → A is a Dedekind self-map with critical point π −1 (b).
f
A
@
-
B
j@
π
R
@
A
Proof. For one direction, suppose A, B are sets, b ∈ B, and f : A → B can be factored as
f = π ◦ j, where π : A → B is a bijection and j : A → A is a Dedekind self-map with critical
point π −1 (b). We show f is a Dedekind map. Certainly, f is 1-1. We show b is a critical point
of f . Since π is onto, there is a ∈ A with π(a) = b. Note that a = π −1 (b) is, by assumption, a
critical point of j. If b ∈ ran f , then let x ∈ A be such that f (x) = b. Then by commutativity
of the diagram, π(j(x)) = b, and so j(x) = π −1 (b) = a, which is impossible since a 6∈ ran j. We
have shown b is a critical point of f , f is 1-1, and |A| = |B| (by way of π), as required.
For the other direction, suppose f : A → B is a Dedekind map with critical point b. Since
|A| = |B|, there is a bijection π : A → B. Let B0 = f [A] ⊆ B and let A0 = π −1 [B0 ] ⊆ A. Let
π0 = π A0 . We show ran π0 = B0 : Suppose y ∈ A0 . Then y = π −1 (x) for some x ∈ B0 ; that is,
for some x ∈ B0 , π(y) = x. This shows ran π0 ⊆ B0 . If x ∈ B0 , then let y ∈ A0 with π −1 (x) = y.
Then x = π(y) ∈ ran π. We have shown therefore that π0 is onto. Since π0 is a restriction of the
bijection π, π0 is also 1-1. Therefore, π0 : A0 → B0 is a bijection.
50
Note that f : A → B0 is a bijection. Let j = π0−1 ◦ f : A → A0 . Then j is a bijection, and so,
viewed as a map j : A → A, j is a Dedekind self-map. The following diagram is commutative:
f
A
- B0
@
j@
R
@
π0−1
A0
We claim that π −1 (b) is a critical point for j : A → A: Suppose j(x) = π −1 (b) for some
x ∈ A. Then
b = π(j(x)) = π(π −1 (f (x))) = f (x),
which is impossible because b is a critical point of f .
Now notice that, for any x ∈ A, since j(x) = π0−1 (f (x)), then π(j(x)) = π0 (j(x)) = f (x).
We have shown that f = π ◦ j, π is a bijection, and j : A → A is a Dedekind self-map with critical
point π −1 (b), as required.
Any decomposition of a Dedekind map f by f = π ◦ j as in Proposition 25 will be called a
Dedekind factorization of f . The proof shows that for each bijection π : A → B we can obtain
a Dedekind self-map j : A → A so that f = π ◦ j. If π and π 0 are bijections from A to B
whose inverse images of f [A] are distinct, then the resulting Dedekind self-maps j = π −1 ◦ f and
j 0 = (π 0)−1 ◦ f are also distinct. It follows that every Dedekind map has infinitely many distinct
Dedekind factorizations, as the following Lemma shows:
Lemma 26. Suppose A, B are infinite sets with |A| = |B|, b ∈ B, and B0 ⊆ B − {b} also infinite.
Then for k ∈ ω, there exist bijections πk : A → B with the property that, whenever k1 6= k2 are
finite ordinals, πk−1
(B0 ) 6= πk−1
(B0 ).
1
2
Proof. Let B1 = {b1 , . . . , bn, . . .} ⊆ B0 be a countably infinite subset of B0 . Let π0 : A → B be
any bijection. For n = 1, 2, 3, . . ., let an ∈ A be such that π0 (an ) = bn , and let a be such that
π0 (a) = b. For each n > 0, we define another bijection πn : A → B as follows:


π0 (u) if u 6= a and u 6= an
πn (u) = b
if u = an


bn
if u = a
Now if k 6= n, then ak 6∈ πk−1 (B0 ) but ak ∈ πn−1 (B0 ), so πk−1 (B0 ) 6= πn−1 (B0 ), as required.
For co-Dedekind maps, we have the following dual to Proposition 25.
Proposition 27. Suppose g : A → B and a ∈ A. Then g is a co-Dedekind map with co-critical
point a if and only if there exist functions π, h so that g = h ◦ π, where π : A → B is a bijection
and h : A → A is a co-Dedekind self-map with co-critical point π(a).
51
g
A
@
-
B
π@
h
R
@
B
Proof. First, suppose g : A → B is co-Dedekind with co-critical point a ∈ A. Since |A| = |B|,
we let π : A → B be any bijection. Then define h : B → B by
(+)
h(π(x)) = g(x).
This definition guarantees that the diagram above is commutative. We show h is a co-Dedekind
self-map with co-critical point π(a). Clearly h = g ◦ π −1 is onto. Let b ∈ B be such that g(a) = b,
and let a0 be such that g(a0) = b as well (since |g −1 (b)| > 1). It follows from (+) that
(++)
h(π(a)) = b = h(π(a0)),
and so |h−1 (b)| > 1, and h is a co-Dedekind self-map. Moreover, by (++), π(a) ∈ h−1 (b), so π(a)
is a co-critical point of h.
Conversely, suppose g : A → B, a ∈ A, and g = h ◦ π, where π : A → B is a bijection and
h : A → A is a co-Dedekind self-map with co-critical point π(a). We show g is a co-Dedekind
map with co-critical point a. Since both h and π are onto, it follows that g is onto. Since π(a) is
co-critical for h, there is a0 ∈ A so that h(π(a)) = b = h(π(a0 )). It follows that g(a) = b = g(a0 ),
so |g −1 (b)| > 1 and a ∈ g −1 (b). We have shown g is a co-Dedekind map with co-critical point a.
The next proposition shows the relationship between Dedekind maps, co-Dedekind maps,
and infinite sets.
Proposition 28. (ZFC – Infinity) The following are equivalent for any set A:
(1) A is infinite.
(2) A is the domain of a Dedekind map.
(3) A is the codomain of a Dedekind map.
(4) A is the codomain of a co-Dedekind map.
Proof of (1) ⇔ (2). Suppose f : A → B is a Dedekind map. Then, as in the proposition, there
is a Dedekind self-map j : A → A, and so j : A → j[A] is a bijection between A and one of
its proper subsets. Therefore, A is Dedekind-infinite (hence, infinite). Conversely, suppose A is
Dedekind-infinite and A0 ⊆ A is a proper subset for which there is a bijection j : A → A0 . Then,
viewed as a map A → A, j is a Dedekind self-map. Trivially, then, we have a Dedekind map
52
(letting B = A, π = idA , and f = j).
Proof of (1) ⇔ (3). If A is infinite, let x ∈ A; then the inclusion map A−{x} ,→ A is a Dedekind
map. Conversely, if g : B → A is a Dedekind map, then by (2) ⇒ (1), A is infinite.
Proof of (4) ⇒ (3). Suppose g : B → A is a co-Dedekind map with co-critical point in g −1 (a).
Let s : A → B be a section of g. Then s is 1-1 but not onto since at least one element of g −1 (a)
is not in the range of s. It follows that s is a Dedekind map, and A is the domain of a Dedekind
map. By (2) ⇒ (3), A is also the codomain of a Dedekind map.
Proof of (2) ⇒ (4). Suppose f : A → B is a Dedekind map with critical point b ∈ B. Let
b0 ∈ f [A], and suppose a ∈ A with f (a) = b0 . Define g : B → A as follows:
(
x if f (x) = y
g(y) =
a if y 6∈ ran f
Given u ∈ A, let v = f (u) for some v ∈ B; then g(v) = u. Therefore g is onto. Since
g(b0) = a = g(b), |g −1 (a)| > 1, so g is a co-Dedekind map and A is the codomain of a coDedekind map.
The next Proposition highlights the close relationship between Dedekind maps and arbitrary
functions that are 1-1 and not onto, and also between co-Dedekind maps and arbitrary functions
that are onto but not 1-1.
Proposition 29.
(1) Suppose h : A → D is 1-1 but not onto. Then there is B ⊆ D such that ran h ⊆ B and
h : A → B is a Dedekind map. In particular, h factors as i ◦ h where h is h : A → B and
i : B ,→ D is the inclusion map.
h
A
@
h@
R
@
-
D
i
B
(2) Suppose k : D → A is onto but not 1-1. Then there is B ⊆ D such that k B : B → A
is a co-Dedekind map. In particular, k factors as k = (k B) ◦ t where t = sB ◦ k and
sB : A → B is a section of k B.
k
D
@
t=sB ◦k@
A
kB
R
@
B
53
-
Remark 4. In part (2), the definition of t need not be quite so precise. In fact, if we define t by
(
u
if u ∈ B
t(u) =
arbitrary otherwise
commutativity of the diagram is still guaranteed.
Proof of (1). Since h is 1-1 but not onto, A and D must both be infinite. Let d ∈ D lie outside
the range of h. Let B = h[A] ∪ {d}. Clearly, |A| = |B| and h : A → B is 1-1 and not onto.
Therefore, h : A → B is Dedekind.
Proof of (2). Let s : A → D be a section of k and let B = ran s ∪ {y}, where y ∈ D − ran s.
Let x ∈ A be such that k(y) = x. Since s : A → ran s is a bijection, there is y 0 ∈ ran s such that
k(y 0 ) = x. It follows that k B : B → A is a co-Dedekind map. Let sB be the map i ◦ s : A → B,
where i : ran s ,→ B is inclusion. Letting t = sB ◦ k, it follows that k = (k B) ◦ t; that is, the
diagram displayed in part (2) of the Proposition is commutative.
Remark 5. We observe that Dedekind maps have essentially the same preservation properties
as Dedekind self-maps: Given a Dedekind map f : A → B, in a sense we will now specify,
the “image” of f is again the domain of a Dedekind map and is itself a Dedekind infinite set.
Moreover, iterating the process of obtaining new Dedekind maps leads to a sequence of critical
points that provides a blueprint for the set ω of finite ordinals, just as in the case of Dedekind
self-maps.
Suppose f : A → B is a Dedekind map with critical point b and let f = π◦j be a factorization.
f
A
@
-
B
j@
π
R
@
A
As before, let B0 = f [A], A0 = π −1 [B0 ] ⊆ A, π0 = π A0 : A0 → B0 , a = π −1 (b), and we have
crit j = a. Notice A0 = π −1 [B0 ] = π −1 [f [A]] = j[A]. Therefore, since a 6∈ ran j, a 6∈ A0 .
Let j0 = j A0 . We verify that j(a) is a critical point of j0 : If not, let z ∈ A0 with
j0 (z) = j(a). Then j(z) = j(a) and z = a, but this is impossible since a 6∈ A0 .
Let f0 = f A0 . We check that ran f0 ⊆ B0 : Given x ∈ A0 , f (x) ∈ f [A] = B0 . We note
that f (a) = π(j(a)) is a critical point for f0 : If f0 (x) = π(j(a)) = f (a), then x = a, which is
impossible since a 6∈ A0 . Since |A0 | = |B0 | (via π0 ) and f0 : A0 → B0 is 1-1 and not onto, it
follows that f0 is a Dedekind map. And once again, A0 is Dedekind-infinite because j0 : A0 → A0
is a Dedekind self-map.
Summarizing the discussion so far, Dedekind maps have essentially the same preservation
properties as a Dedekind self-map: Given a Dedekind map f : A → B with critical point a and
witnessing bijection π : A → B, A0 = π −1 [ran f ] is, like A, the domain of another Dedekind map
54
f A0 , with critical point π(j(a)). And, since there is a Dedekind self-map (obtained from f0 , π0 )
from A0 to itself, A0 is itself, like A, Dedekind-infinite.
f0
A0
- B0
@
j0@
π0
R
@
A0
Moreover, similar logic leads to a sequence of Dedekind maps, f : A → B, f0 : A0 → B0 , f1 :
A1 → B1 , . . ., with critical points b, π(j(a)), π(j(j(a))), . . ., which is the sequence
(∗)
(π ◦ j ◦ π −1 )0 (b), (π ◦ j ◦ π −1 )1 (b), (π ◦ j ◦ π −1 )2 (b), . . .,
together with witnessing bijections π, π0, π1 , . . ., where each of A, A0 , A1 , . . . is Dedekind-infinite.
From previous work, it is not hard to show that (∗) determines a sequence that is Dedekind selfmap isomorphic to s : ω → ω. This can be shown by first letting W 0 be the smallest j 0 -inductive
subset of B, where j 0 = π ◦ j ◦ π −1 . Then one shows that there is a unique Dedekind self-map
isomorphism τ 0 : ω → W 0 taking 0 to b and making the following commute:
ω
s
-
τ0
?
W0
ω
τ0
j0
-
?
W0
Indeed, τ 0 (n) = (j 0)n (b) is the required map. Therefore, in the same sense in which a Dedekind
self-map j : A → A gives rise to a kind of blueprint W ⊆ A for ω, so likewise any Dedekind map
f : A → B gives rise to a blueprint W 0 ⊆ B for ω.
To close this section, we show that Dedekind (self)-maps and co-Dedekind (self)-maps, together with bijections, are the building blocks of functions whose domain, codomain, and range
all have the same infinite size. For any such function g : A → B, we show below that exactly one
of the following must be true:
(1) g is a bijection;
(2) g is a Dedekind map;
(3) g is a co-Dedekind map;
(4) g factors as g = i ◦ f , where i is Dedekind map and f is a co-Dedekind map.
Remark 6. Part (4) is a more informative variant—in the special case in which g is neither 1-1
nor onto and its domain, codomain and range all have the same infinite size—of the fact that
every function g : A → B can be factored as g = i ◦ f where f is onto and i is 1-1 (known as an
epi-monic factorization).
55
Proposition 30. Suppose g : A → B is a function that is not a bijection. Let C = ran g. Suppose
also that |A| = |C| = |B|. Then:
(A) If g is 1-1, then g : A → B is a Dedekind map.
(B) If g is onto, then g : A → B is a co-Dedekind map.
(C) If g is neither 1-1 nor onto, then g factors as g = i ◦ f where f = g : A → C is co-Dedekind
and i : C → B is Dedekind.
Proof of (A) and (B). If g is 1-1 then since g is not a bijection, it is not onto; since |A| = |B|
as well, it follows that g is Dedekind. If g is onto, then g is not 1-1, and, by similar reasoning, g
is co-Dedekind.
Proof of (C). Since g is not 1-1, it follows that g : A → C is co-Dedekind. Let i : C ,→ B be
the inclusion map. Since C 6= B (g is onto C but not onto B, so C 6= B), i is a Dedekind map.
The result follows.
11
Blueprints Generated by a Dedekind Self-Map
In this section, we give a more formal treatment of the concept of a blueprint, introduced in
the last section. This approach will generalize well to broader contexts and will provide further
evidence of the rich foreshadowing of large cardinals suggested by the notion of a Dedekind
self-map.
Starting with a Dedekind self-map j : A → A with critical point a, and a set X ⊆ A, we
describe in a reasonably precise way what we mean by a blueprint for X. Because the intended
meaning can vary considerably depending on the context, our treatment will leave the notion
of a blueprint somewhat underspecified. The intention is that we have a class E of structurepreserving maps—preserving structure in the same way that j preserves structure—definable
from j. Through the interaction of j, a, and E, a self-map f (either Dedekind or co-Dedekind)
is obtained, which encodes the set X. The map f is the blueprint for X, and E plays the role of
encoding and decoding the blueprint. In particular, it should be the case that for every x ∈ X,
there is i ∈ E such that i(f )(a) = x. We will also consider the special case in which there is
also a dual g to f (so that either f is a section of g or g is a section of f ), playing the role of
“collapsing” X to a, with the property that for every x ∈ X, there is i ∈ E such that i(g)(x) = a.
Here are the details:
Definition 3 (Blueprints and Strong Blueprints). Suppose j : A → A is a Dedekind self-map
with critical point a and X is a set. A j-blueprint (or simply a blueprint) for X is a triple (f, a, E)
having the following properties:
(1) For some set B, f is a map B → B, either a Dedekind self-map or a co-Dedekind self-map,
called the blueprint map.
(2) a is a critical point of j and is called the blueprint seed.
56
(3) E is a set of “structure-preserving” functionals—having essentially the same structurepreserving characteristics as j itself—“compatible with” j, such that, for each i ∈ E:
(a) dom i ⊇ B B .
(b) i(f ) is a function and dom i(f ) ⊇ B.
(c) a ∈ dom i(f ).
The collection E is called a blueprint coder.
(4) (Encoding) The self-map f is defined from E, j, a.
(5) (Decoding) For every x ∈ X, there is i ∈ E such that i(f )(a) = x.
Moreover, a strong j-blueprint (or simply a strong blueprint) for X is a quadruple (f, g, a, E)
such that (f, a, E) is a blueprint for X and
(A) dom g = dom f
(B) Either f is a section of g or g is a section of f
(C) For each i ∈ E, dom i(g) = dom i(f )
(D) For every x ∈ X, there is i ∈ E such that i(g)(x) = a.
We have under-specified E in our requirement that its members are “structure-preserving”
and “compatible with j.” Presumably, with additional work and a study of a wider range of
examples, we can make this requirement more precise. For the examples we know about so far,
when our starting point is a Dedekind self-map j : A → A, as described in the previous section,
a member of E = {i0 , i1 , . . . , in, . . .} is “structure-preserving” if it is a Dedekind map (with
exceptions in the base cases n = 0, 1), and is “compatible with j” if it is definable from j and its
critical point, without extra parameters. When we expand to the context of Dedekind self-maps
j : V → V , we will require the elements of E to be elementary embeddings, and compatibility
with j will have the technical meaning discussed in [6] of compatibility of a class of set elementary
embeddings relative to an ambient embedding j : V → V .
Intuitively, if (f, a, E) is a blueprint for X, E will have been defined in some way from j, a,
and has been used to encode X in the map f , and E may be used to “read” f , recovering the
elements of X.
As we shall show in a moment, the notion that the iterates a, j(a), j(j(a)), . . . obtained from a
Dedekind self-map j : A → A form a “blueprint” of the set ω of finite ordinals can be expressed in
this framework. In later sections, we will show that analogous blueprints arise with various large
cardinals, providing evidence for our hypothesis that properties of Dedekind self-maps naturally
generalize to the context of large cardinals.
The work in the previous section provides us with an example of both a blueprint and a
strong blueprint, as we now show.
57
Example 2 (Blueprint for ω). Suppose j : A → A is a Dedekind self-map with critical point a.
Let W denote the smallest j-inductive set in A. (W corresponds to the set B mentioned in the
definition of blueprint.) Let π : W → ω be the Mostowski collapsing isomorphism. We define
E = {i0 , i1, i2 , . . .} as follows: For each n, in : W W → ω W is defined by
(
π if n = 0
in (h) =
π ◦ j n−1 ◦ h if n > 0
Note that when n = 0, in is the constant function h 7→ π and when n = 1, in (h) = π ◦ h. E is the
blueprint coder.
Define f = j W : W → W ; f is the blueprint map. We verify that (f, a, E) is a blueprint
for ω.
We verify points (1)–(5) from the definition. Parts (1) and (2) are obvious. For (4), as
f = j W , we see f is definable from j and W . However, W itself (see Remark 1) is definable
from j and a.
For (3), parts (a)–(c) are easy to check. For the main statement of (3), first it is clear (by
inspecting the proof of the Mostowski Collapsing Theorem) that π itself is definable from j and
each in is definable from j and π. For n > 1, in is a Dedekind map: It is easy to see in is 1-1,
and, for each such n, the map r : W → ω, defined by r(j n (a)) = sn (0), fails to be in the range of
in , and so each in , for n > 1, is a Dedekind map; in particular, each such in has the same “kind”
of preservation properties as j itself, as discussed in Remark 5. For n = 0, 1, in is not a Dedekind
map; these are the exceptional base cases. One can easily check that, in fact, i1 is a bijection.
For (5), certainly i0 (f )(a) = π(a) = 0. Let n ∈ ω, n ≥ 1. Then
in (f )(a) = π(j n−1 (f (a))) = π(j n(a)) = sn (π(a)) = sn (0) = n.
We have shown that (f, a, E) is a blueprint for ω.
In Example 2, we see how j : A → A provides a blueprint for ω and how the blueprint coder
E extracts the finite ordinals from the blueprint. The next example exhibits a strong blueprint
for W .
Example 3 (Strong Blueprint for W ). Suppose A is a transitive set closed under pairs. Define
SA : A → A and F : A → A as in Example 1. We treat SA as a Dedekind self-map with critical
point ∅. Let W ⊆ A be the smallest SA -inductive set in A and let π : W → ω be the Mostowski
collapsing isomorphism. Let f = SA W : W → W and let g = F W : W → W . Define a
collection E = {i0 , i1, . . .} of functionals as follows: For each i ∈ E, dom i = W W . For each
h ∈ WW:
i0 (h) = idW
in (h) = hn for n ≥ 1.
Then (f, g, ∅, E) is a strong blueprint for W .
We verify the properties of a blueprint. For (1), the set B is W = {∅, {∅}, {{∅}}, . . .},
and, as shown in the previous section, f is a Dedekind self-map with critical point ∅ and g is a
co-Dedekind self-map. Parts (2) and (4) are clear.
58
For (3), parts (a)–(c) are immediate; for the main part of (3), as in the previous propostion,
when n = 0, in is constant; when n = 1, in is a bijection; and for n > 1, in is a Dedekind
self-map, with critical point f . Also note that each the elements of E are definable from ω, which,
as previous work shows, is definable from SA and ∅.
For (5), certainly i0 (f )(a) = a. For n ≥ 1, we have
in (f )(a) = f n (a) = j n (a),
as required.
The additional properties (A)–(D) for strong blueprints were established in the previous
section. In particular, g is a co-Dedekind self-map and for each n ∈ ω, in (g)(j n(a)) = F n (j n (a)) =
∅.
12
A First Look at Dedekind Self-Maps of the Universe
One of our motivations for introducing the New Axiom of Infinity is to wrap within the formulation
of the set theory axiom “there exists an infinite set” some intuition about the nature of “the
infinite”—the intention is that, by including an intuition of this kind, the axiom can suggest
a direction for answering questions about the mathematical infinite that cannot be resolved by
the ZFC axioms alone. The intuition that we have wished to include is the notion that the
“essence” of infinite collections consists of underlying self-referral dynamics of an unbounded
field, represented mathematically as a Dedekind self-map. To express Cantor’s original insight
that “infinity exists” by saying “a Dedekind self-map exists” is to say that it is enough that
the axiom should assert the existence of the source of infinite collections; then one derives the
existence of concrete infinite collections.
One fundamental question that has been of interest to set theorists for many decades, is the
Problem of Large Cardinals: Is there a natural axiom that can be added to the axioms of ZFC
on the basis of which the known large cardinals can be derived?36 Since this is a question about
the mathematical infinite, it is reasonable to consult the Axiom of Infinity to get some hint about
how we might strengthen the axiom in a natural way to produce a new axiom that could account
for large cardinals. But the usual Axiom of Infinity tells us very little.
On the other hand, our New Axiom of Infinity suggests a direction that one would be unlikely
to consider on the basis of the usual Axiom of Infinity. We put together two clues as we seek new
axioms to justify large cardinals:
(1) Many of the known large cardinals have a global character, asserting things about the
universe V as a whole.
(2) The New Axiom of Infinity (which has “wrapped within it” some intuition about the origin
of infinite sets) asserts the existence of a Dedekind self-map.
Based on these clues, we conjecture that the axiom for large cardinals that we are seeking is of
the form “There is a Dedekind self-map defined on the universe of all sets.” We therefore are
interested in investigating Dedekind self-maps of the form j : V → V .
36
Large cardinals will be introduced more systematically in Section 21.
59
Based on our work so far, we can describe properties we would expect such a j : V → V to
have. Properties of a Dedekind self-map j : A → A with critical point a that we have discovered
so far are:
(A) The self-map j preserves essential properties of its domain (A is Dedekind infinite, and so
is the image j[A] of j) and of itself (the property of being a Dedekind self-map propagates
to the restriction j j[A]).
(B) The definition of j entails a critical point that plays a key role in its dynamics. One aspect
of those dynamics is that repeated restrictions of j to successive images give rise to a critical
sequence W = {a, j(a), j(j(a)), . . .} which is a precursor to the set ω of finite ordinals; the
emergence of this critical sequence provides a strong analogy to the ancient and quantum
field theoretic perspective that “particles arise from the dynamics of an unbounded field,”
where, in this context, particular finite ordinals are viewed as “particles.” The sequence of
restrictions j0 , j1, j2 , . . . of j that produce the critical sequence is defined by the clauses:
A0 = A
j0 = j : A → A
crit(j0 ) = a
A1 = j[A0 ]
j1 = j A1
crit(j1 ) = j(a)
An+1 = j[An]
jn+1 = j An+1
crit(jn+1 ) = j n+1 (a).
Therefore, crit(jn ) = j n (a) is a critical point of jn .
(C) Through the interplay between j and its critical point, a blueprint (j W, a, E) of the set ω
of finite ordinals arises. In particular, j W is a Dedekind self-map, and for every n ∈ ω,
there is i ∈ E such that i(j W )(a) = n.
(D) In special cases (starting with a transitive set A that is closed under pairs), there also arises
a strong blueprint (S W 0 , F W 0 , ∅, E 0), where S is the singleton map x 7→ {x} (which is
a Dedekind self-map), W 0 = {∅, {∅}, {{∅}} . . .}, F takes a set to a ∈-minimal element of
the set (F is a co-Dedekind self-map and S A is a section of F ), and E has the property
that for all x ∈ W 0 there are i, k ∈ E such that i(S W )(∅) = x and k(F W )(x) = ∅. In
this case, the strong blueprint gives expression to both the generating effect of a Dedekind
self-map and the collapsing effect of the dual map, a co-Dedekind self-map.
Using (A)–(D) to guide intuition about the “nature of the infinite,” as we expand our consideration from local to global, it is natural to search for Dedekind self-maps j : V → V that
exhibit similar properties. We, in a sense, expect to find Dedekind self-maps that:
(A) Preserve essential properties of their domain
60
(B) Have a critical point that plays a key role in the dynamics of j. Moreover, restrictions of j
should give rise to a sequence of critical points and reveal further dynamics of j.
(C) Give rise to a blueprint for some set related to the critical point.
(D) Give rise to a strong blueprint for some other set related to the critical point.
As we begin studying Dedekind self-maps defined on the universe V , an issue comes into view
that needs to be resolved. Consider the global successor function s defined by s(x) = x ∪ {x}.
Certainly this is an example of a Dedekind self-map defined on the universe of sets, having critical
point ∅ (just like the successor function on ω). However, the definition of this function does not
imply the existence of an infinite set.
Suppose we attempt to carry out the proof given above, which showed that from a set
Dedekind self-map, we can derive s and the set ω of natural numbers, but now using s in place
of the set Dedekind self-map. The first thing to notice is that the proof in Theorem 10, showing
that N is the smallest s̄-inductive set, doesn’t work. The problem is that, in order for there to
be a smallest inductive set, there must be at least one inductive set in the first place. Certainly
V itself is inductive (but it’s not a set, as we shall discuss in the next section), and certainly we
can form the sequence ∅, s(∅), s(s(∅)), . . . by iterating s̄ (and, intuitively, this collection should
form an inductive set), but we have no guarantee that all of these values belong to any single set;
therefore, in this case, forming the intersection of all inductive sets may result in the empty set,
and forming the intersection of all inductive classes may result in a proper class. Thus, N does
not emerge from the dynamics of s̄ : V → V as it does from any Dedekind self-map defined on a
set.
The fact is that the definition of s is just as valid37 in ZFC − Infinity as it is in ZFC.38 This
means that the existence of a Dedekind self-map on V cannot properly be viewed as an axiom of
infinity even though existence of a set Dedekind self-map can.
Nevertheless, it is reasonable to expect that one could introduce slight strengthenings of the
properties of a self-map ̄ : V → V that would imply the Axiom of Infinity (or the New Axiom of
Infinity). Our next step, therefore, is to formulate several strengthenings of this kind, which we
can then use for further intuition concerning how to proceed in the direction of justifying not just
infinite sets, but large cardinals as well. To take this step, we will need a deeper understanding
of the universe V and of the notion of a class.
37
In fact, s is definable on any transitive class that is closed under formation of pairs and unions; in other words,
s is definable in the theory {Pairing,Union}, which is just a tiny fragment of ZFC.
38
One consequence of this observation is the fact that the existence of a Dedekind self-map on a set is stronger
than the existence of such a map on the universe V . We mention briefly here the precise sense in which this is
true (the reader unfamiliar with consistency results concerning ZFC can skip this note). Using only ZFC-Infinity,
we can demonstrate the existence of a Dedekind self-map from the universe to itself (the generalized successor
s̄ : V → V is an example). However, the existence of a Dedekind self-map on a set, in conjunction with the other
axioms of ZFC, is sufficient to construct a model of the theory ZFC-Infinity (for instance, Vω ). This shows that the
existence of a Dedekind self-map on a set bears the same relationship to the theory ZFC-Infinity as the existence of
large cardinals bears to the theory ZFC: One in principle cannot prove from ZFC-Infinity that a Dedekind self-map
(or any kind of infinite set) exists; likewise, one cannot prove from ZFC, for the same reason, that any type of large
cardinal exists.
61
13
The Classes ON and V
The collection ON of ordinal numbers is to the universe V as the set of natural numbers is to
the “universe” Vω . They allow us to measure the sizes of sets and arrange elements in (long)
sequences. Both ON and V are examples of proper classes—collections that are too big to be
sets, but that we can speak about because they are definable by a formula. In this section, we
give precise defintions of these ideas.
We begin with the construction of V . The universe V is built in stages, and, as we have
already seen, the first few stages of V can be defined by the following inductive definition:
V0 = ∅
Vn+1 = P(Vn).
We also observed before that, assuming the Axiom of Infinity, we can obtain the set
[
Vω =
Vn .
n∈ω
To continue the construction further, we need to define the ordinal numbers.
Definition 4 (Ordinal Numbers). An ordinal number is any transitive set X with the property
that (X, ∈) is a well-ordered set.39 The collection of ordinal numbers is denoted ON.
The list of all ordinals begins with the finite ordinals 0, 1, 2, . . . (defined, as we described
earlier, as sets, with the property that each is the set comprised of all finite ordinals that precede
it). The first infinite ordinal is ω, which, like the finite ordinals, is the set consisting of all the
ordinals that precede it. Adopting the convention that, for any ordinal α, α + 1 = α ∪ {α}, the
next few ordinals can be listed as follows: ω, ω + 1, ω + 2, ω + 3, . . . (note that by ω + 2 we mean
(ω + 1) + 1, and so forth).
The order relation (which is simply the membership relation ∈) on the ordinal numbers
closely resembles that defined on the natural numbers because of the following fact, which we do
not prove here:
Proposition 31. Every nonempty subset of ON is well-ordered (by ∈).
Because the ordinals are well-ordered, they can be used for inductively defining long sequences
in exactly the same way as is done with the natural numbers. The theorem that makes this
assertion is given below, Theorem 21. We begin with two more definitions.
A successor ordinal is an ordinal β for which, for some ordinal α, β = α + 1 (recall α + 1 is
shorthand for α ∪ {α}). All the positive natural numbers are successor ordinals: For instance,
3 = 2 + 1. A limit ordinal is an ordinal that is not a successor ordinal; equivalently, an ordinal
that has no immediate predecessor. The easiest example is 0. Whether or not any other limit
39
Technically, we should say (X, ∈
X × X), since ∈ is a relation defined on all of V .
62
ordinals exist depends on whether the Axiom of Infinity holds. If not, then 0 is the only limit
ordinal. If it does hold, then ω exists and it is the smallest nonzero limit ordinal.
The ordinals can be used as indices to continue the construction of the stages of the universe.
The proof that this is so—as was the case for inductive definitions over ω—requires a principle
of induction. We state this principle first, and then show how inductive definitions are done over
ON.
Theorem 32 (Transfinite Induction). (ZFC-Infinity) Suppose φ(x) is a formula with ordinal
parameter x and possibly other set parameters. Suppose the following three conditions hold:
(1) (Basis) φ(0) holds.
(2) (Successor) If α is a successor ordinal, α = β + 1, and φ(β) holds, then φ(α) also holds.
(3) (Limit) If α is a limit ordinal and φ(β) holds for every β < α, then φ(α) holds.
Then φ(α) holds for every α ∈ ON.
For transfinite induction, the induction step involves two sub-steps—the Successor step and
the Limit step. Likewise, inductive definitions across ON take into account successor and limit
ordinals separately. We give the inductive definition to define the sequence hV0 , V1, V2 , . . . , Vα, . . .i
of stages of the universe, using a compact format, and then follow this by the more rigorous
underpinnings of this simpler format.
V0 = ∅
Vα = P(Vβ ) if α = β + 1
[
Vα =
Vβ if α is a limit ordinal
β<α
V =
[
Vα
α∈ON
The definition of the long sequence hV0 , V1, . . . , Vα, . . .iα∈ON involves two “induction” steps,
both of which are concerned with how to define the αth stage on the basis of stages Vβ for β < α.
When α is a successor β + 1, Vα is computed to be the power set of the previous stage; when α
is a limit, Vα is computed to be the union of all previous stages.
63
Formally, the mechanism for justifying the existence of such a long sequence is a generalization of the mechanism given in Theorem 18. We state the result without directly using
Dedekind self-maps on a class, as was done in that theorem, but our result could be reworded in
the language of Dedekind self-maps.
Theorem 33 (Transfinite Recursion). . (ZFC − Infinity) Suppose C is a class, well-ordered by
some relation <. For each x ∈ C, let Cx = {y ∈ C | y < x}. Suppose F : C × V → V is a class
function. Then there is a unique class map T : C → V satisfying, for all x ∈ C,
T(x) = F (x, T Cx ) .
An important example of C is ON.
theorem, let F : ON × V be defined by


z



P(S(ran(z)))
F(α, z) = S

(ran(z))



∅
To see how the stages of the universe emerge from the
if α = 0
if z is a function on α and α is a successor
if z is a function on α and α is a limit
otherwise
Let T : ON → V be as in the theorem. For each α, write Vα = T(α). We show by transfinite
induction that the Vα satisfy the relations given in the compact definition given earlier. First,
V0 = T(0) = F(0, T 0) = ∅ = 0.
Next, assume α ∈ ON is a successor, so α = β + 1, and let z = T α = hVγ | γ < αi. Then
[
Vα = T(β + 1) = F(β + 1, z) = P( (ran(z))) = P(Vβ ).
If α is a limit
Vα = T(α) = F(α, z) =
[
[
(ran(z)) =
Vβ .
β<α
Now the result follows by transfinite induction.
Notice that in the theorem, we make reference to V even before we have defined the stages
of V . This is because V is, first and foremost, a class, defined by a formula (namely, the formula
“x = x,” so V = {x | x = x}); the representation of V as a union of stages is made possible by
the Foundation Axiom (one of the ZFC axioms).40
A set is understood to be any element of V . A class is any subcollection of V that one can
specify using a formula with set parameters. Any set x is, trivially, a class since x is specified by
the formula (with set parameters x, y) “y ∈ x”: x = {y | y ∈ x}. A proper class is a class that is
not a set. An example we have just seen is the class ON of ordinal numbers. Another example is
K = {x | x is finite}. It is easy to see that a collection C defined by a formula is a proper class
if and only if, for every ordinal α, there is an element of C that does not belong to Vα; in other
words, there are elements of C arbitrarily high up in V , so C 6⊆ Vα for any ordinal α. So, our
40
More formally, the Foundation Axiom is used to prove that for every set x, there is an ordinal α such that
x ∈ Vα .
64
two examples ON and K are proper classes because, for every α, neither α nor {α} belong to Vα
(and α is an ordinal so belongs to ON, and {α} is finite, so belongs to K).
A map F : A → B is a set if both A and B are sets. In some contexts, even if B is a proper
class, F may be considered to be a set as long the collection of ordered pairs that determine F
forms a set. (In that case, the range R of F certainly is a set, so one could, for example, consider
F to be F : A → R, now presented as a set function.) On the other hand, if the collection of
ordered pairs that determine F are definable from some formula, F is a class map. If, in addition,
these ordered pairs form a proper class, then F is a proper class map. We will see later that it
is (meaningfully) possible for F to be neither a set nor a class.41
We now state a slight strengthening of Proposition 31:42
Proposition 34. (ZFC − Infinity) Every nonempty subclass of ON has an ∈-least element.
14
The Class of Natural Numbers
We mentioned at the beginning of this paper43 that one could define, without any form of the
Axiom of Infinity, the class of natural numbers. Our proof that the set ω of natural numbers
can be derived from a Dedekind self-map could have been simplified greatly by making use of
this class, but we chose not to do so in order to emphasize that it is possible to view the natural
numbers as emerging from the dynamics of a Dedekind self-map without any “help” from the
natural numbers themselves.
At this point, since our discipline has already born the intended fruit, we will no longer
attempt to avoid using this class. We define ω as follows:
(
{0} ∪ {s(α) | α ∈ ON and α < γ} if γ is the least nonzero limit ordinal
ω=
{0} ∪ {s(α) | α ∈ ON}
if ON has no nonzero limit ordinal
Theorem 35. (ZFC − Infinity) The following are equivalent:
(1) The Axiom of Infinity (there is an inductive set).
(2) ω = ω.
(3) There is a nonzero limit ordinal.
Proof. For (1) ⇒ (2), assuming there is an inductive set, we can form (in the usual way) the
intersection ω of all inductive sets. It is easy to verify that ω is an inductive set, so ω ⊆ ω.
The usual principle of induction and definition by induction theorems follow. By induction, one
verifies that s = s ω : ω → ω and that ω = {0} ∪ {s(n) | n ∈ ω}. Since ω is defined using the
first clause in the definition in this case, the result follows. For (2) ⇒ (3), we can use ω. For
41
See the footnote on p.147 for an example of such a map.
This follows from the other ZFC axioms in conjunction with Proposition 31: Given a class C of ordinals, let
α ∈ C. By Separation, E = {β ∈ α | β ∈ C} is a set, as is {α}; therefore, their union is a set of ordinals having a
least element γ. Now, for any δ ∈ C, either δ < α or δ ≥ α. In the first case, δ ∈ E, so γ ≤ δ. In the second case,
γ ≤ α ≤ δ.
43
See p.15.
42
65
(3) ⇒ (1), let γ be a nonzero limit ordinal. It is easy to see that γ is inductive, so the Axiom of
Infinity holds.
Whether or not any form of the Axiom of Infinity holds, ω still satisfies a form of the Principle
of Induction.
Theorem 36 (Class Induction Over ω). (ZFC − Infinity) Suppose C is a subclass of ω with the
following properties:
(1) 0 ∈ C
(2) whenever α ∈ C, s(α) ∈ C
Then C = ω.
Proof. Let B = ω − C. Assume B 6= ∅, and let α be its least element. Since 0 ∈ C, α > 0. By
the definition of ω, α is not a limit ordinal, and so has an immediate predecessor β, which must
be in C. By (2), the successor of β must therefore be in C, contradicting the fact that α ∈ C.
Therefore, B is empty, and the result follows.
A special case of Theorem 33 allows us to recursively define sequences with indices in ω.
Theorem 37 (Class Recursion Over ω). (ZFC − Infinity) Suppose F : ω × V → V is a class
function and t0 is a set. Then there is a unique class function T : ω → V satisfying:
(1) T(0) = t0 .
(2) For all n ∈ ω, T(s(n)) = F(n, T(n)).
The theorem shows that, in order to specify a sequence hxn | n ∈ ωi of sets, it is enough to
specify x0 and to declare how xn+1 should be computed from xn —this latter declaration takes
the form of a class function F.
In practice, one uses a simpler but equivalent form of Theorem 37 to define a sequence inductively (we encountered this format before in our definition of the first stages of the universe V ,
p.41). To obtain a sequence hxn | n ∈ ωi, it suffices to specify a value t0 and a function t so that
(1) x0 = t0 ;
(2) xn+1 = t(xn ).
In the context of the theory ZFC − Infinity a set X is said to be finite if |X| = n for some
n ∈ ω.44 When ω = ω, of course, this new definition of “finite” coincides with the one given
earlier,45 but this slightly more general definition gives meaning to the term “finite” even in the
absence of infinite sets. This is the definition we will use for the rest of the paper. We can now
define a set to be infinite if it is not finite. We can verify that this notion of “infinite” matches the
equivalent forms given in Theorem 35. Since we have already shown that existence of a Dedekind
self-map is equivalent to the Axiom of Infinity, the following suffices. We make use of the easily
proven fact that a finite union of finite sets is finite.
44
45
The expression “|X| = n” means there is a bijection f : X → n.
See p. 35.
66
Theorem 38. (ZFC − Infinity) The following are equivalent:
(1) There is a Dedekind self-map.
(2) There is an infinite set.
Proof of (1) ⇒ (2). It suffices to show that no finite set admits a Dedekind self-map, and we do
this by induction over ω. Certainly the empty set is not Dedekind-infinite since the empty map has
no critical point. Assuming the result holds for all finite sets of size n, let X = {x1 , . . . , xn , xn+1 }
be a set of size n + 1 and assume t : X → X is a 1-1 function with critical point xi . Let k 6= i and
let ` be such that t(x` ) = xk . Let Y = X − {x` }, let Z = X − {xk }, and let j = t Y : Y → Z.
Certainly j is 1-1 but not onto since xi is not in the range of j. Since both Y and Z have size n,
there is a bijection α : Z → Y . Now j ◦α : Z → Z is 1-1 but not onto, since if xi = j(z) = j(α(z)),
then, letting xr = α(z) we would have j(xr ) = xi , which is impossible since xi 6∈ ran j. We have
shown that j ◦ α is a Dedekind self-map on a set of size n, which contradicts the induction hypothesis. Thus, the assumption that t is a Dedekind self-map is incorrect. We have completed
the induction step and therefore the proof that no finite set admits a Dedekind self-map.
Proof of (2) ⇒ (1). Suppose X is infinite. We prove by induction that for any n ∈ ω, there is a
1-1 function f : n → X: This is clear for n = 0, 1. Assuming f : n → X is 1-1, since |ranf | = n,
there must be an x ∈ X not in the range of f . Define g : n+1 → X by g = f ∪{(n, x)}. Clearly g is
also 1-1. This completes the induction. We next inductively define a sequence S = hSn | n ∈ ωi of
disjoint, nonempty subsets of X so that, for each n > 0, |Sn | = n. Let x1 ∈ X; we let S1 = {x1 }.
Assume we have defined S1 , S2 , . . . , Sn. An easy induction argument shows A = S1 ∪ S2 ∪ . . .∪ Sn
is finite. The set X − A cannot be finite—otherwise X = (X − A) ∪ A would be finite. By our
earlier claim, we can obtain f : n + 1 → X − A that is 1-1. Let Sn+1 = ran f . This completes the
inductive defintion and gives us a sequence (or function) h : ω → X. To complete the proof of
S
S
(2) ⇒ (1), define Y ⊆ X by Y = ran h; that is, Y = n Sn . Obtain a set W using the Axiom
of Choice by picking one element sn from each Sn . Define w : W → W by w(sn ) = sn+1 . Clearly
w is a Dedekind self-map.
Next, we wish to show that ω consists precisely of its elements of the form sn (0). We first
define by induction on ω the sequence of iterates of s: s0 , s1 , s2 , . . . , sn , . . .. Let t0 = idω . Define
F on ω by
(
s ◦ x if x : ω → ω is a function
F(n, x) =
∅
otherwise
Using Theorem 37, we obtain a unique class function T on ω satisfying T(0) = t0 = s0 and
T(s(n)) = F(n, T(n)) = s ◦ T(n).
Letting sn = T(n) for each n ∈ ω, one shows by induction on ω that
s0 = idω
sn+1 = s ◦ sn .
Alternatively, we can view these clauses as an application of the simpler form of definition by
induction given above.
67
Theorem 39. (ZFC − Infinity) ω = {sn (0) | n ∈ ω}. Moreover, for each n ∈ ω, sn (0) = n.
Proof. An easy induction shows that each sn (0) belongs to ω. To complete the proof, it suffices
to prove the “moreover” clause. We proceed by induction on n. If n = 0, then n = s0 (0). Assume
the result for n ≥ 0. Then
sn+1 (0) = s(sn (0)) = s(n) = n + 1.
We can now show that (ω, s, 0) has the property of being initial in the category of class
Dedekind self-maps.
Theorem 40. (ZFC − Infinity) Suppose j : C → C is a class Dedekind self-map with critical
point c. Then there is a unique class map F, definable from j and s, for which F(0) = c and the
following diagram is commutative:
ω
s
-
F
?
C
ω
F
j
-
?
C
Proof. By the proof of Theorem 17(1), a class function F on ω is specified uniquely by the
following data:
F(0) = c
F(s(n)) = j(F(n)).
This uniquely defined F satisfies the requirements of the Theorem.
The work in this section has been largely unnecessary because it is redundant. We have
already proven the induction and recursion theorems for ω. On the other hand, if ω does not
exist, it is easy to see that ω = ON, since the only ordinals that exist in that case are the finite
ordinals; for that case, we have also provided an induction and recursion theorem (in Section 13).
Nevertheless, we do have in hand now a definition of the class of natural numbers, and this notion
will prove useful.
15
Strengthening Proper Class Dedekind Self-Maps
We turn now to several ways of strengthening a Dedekind self-map of the form j : V → V in the
theory ZFC − Infinity so that some form of the Axiom of Infinity is derivable. Given such a j,
we say that a subset (or more generally, a subclass) A of V is j-closed if, for all x ∈ A, j(x) ∈ A.
Theorem 41. Suppose j : V → V is a Dedekind self-map with critical point a. Suppose A ∈ V
is a set such that a ∈ A and A is j-closed. Then j A : A → A is a Dedekind self-map with
critical point a.
68
Proof. This is a straightforward verification.
We point out that, in the theory ZFC − Infinity, it is impossible to prove the existence of
a Dedekind self-map j : V → V and a set A containing the critical point of j for which A is
j-closed, since existence of an infinite set is not provable from that theory. On the other hand,
the global successor function s, together with Vω , provides an example in ZFC.
Properties of the global successor s suggest another related way to generate Dedekind selfmaps:
Theorem 42. There is a Dedekind self-map if and only if there is a set A such that |A| = |s(A)|.
Proof. An easy but rather unenlightening proof can be given as follows: If there is a Dedekind
self-map, an infinite set must exist, so, in particular, ω exists. Since ω = |ω + 1|, we can let A = ω
in this case. For the converse, suppose |A| = |s(A)|. Let κ = |A|. Then κ = |s(κ)| = |κ + 1|. But
for each finite ordinal n, n < n + 1 = |n + 1|. Therefore, κ (hence, A as well) is infinite.
We give an alternative proof46 using the tools of category theory. This approach will have the
advantage that it will reveal an alternative characterization of initial Dedekind self-maps. Recall
that, by the Axiom of Foundation, for any set A, {A} must be disjoint from A, so s(A) is the
disjoint union of A and {A}. In category theoretic notation, we may view s(A) as a coproduct of
A and {A}: s(A) = A q {A}. Since all singleton sets are isomorphic in the category of sets, we
may equally well write
s(A) = A q 1, 47
where, as usual, 1 = {0}.
We take a moment to define the notion of coproducts (in any category) more carefully:
Given objects X, Y , the coproduct of X and Y consists of an object X q Y and morphisms
iX : X → X q Y and iY : Y → X q Y with the following universal property: Whenever
h : X → Z and k : Y → Z are morphisms, there is a unique morphism denoted hh, ki which
makes the following diagram commute:
iX
X
H
H
-X q Y hHH
j
H
iY
hh,ki k
?
Y
Z
i
i
X
Y
The diagram X −→
X q Y ←−
Y is called a coproduct diagram. In any model of ZFC −
Infinity, the axioms guarantee the existence of coproducts: Given sets X, Y , X q Y is the disjoint
union of X, Y ; that is:
X q Y = {(0, x) | x ∈ X} ∪ {(1, y) | y ∈ Y },
46
This is a special case of a much deeper result due to Freyd.
In this context, it is even more usual to write s(A) = A + 1, but to avoid confusion we will adopt the notational
convention mentioned in the text.
47
69
together with the maps iX , iY defined by
iX (x) = (0, x)
iY (y) = (1, y).
Then, given h : X → Z and k : Y → Z, we define hh, ki by hh, ki((0, x)) = h(x) and
hh, ki((1, y)) = k(y). One verifies commutativity of the diagram and uniqueness of hh, ki.
The main lemma for our second proof of the theorem is the following. Let us call a Dedekind
self-map singular if it has only one critical point.
i
i
a
A
1
Lemma 43. Suppose 1 −→
A q 1 ←−
1 is a coproduct diagram. Suppose j : A → A and 1 −→ A
a
(identifying an element a of A with 1 −→ A) are functions. Then
(A) If j is a Dedekind self-map with critical point a, then hj, ai is 1-1, from which it follows
that |A| = |A q 1|. If, in addition, j is singular, then hj, ai is itself a bijection.
(B) If hj, ai is 1-1, then j is a Dedekind self-map with critical point a. Moreover, if hj, ai is a
bijection, j is a singular Dedekind self-map.
A
iA
H
H
-Aq1
jHH
j
H
hj,ai
?
i1
a
1
A
Proof of Lemma. For (A), we start by showing hj, ai is 1-1. Suppose hj, ai((0, x)) = hj, ai((0, y)).
Using the set-based definition of hj, ai, it follows that j(x) = j(y), so, by 1-1-ness of j, x = y. On
the other hand, if hj, ai((0, x)) = hj, ai((1, 0)) (note that (1, 0) is the only element of A q 1 having
first component 1), it follows that j(x) = a. But this is impossible since a 6∈ ran j. We have
shown hj, ai is 1-1, whence |A q 1| ≤ |A|; by the Schröder-Bernstein Theorem, since |A| ≤ |A q 1|
as well, it follows that |A| = |A q 1|.
For the “moreover” clause of (A), suppose j is a singular Dedekind self-map. Let x ∈ A. If
x 6= a, then x ∈ ran j, and so hj, ai((0, x)) = x. On the other hand, hj, ai((1, 0)) = a. It follows
that hj, ai is onto, as required.
For (B), suppose hj, ai is 1-1. Observing that iA is 1-1, it follows that j, being a composite
of 1-1 functions, is itself 1-1. We show a 6∈ ran j: Suppose, for a contradiction, that j(x) = a. It
follows that h(j, a)i((0, x)) = a = hj, ai((1, 0)), contradicting the 1-1-ness of hj, ai. For the “moreover” clause, assume now that hj, ai is a bijection. It suffices to show that hj, ai (Aq1−{(1, 0)})
is a bijection from (A q 1) − {(1, 0)} to A − {a}. Let b ∈ A − {a}. Then there is x ∈ A with
j(x) = b, and so hj, ai((0, x)) = b.
Finally, we prove the theorem. If there is a Dedekind self-map, then by part (A) of the
lemma, |A| = |A q 1| = |s(A)|. Conversely, if |A| = |s(A)|, let q : A q 1 → A be a bijection and
let a = q((1, 0)). Define j : A → A by j(x) = q((0, x)). With these definitions, the following
70
diagram is commutative:
A
iA
HH
-Aq1
jHH
j
H
q
?
i1
a
1
A
But, by uniqueness of the morphism hj, ai (by the universal property of coproducts), it follows
that q = hj, ai. Since q is a bijection, it follows from the lemma that j : A → A is a Dedekind
self-map.
The theorem, together with the easy proof given in the first paragraph, can be generalized
somewhat, using a reasonable Dedekind self-map j : V → V in place of s:
Corollary 44. There is a set Dedekind self-map if and only if there is a Dedekind self-map
j : V → V having the following properties:
(A) ran j A $ j(A)
(B) |A| = |j(A)|.
Proof. If there is a set Dedekind self-map, then ω exists. Letting A = ω in the theorem, the
global successor s : V → V has the required properties, as the previous theorem shows. For the
other direction, let j : V → V be a Dedekind self-map satisfying (A) and (B). Let f : j(A) → A
be a bijection. Let i = f ◦ (j A) : A → A. Since j is 1-1, so is j A, and so i must also be 1-1.
Since j A is not onto, let y ∈ j(A) − ran (j A). Then f (y) is not in the range of i. We have
shown that i : A → A is a Dedekind self-map.
The reader will recall that we went to the trouble of giving a second proof of the previous
theorem because it leads to an alternative characterization of initial Dedekind self-maps. To state
the result, we need to define one other universal construction from category theory. Suppose
f : A → B and g : A → B are functions, with the same domain and codomain. A coequalizer
for f, g is a function q : B → C (for some C) satisfying q ◦ f = q ◦ g, and having the following
universal property: For any h : B → D satisfying h ◦ f = h ◦ g, there is a unique k : D → C
making the following diagram commutative:
f
A
g
q
B
h
C
k
D
Theorem 45. Suppose j : A → A and a ∈ A.
(A) If (A, j, a) is an initial Dedekind self-map, then,
i
i
A
1
(i) for the coproduct diagram A −→
A q 1 ←−
1, hj, ai is an isomorphism, and
71
(ii) if u : A → 1 is the unique function from A to 1, then u is a coequalizer for j, idA .
(B) Suppose (i) and (ii) above hold true. Then (A, j, a) is an initial Dedekind self-map.
Proof of (A). For (i), the fact that every initial Dedekind self-map (A, j, a) is singular follows
from the fact that it must be uniquely isomorphic to (ω, s, 0). By the previous lemma, it follows
that hj, ai is an isomorphism. We prove (ii). Let h : A → B so that
(∗)
h ◦ j = h ◦ idA = h.
Certainly h(j(a)) = h(a). Moreover, by (∗), h(j n+1 (a)) = h(j n (a)). It follows that h(j n(a)) =
h(a) for all n ≥ 0. Since A = {a, j(a), j 2(a), . . .}, it follows that, for all x ∈ A, h(x) = h(a).
Consider the function k : 1 → B : 0 7→ h(a). Certainly k is the unique function that makes the
diagram commutative.
j
A
u
A
idA
1
k
h
B
We have shown that u coequalizes j, idA .
Proof of (B). Assuming conditions (i) and (ii), we show j is an initial Dedekind self-map with
critical point a. By the Lemma, because (i) holds, j is a singular Dedekind self-map with critical
point a. To see j is initial, letting W = {a, j(a), j 2(a), . . .}, it suffices by previous work to show
W = A. Suppose there is t ∈ A − W . It follows that j(t) 6∈ W : j(t) 6= a since a 6∈ ran j, and
j(t) 6= j n+1 (a) since t 6∈ W . Define h : A → 2 by
(
0 if x ∈ W
h(x) =
1 if x 6∈ W
We show h(j(x)) = h(x) for all x ∈ A, establishing that h ◦ j = h ◦ idA . If x ∈ W , this is obvious.
If x 6∈ W , as we have shown, j(x) 6∈ W , and so h(j(x)) = 1 = h(x).
To complete the proof, we show there is no k : 1 → 2 that makes the following diagram
commutative:
j
A
u
A
idA
h
1
k
B
There are two possible choices for k: Either k(0) = 0 or k(0) = 1. If k(0) = 0, then we have
k(u(t)) = 0 but h(t) = 1. If k(0) = 1, then we have k(u(a)) = 1 but h(a) = 0. Either way, we
have a contradiction. This shows that there is no k that makes the diagram commute, and this
contradicts the fact that u : A → 1 is a coequalizer for j, idA . Therefore, W = A, as required.
Another way that a class Dedekind self-map j : V → V could give rise to a set Dedekind
self-map is if j(x) is itself Dedekind-infinite. From a philosophical point of view, this kind of
72
situation matches the intuition that a critical point of a Dedekind self-map is directly related to
the existence of an infinite set. We will discuss a somewhat more elaborate example of this type
in Section 17.
Two other ways a class map j : V → V could be strengthened to give rise to a Dedekind selfmap involve strengthening the preservation properties of j. Preservation properties turn out to be
a key element in strengthening maps like j in such a way that stronger forms of “infinity” emerge.
We introduce several properties that could be preserved by such a j and give two examples that
lead to a derivation of an infinite set from a class Dedekind self-map.
Recall that if C = (O, M) is a category, an initial object I in C is an object with the property
that for every X ∈ O there is a unique m ∈ M such that m : I → X. We saw that in Set,
the empty set is the unique initial object. The “dual” notion is a terminal object. An object
T ∈ O is terminal if, for every X ∈ O there is a unique t ∈ M such that t : X → T . In Set,
every singleton set {x} is terminal since, for any set X, the only function f : X → {x} is the one
defined by f (z) = x for all z ∈ X.
Suppose j : V → V is a 1-1 class function. A set X ∈ V is said to be a strong critical point
if |X| =
6 |j(X)|. It is possible that even with a strong critical point, j may not be a Dedekind
self-map. The relationship between strong critical points and critical points is addressed in Proposition 32 and Theorem 34, below.
Definition 5. Suppose j : V → V is any function.
(1) j is said to preserve disjoint unions if, whenever X, Y ∈ V are disjoint, j(X), j(Y ) are also
disjoint and j(X ∪ Y ) = j(X) ∪ j(Y ).
(2) j preserves singletons if, for any X, j({X}) = {j(X)}.
(3) j is said to preserve coproducts if, whenever X, Y are disjoint, |j(X ∪ Y )| = |j(X) ∪ j(Y )|.
Moreover, j preserves finite coproducts if, whenever n ∈ ω and X1 , X2 , . . . , Xn are disjoint,
|j(X1 ∪ X2 ∪ · · · ∪ Xn )| = |j(X1) ∪ j(X2) ∪ · · · ∪ j(Xn)|.
(4) j preserves terminal objects if, whenever T is terminal, j(T ) is terminal. In particular, j
maps singleton sets to singleton sets since, in Set, the terminal objects are precisely the
singleton sets.
(5) j preserves initial objects if, whenever I is initial, j(I) is also initial. In particular, j(∅) = ∅,
since ∅ is the unique initial object in Set.
(6) j is cofinal if, for every a ∈ V there is A ∈ V with a ∈ j(A).
(7) j preserves subsets if, whenever X ⊆ Y , we have j(X) ⊆ j(Y ).
Remark 7. Note that preservation of disjoint unions is different from the property that every
1-1 function f has, namely, that whenever A, B are disjoint, the images f [A], f [B] are disjoint,
and f [A ∪ B] = f [A] ∪ f [B]. The example f : V → V defined by f (x) = {x} highlights the
distinction since
f ({1} ∪ {2}) = f ({1, 2}) = {{1, 2}} =
6 {{1}, {2}} = f ({1}) ∪ f ({2}),
73
and yet
f [{1} ∪ {2}] = f [{1, 2}] = {f (1), f (2)} = f [{1}] ∪ f [{2}].
We also observe here that whenever j preserves disjoint unions, j preserves subsets: Given
sets A, B with A ⊆ B, note that B = A ∪ (B − A), which is a disjoint union, and so we may write
j(B) = j(A) ∪ j(B − A), so that j(A) ⊆ j(B).
Finally, we remark that if j has critical point a and j preserves singletons and disjoint
unions, then {a} is also a critical point of j: Suppose {a} = j(z). If z is not a singleton, let z0 ∈ z
and, because {z0 }, z − {z0 } is a partition of z, by preservation of singletons and disjoint unions,
j(z) = {j(z0 )} ∪ j(z − {z0 }), and so |j(z)| > 1, which is impossible. Therefore, z is a singleton
set {y}. We have {a} = j(z) = j({y}) = {j(y)}, and so a = j(y), contradicting the fact that a is
a critical point of j.
Theorem 46. (ZFC − Infinity) Suppose j : V → V is a class Dedekind self-map. Suppose also
that j preserves disjoint unions, the empty set, and singletons. Then there is an infinite set.
Proof. We show by induction on n that j preserves finite disjoint unions. The result is clear for
n = 2 since j preserves disjoint unions. Assume that whenever {X1 , . . . , Xn } is a collection of n
disjoint sets, then the sets j(X1), . . . , j(Xn) are also disjoint, and
j(X1 ∪ · · · ∪ Xn ) = j(X1 ) ∪ · · · ∪ j(Xn).
Given disjoint sets Y1 , . . . , Yn , Yn+1 , certainly j(Y1 ), . . . , j(Yn) are disjoint. Let Y = Y1 ∪ · · · ∪ Yn .
By induction hypothesis, j(Y ) = j(Y1 ) ∪ · · · ∪ j(Yn ). Since Y and Yn+1 are disjoint, so are j(Y )
and j(Yn+1 ), and
j(Y1 ∪ · · · ∪ Yn ∪ Yn+1 ) = j(Y ∪ Yn+1 ) = j(Y ) ∪ j(Yn+1 ) = j(Y1 ) ∪ · · · ∪ j(Yn ) ∪ j(Yn+1 ),
as required. This completes the induction and proves the claim.
We proceed by induction to show that j(n) = n for all n ∈ ω. By assumption on j, j(∅) = ∅.
Assume j(n) = n. Note that for all n ∈ ω, n, {n} are disjoint because (n, ∈) is a well-ordering
(so, in particular, n 6∈ n). We complete the induction step and the proof that j(n) = n for all
n ∈ ω with the following:
j(n + 1) = j(n ∪ {n}) = j(n) ∪ j({n}) = j(n) ∪ {j(n)} = n ∪ {n} = n + 1.
S
Next we show that, for all x ∈ HF, j(x) = x. As we observed earlier,48 HF = n∈ω Vn . For
each x ∈ HF, let rank x denote the least n ∈ ω for which x ⊆ Vn .49 We proceed by induction
on n ∈ ω to show that, for all x of rank ≤ n, j(x) = x. This is clear for n = 0, since j(∅) = ∅.
Assume the result for n. For the induction step, it suffices to show that if rank x = n + 1, then
j(x) = x. Given such an x, certainly x is finite; write x = {y1 , . . . , yk }. For each i, yi ∈ x ⊆ Vn+1 .
48
A proof is given in a footnote on page p. 42. The proof that this is true when ω does not exist is essentially
the same even though, in that case, the union of the Vn is a proper class.
49
More generally, for any x ∈ V , rank x is defined to be the least ordinal α for which x ⊆ Vα .
74
By the induction hypothesis, for each i, j(yi) = yi . We have
j(x) = j ({y1 , . . . , yk })


[
= j
{yi }
1≤i≤k
=
[
j({yi })
1≤i≤k
=
[
{j(yi)}
1≤i≤k
=
[
{yi }
1≤i≤k
= {y1 , . . . , yk }
= x.
This completes the induction and proves that, for all x ∈ HF, j(x) = x.
Finally, we complete the proof of the theorem: Let a be a critical point of j. Since j HF
is just the identity map on HF, which is onto HF, a 6∈ HF. Therefore, every transitive set that
contains a is infinite; since the Trans axiom holds, some transitive set must include a. Therefore,
there is an infinite set.
A filter F on a set A is a collection of subsets of A that is closed under intersections and
supersets, and for which A ∈ F and ∅ 6∈ F (sometimes such an F is called a proper filter). If
there is X ⊆ A such that F = {Y | X ⊆ Y }, F is a principal filter with generator X. If there is
x ∈ A such that {x} ∈ F , then F is a trivial filter. Every trivial filter is principal.
A filter F is an ultrafilter if, for every X ⊆ A, either X or A − X belongs to F . If F is an
ultrafilter, F is nontrivial if and only if F is nonprincipal. If F is a nonprincipal ultrafilter on A,
F contains no finite sets: If Y ⊆ A is finite, Y = {y1 , . . . , yk }, then by closure under intersections,
since {yi } 6∈ F for 1 ≤ i ≤ k, A − {yi } ∈ F , and so
\
A−F =
(A − {yi }) ∈ F.
1≤i≤k
It is possible to construct nontrivial filters on a finite set, and also to construct an ultrafilter
on a finite set. But it is not possible to construct a nontrivial (nonprincipal) ultrafilter on a finite
set, as is shown in the previous paragraph. These observations give us another equivalent form
of the Axiom of Infinity:
Ultrafilter Criterion for Infinite Sets. There is an infinite set if and only if there
is a nonprincipal ultrafilter.
Using the Axiom of Choice, one can construct a nonprincipal ultrafilter on any infinite set.
Ultrafilters (including nonprincipal ultrfilters) also arise from Dedekind self-maps j : V → V ,
equipped with adequate preservation properties, as we discuss next in Theorem 47.
75
Theorem 47. Suppose j : V → V is a class Dedekind self-map with critical point a and there is
a set A such that a ∈ j(A). Let D = {X ⊆ A | a ∈ j(X)}.
(1) If j preserves subsets, intersections, and the empty set, then D is a filter.
(2) If, in addition, to (1), one of the following conditions holds, then D is a nontrivial filter:
(A) j preserves singletons.
(B) j preserves terminals and {a} is a second critical point of j.
(3) If, in addition to (1), j preserves disjoint unions, then D is an ultrafilter. Moreover, if one
of the conditions in (2) also holds, D is a nonprincipal ultrafilter (and A is infinite).
Motivation for the conditions in part (2) are given in Remark 7.
Remark 8. In the absence of (3), properties (1) and (2) are rather weak. We give an example
to illustrate this point, but also to motivate a more complex example for which (3) does hold.
Suppose A is a set with two or more elements, and define j : V → V by j(X) = X A =
{f | f : A → X}. Let idA : A → A denote the identity map. Then idA ∈ j(A). Clearly
idA and {idA } are critical points of j. Also, for any sets X, Y , X 6= Y ⇒ X A 6= Y A , and so
j is 1-1; indeed, j is a Dedekind self-map. Since ∅A = ∅, j preserves the emptyset. For any
set X, j({X}) = {X}A = {t}, where t : A → {X} is the unique function from A to {X};
therefore, j preserves terminals. Finally, notice j preserves intersections: It suffices to show that
(X ∩ Y )A = X A ∩ Y A . f ∈ (X ∩ Y )A , it follows easily that f ∈ X A ∩ Y A . Now suppose
f ∈ X A ∩ Y A , so that ran f ⊆ X and ran f ⊆ Y . It follows that ran f ⊆ X ∩ Y , and so
f ∈ (X ∩ Y )A .
We have shown that this particular j : V → V satisfies properties (1) and (2) of Theorem 47,
and so we conclude that D is a nontrivial filter. However, j does not satisfy property (3): Given
two nonempty disjoint subsets B, C of A with B ∪ C = A, while it is true that B A and C A
are disjoint, it is not the case that B A ∪ C A = (B ∪ C)A ; indeed, idA belongs to the set on the
right-hand side, but not to the set on the left-hand side.
In fact, D is rather uninteresting: Although it is true that A ∈ D (since 1A ∈ j(A)), D
contains no other set! If X $ A, then it is not the case that 1A ∈ X A .
To make D more interesting, and to ensure that j preserves disjoint unions, it is necessary
to “blur” the sharp differences between the sets of the form Z A , and this can be done with an
appropriate choice of equivalence relation.
Proof of (1). A ∈ D by the definition of D and ∅ 6∈ D since j(∅) = ∅. D is closed under
intersections since j preserves intersections. D is also closed under supersets since j preserves
subsets. We have shown that D is a filter.
Proof of (2). We verify D is nontrivial assuming either (2)(A) or (2)(B). For this, it suffices to
show that D has no element of the form {z}. Suppose for some z ∈ A, {z} ∈ D.
If (2)(A) holds, then a ∈ j({z}) = {j(z)}, since j preserves singletons in this case. It follows
a = j(z), which is impossible since a is a critical point of j.
76
If, instead, (2)(B) holds, then a ∈ j({z}) = {y} for some y ∈ j(A), since j preserves
terminals. It follows that a = y, and so {a} = j({z}); in other words, {a} ∈ ran j contradicting
the assumption that {a} is a second critical point for j.
We have shown that D is a nontrivial filter.
Proof of (3). Assuming j preserves disjoint unions, we show that D is an ultrafilter: Suppose
X ⊆ A and X 6∈ D. Let Y = A − X. We show a ∈ j(Y ). Since j preserves disjoint unions,
j(A) = j(X) ∪ j(Y ); since a 6∈ j(X), then a ∈ j(Y ), as required. If one of the conditions in (2)
also holds, then, by the argument establishing (2), D is a nonprincipal ultrafilter. It follows that
A itself is infinite.
Remark 9. We modify the example described in Remark 8 in a way that leads to the conclusion
that the filter D derived from j is a nonprinicipal ultrafilter. As we mentioned in that remark, we
need to modify the map X 7→ AX that was used there by introducing an appropriate equivalence
relation. One of the problems to fix from that example is that the only element of D is A. This
limitation occurs because the only way the function idA : A → A could be a member of X A,
for X ⊆ A, is if X = A. This requirement can be relaxed if we allow the possibility that idA
“almost” belongs to X A . We can do this by declaring two partial functions defined on A to be
equivalent if they are both defined on a “large” subset of A and agree on a “large subset” of A.
This new approach solves the problem: If f : A → X is a function that agrees with idA on a large
set (because X itself is large), we will be able to conclude that X ∈ D.
The concept of “large subset” can be captured by making use of a filter. For this purpose,
let U denote a (proper) filter on A. We do not require U to be a nonprincipal ultrafilter, so it
is perfectly possible for A to be finite at this step. If f : A → Z is a partial function, we call f
U -good if {x ∈ A | f (x) is defined} ∈ U . For f, g : A → Z that are U -good, we define ∼ = ∼A,U,Z
by
f ∼ g iff {x ∈ A | f (x) = g(x)} ∈ U.
We note that, for any f ∈ Z A for which |Z| ≥ 1, there is a total function f 0 : A → Z such that
f ∼ f 0 : Let z ∈ Z. Define f 0 by
(
f (x) if x ∈ dom f
f 0 (x) =
z otherwise
Clearly f 0 ∼ f .
Denote the equivalence class that contains f by [f ]U , or simply [f ]. For each set Z, we define
Z A /U = {[f ]U | f : A → Z is U -good}.
Finally, define j : V → V by j(Z) = Z A /U . We first show, in Claims (1)–(5) below, that j has
property (1), mentioned in Theorem 47:
Claim 1. j(∅) = ∅.
77
Proof. This follows because ∅A = ∅.
Claim 2. j is 1-1.
Proof. Suppose Y 6= Z are sets; without loss of generality, let y ∈ Y − Z. Consider the constant
function f y : A → Y defined by f y (x) = y. Clearly f y agrees nowhere with any partial function
A → Z. Therefore f y ∈ Y ω − Z ω , and so j(Y ) 6= j(Z).
Claim 3. j preserves terminal objects.
Proof. We show j takes singleton sets to singleton sets. Given a singleton set {z}, let fz be the
unique function A → {z}. Then
j({z}) = {z}A /D = {[f ] | f is a partial function from a set A ∈ D into {z}} = {[fz ]}.
Claim 4. [idA ] is a critical point of j.
[idA ] ∈ j(A).
Therefore, j is a Dedekind self-map.
Moreover,
Proof. Certainly, [idA ] is not itself of the form Z A /U , so [idA ] is a critical point. Since idA ∈ AA ,
certainly [idA ] ∈ AA /U = j(A).
Claim 5. j preserves intersections.
Proof. Suppose X, Y are sets and Z = X ∩ Y . Suppose [f ] ∈ X A /U ∩ Y A /U . Let B, C ∈ U be
such that for all x ∈ B, f (x) ∈ X and for all x ∈ C, f (x) ∈ Y . Since U is closed under instersections, B ∩ C ∈ U , and we have that for all x ∈ B ∩ C, f (x) ∈ X ∩ Y , so [f ] ∈ (X ∩ Y )A /U . For
the converse, if E ∈ U is such that for all x ∈ E, f (x) ∈ X ∩ Y , then it follows easily, using the
fact that U is closed under supersets, that [f ] ∈ X A/U and [f ] ∈ Y A /U .
We have shown property (1) of Theorem 47 holds for j. Therefore, if D = {X ⊆ A | [idA ] ∈
j(X)}, then D is a filter. In the present context, this is particularly unsurprising in light of the
next Claim:
Claim 6. D = U .
Proof. Suppose X is a set. Then
X ∈ D ⇔ [idA ] ∈ j(X) ⇔ [idA ] ∈ X A/U ⇔ {x ∈ A | x ∈ X} ∈ U ⇔ X ∈ U.
We next show that j preserves disjoint unions only if our starting filter U was already an
ultrafilter:
Claim 7. The following are equivalent:
(A) j preserves disjoint unions.
78
(B) U is an ultrafilter (equivalently, D is an ultrafilter).
Proof. First, observe that (A) ⇒ (B) follows from Theorem 47, part (3), using the fact that
D = U . For the converse, suppose X, Y are disjoint and let Z = X ∪ Y . It is obvious that
X ω ∩Y ω = ∅, and so j(X)∩j(Y ) = ∅. It is also obvious that j(X)∪j(Y ) ⊆ j(Z). To prove equality,
let [f ] ∈ Z A /D. Let SX = {x ∈ A | f (x) ∈ X} and SY = {x ∈ A | f (x) ∈ Y }. Since SX ∪ SY ∈ D
and D is an ultrafilter, one of SX , SY belongs to D, say
SX . Then
[f ] = [f SX ] ∈ X A /U. We
have shown that each [f ] in Z A /U belongs to X A /U ∪ Y A /U .
Assuming from the beginning that U is an ultrafilter, we have established that j preserves
disjoint unions and that D is therefore, by Theorem 47, also an ultrafilter. None of the arguments so far require A to be infinite or D to be nonprincipal. In the present setting, the only
way D could turn out to be a nonprincipal ultrafilter, given the assumptions we have made so
far on U , is if {[idA ]} turns out to be a critical point of j, which could happen only if U itself
was initially assumed to be nonprincipal. Unfortunately, therefore, tracing through this example
does not reveal the mechanics by which a nonprincipal ultrafilter comes into existence; instead, it
shows that existence of a nonprincipal ultrafilter is equivalent to existence of a Dedekind self-map
having properties (1), (2)(B), and (3) in Theorem 47. The next Claim establishes the remaining
details.
Claim 8. Assume U is an ultrafilter. Then the following are equivalent:
(A) {[idA ]} is a critical point of j.
(B) U (equivalently, D) is a nonprincipal ultrafilter on A, whence A is infinite.
Proof. The fact that (A) ⇒ (B) follows from Theorem 47(2)(B), using the fact that D = U . For
the converse, assume U is nonprincipal but {[idA ]} is in the range of j, so that {[idA ]} = Z A /U ,
for some set Z; we will arrive at a contradiction.
First, we show that Z itself must be a singleton set: If Z has two elements y, z, the constant
functions f y : A → Z : x 7→ y and f z : A → Z : x 7→ z agree nowhere, and so [f y ] 6= [f z ]; hence,
|j(Z)| = |Z A /U | > 1, contradicting our assumption that {[idA ]} = Z A /U .
Therefore, Z = {z} for some z. Let f be the unique function from A to {z}. Then
Z A /U = {[f ]} = {[idA ]}; in other words, f ∼ idA . It follows that {x ∈ A | f (x) = id(x)} ∈ U ;
that is, {x ∈ A | z = x} ∈ U , and so {z} ∈ U . Since U is nonprincipal, this is impossible. We
have shown therefore that {[idA ]} is also a critical point of j.
Claims 7 and 8 could have been presented and proven in reverse order, with slight modifications; the only change in the proofs is that “nonprincipal” must be replaced with “nontrivial” in
the new version of Claim 8 (which we will now call Claim 7’). We give the restatements here:
Claim 7’. The following are equivalent for j, U as originally defined in this example:
(A) {[idA ]} is a critical point of j.
(B) U (equivalently, D) is a nontrivial filter on A.
79
Claim 8’. Assume U is a nontrivial filter on A. Then the following are equivalent:
(A) j preserves disjoint unions.
(B) U (equivalently, D) is a nonprincipal ultrafilter, whence A is infinite.
The condition in Theorem 47 that the critical point a must belong to a set of the form
j(A) for some set A can fail even when the other conditions on j hold; indeed, the example in
Remark 9 provides an example: A is a critical point for j but does not belong to any set of the
form j(X) = X A /D.
We list several sufficient conditions here for the existence of such a set A, but most of these
involve concepts that will be introduced later.
(a) j is cofinal (p. 73)
(b) j preserves ∈ and both preserves and reflects ordinals (definitions on p. 88).
(c) j has a universal element (defined on p. 100).
Verification of (b) can be found on p. 90. The example given in Remark 9 is an example of (c),
as we will see later (p. 100).
Combining Theorem 47 with condition (a) leads to a nice sufficient condition for the existence
of a nonprincipal ultrafilter:
Proposition 48 (ZFC − Infinity). Suppose j : V → V is a cofinal class Dedekind self-map with
critical point a. Suppose also that j preserves disjoint unions, intersections, the empty set, and
singletons. Then there is a nonprincipal ultrafilter over some (infinite) set A.
Proof. By cofinality of j, there is a set A such that a ∈ j(A). Define D by D = {X ⊆ A | a ∈
j(X)}. Then by the proof of Theorem 47, D is a nonprincipal ultrafilter on A.
We turn now to a third set of preservation properties that a Dedekind self-map may exhibit,
which imply existence of an infinite set.
Theorem 49. (ZFC − Infinity) Suppose j : V → V is a class Dedekind self-map with a strong
critical point. Suppose j preserves finite coproducts and terminal objects. Then there is an infinite
set.
Proof. We first show that
(∗)
for every finite set X, |j(X)| = |X|.
First, suppose X 6= ∅. Write X = {x1 , . . . , xn}. Because j preserves finite coproducts and
terminal objects,
|j(X)| = |j({x1}) ∪ · · · ∪ j({xn})| = |{y1 , . . . , yn }| = |X|,
80
where, for each i, {yi } = j({xi}).
We show that j(∅) = ∅, so, in particular, |j(∅)| = |∅|. Suppose not. Then |j(∅)| ≥ 1 and so,
since terminal objects are preserved, for some set y,
|j(∅) ∪ j({∅})| = |j(∅)| + |j({∅})| = |j(∅)| + |{y}| ≥ 2.
But this contradicts the fact that coproducts are preserved, since, for this same set y,
|j(∅ ∪ {∅})| = |j({∅})| = |{y}| = 1.
Finally, we show there is an infinite set: Let Z be a strong critical point of j, so that
|j(Z)| =
6 |Z|. But now (∗) implies that Z is not finite. Therefore, Z is infinite.
Remark 10. The proof of Theorem 49 does not actually require j to be a Dedekind self-map;
the same argument works assuming only that j : V → V has a strong critical point and preserves
finite coproducts and terminal objects.
Remark 11. The proofs of Theorems 46, 47, and 49 show that any Dedekind self-map on V
that exhibits the specified preservation properties produces an infinite set, and, in the case of
Theorem 47, a nonprincipal ultrafilter. Self-maps with such properties cannot be proven to
exist in the theory ZFC − Infinity by Gödel Incompleteness. However, assuming existence of a set
Dedekind self-map—or, equivalently, existence of ω—such examples can be found. As an example
of a j : V → V that exhibits the properties specified in the hypotheses of Theorems 47(2) and 49,
we summarize (with a more concrete example) the points made in Remark 9. We follow this with
a simpler example that satisfies the properties of Theorem 49 only. We do not have an example
of a j : V → V with precisely the properties of Theorem 46 or of Theorem 47(1); in particular,
our examples do not preserve singletons (only terminal objects). Such examples do exist however,
but, so far, the only examples we know involve the use of large cardinal hypotheses.
Example 4 (Reduced Product Construction). We recap the development of the example in
Remark 9 using a more concrete example and proving one additional property that it has. In
particular, assuming ω exists, we exhibit a class Dedekind self-map with the properties listed
in Theorem 47(2) and Theorem 49: As was mentioned earlier, we may obtain a nonprincipal
ultrafilter D on ω. We recall several definitions from Remark 9. Given any set X, if f, g are both
D-good partial functions from ω to X, we declare f, g are equivalent, and write f ∼ g, if the set
of n ∈ ω at which f, g are both defined and equal belongs to D. Let [f ] denote the ∼-equivalence
class containing f and let
X ω /D = {[f ] | f is a D-good partial function from ω to X}.
Define jD : V → V by
jD (X) = X ω /D.
We note as before that, for any f ∈ X ω for which |X| ≥ 1, there is a total function f 0 : ω → X
such that f ∼ f 0 : Let x0 ∈ X. Define f 0 by
(
f (n) if n ∈ dom f
f 0 (n) =
x0 otherwise
81
Clearly f 0 ∼ f .
The following claims are proved in Remark 9 (setting A = ω).
Claim
Claim
Claim
Claim
1.
2.
3.
4.
jD (∅) = ∅.
jD is 1-1.
jD preserves terminal objects.
Both [idω ] and {[idω ]} are critical points of jD . Also,
D = {X ⊆ A | [idω ] ∈ jD (X)}.
Claim 5. jD preserves disjoint unions. Consequently, jD preserves all finite coproducts.
Claim 6. jD preserves intersections.
The following claim shows that our example also illustrates Theorem 49:
Claim 7. ω is a strong critical point of jD .
Proof. Define, for each n ∈ ω, the function fn : ω → ω by fn (i) = n + i. Then, whenever m 6= n,
fm , fn disagree everywhere. Therefore {[fn ] | n ∈ ω} is an infinite subset of jD (ω) = ω ω /D. We
show that jD (ω) is in fact uncountable. Suppose {[gn] | n ∈ ω} is an infinite subset of ω ω /D.
Define h : ω → ω by
h(n) = least element of ω − {gi(n) | i < n}.
Then for all i ∈ ω and all n > i, h(n) 6= gi(n); in particular [h] 6= [gi ] for all i ∈ ω. We have
shown jD (ω) is uncountable, so ω < |jD (ω)|. Therefore, ω is a strong critical point.
Example 5 (Simple Model of Theorem 49). Assuming that ω does exist, we construct a Dedekind
self-map j : V → V having the properties mentioned in the hypothesis of Theorem 49.
(
x
if x is finite
j(x) =
P(x) if x is infinite
The first clause of the definition ensures that j preserves terminal objects, since such objects
must always be finite. Since the powerset operator is 1-1, so is j. Notice ω is a strong critical
point by Cantor’s Theorem; it is easy to see that ω is a critical point as well. We show that j
preserves coproducts. Suppose X, Y are disjoint sets. The cases in which one of X, Y is finite are
straightforward; assume both are infinite. Then
|j(X ∪ Y )| = |P(X ∪ Y )| = 2max{|X|,|Y |} = max{2|X|, 2|Y | } = |P(X) ∪ P(Y )| = |j(X) ∪ j(Y )|.
The proof that finite coproducts are preserved is a straightforward induction.
Remark 12. We review the logic of Theorems 46, 47, and 49, along with Examples 4 and 5.
Theorems 46,47, and 49 guarantee the existence of an infinite set from the existence of a class
82
Dedekind self-map with sufficiently strong preservation properties. Theorem 47 guarantees, in
additions, existence of a nonprincipal ultrafilter, also derivable from a class Dedekind self-map
with sufficiently strong preservation properties.
Conversely, Example 5 shows that assuming the existence of an infinite set, one may recover
a Dedekind self-map with the properties listed in Theorem 49. And Example 4 shows how,
assuming existence of a nonprincipal ultrafilter on an infinite set, one can define a Dedekind
self-map with the properties listed in Theorem 47.
Our work so far shows that a class Dedekind self-map, equipped with sufficiently strong
preservation properties, gives rise to an infinite set, and therefore to ω, and also to a nonprincipal
ultrafilter. Another sort of structure, in the domain of graph theory, also arises with the emergence
of ω, called ω-trees; the existence of a nonprincipal ultrafilter provides a key for establishing the
main property that belongs to all trees of this type.
Example 6 (Tree Property for ω). A tree is a partially ordered set (T, <) having a smallest
element (called the root), with the property that, for every t ∈ T , the set P rt = {s ∈ T | s ≤ t}
(the predecessors of t) is well-ordered by ≤. We identify this definition of a tree with that of a
rooted tree from graph theory (a rooted tree is an undirected graph that is connected, has no
cycles, and has a distinguished vertex called the root).
Thus, there is a path from x to y in T (as a graph) if and only if either x ≤ y or y ≤ x;
likewise, for any x ∈ T , the descendants of x are those y ∈ T for which x < y, and the children
of x are those descendants of x that are immediate successors of x. A chain in a tree T is
any linearly ordered subset of T . A tree has levels, denoted LevT0 , LevT1 , LevT2 , . . . , or, when the
meaning is clear, Lev0 , Lev1 , . . .50 If r denotes the root of a tree, Lev0 = {r}, and Lev1 consists
of the children of r, and so forth.51 An ω-tree is a tree with exactly ω levels and for which every
50
In the general case, this enumeration may extend into the transfinite.
Further specification is required for trees having ω or more nonempty levels; we will not need this specification
here.
51
83
level is finite; in particular, every element has finitely many children. A binary ω-tree is an ω-tree
in which each vertex has at most two children. For any m ∈ ω ∪ {ω}, an m-branch in an ω-tree is
a subset B of T , linearly ordered by <, which intersects each of the levels in L = {Levi | i < m}
exactly once. A maximal branch is a subset B of T that is an m-branch for every m ∈ ω. We
can make use of the fact that ω admits a nonprincipal ultrafilter to prove the following:
Proposition 50. ω has the tree property. That is, every ω-tree has a maximal branch.
Proof. Given an ω-tree (T, <T ), it is possible to list its elements level by level using the elements
of ω as indices. Therefore, we identify the underlying set of this tree with ω, ordered by the
tree ordering <T . Note that identifying vertices of T with elements of ω in this way implies the
following: For any m, n ∈ ω, m <T n implies m < n. We let D be a nonprincipal ultrafilter D
on ω (i.e., on T ).
For each t ∈ T , we let St be the subtree of T at t; precisely:
St = {t} ∪ {s ∈ T | s is a descendant of t in T }.
Notice that t1 6= t2 implies St1 ∩ St2 = ∅. We let S n denote the union of all subtrees of T having
roots in Lev n :
[
S n = {St | t ∈ Levn }.
Notice that S n includes each element of T whose level is ≥ n. Since there are only finitely many
elements of T in levels 0 to n − 1, it follows that, for each n, S n ∈ D. Since D is an ultrafilter
and S n is a finite union, there is tn ∈ Levn such that Stn ∈ D. We claim that
(∗)
T = St 0 ) St 1 ) St 2 ) · · ·
This can be shown by induction. Clearly St0 ) St1 . Assuming this relation holds below n, for
n > 0, we show Stn ) Stn+1 . But since
[
Stn = {S` | ` ∈ Levn+1 and ` is a child of tn },
one of the S` in this union also belongs to D. Since only one of the subtrees rooted in Levn+1
can belong to D (since they are all disjoint), it must be that S` = Sn+1 , as required.
Let B = {tn | n ∈ ω}. We verify B is a maximal branch, completing the proof of Proposition 50. By construction, B meets every level exactly once. Also, our proof of (∗) above shows
that for each n, tn+1 < tn . Therefore, B is a maximal branch in T , as required.
We have seen that existence of a nonprincipal ultrafilter follows from existence of a Dedekind
self-map j : V → V with sufficient preservation properties, and that nonprincipal ultrafilters on
ω play a role in showing that ω has the tree property. This latter result, in turn, plays a key role
in establishing a well-known combinatorial property of ω, which we introduce briefly here.
Example 7 (Ramsey’s Theorem). The notation [ω]2 signifies the set of all unordered pairs of
ω; that is, [ω]2 = {{m, n} | m, n ∈ ω}. We adopt the convention of representing unordered
pairs {m, n} so that m < n. Suppose PS= {P0 , P1 } is a partition of [ω]2 ; that is, P0 , P1 are
disjoint, each is nonempty, and [ω]2 = P0 P1 . We will call a subset H of ω homogeneous for P
84
if the pairs obtained from H lie entirely in one piece of the partition; that is, for some i = 0, 1,
[H]2 ⊆ Pi . The Infinite Ramsey Theorem implies that every partition of [ω]2 into two pieces
admits a homogeneous set of size ω. Apart from the trivial cases of n = 0, 1, 2, the property
described by this theorem is unique to infinite sets:52 For n > 2, it is never true that a partition
of [n]2 admits a homogeneous set of size n. We give a proof of the Infinite Ramsey Theorem
using the fact that ω has the tree property.
Suppose we are given a partition P = {P0 , P1 } of [ω]2 . We define a binary ω-tree from
P —which we will call a P -tree—so that vertices are finite 0-1 sequences which are named in a
way that encodes the partition P . We now describe how to build the P -tree T from P : Vertices
of T will be finite 0-1 sequences. If t = he0 , e1 , . . . , em i, the length of t, denoted len t, is m + 1,
and, for 0 ≤ i ≤ m, the value of t in position i is denoted t(i). Thus t = ht(0), t(1), . . . , t(m)i.
We denote the vertices of T by: t0 , t1 , t2 , . . .. We specify the order relation on T as follows:
If s, t are vertices of T , then s t if t extends s, that is, if s = t len s. We specify the values of
t0 , t1 , t2 , . . . with an inductive definition so that for each j ∈ ω, tj is not equal to tk for k < j and
len tj ≤ j. t0 = ∅, the empty sequence. Assuming t0 , . . . , tn−1 have been defined, tn is defined as
follows: the length of tn is ≤ n. We define values tn (i) inductively as follows. When i = 0, we
define:
(
0 if {0, n} ∈ P0
tn (0) =
1 if {0, n} ∈ P1
For 0 < i ≤ n, if the computation of tn is not yet complete (and, assuming that we have defined
htn (0), tn(1), . . . , tn (i − 1)i), we perform the next step of computation of tn by considering two
cases: In the first case, for some m < n,
(∗)
tm = htn (0), tn(1), . . ., tn (i − 1)i.
In that case, since we are assuming the vertices t0 , . . . , tn−1 are distinct, tm is the unique vertex
for which (∗) holds. Then we define
(
0 if {m, n} ∈ P0
tn (i) =
1 if {m, n} ∈ P1
For the second case, (∗) does not hold for any m < n. When this is the case, we define
tn = htn (0), tn(1), . . ., tn (i − 1)i,
and the computation of tn is considered complete. (Note that in this case, we do not assign a
value to tn (i).) This completes the inductive definition of T .
We must verify that T has the required properties: that len tj ≤ j and that all tj are distinct.
The first of these is obvious by construction. To see that the tj are all distinct, notice that in
the definition of tj , the computation is not considered complete until its sequence of values is
different from each of t0 , . . ., tj−1 ; this requirement guarantees that, when the computation for tj
is complete, tj is different from each of t0 , . . . , tj−1 .
52
We show here that the Ramsey property holds for sets of size ω; it does not hold for sets of all infinite
sizes however; in fact, for sets of uncountable size κ, the statement that every partition of the pairs [κ]2 has a
homogeneous set of size κ is equivalent to the assertion that κ is weakly compact (weak compactness is a large
cardinal property).
85
Proposition 51 (Ramsey’s Theorem). Suppose P = {P0 , P1 } is a partition of [ω]2 and let T be
the P -tree obtained from P . Then P admits a homongeneous set of size ω.
Proof. Since any vertex t ∈ T can have at most two extensions, t_ 0, t_ 1, T is a binary tree,
which, since T is infinite, must in fact be an ω-tree. Therefore T has a maximal branch B. Let
C0 , C1 be subsets of ω defined by
C0 = {n ∈ ω | both tn and t_
n 0 belong to B}
_
C1 = {n ∈ ω | both tn and tn 1 belong to B}.
We observe that at least one of C0 , C1 is infinite: If both were finite, let n be bigger than
_
max{max C0 , max C1 } for which tn ∈ B. Since B is an infinite branch, either t_
n 0 or tn 1 also
belongs to B, which is a contradiction.
Without loss of generality, assume C0 is infinite. We show that C0 is homogeneous for P ;
indeed, that [C0 ]2 ⊆ P0 . Suppose m < n and both belong to C0 . We show that {m, n} ∈ P0 .
Since both belong to C0 , it follows that the following three vertices lie on B: tm , t_
m 0, tn . Let
`m = len tm . Then tm (`m ) = 0. Since n > m and both tn , tm lie on B, tn ⊇ t_
0,
so,
in
particular,
m
tn (`m ) = 0. When tn (`m ) was computed, in the construction of T , it was observed that tm is the
(unique) vertex equal to htn (0), . . ., tn (`m − 1)i, and so
(
0 if {m, n} ∈ P0
tn (`m ) =
1 if {m, n} ∈ P1
Since tn (`m) = 0, it follows that {m, n} ∈ P0 , as required.
Since Ramsey’s Theorem does not hold for finite cardinals n > 2, we are in a position to
state another alternative form of the Axiom of Infinity: There is an infinite set if and only if
there is a cardinal n with the following two properties:
(1) n is large enough so that [n]2 admits a partition into two pieces.
(2) Every partition of [n]2 into two pieces admits a homogeneous set of size n.
This formulation can be generalized somewhat. Although the proof we have given of Ramsey’s Theorem makes essential use of the well-ordering of the natural numbers, the truth of
Ramsey’s Theorem does not depend on such properites. This observation leads to the following
variant of the Axiom of Infinity:
Ramsey Criterion for Infinite Sets. There is an infinite set if and only if there is
a set A with the following two properties:
(1) [A]2 admits a partition into two pieces.
(2) Every partition of [A]2 into two pieces admits a homogeneous set of size |A|.
We began this section with the observation that class Dedekind self-maps are not strong
enough to imply the existence of an infinite set. In this section, we have identified several
86
strengthenings of the properties of a class Dedekind self-map that would imply existence of infinite sets. In some cases, these strengthenings involved the introduction of preservation properties;
when a class Dedekind self-map preserves a suitable set of properties belonging to its domain, it
becomes strong enough to give rise to an infinite set. Our plan is to motivate the existence of
large cardinals by further strengthening the properties of a class Dedekind self-map in the ways
that our results so far suggest. Our first step will be to consider one further strengthening that
will allow us to obtain a Dedekind self-map from a functor. To take this step, it will be helpful
to clarify the relationship between the different notions of “critical point” we have introduced.
16
The Relationship Between the Different Notions of Critical
Point
We spend a moment to clarify the relationship between critical points and strong critical points.
We introduce one other related concept: Suppose j : V → V is a 1-1 class function. A set X in
the domain of j is a weak critical point if j(X) 6= X. In general, the relationships between these
three notions of critical point are summarized in the following proposition:
Proposition 52. (ZFC − Infinity)
(1) Suppose j : V → V is a 1-1 class function. If X is a strong critical point or a critical point,
then X is also a weak critical point.
(2) There is a 1-1 class map j : V → V having a weak critical point that has no strong critical
point.
(3) There is a 1-1 class map j : V → V having a weak critical point but no critical point.
(4) There is a 1-1 class map j : V → V that has a strong critical point but no critical point.
(5) There is a 1-1 class map j : V → V that has a critical point but not a strong critical point.
Proof of (1). If X is a strong critical point, certainly j(X) 6= X, so X is also a weak critical
point. If X is a critical point, then since X is not in the range of j, j(X) 6= X, so X is a weak
critical point.
Proof of (2). Obtain j : V → V as follows. Let idV be the identity on V . Define j as follows:


idV (x) if x 6= 1 and x 6= {1}
j(x) = {1}
if x = 1


1
if x = {1}
Here, j(1) 6= 1, so 1 is a weak critical point. However, |j(x)| = |x| for all x ∈ V , so j has no
strong critical point.
87
Proof of (3) & (4). Obtain j : V → V as
follows:


id(x)
j(x) = 0


1
follows. Let id be the identity map. Define j as
if x 6= 0 and x 6= 1
if x = 1
if x = 0
Now j(0) 6= 0 so 0 is a weak critical point. However, ran j = dom j, so j has no critical point.
Also, notice that |j(0)| =
6 0, so 0 is a strong critical point.
Proof of (5). Define j : V → V by
(
{n + 1}
j(x) =
x
if x = {n} for some n ∈ ω
otherwise.
Certainly j is 1-1 and has critical point {0}. However, for each n ∈ ω, |{n + 1}| = |j({n})| =
|{n}| = 1, so j has no strong critical point.
Despite these differences, when j satisfies certain additional preservation properties, these
three notions of critical point coincide. Moreover, we will be especially interested in maps that
satisfy these preservation properties. We will see that for self-maps exhibiting strong preservation
properties, these distinct notions of critical point will turn out to be equivalent.
Definition 6. Suppose j : V → V is a 1-1 class function.
(1) j preserves ∈ if, whenever x, y ∈ dom j and x ∈ y, then j(x) ∈ j(y).
(2) j preserves cardinals if, whenever γ ∈ ON is a cardinal, j(γ) is also a cardinal.
(3) j preserves functions if, for any f : X → Y and X, Y ∈ dom j, j(f ) is a function from j(X)
to j(Y ) and, for all x ∈ X, j(f (x)) = j(f )(j(x)). This property of j can be represented by
a commutative diagram:
f
X
j
-
Y
j
?
j(f )
j(X)
?
- j(Y )
(4) j preserves images if, whenever f : X → Y is a function, j(f ) is a function and j(ran f ) =
ran j(f ).
(5) j preserves ordinals if, whenever α is an ordinal, j(α) is also an ordinal. j reflects ordinals
if, whenever j(x) is an ordinal, x is also an ordinal.
Any 1-1 class function j : V → V that satisfies (1)–(4) will be called basic structure-preserving,
or BSP.
88
Lemma 53. (ZFC − Infinity) Suppose j : V → V is a 1-1 class function.
(1) If j preserves ∈, then, for all ordinals α, β, if j(α) ∈ j(β), then α ∈ β.
(2) If j preserves ordinals and ∈, then, for all ordinals α, j(α) ≥ α.
(3) Suppose j preserves ordinals and ∈. Then for all x, if x and j(x) have the same rank and
for all y for which rank(y) < rank(x) we have j(y) = y, then j(x) = x.
(4) If j preserves ordinals, preserves ∈, and has a weak critical point, then there is an ordinal
α such that j(α) 6= α. In particular, j ON : ON → ON has a weak critical point.
(5) Suppose j preserves ordinals, is BSP, and has a weak critical point κ. Then κ is a cardinal.
Proof of (1). Suppose j(α) ∈ j(β). If α 6∈ β, then either α = β or β ∈ α (since ∈ is a total
ordering on ON). If α = β, then we have j(β) = j(α) ∈ j(β) which is impossible by irreflexivity
of ∈. If β ∈ α, then j(β) ∈ j(α) ∈ j(β), and this contradicts the fact that ∈ is both irreflexive
and transitive. The result follows.
Proof of (2). Suppose not; let α be least such that j(α) < α. Since j preserves ordinals and ∈,
j(j(α)) < j(α). But now j(α) is a smaller ordinal β with the property that j(β) < β, contradicting leastness of α.
Proof of (3). We first show x ⊆ j(x): Let y ∈ x. Since j preserves ∈, j(y) ∈ j(x). Since
rank(y) < rank(x), y = j(y). It follows y ∈ j(x). Conversely, if y ∈ j(x), then y is of lower rank,
so y = j(y). Since j(y) = y ∈ j(x), it follows by (1) then y ∈ x, as required.
Proof of (4). Suppose j(α) = α for all ordinals α. Let x be a weak critical point for j, so
j(x) 6= x. Let α = rank(x) + 1, and let X = Vα . Let M = {x ∈ X : j(x) 6= x}. The fact that
M is a set follows from an application of Separation. Let B = {rank(x) : x ∈ M }. B is a set by
Replacement. Also, B 6= ∅ since M 6= ∅. Let γ = inf B and let y ∈ M be such that rank(y) = γ.
By (3) and the leastness of rank(y), we must have j(y) = y, and we have a contradiction. Therefore, j ON : ON → ON has a weak critical point.
Proof of (5). Let κ be the least ordinal moved by j. By (2), j(κ) > κ. Suppose α < κ
and f : α → κ is an onto function. By leastness of κ, j(α) = α. Since j preserves images,
j(f ) : j(α) → j(κ) is also a function, and
(+)
ran j(f ) = j(ran f ) = j(κ).
Since j(α) = α, j(f ) : α → j(κ). We show f = j(f ): For any β ∈ α, because j preserves functions
and j(β) = β, we have
j(f )(β) = j(f )(j(β)) = j(f (β)) = f (β).
The last step follows because f (β) ∈ κ and κ is the least ordinal moved by j. Therefore j(f ) = f
and so ran j(f ) ⊆ κ, which contradicts (+). Therefore, no such onto function exists, and κ is a
89
cardinal.
With these preservation properties in mind, we can return to the task of verifying a point
mentioned earlier, regarding Theorem 47. In that theorem, one of the hypotheses concerning
j : V → V was that one of its critical points a belongs to a set of the form j(A). We described
earlier (p. 80) several sufficient conditions for this hypothesis to hold true. One of the conditions
described there is the following:
j preserves ∈ and both preserves and reflects ordinals.
We explain why this condition suffices: Theorem 52 shows that if j : V → V is a Dedekind
self-map, j has a weak critical point; moreover, since we are assuming j preserves ∈ and ordinals,
there is, by Lemma 53(4), a least ordinal α0 such that α0 < j(α0 ). This ordinal α0 must be a
critical point of j because (a) for all ordinals α, α ≤ j(α) (Lemma 53(2)), and (b) by the ordinal
reflecting property, no non-ordinal is mapped to α0 . It follows therefore that α0 itself is the
required set A; that is, α0 ∈ j(α0 ).
We show now that when a self-map j : V → V satisfies a modest set of the preservation
properties described so far, the three notions of critical point coincide.
Theorem 54. (ZFC − Infinity) Suppose j : V → V is a 1-1 class function that preserves and
reflects ordinals, and is BSP. Then the following are equivalent.
(1) j has a weak critical point.
(2) j has a critical point.
(3) j has a strong critical point.
Proof. We have already seen that (2) ⇒ (1) and (3) ⇒ (1). It suffices to prove (1) ⇒ (2) and
(1) ⇒ (3).
Assume j has a weak critical point. Using Lemma 53(4),(5), there is a cardinal κ that is
moved by j and that is the least ordinal moved by j. Assume for a contradiction that κ ∈ ran j,
and let j(x) = κ. Because j reflects ordinals, x must also be an ordinal. Because α ≤ j(α) for
all ordinals α, it follows that x < κ. However, existence of such an ordinal x contradicts the
leastness of κ. It follows that κ is a critical point of j. Also, since κ is a cardinal and j preserves
cardinals, j(κ) is also a cardinal. Since |κ| = κ < j(κ) = |j(κ)|, we conclude that κ is also a
strong critical point of j.
Remark 13. The property of j that it reflects ordinals is used only in the proof that a weak
critical point is also critical point; all other implications in the theorem are provable without
assuming that j reflects ordinals.
17
Class Dedekind Self-Maps and Functors
In this section, we lay the groundwork for a discussion about class Dedekind self-maps that are
functors.
90
Definition 7 (Functors). Suppose C = (OC , MC ) and D = (OD , MD) are catgories. A functor
F : C → D is a transformation that maps OC to OD and MC to MD , satisfying the following
properties:
(1) Whenever f : A → B belongs to MC , F(f ) : F(A) → F(B) belongs to MD .
(2) Whenever f : A → B and g : B → C belong to MC , we have F(g ◦ f ) = F(g) ◦ F(f ).
(3) For any object A ∈ OC , F(1A ) = 1F(A).
Notice that if F : Set → Set is a 1-1 functor and there is some Y ∈ V not in the range
of F, then F can be seen as a class Dedekind self-map from V to V . As we will show now,
the notion of a functor is related to that of a Dedekind self-map that preserves functions. To
explain this relationship, we introduce the notion of a natural transformation from one functor
to another. If F : C → D and G : C → D are both functors, one can define a certain type of
transformation—called a natural transformation—from F to G that preserves the relationships
exhibited by F now within the context of G. Specifically, we have the following definition:
Definition 8 (Natural Transformations). Suppose F : C → D and G : C → D are functors. A
natural transformation η from F to G associates to each object X in C a map ηX : F(X) → G(X)
so that for any C-morphism f : X → Y , the following is commutative:
F(X)
F(f )
- F(Y )
ηX
?
G(X)
ηY
G(f )
?
- G(Y )
Proposition 55. Suppose j : V → V is a 1-1 functor that preserves ∈. Define a mapping ̂
that associates to each set X the map jX = j X : X → j(X), and suppose that ̂ is a natural
transformation from 1V to j (where 1V denotes the identity functor from V to V ). Then j
preserves functions.
Proof. We first observe that, because j preserves ∈, it follows that for every X, j[X] ⊆ j(X),
so j X : X → j(X). Suppose f : X → Y is a function. By naturality of ̂, the following is
commutative:
X
f
jX
?
j(X)
-
Y
jY
j(f )
?
- j(Y )
But now, tracing through the diagram beginning with any x ∈ X, we obtain
j(f (x)) = (j(f ))(j(x)),
91
as required.
The proposition says that relationships indicated by the identity functor are preserved by j in
the structure of any function j(f ). This preservation of “silence” in dynamism is a characteristic
of a class Dedekind self-map that is a BSP functor, but is not a characteristic of class Dedekind
self-maps in general—for instance the global successor function s : V → V : x 7→ x ∪ {x} does not
have this property. The result shows that deeper aspects of the self-referral flow of consciousness,
as understood in the ancient wisdom of life, begin to be reflected more profoundly in Dedekind
self-maps equipped with stronger preservation properties. We will see this theme develop more
fully as we introduce still stronger preservation properties.
Having introduced the notions of functor and natural transformation, we are ready to discuss
one final way in which Dedekind self-maps emerge from the dynamics of a class Dedekind self-map.
This final example does not make direct use53 of preservation properties as a way to enhance the
class map. Instead, the mechanism for generating a Dedekind self-map that is a set is to locate a
functor that can play the role of a self-map generator in such a way that it provides a balancing
counterpoint to another kind of functor that plays the role of a self-map annihilator.
Since we wish to “create” a Dedekind self-map—in effect, create something infinite—we will
work, as in previous sections, in the theory ZFC − Infinity. At the same time, we will need an
ambient category in which our new Dedekind self-map can appear. But because Dedekind selfmaps may not exist in the absence of an Axiom of Infinity, the category SelfMap may be empty,
and therefore is not a suitable choice. We therefore introduce a slightly more general category
SM whose objects are arbitrary self-maps—functions from a set to itself—and whose morphisms
are defined as in SelfMap. More precisely, SM = (O, M) where O = {f : A → A | A is a set},
and an element α : f → g of M, where f : A → A, g : B → B belong to O, is a function
α : A → B making the following diagram commutative:
A
f
-
α
?
B
A
α
g
-
?
B
There is a naturally defined functor G : SM → Set—known as the forgetful functor—that
has the effect of annihilating the structure of any self-map by outputting just the domain of the
self-map. Thus, for any self-map g : B → B, G(g) = B.
It turns out that the existence of a (set) Dedekind self-map is equivalent to the existence of a
kind of counterbalancing functor F : Set → SM to G, called a left adjoint to G. Such a functor
F would produce, from any given set A, a self-map fA : XA → XA ; that is, F(A) = fA . Moreover,
F must also satisfy the following “balance condition”: For any self-map g : B → B and any set
A, the number of Set morphisms from A to G(g) = B (that is, the number of ordinary functions
there are from A to B) precisely equals the number of SM morphisms α from F(A) = fA to g.
53
In fact, preservation properties are involved, but the self-map that we end up defining does not actually exhibit
the interesting properties that its factors do.
92
As a matter of notation, in general, for any sets X, Y , we let Set(X, Y ) denote the set of all
functions from X to Y , and for any self-maps u : X → X, v : Y → Y in SM, we let SM(u, v)
denote the set of all SM-morphisms from u to v. Recall that a morphism α : u → v is a function
α : X → Y so that the following is commutative:
-
u
X
X
α
α
?
-
v
Y
?
Y
With this notation, assuming F is a left adjoint to G, for any A in Set and g : B → B in
SM, we let ΘA,g : SM(F (A), g) → Set(A, G(g)) denote a bijection between these two sets of
morphisms. A further requirement, which we express in only an intuitive way here, is that the
way the bijection Θ is defined is “the same” for all choices of A, g; this can be expressed precisely54
54
We give the precise formulation here, for the avid student of mathematics. For any category C, C op is another
category, the opposite category of C, whose objects are the same as those of C and whose morphisms are those
of C, but reversed (so if f : A → B is a morphism of C, then f op : B → A is a morphism of C op ). In the
present context, define functors ΓSet : Setop × SM → Set and ΓSM : Setop × SM → Set defined on objects by
ΓSet (A, β) = Set(A, G(β))), and ΓSM (A, β) = SM(F(A), β), where β : B → B is any SM-object. These functors
are defined on respective morphisms as follows: Given any (A, β), (C, δ) ∈ Setop × SM and any Setop × SMmorphism hf op , gi : (A, β) → (C, δ), and any h : A → G(β) in Set, we have
f
G(g)
h
ΓSet (hf op , gi)(h) = C −→ A −→ G(β) −→ G(δ) = G(g) ◦ h ◦ f,
and given h : F (A) → β in SM, we have
F(f )
h
g
ΓSM (hf op , gi)(h) = F(C) −→ F(A) −→ β −→ δ = g ◦ h ◦ F(f ).
Then to say that ΘA,β : SM(A, G(β)) → Set(F (A), β) is natural in A and β means Θ is a natural transformation
from ΓSM to ΓSet , which in turn means that, for each Setop × SM-morphism hf op , gi : (A, β) → (C, δ), with
β : B → B and δ : D → D, the following diagram is commutative (here, natural maps are drawn horizontally
rather than vertically, as was originally done in the definition of “natural”):
ΓSM (A, β)
- ΓSet (A, β)
ΘA,β
ΓSM (f op ,g)
?
ΓSM (C, δ)
ΓSet (f op ,g)
?
- ΓSet (C, δ)
ΘC,δ
It is convenient to draw the diagram in the following way:
SM(F(A), β)
- Set(A, G(β))
ΘA,β
h7→g◦h◦F(f )
?
SM(F(C), δ)
h7→G(g)◦h◦f
?
- Set(C, G(δ))
ΘC,δ
93
by providing a certain natural transformation between functors of the right type, depending on
A, g.
Assuming such a functor F, and corresponding natural bijections ΘA,g , can be found, we
indicate how a (set) Dedekind self-map emerges from the relationship between F and G. Once
we have F and Θ, we can locate a single seed computation Θ1,f1 (recall F(1) = f1 : X1 → X1 )
from which the values of F can be derived, and from which a Dedekind self-map emerges.
Notice that Θ1,f1 is the bijection from SM(f1 , f1 ) to Set(1, G(F(1)) = Set(1, X1). Let
η = Θ1,f1 (1f1 ) (where 1f1 is the identity morphism in SM from f1 to itself). The function η picks
out one of the maps 1 → X1 from the collection Set(1, X1). In this way η picks out a point that
will turn out to be a critical point in X1 : Write η0 = η(0) ∈ X1 .
The next point, which follows from the fact that ΘA,β is a natural bijection, is that, for any
other map g from 1 to a set of the form G(h : M → M ) = M , we can find a unique SM-map
As a matter of terminology, notice that in the above discussion, we begin with a morphism hf op , βi :
(A, β) → (C, δ) in Setop × SM, which, by definition, is precisely the morphism hf, βi : (C, β) → (A, δ) in
Set × SM. Applying ΓSM produces a morphism ΓSM (A, β) → ΓSM (C, δ) and applying ΓSet produces a morphism ΓSet (A, β) → ΓSet (C, δ). In each case, the domain and codomain of f , but not of β, have been reversed by
the functors. We say that ΓSM and ΓSet are contravariant in the first argument, covariant in the second argument.
94
τ : f1 → h that takes η0 to g(0) and makes the following diagram commutative:
55
55
We prove a more general fact and indicate how the result mentioned in the text follows. For each Set object A,
define a function ηA : A → GF(A) by ηA = ΘA,F(A) (1F(A) ). η is called the unit of the adjunction F a G. We show
that ηA has the following universal property: Given any k : A → G(β), we show there is a unique k : F(A) → β
such that the following diagram commutes:
- GF(A)
ηA
A
H
H
H
HH
j
k
G(k)
?
G(β)
We first observe that, for any g : F(A) → β, we have
ΘA,β (g) = G(g) ◦ ηA .
We show this by studying a special case of the main diagram from the previous footnote (showing naturalness of
Θ), where f op is the identity map 1F(A) :
SM(F(A), F(A))
- Set(A, GF(A))
ΘA,F(A)
h7→g◦h
h7→G(g)◦h
?
SM(F(A), β)
?
- Set(A, G(β))
ΘA,β
By commutativity of the diagram, we obtain, for any g : F(A) → β in SM,
`
´
`
´
(∗)
G(g) ◦ η = ΓSet ΘA,F(A) (1F(A) ) = ΘA,β ΓSM (1F(A) ) = ΘA,β (g).
We now prove that ηA has the desired universal property: As described earlier, we are given k : A → G(β). Since
ΘA,β is a bijection, we can obtain k such that ΘA,β (k) = k. Now by (∗) we have
k = ΘA,β (k) = G(k) ◦ ηA ,
as required.
To show that k is unique, assume k = G(`) ◦ ηA , for some ` : F(A) → β. Let ` = ΘA,β (`). Then by (∗),
` = ΘA,β (`) = G(`) ◦ ηA = k.
Since ΘA,β is a bijection, ` = k.
Finally, we show how these observations demonstrate the claim in the text. The function η described in the text
is, in the notation of this footnote, η1 . Suppose we are given g : 1 → G(h), where h : M → M is an object in SM.
By the universal property of η1 , we can find a unique τ : F(1) → h so that the following is commutative:
1
η1
H
H
- GF(1)
H
HH
j
g
G(τ )
?
G(h)
95
X1
f1
-
τ
τ
?
M
X1
h
-
?
M
We call this the universal property of f1 .
We verify in several steps that f1 : X1 → X1 is a Dedekind self-map with critical point η0 :
First we observe that X1 has more than one element: If, for a contradiction, we suppose
X1 = {η0 }, then let Y = {0, 1} and h : Y → Y be defined by h(x) = 1 − x. There should be a
map τ taking η0 to 0 and making the following commutative:
{η0 }
f1
- {η0 }
τ
τ
?
{0, 1}
?
h
- {0, 1}
However, if this were true, we would have
0 = τ (η0 ) = τ (f1 (η0 )) = h(τ (η0)) = h(0) = 1,
which is impossible.
Next we show that f1 cannot be the identity. Let a 6= η0 be an element of X1 . Define
τ1 (a) = τ1 (η0 ) = η0 ; for all other x ∈ X1 , let τ1 (x) = x. Define τ2 = idX1 . Whether τ = τ1 or
τ = τ2 , we have, in either case, τ ◦ idX1 = idX1 ◦ τ , violating uniqueness of τ .
X1
idX1
-
τ1
?
X1
X1
τ1
idX1
-
?
X1
Since F(1) = f1 : X1 → X1 , h : M → M , and G(τ ) is simply the Set morphism τ : X1 → M , the commutative
triangle implies that g(0) = τ (η1 (0)) (in the text, η1 (0) was denoted η0 ). Also, since τ : F(1) → h is an SMmorphism, we obtain the following commutative diagram:
X1
f1
-
τ
τ
?
M
X1
h
96
-
?
M
idX1
X1
-
X1
τ2
τ2
?
idX1
X1
f1−1
-
?
X1
Next we show that f1 is not onto. Suppose it is. Since f1 is not the identity, f1 6= f1−1 6= idX1 .
takes η0 to η0 and makes the following commutative:
f1
X1
-
X1
f1−1
?
X1
f1−1
f1
-
?
X1
Since idX1 also makes the diagram commutative, this is a violation of uniqueness.
Next we show that η0 is a critical point. Suppose not; since f1 is not onto, it has a critical
point, which we denote x. Let τ be unique such that τ (η0 ) = x and the following is commutative:
X1
f1
-
X1
τ
?
X1
τ
f1
-
?
X1
Since we are assuming η0 is not a critical point, there is u ∈ X1 such that f1 (u) = η0 . But then,
chasing the diagram, we have
f1 (τ (u)) = τ (f1 (u)) = τ (η0 ) = x,
and so x is in the range of f1 , contradicting the fact that x is a critical point of f1 .
We will take a roundabout path to establish that f1 is 1-1. We will first extract, in the
familiar way, the sequence s generated by f1 and η0 :
s = hη0 , f1 (η0 ), f1 (f1 (η0 )), . . .i,
and show that s has no repeated terms. Letting W = ran s, we will then conclude that f1 W is
1-1. We will then show the unique τ : f1 → f1 W must be an isomorphism—in fact, the identity
map. This will allow us to conclude not only that f1 itself is 1-1, but in fact that (X1 , f1 , η0 ) is
an initial Dedekind self-map.
We proceed in a sequence of claims. First, we make use of Theorem 37 to define the class
sequence s = hη0 , f1 (η0 ), f1 (f1 (η0 )), . . .i:
s0 = η0
sn+1 = f1 (sn )
97
One may use the Axiom of Separation now to conclude that W = ran s is a set:
W = ran s ∩ X1 .
Claim A. s has no repeated terms.
We show by (class) induction that, for all n, the terms s0 , s1 , . . ., sn are distinct. Let A ⊆ ω
be defined by
A = {n ∈ ω | s0 , s1 , . . . , sn are distinct}.
Certainly 0 ∈ A. Assume n ∈ A, and hence that s0 , s1 , s2 , . . . , sn are distinct. Define a set
U = W ∪ {u} where u 6∈ W . Define t : U → U by


if x = si and 0 ≤ i < n
si+1
t(x) = u
if x = sn


arbitrary otherwise
Now, letting ti (x) denote the ith iterate of t at x, with t0 (x) = x, it follows that for 0 ≤ i ≤ n,
ti (s0 ) = si = f1i (s0 ), but f1n+1 (s0 ) 6= u = tn+1 (s0 ). By the universal property of f1 , there is a
unique τ : X1 → U making the following diagram commutative.
X1
f1
-
τ
τ
?
U
X1
t
-
?
U
To show that s0 , s1 , . . . , sn , sn+1 are distinct, assume instead that sn+1 = si for some i, 0 ≤
i ≤ n; in other words, f1n+1 (s0 ) = f1i (s0 ) for some i, 0 ≤ i ≤ n. Tracing through the diagram, we
have
si = ti (s0 ) = ti (τ (s0 )) = τ (f1i (s0 )) = τ (sn+1 (s0 )) = tn+1 (τ (s0 )) = tn+1 (s0 ) = u,
which is impossible. Therefore, s0 , s1 , . . . , sn , sn+1 are indeed distinct.
Claim B. The map f1 W : W → W is 1-1.
Proof. If f1 is not 1-1, since all elements of W are of the form f1n (η0 ), it follows that f1 (f1n (η0 )) =
f1 (f1m (η0 )) for some m 6= n. This means f1n+1 (η0 ) = f1m+1 (η0 ), and so the sequence s contains
repeated elements, contradicting Claim A.
Remark 14. We observe here (for later use) that we have not used the uniqueness of τ in either
of our arguments in Claims A and B. Also recall that, in these arguments, it is not yet known
that f1 is 1-1.
98
Claim B establishes that our construction does produce a Dedekind self-map, namely f1 W :
W → W , with critical point η0 . In fact, using the techniques described earlier, it is straightforward to show that there is a Dedekind self-map isomorphism from (ω, s, 0) to (W, f1 W, η0).
Nevertheless, we plan to show that f1 itself is such a map.
By the universal property of f1 : X1 → X1 , there is a unique τ : X1 → W taking η0 to η0
and making the following the diagram commutative:
f1
X1
-
τ
?
W
X1
τ
f1 W -
?
W
Claim C. X1 = W . Therefore f1 = f1 W is 1-1 and (X1 , f1 , η0) is a Dedekind self-map; indeed,
it is an initial Dedekind self-map.
Proof. The second clause follows immediately from X1 = W , which we prove now. Consider the
following diagram:
f1
X1
-
τ
?
W
τ
f1 W -
incW,X1
?
X1
W
?
W
incW,X1
f1
-
τ
?
X1
?
X1
τ
f1 W -
?
W
It is easy to see that the middle square is commutative, using the inclusion map incW,X1 .
By the universal property of f1 , idX1 is the unique map taking η0 to η0 for which the following
is commutative:
X1
f1
-
idX1
?
X1
X1
idX1
f1
-
?
X1
Therefore, incW,X1 ◦ τ = idX1 . Likewise, since (W, f W, η0) is an initial object in SelfMap, idW
99
is the unique self-map taking η0 to η0 and making the following diagram commutative:
f1 W -
W
idW
?
W
W
idW
f1 W -
?
W
Therefore τ ◦ incW,X1 = idW . It follows that both τ and incW,X1 are bijections. Since one of
these is an inclusion map, we conclude X1 = W .
Remark 15. We note that the entire proof that f1 : X1 → X1 is an initial Dedekind self-map
was derived from the universal property that f1 enjoys (p. 96).
We make one final observation about the adjunction F a G and the resulting Dedekind
self-map (X1 , f1 , η0 ): It turns out that η0 ∈ G(f1 ) is a universal element for G.
In general, if U : C → Set is a functor from some category C to the category of sets, an
element a of a set U(A) is a weakly universal element for U if, for every B ∈ C and every
b ∈ U(B), there is a g : A → B ∈ C so that U(g)(a) = b. Moreover, a is called a universal
element for U if the map g is unique. We can picture this universal property using the following
x
diagram, identifying an element x of a set X with the map 1 −→ X:
1
a
H
H
- U(A)
bHH
U(g)
j
H
?
U(B)
A
g
?
B
We recall56 that a functor U : C → Set is cofinal if, for every set y, there is c ∈ C such that
y ∈ U(c). In this situation, when U has a weakly universal element, and U is cofinal, it follows
that
V = {U(h)(a) | h is a morphism of C}.57
We verify this point: Given a set b, we show b = U(h)(a) for some h: Using cofinality of U, we
let B ∈ C be such that b ∈ U(B), and let h : A → B be a morphism of C for which U(h)(a) = b,
as in the definition of a weakly universal element.
An important example of a weakly universal element that we encountered earlier arose in the
reduced product construction obtained from a nonprincipal ultrafilter D on ω (Example 4, p. 81).
Recall that, given such a D, jD : V → V is defined by jD (X) = X ω /D. It turns out that jD is a
functor Set → Set; its definition on Set-morphisms is given by:
jD (f ) : X ω /D → Y ω /D : [g] 7→ [f ◦ g],
56
57
This notion was defined in exactly the same way for any class function j : V → V on p. 73.
In the collection on the right hand side, we ignore morphisms h for which a is not in the domain of U(h).
100
for any f : X → Y . Then [1ω ] ∈ jD (ω) is a weakly universal element for jD :
1
[1ω ]
-
jD (ω) = ω ω /D
H
H
[f ]HH
ω
g
[g]7→[f ◦g]
j
H
?
?
jD (X) = X ω /D
X
Given [f ] ∈ jD (X) = X ω /D, with f : ω → X, f itself is the required function: We show that
jD (f )([1ω]) = [f ]:
jD (f )([1ω]) = [f ◦ 1ω ] = [f ].
Returning to the adjunction F a G, it is easy to see from previous work that, by the
universal property of f1 (p. 96), if β : B → B ∈ SM and b ∈ G(β) = B, there is a unique
morphism τ : f1 → β in SM so that G(τ )(η0 ) = b.
1
η0
HH
- GF(1)
bHH
j
H
f1
X1
G(τ )
-
τ
?
τ
?
G(β)
X1
β
B
-
?
B
Therefore η0 is a universal element for G. Moreover, since G is cofinal (as is easily checked),
we have
V = {G(τ )(η0) | τ is a morphism of SM}.
As we will see,58 a converse is also true: Existence of a universal element for G suffices to
guarantee the existence of a Dedekind self-map, and hence also existence of a left adjoint for G.
We summarize what we have accomplished so far. We showed that, in obtaining a functor
F : Set → SM that is left adjoint to the forgetful functor G : SM → Set—so that, in a
sense, the “annihilating” effect of G is counterbalanced by the “generating” effect of F—we
were able to locate a seed Θ1,f1 (1f1 )(0) = η0 to generate a Dedekind self-map F (1) = f1 :
X1 → X1 = hη0 , f1 (η0 ), f12 (η0 ), . . .i. Moreover, the set X1 is itself obtained by the computation
X1 = G(F(1)). We can combine the two influences represented by F and G to produce our final
functor: j : V → V , defined by j = G ◦ F.
j
V
@
-
V
F@
G
R
@
SM
As we have just seen, j has a strong critical point 1, since j(1) = G(F(1)) = X1 (and recall that
X1 is itself infinite). 1 is also a critical point since j(0) = 0 and j(A) must be infinite, for all
58
See the footnote, starting on page 103, that discusses essentially Dedekind functors.
101
A 6= 0.59 And in fact 1 is a weak critical point; indeed, 1 is the least ordinal moved by j. Finally,
η0 ∈ GF(1) turns out to be a universal element for G, “generating” all of V in the sense that
V = {G(τ )(η0) | τ is a morphism of SM}.
Although j : V → V has a critical point, we cannot quite guarantee that it is 1-1. Let us call
a functor H : V → V essentially Dedekind if it is “naturally isomorphic” to a functor K : V → V
which has a critical point and is 1-1 (on objects). It turns out that, if we obtain j = G ◦ F where
F is any left adjoint to G, as we have done, though we cannot guarantee that j is a Dedekind
59
We prove this in the next footnote.
102
self-map, it can be shown that j is essentially Dedekind.60 Here, then, we have another example
in which a class Dedekind self-map (in this case, essentially Dedekind) gives rise to a set Dedekind
self-map. In fact, working in ZFC − Infinity, one can show that existence of the adjoint functor
F, and hence, existence of j itself, is equivalent to the Axiom of Infinity.61
60
We outline the argument here and demonstrate, using the same method, two other related facts. We show:
(1) Any left adjoint F for G is essentially Dedekind.
(2) Whenever F a G and |A| > 0, j(A) = G(F(A)) is infinite.
(3) Whenever G has a universal element, there is a naturally defined initial Dedekind self-map. Moreover, G
has a left adjoint.
For (1), since we have already shown that existence of a left adjoint for G produces a Dedekind self-map, existence
of the left adjoint guarantees that ω exists. We can then define a functor F0 by F0 (A) = 1A × s : A × ω → A × ω.
Also, if h : A → B is a Set morphism, then F0 (h) : A × ω → B × ω is defined by F0 (h) = h × 1ω .
A×ω
1A ×s
- A×ω
h×1ω
h×1ω
?
B×ω
?
1B ×s
- B×ω
It can be shown that F0 is also left adjoint to G. To see that F0 is 1-1 on objects, suppose A 6= B, say a ∈ A − B.
Then (a, 0) ∈ A × ω − B × ω, and it follows that GF0(A) 6= GF0 (B). Also, j 0 = G ◦ F0 has a critical point: As
before, j 0 (0) = 0, and, for all A for which |A| ≥ 1, j 0 (A) is infinite (since j 0 (A) = A × ω).
A fact from category theory ([2] or [26]) is that left adjoints for the same functor must be naturally isomorphic.
In this case, this means that there is, for each set A, an SM-isomorphism σA : F(A) → F0 (A) that is natural in
A. Applying G, we obtain another isomorphism G(σA ) : GF(A) → GF0 (A) (note that every functor preserves
isomorphisms). Moreover G(σA ) can be shown to be natural in A. Therefore, j and j 0 are naturally isomorphic
and j 0 is a Dedekind self-map. It follows that j is an essentially Dedekind self-map.
For (2), using these ideas, we can now verify that, for any set A having one or more elements, j(A) is infinite.
Since j ∼
= j 0 , it follows that
|j(A)| = |j 0 (A)| = |A × ω| ≥ ω.
For (3), we make use of Remark 15, which tells us that, to show that a triple (A, f, a) is an initial Dedekind
self-map, it suffices to show that for any other triple (B, g, b), we can find a unique SM-map τ : f → g that takes
a to b and makes the following diagram commutative:
A
f
-
τ
τ
?
B
A
g
-
?
B
Given a universal element a ∈ G(f ) for G, where f : A → A, we show (A, f, a) is an initial Dedekind self-map.
Let (B, g, b) be a triple, with g : B → B and b ∈ G(g) = B. By the universal property of universal elements, there
is a unique τ : f → g such that G(τ )(a) = b. The latter statement ensures that τ (a) = b; the former ensures that
the diagram above commutes.
For the “moreover” clause, the proof given in part (1) shows how to define a left adjoint for G once existence of
ω has been established.
61
Assuming the adjoint F exists, we have seen that F(1) is a Dedekind self-map, with j(1) = X1 an infinite set.
Conversely, assuming some form of the Axiom of Infinity, we conclude that ω exists; as in the previous footnote,
103
The diagram shows that j is the composition of two functors, F and G; we say that F and
G are the factors of j. Because F is left adjoint to G, it turns out that both F and G exhibit
strong preservation properties. For instance, it can be shown that F preserves coproducts; in
fact, if {Xi | i ∈ I} is a collection of disjoint sets, no matter how big I may be, the functor F has
the property that
! [
[
Xi = F(Xi ) .
F
i∈I
i∈I
Recall that when we defined “preservation of coproducts” for functors, we considered only the
case in which |I| = 2; the preservation property that F exhibits is much stronger.
The functor G exhibits the “dual” preservation property of preserving products. Let us say
that a class function j : V → V preserves products if, for any sets A, B, |j(A×B)| = |j(A)×j(B)|.
It can be shown that, if {Xi | i ∈ I} is a collection of sets, no matter how big I may be,
! Y
Y
G(Xi ) ,
Xi = G
i∈I
i∈I
Q
where
denotes the product operator. Once again, the definition of “preserving products”
requires only that |I| = 2; the property exhibited by G is much stronger.
The notions of product and coproduct have natural generalizations in category theory to the
notions of limit and colimit. These more general definitions allow us to see that products—along
with many other well-known constructions in mathematics—are a special type of limit, and likewise, coproducts, and many other constructions, are special instances of the colimit construction.
We will not define these more general notions here, but to give a feeling of concreteness, we will
at least introduce some notation. Suppose {Xi | i ∈ I} is a collection of sets and suppose H is a
functor. We will say that H preserves set-indexed limits if
H (Limi∈I (Xi )) = Limi∈I (H(Xi)) ,
where Lim denotes the limit construction. Likewise, H preserves set-indexed colimits if
H (Colimi∈I (Xi )) = Colimi∈I (H(Xi)) ,
where Colim denotes the colimit construction.
Now we are in a position to state the properties of F and G. Because F is left adjoint to
G, it can be shown that F preserves all set-indexed colimits and G preserves all set-indexed
limits. However, the composition j = G ◦ F exhibits, by comparison, very little in the way of
preservation properties. We will discuss this “shortcoming” further in the next section.
We may summarize the discussion as follows. We have labeled this theorem The Lawvere
Construction since most of the contents of the theorem are due to Lawvere [23]:
Theorem 56 (The Lawvere Construction). (ZFC − Infinity) Let G denote the forgetful functor
from SM to Set; that is, for every f : A → A ∈ SM, G(f ) = A and for every SM-arrow
α : f → g, where f : A → A and g : B → B, G(α) is α : A → B. Suppose G has a left adjoint
F : Set → SM. Let ΘA,β denote the natural bijection from SM(F(A), β) to Set(A, G(β)). Write
F(A) = fA : XA → XA . Let η0 = Θ1,F(1)(1F(1))(0). Let j = G ◦ F. Then
F : Set → SM defined on objects by F(A) = 1A × s : A × ω → A × ω is left adjoint to G.
104
(1) j : V → V is an essentially Dedekind self-map for which 1 is both a critical point and a
strong critical point of j.
(2) j(1) is itself a Dedekind-infinite set X1 and F(1) is the corresponding Dedekind self-map f1 :
X1 → X1 with critical point η0 . Moreover, f1 is initial and X1 = {η0 , f1 (η0 ), f12 (η0 ), . . .}.
(3) η0 ∈ G(F(1)) is a universal element of G. Moreover,
V = {G(f )(η0 ) | f is a morphism in SM}.
(4) F preserves set-indexed colimits and G preserves set-indexed limits.
As a final observation, we mention that the emergence of an infinite set from the Lawvere
Construction follows exactly the pattern that was mentioned earlier (on p. 73)—that a Dedekindinfinite set occurs as the image of the critical point of the class map j : V → V .
18
Emergence of the Infinite in the Universe
In the previous sections, we have examined ways in which a bare class Dedekind self-map can
be strengthened so a set Dedekind self-map is guaranteed to exist. In an upcoming section, we
will use these observations as tools for generalizing from “something infinite exists” to “large
cardinals exist.”
In this section, we step back for a moment and place our efforts to strengthen class Dedekind
self-maps in a slightly different context. Suppose we live in a ZFC − Infinity world. In that
world—unlike the world ZFC− Infinity + ¬Infinity—we cannot be sure whether infinite sets exist.
If we look very closely, is it possible to determine how exactly anything infinite comes into being
(assuming that it does)?
We can view the last few sections of this paper as partial answers to this question. One
answer is that the infinite emerges as certain closure properties (Theorem 41) or preservation
properties of class Dedekind self-maps emerge. As these closure/preservation properties “come
into view” or “become part of reality,” so too do infinite sets.
We have also seen that a Dedekind self-map emerges from a kind of alchemy between the
collapsing influence of the forgetful functor G : SM → Set and the expansive influence of a left
adjoint F : Set → SM to G. When these influences are “balanced,” through the dynamics of
adjointness, a set Dedekind self-map is born.
There is yet another way that the infinite can be seen to emerge, in this case on the basis of
the relationship between the dynamics within SM and points within Set. In the category Set,
mappings from the terminal object 1 = {0} (recall, a terminal object in Set is any singleton set)
can be used to specify any value in any set. Consider, for example, a set X = {u, v, w}. We can
specify each element of X by considering different maps 1 → X. The map U : 1 → X defined by
U (0) = u specifies u, and something similar can be done for the other elements of X.
When we step into the highly dynamic expansion SM of Set, consisting of all self-maps of
sets, the ability to distinguish points within sets becomes more difficult; the terminal object in
SM (which is the unique map T : 1 → 1) is unable to distinguish points of the underlying sets of
105
the self-maps belonging to SM. To see this, consider any SM-map β from T to another object
f : X → X of SM:
1
T
-
β
?
X
1
β
f
-
?
X
Notice that the only points that any β : T → f can pick out from the underlying set X of f are
the fixed points of f : Suppose x = β(0). Then
f (x) = f (β(0)) = β(T (0)) = β(0) = x.
It follows that the only function g : Y → Y for which the maps α : T → g can pick out all points
of the underlying set Y of g is the identity map g = idY : Y → Y , since all points of Y are fixed
points in that case.
The requirement to be able to pick out points in this way leads to the notion of a weak
natural number object. A triple (A, j, a), where j : A → A is a self-map and a ∈ A, is a weak
natural number object if, for any f : X → X and any x ∈ X,
(∗)
there is an SM-map β : A → X such that β(a) = x and f ◦ β = β ◦ j;
in other words, the following is commutative:
A
j
-
β
?
X
A
β
f
-
?
X
Notice that a weak natural number object j : A → A accompishes exactly what the SM-terminal
object T failed to accomplish: For every element x ∈ X, we can find a β : (A → A) → (X → X)
that picks out x: β(a) = x.
We verify now that every weak natural number object gives rise to a (set) Dedekind selfmap. Suppose (A, j, a) is a weak natural number object. We first check that a is a critical point
of j: Consider the set X = {0, 1} in the diagram above, where f : X → X is defined so that
f (0) = f (1) = 1. Now suppose a ∈ ran j, so that, for some u ∈ A, we have j(u) = a. Tracing
through the diagram, we see that
0 = β(a) = β(j(u)) = f (β(u)) = 1,
which is impossible.
Next, we can use transfinite recursion over ω to define the sequence a0 , a1 , a2 , . . . within A,
where a0 = a, a1 = j(a0 ), and an+1 = j(an); we let W denote the set of terms of this sequence:
W = {a0 , a1 , a2 , . . .}. It follows from Remark 14 that j W : W → W is 1-1. It follows that
j W is a Dedekind self-map with critical point a.
106
We have shown therefore that weak natural number objects give rise to (set) Dedekind selfmaps. In this context, we see that the “infinite” emerges from the need, within the dynamic
expanded universe SM, to locate point values of underlying sets. In this sense, we see how the
“infinite” within the realm of sets is born from the relationship between the self-referral dynamics
inherent in sets (made manifest in the universe SM) and the points within those sets.
Our demonstration that Dedekind self-maps “arise from” weak natural number objects also
shows that Dedekind self-maps arise from weakly universal elements of the forgetful functor
G : SM → Set, because the notions of weak natural number object and weakly universal element
of G are essentially the same. The close relationship between these notions is most easily seen if
we consider the following diagrams.
a
1
H
- G(j) = A
H
bH
H
j
H
G(τ )
A
j
-
τ
?
?
G(β) = B
B
B
τ
β
-
?
B
Proposition 57. Suppose j : A → A and a ∈ A. The following are equivalent.
(A) (A, j, a) is a weak natural number object.
(B) If G : SM → Set is the forgetful functor, a ∈ G(j) is a weakly universal element for G.
Remark 16. In particular, since (ω, s, 0) is a weak natural number object, it follows that 0 ∈ G(s)
is a weakly universal element for G.
Proof. Suppose (A, j, a) is a weak natural number object; we show a ∈ G(j) is a weakly universal
element for G. Suppose b ∈ G(β) = B, where β : B → B is an object of SM. Because (A, j, a)
is a weak natural number object, there is τ : A → B such that
(+)
τ (a) = b
which makes the square above commute. Viewing τ : j → β as an SM-morphism, it follows from
(+) that G(a) = b. We have shown a ∈ G(j) is a weakly universal element for G.
Conversely, suppose a ∈ G(j) is a weakly universal element. Suppose β : B → B and b ∈ B.
We find τ : A → B taking a to b and making the square above commute. Because a ∈ G(j) is
a weakly universal element, we obtain an SM-morphism τ : j → β (which therefore makes the
square above commute) satisfying G(τ )(a) = b. Identifying τ with G(τ ), we have τ (a) = b, as
required.
As a final observation, we point out that a strong converse to the fact that Dedekind selfmaps arise from weak natural number objects is also true. Certainly, existence of a Dedekind
self-map suffices to prove the existence of a weak natural number object: An example is (ω, s, 0),
where s is the successor function. We may in fact conclude that a stronger type of structure,
called a natural number object, can be obtained from a Dedekind self-map. A natural number
object is a weak natural number object for which the property (∗) is strengthened to require that
107
the map β is unique. We give the full statement here: A triple (A, j, a), where j : A → A is a
self-map and a ∈ A, is a natural number object if, for any f : X → X and any x ∈ X,
(∗∗)
there is a unique SM-map β : A → X such that β(a) = x and f ◦ β = β ◦ j.
We will show that (ω, s, 0) is a natural number object; this will be sufficient since, as we have
seen, a Dedekind self-map is enough to generate ω and s. Our argument will show that (ω, s, 0)
is initial not only for the category SelfMap, but also for the more general category SM. The
argument relies heavily on the fact that (ω, s, 0) is initial for SelfMap.
Theorem 58. If there is a Dedekind self-map, there is a natural number object. In particular,
if s : ω → ω is the usual successor function on ω, then (ω, s, 0) is a natural number object and is
the initial object for the category SM. In detail, for every function f : X → X and any x0 ∈ X,
there is a unique β : ω → X with β(0) = x0 and f ◦ β = β ◦ s.
ω
s
-
β
?
X
ω
β
f
-
?
X
Remark 17. It is straightforward to show that the theorem remains true if we replace s : ω → ω
with any Dedekind self-map t : W → W that is Dedekind self-map isomorphic to s. Therefore,
objects that are initial in the category SelfMap are also initial in SM. In particular, the self-map
f1 : X1 → X1 , discussed earlier, that is the image F(1) by a left adjoint F to the forgetful functor
G : SM → Set, is a natural number object.62
Proof. Since there is a Dedekind self-map, we may obtain the usual successor function s : ω → ω
and make use of the fact that (ω, s, 0) is initial in SelfMap. Since f : X → X may not be 1-1,
we define a 1-1 function F that “preserves” f : Define F : ω × X → ω × X by F (n, x) = (n, f (x)).
Clearly, (ω × X, F, (0, x0)) is a Dedekind self-map. Let τ : ω → ω × X be the unique function
guaranteed by the initial property of (ω, s, 0). In particular, τ (0) = (0, x0) and the following is
commutative:
ω
s
-
τ
τ
?
ω×X
ω
?
F
-ω×X
Claim 1. For all n ≥ 1, τ (n) = (n, f (y)), for some y.
Proof of Claim 1. We proceed by induction on n ≥ 1. For the base case,
τ (1) = τ (s(0)) = F (τ (0)) = F (0, a) = (1, f (a)).
62
This was in fact already proven in the footnote on p. 95.
108
For the induction step, assume the result for n ≥ 1. Then
τ (s(n)) = F (τ (n)) = F (n, f (y)) = (n + 1, f (f (y))) = (n + 1, f (z)),
where the second equality follows from the induction hypothesis and in the last equation z = f (y).
To pick out components from an ordered pair, we use the following notational convention:
For any u, v, (u, v)0 = u and (u, v)1 = v.
Claim 2. For all n ∈ ω, (F (τ (n)))1 = f (τ (n)1 ).
Proof of Claim 2. There are two cases. When n = 0, we have:
(F (τ (0)))1 = (F (0, x0 ))1 = (1, f (x0))1 = f (x0 ) = f ((0, x0)1 ) = f (τ (0)1 ).
Then, when n ≥ 1, we have, by Claim 1,
(F (τ (n)))1 = (F (n, f (y)))1 = (n + 1, f (f (y)))1 = f (f (y)) = f ((n, f (y))1) = f (τ (n)1 ).
Claim 3. Define β : ω → X by β(n) = τ (n)1 . Then β satisfies β(0) = x0 and makes the diagram
in the statement of the theorem commutative.
Proof of Claim 3. We proceed by induction on n ≥ 0. For the first part, β(0) = τ (0)1 =
(0, x0)1 = x0 . This is also establishes the first part of the Claim. For the induction step, assume
the result for n ≥ 0. Then, by Claim 2,
β(n + 1) = τ (s(n))1 = (F (τ (n)))1 = f (τ (n)1 ) = f (β(n)).
Finally, we establish uniqueness of β:
Claim 4. Suppose h : ω → X satisfies h(0) = x0 and for all n ∈ ω, f (h(n)) = h(s(n)). Then
h = β.
Proof of Claim 4. We proceed by induction on n ≥ 0. For the base case, clearly, h(0) = x0 =
β(0). Assuming the result holds for n ≥ 0,
h(s(n)) = f (h(n)) = f (β(n)) = β(s(n)).
This completes the proof of the theorem.
19
The “Naturalness” of the New Axiom of Infinity
In previous sections, we have examined several principles from which it is possible to derive a
Dedekind self-map, working in the theory ZFC − Infinity. In the last section, we summarized this
109
work from the perspective of determining how an infinite set could possibly arise in a ZFC−Infinity
world. In this section, we would like to address the more philosophical question, Is the New Axiom
of Infinity a natural axiom? A different but related philosophical question, which may need to
be addressed in a different way, is whether the New Axiom of Infinity is true.
These questions have analogies in the prevailing discussion among logicians concerning large
cardinals. Early in set theory’s history, there was considerable division among researchers about
whether any large cardinals, particularly the stronger ones, should be considered to exist in the
mathematical universe. By now (2014), very few who work in the field of foundations doubt that
large cardinals exist. But still there is no general agreement about how to formulate one or more
foundational axioms from which large cardinals can be derived. The problem in this case is not
the truth of large cardinal axioms but the naturalness of a formulation.
When the Axiom of Infinity, as it is known today, was formulated, the issue about naturalness
of an axiom that asserts existence of an infinite set was merged with the truth issue. Because it
had become clear that a rigorous formulation of the theorems of calculus, and even of the notion
of a real number, rested on the existence of an infinite set, one could assert that the Axiom of
Infinity should be viewed as both true and natural. Looking upon this early time in set theory’s
history with hindsight, it may be worthwhile—in light of the modern attitude toward formulating
one or more foundational axioms that could account for large cardinals—to make a distinction
between the naturalness and the truth of the Axiom of Infinity. The hope would be that if a case
could be made for the naturalness of an infinite set, one may be able to argue in an analogous
fashion in support of naturalness of an axiom from which large cardinals can be derived.
Cantor’s work made a convincing case that an infinite set must exist. But does it make sense
to say that infinite sets occur naturally in the universe simply because there is a need for a more
rigorous foundation for calculus and analysis? More convincing would be an observation that,
the notion of “infinite set” arises in the way many mathematical structures typically arise.
We would therefore like to answer the question, “In what way do infinite sets, or Dedekind
self-maps, naturally arise?” Having offered a satisfactory answer, we would then wish to use that
answer to help with the larger question, “In what way can it be said that large cardinals naturally
arise?” From an answer to this question, we could hope to find a “natural” axiom that could
account for all large cardinals.
This second question has, like the question about infinite sets, two sides. One side is, “Is it
true that large cardinals exist?” This question has been addressed in many ways already by set
theorists. An example is the problem of determining whether certain pointclasses in definability
theory are determined. A well-known ZFC result due to Martin is that the class of Borel sets is
determined. The natural generalizations to analytic sets and ultimately projective sets require
large cardinals for a positive answer; for projective sets, Woodin cardinals are necessary. Because
of the naturalness of the generalization, one may answer the second question by saying, “The
Borel sets are determined; it should also be the case that analytic sets, and even projective sets,
should also be determined. Therefore, the large cardinals required in the proof of these results
should exist.” A great deal of evidence of this kind has been accumulated in the past half century.
Most set theorists believe that this evidence is sufficient to conclude that large cardinals are real
entities in the universe. It could also be argued that some of this evidence also provides part of an
answer to the second question as well. For instance, an intuitive remark along these lines would be
“Projective sets ought to be determined in the same way as Borel sets are determined; therefore,
110
it is natural to conclude that Woodin cardinals exist.” More generally, evidence of this kind has
the following flavor: A certain problem has the “expected” solution under the assumption of a
large cardinal; the expected solution also implies existence or consistency of the large cardinal in
question (or of a large cardinal not too much weaker than the one in question); therefore, it is
“natural” to conclude that such large cardinals exist.
What we are looking for here, however, is some kind of dynamic overarching principle that is
fundamental to mathematics itself, which would naturally suggest the existence of large cardinals;
perhaps even suggest the existence of all the known cardinals.
We begin with a discussion of naturalness of an infinite set in mathematics. A reasonable
starting point is the list of conditions, mentioned in previous sections, that imply, in ZFC−Infinity,
the existence of an infinite set.
We summarize these conditions for easy reference:
(1) The existence of a Dedekind self-map j on V that has a j-closure point.
(2) The existence of a Dedekind self-map j on V having a weak critical point A so that
ran j A $ j(A).
(3) The existence of a Dedekind self-map on V that preserves disjoint unions, the empty set,
and singletons.
(4) The existence of a Dedekind self-map j : V → V with critical point a, which preserves
disjoint unions, intersections, the empty set, and terminal objects, with the additional
properties that {a} is also a critical point and a ∈ j(A) is a weakly universal element for j,
for some set A.
(5) The existence of a self-map on V that preserves finite coproducts and terminal objects.
(6) The existence of a Dedekind self-map on V whose critical point is itself an infinite set
(arising from the existence of a left adjoint to the forgetful functor G : SM → Set).
(7) The existence of a weak natural number object j : A → A, which makes it possible for an
SM object to pick out points from objects in Set.
(8) The existence of a weakly universal element for the forgetful functor G : SM → Set.
In our view, even though all of these points are useful for the purpose of formulating large
cardinal analogues, mostly they do not provide compelling reasons for including an infinite set
in a ZFC − Infinity universe. The category theorist may find (7) compelling; Lawvere [24, chap.
9] suggests that the requirement to be able to pick out points in this way is fundamental, but it
still has the form of “solving a problem in a natural way.”
For us, item (6) seems the most promising. The fact that an infinite set, or a Dedekind
self-map, is equivalent to the existence of an adjoint functor demonstrates that infinite sets arise
in the same way an enormous number of other structures in mathematics arise. For this to be
convincing, we will offer some evidence, well-known to category theorists, of the rather ubiquitous
role that is played by adjunctions in the creation of mathematical structures and their properties.
We will recall several facts as evidence for our thesis. First, we recall the fact that all algebraic
111
theories, as well as many other categories such as SM, can be seen to emerge from the category
of sets by way of an adjunction. Second, we will repeat a point made by Lawvere 50 years ago
that many of the constructive properties inherent in any model of set theory arise naturally from
existence of adjoints. And finally, we will recall a recent theorem which demonstrates that the
category of sets can be characterized in terms of maximum-length adjoint strings of a certain
kind.
Let us begin by quoting category theorists, who find adjunctions to be a fundamental pattern
in mathematics. In the preface to his classic text on category theory, S. Mac Lane mentions [26,
p. vii] a slogan that captures an intuition shared by experts in the field: “Adjoint functors arise
everywhere.” In the same spirit, S. Awodey mentions [2, p. 231] in his text Category Theory,
“...in a sense, every functor has an adjoint.” It is obvious to those who have sought to detect the
presence of adjunctions that they pervade mathematics. We turn to some evidence that category
theorists often cite in support of this point.
As is well known (see [34, p. 120], for example), every algebraic theory, such as group
theory or ring theory, can be understood as emerging from the category of sets by way of a free
object functor, which always turns out to be left adjoint to the forgetful functor. We review the
mechanics of this for the case of group theory; it is a theorem of universal algebra that something
like this is true for any algebraic theory. We quote some standard facts about groups. First, every
group is a homomorphic image of a free group. Free groups can be seen as arising from the free
object functor F : Set → Grp, where Grp is the category of groups with group homomorphisms.
For each set X, F(X) is the set of all reduced words obtained from the elements of X (see [34]).
Let G : Grp → Set be the forgetful functor. Then every set function f : X → G(H) induces a
unique homomorphism f : F(X) → H, and in fact there is a natural bijection from Set(X, G(H))
and Grp(F(X), H), and so F a G.
Philosophically speaking, one can argue that the category of groups arises from the category
of sets by virtue of the fact that G has a left adjoint; in this sense, existence of a left adjoint to the
forgetful functor, at least in the world of algebraic theories, is a kind of creational transformation
that produces a more structured category from the “structureless” category of sets. And this
happens because of the balanced relationship between the “dynamic and creative” influence of F
and the “silent and destructive” influence of G.
Given any category C, one can ask if it arises from Set in a similar fashion. The answer is
that, in many cases, this is indeed the situation, but not always. One can answer this question
by examining the internal structure of C, using the Adjoint Functor Theorem. This theorem tells
us that G : SM → Set has a left adjoint precisely when an infinite set exists. This demonstrates
that the notion of “infinite set” is directly related to the dynamic principle by which algebraic
theories arise from adjunctions. This phenomenon therefore provides evidence for the naturalness
of infinite sets.
As a second point, we observe that the category of sets is an example of a cartesian closed
category, whose defining characteristics all arise from adjunctions. Stated simply, a cartesian
closed category is one that has a terminal object, is closed under cartesian products (for any
objects X, Y , there is another object X × Y in the usual sense), and is closed under taking
exponents (for any objects X, Y , there is an exponential object X Y consisting of all morphisms
112
from Y to X).63 These defining axioms can be understood as expressions of adjoint situations,
as follows:64 A cartesian closed category can be defined to be a category C equipped with adjoint
situations of the following three sorts:
(1) Existence of Terminal Object. C has a terminal object if there is an object 1 with the
property that, for every X in C, there is a unique morphism X → 1. Existence of a
terminal object for C is equivalent to existence of a right adjoint 1 → C to the unique
morphism (which, in the category of all categories, is a functor) from C to the category 1.
(2) Existence of Products. For any objects X, Y in C, the product of X and Y is an object
X × Y together with projection morphisms πX : X × Y → X, πY : X × Y having the
universal property: For any object Z in C and morphisms f : Z → X, g : Z → Y , there is a
unique morphism hf, gi : Z → X × Y making the following diagram commutative:
Z
H
f g
hf,gi HH
HH
π
j
?
πY X
X ×Y
Y
X
Existence of products in C is equivalent to existence of a right adjoint Π : C × C → C to the
diagonal functor ∆ : C → C × C, where ∆ is defined by ∆(X) = (X, X). For each X ∈ C,
we let ΠX : C → C denote the X-product functor defined by ΠX (Y ) = X × Y , with the
obvious definition on morphisms.
(3) Existence of Exponentials. For any objects X, Y in C, an exponential object Y X in C,
together with its evaluation map evX,Y : Y X × X → Y , is an object-morphism pair that
satisfies the following universal property: Given any object Z ∈ C and any g : Z × X → Y ,
there is a unique morphism ĝ : Z → Y X such that the following diagram commutes:
Z×X
ĝ×1X
? g
-
Y
*
evX,Y
YX ×X
For Set, evX,Y is the evaluation map given by evX,Y (f, y) = f (y). Exponentiation by X
can be seen as a functor EX : C → C that is right adjoint to the X-product functor ΠX ,
that is, ΠX a EX .
The fact that cartesian closed categories are defined in terms of adjoints indicates that the
structure and dynamics of the category Set are intimately tied to adjunctions.
63
The definitions of “product” and “exponential” in this context need to be expressed in the language of categories; each of these is defined entirely in terms or morphisms with universal properties, in the spirit of the definition
of coproducts, discussed earlier.
64
See [23].
113
Another way in which adjunctions make their presence known in the category of sets was
discovered relatively recently in work by Rosebrugh and Wood [33]. They show that the category
Set can be completely characterized in terms of maximal adjoint strings. We give an overview
of this characterization here.
After stating some preliminaries, we will give the main result, and then explain the terminology somewhat more and list the background theorems that the result depends on. First, we
consider a new way to construct categories. Suppose C is any category with the property that,
for any objects C, D of C, the collection C(C, D) of morphisms from C to D in C is a set; such
op
categories are said to be locally small. We denote by SetC the category of contravariant functors
from C to Set whose morphisms are natural transformations between functors.65 C can always
op
be “embedded” in SetC via what is known as the Yoneda embedding, which is a special functor
Y defined as follows: For each object C in C, Y(C) is the functor yC : C op → Set defined by:
yC(D) = C(D, C).
In other words, yC takes each object D of C to the set of C-morphisms from D to C; in order for
C(D, C) to be an object of Set, the condition that C be locally small is necessary. Moreover, for
any morphism f : C → E in C, Y(f ) : yC → yE is the natural transformation yf defined, for
each object D in C, by
f
h
yfD (h) = D −→ C −→ E = f ◦ h.
Rosebrugh-Wood Adjoint String Theorem (RAST) [33]. Suppose C is a locally small
category.
op
(A) Suppose that the Yoneda embedding Y : C → SetC has a left adjoint X, which in turn
has left adjoints W, V, U; in other words, suppose C admits the following adjoint string:
U a V a W a X a Y.
Then C is equivalent to Set. Moreover, the maximum possible length of an adjoint string
of this type (beginning at the right with a Yoneda embedding) is 5.
(B) An example of an adjoint string of length 5, of the type mentioned in (A), is given by
op
∃∃Y0 a K(∃Y0) a ∃Y1 a K(Y1) a YSet : Set → SetSet .
We explain the meaning of the notation in (B) and derive the adjoint string mentioned there.
We will make use of the following standard results from the literature:66
We let 0 denote the empty category (no objects or morphisms); it is the initial object in
the category Cat of categories. We let 1 denote the category having just one object and one
morphism; it is the terminal object in Cat. If C is an object in Cat, the unique functor F : C → 1
is denoted !C , or simply ! if the context makes the meaning clear. Note that 0op = 0 and 1op = 1.
65
The definition of natural transformation is given on p. 91. The opposite category of a category C was introduced
on p. 93, and the notion of a contravariant functor was introduced on p. 94.
66
These are stated, with references, in [33].
114
op
For each category C, we let K(C) denote SetC .67 K is a contravariant functor from
Cat to Cat (equivalently, a functor Cat → Catop ); it is defined on morphisms of Cat in
the following way: Suppose C, D are categories and F : C → D is a Cat-morphism. Then
op
op
K(F) : SetD → SetC is defined by K(F)(H) = H ◦ F.
Lemma R1 . If L1 , L2 are left adjoint to a functor F, then L1 and L2 are naturally isomorphic.
Likewise, if R1 , R2 are both right adjoints of F, then R1 and R2 are naturally isomorphic.
Lemma R2 . Whenever L a F, we have K(L) a K(R).
Lemma R3 . If the morphisms of C form a set and if D is locally small, and F : C → D, then
K(F) has a left adjoint, denoted ∃F and a right adjoint, denoted ∀F. Therefore,
∃F a K(F) a ∀F.
op
In particular, if F is the Yoneda embedding YC : C → K(C) = SetC , then ∀YC ∼
= YK(C).
Therefore:
∃YC a K(YC ) a YK(C).
We build the adjoint string of the RAST Theorem: We begin by considering the Yoneda
embeddings Y0 : 0 → Set0 ∼
= 1 and Y1 : 1 → Set1 ∼
= Set. Applying R3 to Y0 , we have the
following adjoint string:
(1)
∃Y0 a K(Y0) a ∀Y0 ∼
= YK(0) ∼
= Y1.
Notice that K(Y0) : K(1) → K(0), that is, K(Y0) is the unique morphism ! : Set → 1. This
implies that the other two functors in the adjoint string (1) have the signature 1 → Set.
Next, we apply R2 to (1):
(2)
K(∃Y0) a K(!) a K(Y1).
op
Since ! : Set → 1, K(!) : Set → SetSet . It follows by adjointness that the other two functors in
op
(2) have signature SetSet → Set.
Next, we apply R3 to the functor ∃Y0 : 1 → Set to obtain the following adjoint string:
(3)
∃∃Y0 a K(∃Y0) a ∀∃Y0 .
Examining (2) and (3) we see that both K(!) and ∀∃Y0 are right adjoints to K(∃Y0), so, by
R1 , K(!) ∼
= ∀∃Y0 . Using this observation, we can combine (1), (2) and (3) to produce a length-4
adjoint string:
(4)
∃∃Y0 a K(∃Y0) a ∀∃Y0 ∼
= K(!) a K(Y1).
67
The notation K is in honor of Kan who discovered this functor and established the first mathematical results
about it.
115
Next, we apply R3 to the functor Y1 : 1 → Set, producing the following adjoint string:
(5)
∃Y1 a K(Y1) a ∀Y1 ∼
= YK(1).
Examining (4) and (5), we see that K(Y1) has two left adjoints: K(!) and ∃Y1 , which, by
R1 , must be isomorphic. Combining (4) and (5), and noting that K(1) ∼
= Set, yields the final
result:
(6)
∃∃Y0 a K(∃Y0 ) a ∃Y1 a K(Y1) a YSet.
What does the RAST tell us about infinite sets? The fact is, the result only makes sense if
op
infinite sets exist; one cannot even talk about the category SetSet without infinite sets. In this
sense then, RAST implies the Axiom of Infinity.
We have seen that adjunctions show up throughout mathematics, with or without the Axiom
of Infinity. And existence of an infinite set turns out to be equivalent to more than one important
result about existence of adjunctions. These points provide evidence that infinite sets are as
natural for mathematics as adjoint situations.
We make one final point about the emergence of the category SM from Set by way of a
(putative) adjunction. We would like to make the point that infinite sets arise not just from
adjunctions generally, but from adjunctions as dynamical principles internal to the universe of
sets. To see what we mean, we return to the example of group theory. We have suggested that
group theory “arises” because of a left adjoint to the forgetful functor G : Grp → Set, and we
wish to claim the same holds for SM: that it arises from the existence of a left adjoint to the
forgetful functor G : SM → Set. But we cannot really claim these categories “arise” from Set, at
least by way of a particular left adjoint, because in each case, the new category must already exist
in order for a forgetful functor to exist. We close this section by addressing this point. It turns
out that, for algebraic theories, as well as for the category SM, whenever we have a left adjoint
F : C → Set to a particular forgetful functor G : Set → C, if we let j = G ◦ F : Set → Set, the
category C and the functors F, G can be recovered by examining j and its action on Set, without
reference to an external category. This shows that, in the case of G : SM → Set, an infinite set,
or equivalently, existence of a left adjoint for G, arises from dynamics entirely internal to Set
itself.
To explore these details further, we begin by introducing a bit more information about the
structure of adjunctions. Suppose C, D are categories and F : C → D, G : D → C are adjoint
functors. We observed earlier (see the footnote on p. 95) that the adjunction has an associated
unit, denoted η, defined, for each object A in C, by
ηA : A → G(F(A)) = ΘA,F(A) (1F(A)),
where ΘA,B is the natural bijection from D(F(A), B) to C(A, G(B)). One also defines a co-unit
of the adjunction as follows: For each D-object B,
B : F(G(B)) → B = Θ−1
G(B),B (1G(B)).
One can show that these components of have a universal property that is dual to those of
the components of η, and that is a natural transformation. One says that hF, G, η, i is an
adjunction.
116
Any adjunction hF, G, η, i determines another functor T : C → C by composition: T = G◦F,
as we have already observed in the case of the categories SM and Set. By virtue of the properties
of the adjunction, T forms a central part of another structure, called a monad.
Given a category C, a monad is a triple hT, η, µi for which T : C → C is a functor, and
η : 1 → T and µ : T2 → T are natural transformations, so that
(i) for any object A in C and any x ∈ T3 (A), µA (µT(A) (x)) = (T(µA ))(µA (x)).
(ii) for any object A in C and any x ∈ T(A), µA (ηT(A)(x)) = x = µA ((T(ηA ))(x)) .
Note that the maps T(ηA ) : T(A) → T2 (A), for A ∈ C, are the components of a natural
transformation.
Any adjunction hF, G, η, i gives rise to a monad hT, η, µi by way of the following definitions
(η is the same in both structures):
(i) T = G ◦ F
(ii) η is the same in both structures
(iii) for all objects A in C and x ∈ T2 (A), µA (x) = G(F(A)) (x).
A natural question, which is relevant for us in this paper, is whether, given a monad hT, η, µi
as above that is obtained from an adjunction hF, G, η, i, is it possible to recover this adjunction,
and indeed the category D, from the monad? In general, the answer is no, but when D is an
algebraic category, like Grp, or the special category SM that we have been considering, this can
indeed be done. We outline the results in the special case in which the adjunction hF, G, η, i is
for the functors F : Set → SM and G : SM → Set, and resulting monadic functor is j = G ◦ F.
Note that to proceed further, we need to assume some form of the Axiom of Infinity, since
existence of F depends on this assumption.
Starting with j : V → V (we identify V with Set as we were doing earlier), together with
natural transformations η, µ that give us that hj, η, µi is a monad, we build a category V j of
j-algebras as follows:
V j = {(A, a) | a : j(A) → A},
where each (A, a) in V j satisfies the equations:
a ◦ η A = 1A
a ◦ j(a) = a ◦ µA .
Morphisms of V j of the form f : (A, a) → (B, b) are functions f : A → B satisfying
b ◦ j(f ) = f ◦ a.
We can define the forgetful functor Gj : V j → Set by Gj ((A, a)) = A. Gj has a left adjoint
F defined by F j (A) = (j(A), µA).
We have constructed the category V j from j, η, µ, which are mappings belonging to the
dynamics of V alone. As it turns out, in building V j , we have reconstructed, up to isomorphism,
the category SM. The Eilenberg-Moore comparison functor Φ : SM → V j is defined as follows:
For any f : X → X belonging to SM, Φ(f ) is the j-algebra (X, G(f )). Given an SM-morphism
j
117
h : f → g, where f : X → X and g : Y → Y , Φ(h) : (X, G(f )) → (Y, G(g )) is the function
h : X → Y . One shows that Φ is the unique functor satisfying
Gj ◦ Φ = G and Φ ◦ F = F j .
Moreover, we have
(A) Φ : SM → V j is an isomorphism
(B) j = Gj ◦ F j
(C) F j (1) = Φ(F(1)) = (X1 , j(X1) → X1 ) yields a j-algebra based on an infinite set; in
particular, it continues to be true that X1 is the least critical point of j.
By (C), we see how an infinite set emerges from the dynamics of j : V → V without reference to
an external category.
Our discussion in this section has made a case for the claim that existence of infinite sets
is “natural” because they arise, like so many structures in mathematics, from adjunctions. Our
discussion of monads gives us additional information: Every adjunction between Set and another
category produces a monad j : V → V . So, like adjunctions, monads are fundamental to
mathematics, and we have seen that an infinite set arises naturally from a monad j : V → V —a
certain type of (essentially) Dedekind self-map—in particular, from its critical point. Moreover,
the dynamics of this monad can be seen to arise, by the Eilenberg-Moore results, entirely within
the universe (and not dependent on the presence of external categories).
We can now ask whether the naturalness of infinite sets provides some insight into the
naturalness of large cardinals in the universe. We will explore this topic further in Section 22. A
reasonable hope is that the arguments for naturalness of infinite sets that we have discussed here
might carry over to the context of large cardinals in the following ways:
(1) “monad-like” Dedekind self-maps j : V → V occur naturally in mathematics, and large
cardinals may be seen to emerge from the dynamics of such a map
(2) the “largeness” of large cardinals may naturally become apparent in examining the behavior
of such a j at a canonical critical point—such as its least critical point.
Our argument for the naturalness of infinite sets suggests a point of view about larger infinities
and points to an axiomatic formulation, involving (1) and (2). We have provided evidence for
the claim that a formulation of an axiom that accounts for large cardinals and that has the
characteristics of (1) and (2) is natural. Moreover, our experience with Dedekind self-maps of
this kind tell us that they can be “tuned” by requiring them to satisfy certain preservation
properties. Maps of this kind, as described in (1) and (2), are natural, and the ones that have
been “tuned” yield the strongest large cardinal consequences.
20
Tools for Generalizing to a Context for Large Cardinals
As we discussed at the beginning of this paper, one of our motivations for formulating a new
Axiom of Infinity is to provide, in the structure of the axiom itself, the sort of intuition about
118
the “Infinite” that could be useful in addressing the Problem of Large Cardinals. In particular,
we are looking for patterns in the statement of the axiom that would be natural choices for
generalization; the intended result would be stronger, naturally motivated axioms that could
possibly be used to derive known large cardinal properties.
In this section, we collect together patterns of this kind that seem amenable to natural
strengthenings.
20.1
The structure of a Dedekind self-map
Our New Axiom of Infinity states simply that a Dedekind self-map exists. The underlying intuition is that the discrete values that belong to an infinite collection arise as a consequence of
more fundamental transformational dynamics, namely, some j : A → A having a critical point a.
We found that such a j has three salient characteristics:
(1) It preserves essential features of its domain: The image B = j[A] is itself a Dedekind infinite
set, and j B : B → B is a Dedekind self-map with critical point j(a).
(2) It is not simply the identity map—some element of A is moved by j.
(3) Something interesting happens in the interaction between j and its critical point. In this
case, a blueprint W ⊆ A of the set ω of natural numbers arises from repeated application
of j to a: W = {a, j(a), j(j(a)), . . .}.
The intuition that arises from this formulation of the “infinite” is that perhaps large cardinal
notions also arise from the dynamics of a Dedekind self-map. Since large cardinal properties often
involve the entire universe, it is reasonable to look for answers regarding large cardinals in proper
class Dedekind self-maps.
As we have observed, existence of a proper class Dedekind self-map is not enough to imply
the existence of infinite sets, even though the set version of Dedekind self-maps is. We collect
together the techniques we have discussed for strengthening such class maps so existence of infinite sets can be derived. We view these techniques as candidates for further development in the
direction of deriving large cardinals.
We review the content of Theorems 46, 47, and 49:
Theorem 46. (ZFC − Infinity) Suppose j : V → V is a class Dedekind self-map. Suppose also
that j preserves disjoint unions, the empty set, and singletons. Then there is an infinite set.
Indeed, every critical point of j is infinite.
Theorem 47. (ZFC − Infinity) Suppose j : V → V is a class Dedekind self-map with critical
point a. Suppose also that j preserves disjoint unions, intersections, and the empty set, and that
there is a set A such that a ∈ j(A). Then there is an ultrafilter D on A. Moreover, if either of
the following conditions holds, D is nonprincipal, and A is infinite.
(1) j preserves singletons.
119
(2) j preserves terminal objects and {a} is also a critical point of j.
Theorem 49. (ZFC − Infinity) Suppose j : V → V is a class Dedekind self-map with a strong
critical point. Suppose j preserves finite coproducts and terminal objects. Then there is an
infinite set. Indeed, any strong critical point of j is infinite.
We see in each of these results how requiring a class Dedekind self-map j : V → V to
have certain combinations of preservation properties results in an infinite set, and that these
combinations produce an infinite set that can typically be located at a (strong) critical point of
j. A natural direction for further development, then, would be to require j to satisfy additional
preservation properties and to see whether the properties of a critical point for j are strengthened
in the direction of large cardinals. Theorem 53(3) shows that whenever j preserves ∈ and preserves
ordinals, and has any type of critical point, we can locate a least ordinal moved by j; when we
are working with a j with these properties, we will let crit(j) denote the least ordinal moved by
j; crit(j) will be a natural critical point to examine as we seek to “generate” large cardinals from
preservation properties.
In this section we will provide two strengthenings of these preservation properties that lead
to existence of large cardinals. We focus in this section on perhaps the simplest (and certainly
one of the weakest) large cardinal notions—inaccessible cardinals—in order to illustrate the techniques. In Section 21, we will discuss other large cardinals and expand the techniques introduced
here so that even the biggest large cardinals can be derived. An infinite cardinal κ is regular if
unbounded subsets of κ have size κ; κ is a strong limit if for every λ < κ, 2λ < κ. An uncountable
cardinal κ is inaccessible if it is regular and a strong limit. We begin with a few new preservation
properties that Dedekind self-maps j : V → V may satisfy.
Definition 9. Suppose j : V → V is a Dedekind self-map.
(1) j preserves countable disjoint unions if, Swhenever hXS
n | n ∈ N i (where either N ∈ ω or
N = ω) is a sequence of disjoint sets, j
X
=
n
n∈N
n∈N j(Xn).
(2) j preserves unboundedness if, whenever α ∈ ON and A ⊆ α is unbounded in α, then, if
j(A) ⊆ j(α), then j(A) is unbounded in j(α).
(3) j preserves power sets if, for all X, j(P(X)) = P(j(X)).
(4) Suppose f : A → B and g : A → B are functions. The equalizer E = Ef,g of f and g is
the set {x ∈ A | f (x) = g(x)}. If j is a functor, j is said to preserve equalizers if for any
f, g as above, j(Ef,g) = Ej(f ),j(g); that is, if E is the equalizer of f and g, then j(E) is the
equalizer of j(f ) and j(g).
Theorem 59 (Generalization of Theorem 46 to Inaccessibles). (ZFC − Infinity) Suppose j : V →
V is a Dedekind self-map that preserves ∈ and preserves ordinals. Let κ = crit(j).
(A) Suppose j preserves countable disjoint unions, the empty set, and singletons. Then κ > ω.
(B) Suppose j is BSP and preserves countable disjoint unions, the empty set, singletons, and
unboundedness. Then κ is an uncountable regular cardinal.
120
(C) Suppose j is BSP and preserves countable disjoint unions, the empty set, singletons, unboundedness, and power sets. Then κ is an inaccessible cardinal.
Proof of (A). By Theorem 53(4), ω exists and κ ≥ ω. We have, by Theorem 53 and the list of
properties that j preserves:
ω = {0} ∪ {1} ∪ {2} ∪ · · ·
= {j(0)} ∪ {j(1)} ∪ {j(2)} ∪ · · ·
= j({0}) ∪ j({1}) ∪ j({2}) ∪ · · ·
= j ({0} ∪ {1} ∪ {2} ∪ · · · )
= j(ω).
It follows that κ > ω.
Proof of (B). By Theorem 53(5), κ is a cardinal. By (A), κ is an uncountable cardinal. Suppose
f : α → κ and ran f is unbounded in κ. Using the BSP property of j, we can reason as in the
proof of Theorem 53(5) to show that j(f ) : j(α) → j(κ) has domain α and j(f )(β) = f (β) for
all β < α. Because j preserves images, we have
(a)
ran j(f ) = j(ran f ) ⊆ κ,
but, because j also preserves unboundedness and ran f is unbounded in κ,
(b)
j(ranf ) is unbounded in j(κ).
Clearly, since κ < j(κ), (a) and (b) contradict each other. Therefore, all functions from α to κ
have bounded range. It follows, therefore, that κ is regular. We have shown κ is an uncountable
regular cardinal.
Proof of (C). We begin by showing that, for every A ⊆ α, where α < κ, we have j(A) = A.
Because α < κ and ∈ is preserved, A ⊆ j(A). Suppose γ ∈ j(A) − A. Note that α is the disjoint
union of A and α − A: α = A ∪ (α − A). Since γ 6∈ A, it follows that γ ∈ α − A. Since j preserves
disjoint unions, α = j(α) = j(A) ∪ j(α − A). Since γ ∈ α − A, γ = j(γ) ∈ j(α − A), and so
γ 6∈ j(A), which is a contradiction. We have shown A = j(A).
Suppose, for a contradiction, that there is a surjective function g : P(α) → κ, where α < κ.
Since j preserves functions and power sets and j(α) = α, j(g) : P(α) → j(κ). Note that, for each
A ∈ P(α), g(A) ∈ κ, whence j(g(A)) = g(A). Therefore, for each A ∈ P(α),
j(g)(A) = j(g)(j(A)) = j(g(A)) = g(A).
Therefore, we have
(∗)
ran j(g) = ran g = κ.
We also have, by preservation of images,
(∗∗)
ran j(g) = j(ran g) = j(κ) > κ.
121
Clearly, (∗) and (∗∗) contradict each other, and so no such function g exists. We have shown κ
is a strong limit, and hence, inaccessible.
The next result generalizes the approaches in Theorems 47 and 49, where somewhat different
preservation properties, in some cases of a more category-theoretic flavor, were used. These results
also incorporate observations from Example 4, which illustrate the role of ultrafilters in climbing
further up the hierarchy of infinities. And finally, the important role of universal elements, which,
as Proposition 57 shows, was the key to the Lawvere construction of a j : V → V , shows up again
in the next theorem as an important element in lifting results leading to infinite sets to results
leading to large cardinals.
Theorem 60 (Generalization of Theorems 47 and 49 to Inaccessibles). (ZFC − Infinity) Suppose
j : V → V is a functor, having a strong critical point, which preserves countable disjoint unions,
intersections, equalizers, the empty set, and terminal objects, and has a weakly universal element
a ∈ j(A) for j, for some set A. Then there is a collection A of subsets of A with the following
properties:68
(1) For every X ∈ A, |X| is inaccessible
(2) If X, Y ∈ A, then |X| =
6 |Y |.
(3) The size of A itself is inaccessible.
Remark 18. We will show that j induces a nonprincipal ω1 -complete ultrafilter on A. The
properties (1)–(3) listed above are known consequences of the existence of such an ultrafilter, so
we do not provide proofs of these. Note that properties (1)–(3) imply that there are many large
cardinals that are less than |A|.
We have not required j to be a Dedekind self-map; we do not have a proof, under the given
hypotheses, that it must have this property. We can show, however, that j must be essentially
Dedekind (see p. 102 for the definition): In the proof given below, Claim 1 shows that j and jD
are naturally isomorophic. As mentioned in Example 8, jD itself is 1-1.
Also, although the hypotheses require j to have a strong critical point, they do not require the
weakly universal element a ∈ j(A) itself to be a strong critical point. However, under somewhat
stronger hypotheses, this does turn out to be the case—see Claim 4 of Example 8, where existence
of a measurable cardinal is shown to suffice to obtain this consequence.
Proof. Let D = {X ⊆ A | a ∈ j(X)}. As in the first part of Theorem 47, D is an ultrafilter (we
do not claim yet that it is nonprincipal). Define jD : V → V by jD (X) = X A /D.
Claim 1. j is naturally isomorophic to jD . That is, there are, for all sets B, bijections
φB : jD (B) → j(B), natural in B.
Proof. We first define an onto map φB : B A → j(B), and then show that φB induces a bijection
φB : B A /D → j(B). Define φB by
φB (f ) = j(f )(a).
68
These properties follow from the fact that, assuming the hypotheses of the theorem, there must exist a measurable cardinal κ ≤ |A|. See Section 21.
122
We use the fact that a is a weakly universal element to show that φB is onto: Suppose y ∈ j(B).
By weak universality, there is f : A → B so that y = j(f )(a) = φB (f ), as required.
Now define φB : B A /D → j(B) by
φB ([f ]) = j(f )(a).
Like φB , φB is onto. To see it is well-defined and 1-1, it suffices to show that, for all partial
functions f, g : A → B, f ∼ g if and only if j(f )(a) = j(g)(a).
f ∼g
⇔
E = Ef,g ∈ D
⇔
a ∈ j(E)
⇔
a ∈ {z | j(f )(z) = j(g)(z)} = Ej(f ),j(g)
⇔
j(f )(a) = j(g)(a).
The proof that the φB are components of a natural transformation is straightforward.
Claim 2. D is nonprincipal.
Proof. Let Z be a strong critical point for j. By Claim 1, Z is also a strong critical point
for jD . Assume D is principal. Then there is u ∈ A such that {u} ∈ D; it follows that
D = {X ⊆ A | u ∈ X}. For a contradiction, it suffices to exhibit a bijection jD (Z) → Z.
For each z ∈ Z, let cz : A → Z be the constant function defined by cz (x) = z for all x ∈ A.
Suppose g : A → Z is total and let z = zg = g(u). Let E = Ecz ,g = {x ∈ A | cz (x) = g(x)}. Since
u ∈ E, it follows that E ∈ D, and so [cz ] = [g]. Since every [f ] ∈ Z A /D has a representative g
that is total, and that for any such g, [g] = [czg ], it follows that Z A /D = {[z] | z ∈ Z}. The map
[cz ] 7→ z is therefore a 1-1 correspondence between Z A /D and Z.
Claim 3. D is ω1 -complete.
S
Proof. It suffices to show that if {Xn | n ∈ ω} are disjoint subsets of A with X = n∈ω Xn ∈ D,
then for some n ∈ ω, Xn ∈ D. Since X ∈ D and since j preserves countable disjoint unions,
[
a ∈ j(X) ⇐⇒ a ∈
j(Xn) ⇐⇒ ∃n ∈ ω a ∈ j(Xn) ⇐⇒ ∃n ∈ ω Xn ∈ D.
n∈ω
After introducing large cardinals more formally in Section 21, we will give examples of
Dedekind self-maps j : V → V that exhibit the properties mentioned in the two previous theorems, assuming the existence of large cardinals.
It turns out the most productive direction for generalizing further arises from strengthening
the results of Theorem 59. To generalize this kind of preservation, a natural choice is some form
of elementary embedding. A map j : V → V is an elementary embedding if, for any formula
φ(x1 , x2 , . . . , xn ) in the language of sets and for any particular sets a1 , a2 , . . . , an ,
φ(a1 , a2 , . . . , an ) holds true in V if and only if φ (j(a1 ), j(a2), . . ., j(an)) holds true in V .
123
Intuitively speaking, elementarity of a Dedekind self-map j : V → V means that all properties69
and relationships of sets that hold in V are preserved by j; preservation of disjoint unions and
terminal objects, for example, are special cases of this much stronger form of preservation.
Though elementary embeddings seem natural to consider, it turns out that, without some
modification, the preservation that they introduce is too strong. A consequence of early work by
K. Kunen [20] is that whenever j : V → V is definable in V (as all class Dedekind self-maps from
V to V must be), if j is elementary, j must be simply the identity map idV : V → V —it cannot
have a critical point (or even a weak critical point).70
One way to pursue this approach in a consistent way is to eliminate the “definability” of j.
This can be done by expanding the language of set theory. Ordinarily, the only “extralogical”
symbol that is used in set theory is ∈, which is a formal symbol intended to represent the
membership relation. In the expanded set theory we are suggesting, we would introduce one
additional extralogical symbol j, intended to represent a map from V to V . One can then
introduce axioms that collectively assert that j is elementary and has a weak critical point. In
fact, the statement that there is a weak critical point is usually strengthened to assert “there is a
least ordinal moved.”71 A convention we will adopt for the rest of the paper is to refer to the least
critical point of any BTEE-embedding j : V → V as the critical point; the least weak critical
point does indeed turn out to be a critical point—indeed, the least critical point in ON—in the
usual sense of Dedekind self-maps. Collectively, these new j-axioms are given the name Basic
Theory of Elementary Embeddings, or BTEE.
Having removed the “definability” of j by requiring it to be an interpretation of a new function symbol j in the extended language {∈, j}, we are free to consider elementary embeddings
from V to V , as formulated in the expanded theory ZFC + BTEE, and to consider elementary
embeddings in strengthenings of this theory; in this context, the Kunen inconsistency result arises
as an extreme special case in which ZFC + BTEE is strengthened “too much” with additional
axioms. We examine consequences of the theory ZFC + BTEE after a brief but more formal
introduction to large cardinals in Section 21.
21
Introduction to Large Cardinals
In the previous section we mentioned some principles and strategies for strengthening a proper
class Dedekind self-map in such a way that it could give rise to large cardinals. In this section,
we give some background information about large cardinals, and then, in the following section,
examine two approaches to achieving our goal.
In the 19th century, Georg Cantor demonstrated that, assuming there is an infinite set (like
the set ω of natural numbers), there must be an endless hierarchy of ever-larger infinite sizes, called
infinite cardinals. He gave these different infinite sizes names: ℵ0 , ℵ1 , ℵ2 , . . . (pronounced “alephzero, aleph-one, . . .”). Later it was recognized that the best way to define cardinal numbers is as
69
Technically speaking, all first-order properties are preserved by an elementary embedding.
Kunen’s result is somewhat stronger than this; a detailed discussion is given in [6, 7]. His work shows that
whenever j : V → V is elementary and weakly definable, j must be the identity.
71
Details of this approach can be found in [7].
70
124
special kinds of ordinal numbers. Viewed in this way, the infinite cardinals are ω = ω0 , ω1 , ω2 , . . ..
It is known that every infinite set has size that is equal to exactly one of these infinite cardinals
(though it is not always possible to determine, for a given set, which of the cardinals represents
its size).
In the early 20th century, certain combinations of properties of infinite cardinals were discovered to be quite strong; it was not clear that any infinite cardinal could actually have such
combinations of properties. We give an example of one such combination. One of the properties
involved is that of being an aleph fixed point. A cardinal κ is an aleph fixed point if ωκ = κ; in
other words, the index of the aleph is identical to the aleph itself. Existence of such a cardinal
seems at first to be unlikely when one considers that, among the smallest infinite cardinals, the
index of an aleph is always much smaller than the aleph itself:
0 < ω0 ,
1 < ω1 ,
2 < ω2 , . . .
It is possible however to construct an aleph fixed point.
A second property is regularity, for which we gave a quick definition in an earlier section
(p. 120). We give an equivalent definition here that may have more intuitive appeal. An infinite
cardinal κ is regular if, whenever S is a collection of subsets of κ having these two properties:
S
(1) S = κ,
(2) each element of S has size < κ,
it must be the case that |S| = κ. In other words, κ cannot be covered by subsets of smaller size
unless one uses κ many of them. ω is an easy example of a regular cardinal: Any cover of the set
ω = {0, 1, 2, . . .} by finite subsets of ω must contain infinitely many elements; ω is not covered
by finitely many of its finite subsets.
A question that was asked early on is: Is there an infinite cardinal that is both regular and
an aleph fixed-point? As was discovered many years later, if such a cardinal exists, it certainly
cannot be proven to exist. This was one of the first large cardinal notions to be discovered. Any
regular cardinal that is an aleph-fixed point is now known as a weakly inaccessible cardinal. We
spend a moment to explain why this type of cardinal, like any other large cardinal, cannot be
proved to exist using the ZFC axioms alone.
In the 1930s, Gödel proved two very important things in mathematical logic that hold the
key to understanding the limitative results surrounding large cardinals:
Gödel’s Theorems
(1) (Completeness Theorem) A mathematical theory is consistent if and only if it has a model.
(2) (Second Incompleteness Theorem) If a mathematical theory is strong enough to prove the
theorems of Peano Arithmetic, it cannot prove its own consistency, unless it is inconsistent.
We begin the discussion by defining some terms used in these statements that may be unfamiliar. A mathematical theory is a collection of axioms—like the ZFC axioms or axioms for
Euclidean geometry—together with all the theorems that can be proven from the axioms. A theory is consistent if one cannot prove from the theory a self-contradictory statement such 0 6= 0. A
125
model for a theory is a set (or for some purposes, we can consider a proper class as well) in which
the extralogical symbols are interpreted and in which all the axioms hold true. A model of set
theory, for example, interprets the ∈ symbol as usual membership, and the axioms of ZFC hold
true in the model. An example of a model of set theory is V . However, the Second Incompleteness
Theorem tells us that if ZFC is consistent, we cannot prove that ZFC is consistent using ZFC
alone; in particular, there is no way to define V within ZFC (otherwise we would have defined
from ZFC a model of ZFC).
Now we can explain why large cardinals are “large”: Suppose κ is a large cardinal—say κ is
a weakly inaccessible cardinal. It can be shown that from κ one can define a transitive model of
ZFC.72 It follows, therefore, that one cannot prove from ZFC the existence of any large cardinals.
Despite this fact, large cardinals show up as key ingredients in the solutions of many problems
in analysis, topology, algebra, and logic. The Problem of Large Cardinals attempts to add to the
axioms of ZFC new (natural) axioms from which the known large cardinals could be derived.
Some of the better known types of large cardinals are listed below, in increasing order of
strength:
weakly inaccessible
inaccessible
Mahlo
weakly compact
Ramsey
measurable
supercompact
extendible
huge
superhuge
super-n-huge for every n ∈ ω
In previous sections, we have already encountered inaccessible cardinals (p. 120), and the
essential ingredients of measurable cardinals have also been introduced (Theorem 60 together
with Remark 18). Since we will have more to say about measurable cardinals in the next section,
we give a definition here:
Definition 10 (Measurable Cardinals). Suppose κ is an infinite cardinal and U is an ultrafilter
on κ. Then U is saidSto be κ-complete if, for any collection {Xα | α < λ} of elements of U , with
λ < κ, we have that α<λ Xα ∈ U. If κ is uncountable, we say that κ is a measurable cardinal if
there is a nonprincipal, κ-complete ultrafilter on κ.
In Theorem 60, we showed how a nonprincipal ω1 -complete ultrafilter on some set A could
be obtained from a Dedekind self-map j : V → V with sufficiently strong preservation properties,
72
For most large cardinals κ, this model is the κth stage of the universe, Vκ , but in some cases, especially for
some of the smaller large cardinals (for instance, the weakly inaccessibles), a modification of Vκ is needed. The
usual way to handle these special cases is to use the relativized model VκL , where L denotes Gödel’s constructible
universe (another proper class model of set theory) and where the notation VκL signifies “the κth stage of the
universe, built inside L.”
126
and that the presence of such an ultrafilter guaranteed existence of many inaccessible cardinals.
Following this thread for a moment, it can be shown that, whenever such an ultrafilter on a set
A exists, there is a measurable cardinal κ such that κ ≤ |A|, with a corresponding κ-complete
ultrafilter D. It can then be shown that the set {λ < κ|λ is inaccessible} belongs to D, so that,
in a sense, “almost all” cardinals below κ are inaccessible. It is by way of this fact that the
conclusion to Theorem 60 can be demonstrated.
22
Some Strengthenings of Dedekind Self-Maps to Account for
Large Cardinals
In Section 15, we saw how certain combinations of preservation properties belonging to a class
Dedekind self-map j : V → V resulted in the emergence of an infinite set, and then, with
the introduction of a few more properties, the emergence of inaccessible cardinals. Examples 4
and 5 (p. 81 and p. 82) showed that class Dedekind self-maps with the required properties, at
least in some cases, could be constructed under sufficiently strong assumptions. In this section,
we begin by showing that the preservation properties of the Dedekind self-map described in
Theorem 60, which produced many inaccessibles, can be realized under that assumption of a
measurable cardinal. We also give a class Dedekind self-map example to show that the properties
of Theorem 59 can also be realized, and carry this work further still to give a much fuller account
of nearly all large cardinals.
We reproduce Theorem 60 here, and then give the related example.
Theorem 60 (ZFC − Infinity) Suppose j : V → V is a functor, having a strong critical point,
which preserves countable disjoint unions, intersections, equalizers, the empty set, and terminal
objects, and has a weakly universal element a ∈ j(A) for j, for some set A. Then there is a
collection A of subsets of A with the following properties:
(1) For every X ∈ A, |X| is inaccessible
(2) If X, Y ∈ A, then |X| =
6 |Y |.
(3) The size of A itself is inaccessible.
As was mentioned briefly before, the hypotheses of Theorem 60 imply the existence of a
measurable cardinal κ ≤ |A|. The next example exhibits a functor V → V having the properties
listed in Theorem 60:
Example 8. Suppose κ is a measurable cardinal and D is a nonprincipal, κ-complete ultrafilter
on κ. Define jD : V → V as was done in the proof of Theorem 60: jD (X) = X κ /D. Defining
jD on functions f : X → Y by jD (f )(g) = [f ◦ g], as before, turns jD into a functor. The
proof of the fact that jD is 1-1 and preserves disjoint unions, intersections, and terminal objects
is essentially the same (replacing ω with κ) as the corresponding verifications given in Example 4.
127
Claim 1. [idκ ] ∈ jD (κ). Moreover,
D = {X ⊆ κ | [idκ ] ∈ jD (X)}.
(∗)
In addition, [idκ ] ∈ jD (κ) is a universal element for jD .
Proof. The proof of (∗) is as in the proof of Theorem 60. The proof that [idκ ] ∈ jD (κ) is a
universal element for jD follows the logic given on p. 100.
Claim 2. jD preserves countable disjoint unions.
S
Proof. Suppose X = n∈ω Xn is a countable disjoint union. It is clear that the sets in
S
{Xnκ/D | n ∈ ω} are also disjoint and that n∈ω (Xnκ /D) ⊆ X κ /D. To show the converse,
let f : κ → X. For each n ∈ ω, let Sn = {α
S < κ | fκ (α) ∈ Xn }. By κ-completeness, some Sn
κ
belongs to D. It follows that [f ] ∈ Xn /D ⊆ n∈ω (Xn /D).
Claim 3. jD preserves equalizers.
Proof. Let f, g : X → Y and let E = Ef,g = {x ∈ X | f (x) = g(x)}. We show that
jD (E) = E κ/D is the equalizer EjD (f ),jD(g) of jD (f ), jD(g):
[h] ∈ jD (E)
⇐⇒
{α < κ | h(α) ∈ E} ∈ D
⇐⇒
{α < κ | f (h(α)) = g(h(α))} ∈ D
⇐⇒
[h] ∈ EjD (f ),jD (g).
Claim 4. κ is a strong critical point for j.
Proof. By κ-completeness and the fact that D is nonprincipal, all members of D have size κ; in
particular, all final segments [α, κ) belong to D. We show κ < |jD (κ)| = |κκ /D|. Let hfα | α < κi
be a sequence of κ functions κ → κ. Define g : κ → κ by
g(α) = sup{fβ (α) | β < α} + 1.
Then for each β, {α | fβ (α) < g(α)} ⊇ (β, κ) ∈ D. Therefore, |κκ /D| > κ, as required.
It can also be shown [10] that κ is a critical point of jD and the least ordinal moved by jD .
What’s more, since D is (at least) ω1 -complete, it can be shown that, if we identify κκ /D with
its transitive collapse, [idκ ] is mapped to κ; in this sense, κ ∈ jD (κ) itself is a weakly universal
element of jD .
This Example, combined with Theorem 60, provides the following characterization: There
is a measurable cardinal if and only if there is a Dedekind self-map j : V → V that is a functor,
preserves unions, intersections, equalizers, the empty set, and terminal objects, and that has a
weakly universal element. In [3], Trṅkova-Blass strengthen this characterization by using the
notion of exact functor: A functor is exact if it preserves all finite limits and colimits.73 The
main result of [3] is the following:
73
Limits and colimits are informally defined on p. 104.
128
Theorem 61 (Trṅkova-Blass Theorem). There exists a measurable cardinal if and only if there
is an exact functor from V to V having a strong critical point.
We consider next an example showing that the properties of a j : V → V mentioned in
Theorem 59 can be realized. We reproduce the theorem here for easy reference.
Theorem 59. (ZFC − Infinity) Suppose j : V → V is a Dedekind self-map that preserves ∈ and
preserves ordinals. Let κ = crit(j).
(A) Suppose j preserves countable disjoint unions, the empty set, and singletons. Then κ > ω.
(B) Suppose j is BSP and preserves countable disjoint unions, the empty set, singletons, and
unboundedness. Then κ is an uncountable regular cardinal.
(C) Suppose j is BSP and preserves countable disjoint unions, the empty set, singletons, unboundedness, and power sets. Then κ is an inaccessible cardinal.
We mentioned in Section 20 that any elementary embedding j : V → V for which (V, ∈, j)
satisfies ZFC + BTEE will possess all the preservation properties that we have required of a class
Dedekind self-map in the previous few sections. We formulate this observation as a proposition:
Proposition 62. Suppose (V, ∈, j) is a model of ZFC + BTEE, and critj = κ. Then
(A) j : V → V is a Dedekind self-map, with critical point κ and strong critical point κ.
(B) j is BSP and preserves ∈, ordinals, countable disjoint unions, the empty set, singletons,
unboundedness, and power sets. j also reflects ordinals.
(C) j preserves intersections, equalizers, terminal objects.
(D) j is an exact functor.
Remark 19. Part (B) lists the properties mentioned in Theorem 59, while (C) lists the additional
properties mentioned in Theorem 60.
Proof. We prove some of these claims. Assume, as in the hypothesis, that (V, ∈, j) is a model
of ZFC + BTEE and critj = κ.
Claim 1. j preserves ∈.
Proof. Suppose u ∈ v. Consider the formula φ(x, y) : x ∈ y; certainly V |= φ[u, v]. Then by
elementarity, V |= φ[j(u), j(v)]; in other words, j(u) ∈ j(v).
Claim 2. j preserves and reflects ordinals.
Proof. Let φ(x) be the formula that asserts “x is an ordinal.” Let α be an ordinal. Then
V |= φ[α]. By elementarity, V |= φ[j(α)]. Therefore, j(α) is an ordinal. (To be more precise,
we would unwind the definition of an ordinal and verify that the elements of the definition are
129
preserved by j. x is an ordinal if and only if (x, ∈) is a well-ordering and x is transitive. x is
transitive if and only if, for all u, v, if u ∈ x and v ∈ u, then v ∈ x. Suppose ψ(x) asserts “x
is transitive.” Then, formally, ψ(x) ≡ ∀x∀u∀v (u ∈ x ∧ v ∈ u → v ∈ x). So, if w is transitive,
V |= ψ[w]. By elementarity, V |= ψ[j(w)]; that is, V |= ∀x∀u∀v (u ∈ j(w) ∧ v ∈ u → v ∈ j(w)).
It follows that j(w) is transitive. Similar steps of analysis allow us to conclude that if (x, ∈) is a
well-ordering, so is (j(x), ∈). From these observations, whenever α is an ordinal, j(α) is also an
ordinal. In future arguments, we will be more informal.)
Suppose now that β is not an ordinal; it follows that V |= ¬φ[β]. By elementarity, V |=
¬φ[j(β)], and so j(β) is not an ordinal. This proves that j reflects ordinals.
Claim 3. j is a Dedekind self-map, with critical point κ. Moreover, j(κ) > κ.
Proof. Since j(κ) 6= κ (by assumption), and since, by Lemma 53, j(κ) ≥ κ, it follows that
j(κ) > κ. Suppose κ ∈ ran j. Let j(x) = κ. Since j reflects ordinals, x is an ordinal. If x < κ,
then j(x) = x, by leastness, so x > κ. But then j(x) = κ < x, contradicting Lemma 53. The
fact that j is 1-1 follows from elementarity: Let φ(x, y) be the formula “x 6= y.” Suppose u 6= v.
Then V |= φ[u, v], so V |= φ[j(u), j(v)], and j(u) 6= j(v). So j is 1-1.
Claim 4. j is BSP and j is a functor.
Proof. Recall that the formula φ(x) that asserts “x is a cardinal” is formulated as follows:
φ(x) ⇐⇒ x is an ordinal and for all α < x there is no 1-1 correspondence from α to x.
Clearly, if γ is a cardinal, then V |= φ[γ]; by elementarity, V |= φ[j(γ)]. In particular
V |= j(γ) is an ordinal and for all α < j(γ), there is no 1-1 correspondence from α to j(γ).
Thus, j(γ) is a cardinal.
We verify that j is a functor. If f : A → B, since
V |= “domain of f is A, codomain of f is B”,
we have
V |= “domain of j(f ) is j(A), codomain of j(f ) is j(B)”,
so j(f ) : j(A) → j(B). A simple application of elementarity shows that j preserves composition
of functions and preserves identity functions.
We verify that j preserves functions. Suppose f : A → B. Then for all a ∈ A, by elementarity,
j(a) ∈ j(A). Note that, for each a ∈ A and each b ∈ B, if f (a) = b, then we have:
V |= “b is the value of f at a.”
and so
V |= “j(b) is the value of j(f ) at j(a).”
But this implies that j(f )(j(a)) = j(b), as required.
130
We omit the proof that j preserves images.
Claim 5. j preserves intersections and equalizers.
Proof. Suppose A, B are sets. Then by elementarity of j,
j(A ∩ B) = j({x ∈ A | x ∈ B}) = {x ∈ j(A) | x ∈ j(B)} = j(A) ∩ j(B).
Suppose f, g : A → B are functions. Then
j(Ef,g) = j({x ∈ A | f (x) = g(x)}) = {x ∈ j(A) | (j(f ))(x) = (j(g))(x)} = Ej(f ),j(g).
Claim 6. j is exact.
Proof. We show j is left exact; the proof that j is right exact is similar. It suffices to show that
j preserves products, equalizers, and terminal objects. We have already established the latter
two of these. If A, B are sets, we have
j(A × B) = j({(a, b) | a ∈ A and b ∈ B}) = {(a, b) | a ∈ j(A) and b ∈ j(B)} = j(A) × j(B).
Corollary 63. (ZFC − Infinity) Suppose (V, ∈, j) is a model of ZFC + BTEE, with κ = critj.
Then κ is inaccessible.
Proof. This follows from Theorem 59 and Proposition 62.
Exhibiting models of ZFC + BTEE of the form (V, ∈, j), where V is the universe of sets,
presents a peculiar problem. To see the difficulty, recall how we were able to obtain an example
of the properties of Theorem 60: We started with a nonprincipal κ-complete ultrafilter D and
defined a jD : V → V with the required properties. However, Kunen’s Theorem [20] tells us that
it is impossible to define a j : V → V for which (V, ∈, j) |= ZFC + BTEE; such a j, if it exists at
all, must be undefinable (in V ).
Assuming large cardinals, one can define set models, and even certain class models, of
ZFC + BTEE. We describe such a class model next in the spirit of providing examples to show
that the conditions described in our earlier results—in this case, Theorem 59—can be realized.
Example 9 (Gödel’s Constructible Universe). Let L denote Gödel’s constructible universe, whose
construction we describe now. L, like V , is built in stages, except that at each successor step of
the construction, instead of requiring the next stage to contain all subsets of the previous stage,
we require only the subsets of the previous stage that are definable (with parameters) from the
previous stage. Suppose X is a set and M is a model of a reasonable fragment of ZFC. Then X is
a definable subset of M with parameters if, for some formula φ(x, y1 , . . . , yn ) and some elements
131
u1 , . . . , un ∈ M , X = {w ∈ M | M |= φ[w, u1 , . . ., un ]}. We can define the stages of L:
L0 = ∅
Lα+1 = {X ⊆ Lα | X is a definable subset of Lα with parameters}
[
Lλ =
Lα
(λ a limit)
α<λ
L =
[
Lα
α∈ON
In the presence of large enough cardinals, such as a measurable cardinal, it is known that L
is much smaller than the full universe V and, in this situation, L can be used as the underlying
class for a model of ZFC + BTEE. We state the result, which is due to Kunen:74
Theorem 64 (Kunen). Suppose there is a measurable cardinal. Then there is an elementary
embedding j : L → L such that hL, ∈, ji |= ZFC + BTEE. Moreover j is definable in V but not
in L.
Although we do not have a way of explicitly defining examples of Dedekind self-maps j :
V → V for which hV, ∈, ji is a model of ZFC + BTEE, the pattern of generalization that we
have described in the previous sections, introducing more and more preservation properties,
seems to us to be quite natural. Of course, one could introduce “natural”-looking preservation
properties that are not consistent with each other, so one must verify the consistency of the
various combinations of properties that are proposed. We have been able to do this so far using
explicitly defined Dedekind self-maps j : V → V . Though we will not be able to do this as we
introduce strengthenings of BTEE, it is possible to check consistency of the properties involved
by building set models, or, occasionally, class models as we have done in Theorem 64.
By Kunen’s Theorem, the embedding j of Theorem 64 could not be definable in L, but it
turns out to be quite important for j to be definable (or at least partially definable) in V itself.
By Proposition 62, j : L → L has all the properties mentioned in Theorem 59, except for the fact
that L 6= V .
It turns out that the conclusion of Theorem 64—that there is a transitive model of ZFC +
BTEE—follows from a large cardinal notion that is weaker than measurable; for instance, assuming there is a Ramsey cardinal, the conclusion of the theorem continues to hold true. Now
suppose we have a model hL, ∈, ji, as in the theorem (obtained from existence of a Ramsey cardinal). It is not hard to see that, reasoning as in Proposition 62, j : L → L is an exact functor
with a strong critical point. By the Trṅkova-Blass Theorem, one might naturally conclude that,
at least relative to the universe L, there should exist a measurable cardinal. This, however is
not the case; it is known that measurable cardinals do not exist in L; and one cannot, under
any circumstances, prove the existence (or consistency) of a measurable cardinal assuming only
existence of a Ramsey cardinal. A closer look at the proof of the Trṅkova-Blass Theorem, or
equally well at the proof of Theorem 60, identifies the problem: The definition of the ultrafilter
D from j makes sense only if j itself is definable. In the context of Theorem 60, it is implicitly
74
Hereafter, when we talk about Kunen’s Theorem, we will not be referring to the result mentioned in Theorem 64,
but rather the result that says there is no weakly definable nontrivial elementary embedding j : V → V .
132
assumed that the Dedekind self-map j : V → V in the hypothesis is definable in V ; if it were
not, defining D by D = {X ⊆ κ | κ ∈ j(X)} provides no guarantee that D is a set. But no such
assumption about j can be made when j : M → M is supposed to be an elementary embedding
with a least ordinal moved, where M is a model of ZFC, because Kunen’s Theorem shows that
such embeddings don’t exist. Therefore, the class D cannot be taken to be a set, and the argument cannot be completed. In this case in which j is not definable, exactness of j (and the
presence of a strong critical point) are not sufficient to produce a measurable cardinal.
This observation suggests a way to strengthen the theory ZFC + BTEE further. It is known
that ZFC+ BTEE, on its own, implies the existence of a weakly compact cardinal (and somewhat
more). By adding to the axioms of ZFC + BTEE an axiom that asserts that the ultrafilter
naturally defined from j is a set (this can be formulated as an instance of the Separation axiom
for j-formulas), the theory becomes much stronger.
Measurable Ultrafilter Axiom (MUA). The class {X ⊆ κ | κ ∈ j(X)} is a set.
We remark that the boldface j signifies that it is a symbol of the expanded language of set theory
that we are working in: ZFC + BTEE + MUA. Any model hV, ∈, ji of ZFC + BTEE + MUA
will interpret j as a nontrivial elementary embedding j : V → V , where V contains, as one of its
elements, {X ⊆ κ | κ ∈ j(X)}.
Theorem 65. The theory ZFC+BTEE+MUA proves the existence of many measurable cardinals.
Moreover, existence of a supercompact cardinal (even a 2κ -supercompact, where κ is the critical
point of the embedding) is sufficient to produce a transitive model of ZFC + BTEE + MUA.
In [7], a variety of strengthenings of the theory ZFC + BTEE, like ZFC + BTEE + MUA, are
studied. The stronger the models are, the more fully they provide an account of the known large
cardinals.
Before examining the ultimate limit of this process of generalization, we take a closer look
at the theory ZFC + BTEE + MUA to see the extent to which our original intuitions, based
on our analysis of Dedekind self-maps and the bare notion of “infinity,” have been successful in
leading to an account for the existence of “many” measurable cardinals—entities that exist in
the middle-range of the spectrum of large cardinals.
23
The Theory ZFC + BTEE + MUA
We recall significant properties of set Dedekind self-maps, which we expected would generalize
to the context of Dedekind self-maps V → V in our search for a natural account of the origin
of large cardinals (p. 60). We found that if j : A → A is a Dedekind self-map on a set A with
critical point a, j exhibits the following characteristics:
Properties of Set Dedekind Self-Maps
(A) The self-map j preserves essential properties of its domain (A is Dedekind infinite, and so
is the image j[A] of j) and of itself (the property of being a Dedekind self-map propagates
to the restriction j j[A]).
133
(B) The definition of j entails a critical point that plays a key role in its dynamics. One aspect
of those dynamics is that repeated restrictions of j to successive images give rise to a critical
sequence W = {a, j(a), j(j(a)), . . .} which is a precursor to the set ω of finite ordinals; the
emergence of this critical sequence provides a strong analogy to the ancient and quantum
field theoretic perspective that “particles arise from the dynamics of an unbounded field,”
where, in this context, particular finite ordinals are viewed as “particles.” The sequence of
restrictions j0 , j1, j2 , . . . of j that produce the critical sequence is defined by the clauses:
A0 = A
j0 = j : A → A
crit(j0 ) = a
A1 = j[A0 ]
j1 = j A1
crit(j1 ) = j(a)
An+1 = j[An]
jn+1 = j An+1
crit(jn+1 ) = j n+1 (a).
Therefore crit(jn ) = j n (a) is a critical point of jn .
(C) Through the interplay between j and its critical point, a blueprint (j W, a, E) of the set ω
of finite ordinals arises. In particular, j W is a Dedekind self-map, and for every n ∈ ω,
there is i ∈ E such that i(j W )(a) = n.
(D) In special cases (starting with a transitive set A that is closed under pairs), there also arises
a strong blueprint (S W 0 , F W 0 , ∅, E 0), where S is the singleton map x 7→ {x} (which is
a Dedekind self-map), W 0 = {∅, {∅}, {{∅}} . . .}, F takes a set to a ∈-minimal element of
the set (F is a co-Dedekind self-map and S A is a section of F ), and E has the property
that for all x ∈ W 0 there are i, k ∈ E such that i(S W )(∅) = x and k(F W )(x) = ∅. In
this case, the strong blueprint gives expression to both the generating effect of a Dedekind
self-map and the collapsing effect of the dual map, a co-Dedekind self-map.
Theorems 46, 47, and 49 illustrate the role of preservation properties, motivated by (A), in
strengthening preservation properties of Dedekind self-maps j : V → V sufficiently to give rise to
infinite sets. Moreover, the infinite set generated by each of the self-maps of Theorems 46 and 49
arose as the (strong) critical point of the self-map, as anticipated by (B). Theorem 47 accords
with (B) in another way: The critical point of the map is the seed for defining a nonprincipal
ultrafilter on a set A; the existence of such an ultrafilter implies that its underlying set is infinite. Elaborating on these results through the introduction of additional preservation properties
led to an account of two types of large cardinals—inaccessibles (Theorem 59) and measurables
(Theorem 60). In Theorem 59, the critical point turned out to be inaccessible; in Theorem 60,
the critical point was the seed that was used to build an ω1 -complete nonprincipal ultrafilter, the
existence of which implies the existence of a measurable cardinal. Our approach suggests that
these types of large cardinals arise in the same “natural” way as infinite sets do, from Dedekind
134
self-maps V → V with suitable preservation properties, and with the critical point playing a key
part in the emergence of the large cardinal at hand.
Generalizing preservation properties of Dedekind self-maps to the notion of elementary embedding, as we do when we consider the theory ZFC+BTEE, provides the ultimate generalization
of (A). If j : V → V is a Dedekind self-map given by a transitive model of ZFC + BTEE, its
critical point is necessarily a large cardinal—at least a weakly compact cardinal.
Examining the Dedekind self-maps j : V → V for any model of ZFC + BTEE + MUA
will show how other characteristics of Dedekind self-maps that we have identified (and listed in
(B)–(D) above) come into play to give rise to still stronger large cardinals. Since such a j is
a ZFC + BTEE embedding, it is already clear that j generalizes properties mentioned in (A)
and (B) (though the significance of repeatedly restricting j to subsets of its domain is not yet
apparent). In particular, in the theory ZFC + BTEE + MUA, if κ is the (weak) critical point of
j, κ is measurable, and is in fact the κth measurable.
For ZFC + BTEE + MUA there are also natural analogues to (C)–(D), which we describe
next. The fact that the collection Uj = {X ⊆ κ | κ ∈ j(X)}, where j : V → V is the embedding
provided by a model of ZFC + BTEE + MUA (with critical point κ) turns out to be a set in
ZFC+BTEE+MUA opens the door for a new construction, allowing us to generate (and collapse)
sets starting from the critical point κ. It can be shown that Uj is a nonprincipal, κ-complete
ultrafilter on κ, so we have immediately that κ is measurable. It has one additional property that
should be mentioned here: Uj is closed under diagonal intersections. This
T means that, whenever
hXα : α < κi are elements of Uj , the diagonal intersection {ξ < κ | ξ ∈ α<ξ Xα} also belongs to
Uj . Any nonprincipal, κ-complete ultrafilter on an uncountable cardinal κ that is closed under
diagonal intersections is said to be a normal measure on κ. We will therefore refer to Uj as the
normal measure derived from j.
As we will show now, if j : V → V is an MUA-embedding with critical point κ, we can
obtain from j, κ a blueprint (`, κ, E) for the stage Vκ+1 of the universe, where ` : Vκ → Vκ is
the blueprint map and E consists of certain elementary embeddings (described below). In other
words, the blueprint generates all sets in the universe through stage Vκ+1 .
In order to understand the definitions of ` and E it is necessary to introduce a method for
constructing new universes of set theory—called the ultrapower construction. We will describe
the construction and its properties without proofs; proofs may be found in [19].
Suppose κ is a measurable cardinal and U is a κ-complete nonprincipal ultrafilter over κ.
Recall that the collection V κ consists of all functions having domain κ. We declare two such
functions f, g equivalent, and write f ∼U g, if f and g agree on a set in U , that is, if {α | f (α) =
g(α)} ∈ U . It is easy to check that ∼U is an equivalence relation. We denote the equivalence
class containing f by [f ]. Since [f ] is a proper class, we re-define it in the following way: Let g be
a function κ → V of least rank that is equivalent to f ; so, g ∼u f and g is of least possible rank
with this property. Then define [f ] to be {h : κ → V | h ∼u f and rank(h) = rank(g)}. Under
this definition, [f ] is always a set (being a subset of some Vα).
Let V κ /U = {[f ] | f : κ → U }. Two elements [f ], [g] of V κ /U are equal if f ∼u g. For
membership, we write [f ] ∈∗ [g] if and only if {α | f (α) ∈ g(α)} ∈ U . With these definitions
of equality and membership, V κ /U can be shown to be a model of ZFC. Moreover, because
U is κ-complete, V κ /U is a well-founded model; this means that there is no infinite decreasing
135
∈∗ -chain in V κ /U , i.e., no sequence f1 , f2 , f3 , . . . so that
. . . ∈∗ [f3 ] ∈∗ [f2 ] ∈∗ [f1 ].
Because V κ /U is well-founded and has the property that for each [f ] ∈ V κ /U , {[g] ∈ V κ /U |
[g] ∈∗ [f ]} is a set, Mostowski’s Collapsing Theorem can be applied to produce an isomorphic
transitive image (N, ∈) of V κ /U , which of course is also a model of ZFC. We identify elements
of V κ /U with those of N . We explain this point a bit more. Note that if π : V κ /U → N is the
collapsing isomorphism, then [f ] ∈∗ [g] if and only if π([f ]) ∈ π([g]); we will identify π([f ]) with
[f ].
Finally, we mention that the map iU : V → V κ /U ∼
= N defined by iU (x) = [cx], where cx is
the constant function with value x, is an elementary embedding with critical point κ. The map
iU is called the canonical embedding derived from U .
We now start moving toward the construction of the blueprint (`, κ, E) for Vκ+1 . The map `
will turn out to be a co-Dedekind self-map, and the elements of E will be restrictions of ultrapower
embeddings. We will be able to show that for each X ⊆ Vκ , there is a normal measure U on κ
such that if iU is the canonical embedding derived from U , iU (`)(κ) = X. In this way, every set
in the stage Vκ+1 , the initial part of the universe up to subsets of Vκ , can be seen as arising from
or being generated by the interplay of κ, j and `, where ` is encoded and decoded by E.
The essence of the construction of ` is a Vκ+1 -Laver function. For any set X, a function
f : κ → Vκ is an X-Laver function at κ if, for each x ∈ X, there is a normal measure U on κ
such that iU (f )(κ) = x. The function f is defined by a clever method of encoding information
about all possible normal measures over κ. We define f and then explain how ` is obtained from
f . We first define a formula that is needed both in the definition of f and in the proof that it
has the desired properties:
ψ(g, x, λ) : g : λ → Vλ ∧ x ⊆ Vλ ∧ for all normal measures U on λ,iU (g)(λ) 6= x.
When ψ(g, x, λ) holds true, it means that g is not a Vλ+1 -Laver sequence at λ: Some subset
x of Vλ cannot be computed as iU (g)(λ) for any choice of U . We can now define f :
(∗)
(
∅
f (α) =
x
if α is not a cardinal or f α is Vα+1 -Laver at α
where x satisfies ψ(f α, x, α).
The definition tells us that f (α) has nonempty value just when the restriction f α is not
Vα+1 -Laver at α, and in that case, its value is a witness to non-Laverness.
Theorem 66 (Vκ+1 -Laver Functions Under MUA). The function f defined in (∗) is a Vκ+1 -Laver
function at κ.
Proof. Let j : V → V be the Dedekind self-map, with critical point κ, given to us in a model of
ZFC + BTEE + MUA. Suppose f is not Vκ+1 -Laver at κ, so, in particular, for some y, ψ(f, y, κ)
holds. We consider j(f ) : j(κ) → Vj(κ) .
Notice that j(f ) κ = f : For each α < κ, we have j(f )(α) = j(f )(j(α)) = j (f (α))) = f (α)
(since f (α) ∈ Vκ ). By elementarity, j(f ) has the same definition as f :
136
(
∅
j(f )(α) =
x
if α is not a cardinal or j(f ) α is Vα+1 -Laver at α
where x satisfies ψ(j(f ) α, x, α).
In particular, since f = j(f ) κ is not Vκ+1 -Laver, j(f )(κ) is a witness to non-Laverness and
ψ(j(f ), j(f )(κ), κ) is true.
Let D = Uj be the normal measure derived from j; that is, D = {X ⊆ κ | κ ∈ j(X)}.
By MUA, D is a set. Let i = iD : V → V κ /D ∼
= N be the canonical embedding, and define
k : N → V by k([h]) = j(h)(κ). One can show that k is an elementary embedding with critical
point > κ and makes the following diagram commutative:
V
j
-
H
H
V
6
iDHH
k
j
H
N
By the diagram, we compute:
j(f )(κ) = (k ◦ i)(f )(κ)
= k(i(f )) (k(κ))
= k i(f )(κ) .
Notice that, since j(f )(κ) ⊆ Vκ , k j(f )(κ) = j(f )(κ). By 1-1-ness of k, therefore, since
k j(f )(κ) = j(f )(κ) = k( i(f )(κ) ,
it follows that
j(f )(κ) = i(f )(κ).
The import of this last equation is that, while it is claimed that ψ(j(f ), j(f )(κ), κ) holds true, we
have just exhibited a normal measure D on κ such that iD (f )(κ) = j(f )(κ). We conclude that f
is Vκ+1 -Laver after all.
We mention briefly the significance of having V itself as the codomain of j. If the codomain
of j were some smaller transitive model M , then, assuming f is not Vκ+1 -Laver at κ does imply
the triple j(f ), j(f )(κ), κ satisfies the formula ψ(g, x, λ), but this formula is now relativized to the
model M , and so asserts that for no normal measure U in M is it true that iU (f )(κ) = j(f )(κ).
So, although it is true that for D as defined in the proof, iD (f )(κ) = j(f )(κ), D may not be one
of M ’s normal measures, and so would not give us a contradiction.
We turn to the construction of `:
(+)
(
f (x)
`(x) =
x
if x is an ordinal < κ
otherwise.
We now show that ` is the blueprint map for a blueprint (`, κ, E) for Vκ+1 , assuming MUA.
We will describe the members of E in a more precise way in the discussion in Remark 20, below.
137
Theorem 67 (Existence of Blueprint Self-Maps under MUA). (ZFC + BTEE + MUA) Suppose
j : V → V is a a Dedekind self-map given by a model of ZFC + BTEE + MUA, with critical point
κ. Then the function ` : Vκ → Vκ defined in (+) has the following property:
(∗∗)
For any set X ⊆ Vκ , there is a normal measure U on κ such that iU (`)(κ) = X,
where iU : V → V κ /U ∼
= N is the canonical elementary embedding. Moreover, ` is a co-Dedekind
self-map.
For the rest of this section, we shall say that a self-map having the property (∗∗) has the
Laver property at κ.
Proof. Note that for all α < κ, f (α) = `(α). In particular, if T = {α < κ | f (α) = `(α)} and U
is a normal measure on κ, then T ∈ U . Therefore, if i = iU is the canonical embedding,
κ ∈ i(T ) = i ({α < κ | f (α) = `(α)}) = {α < i(κ) | i(f )(α) = `(f )(α)}.
It follows that, for every normal measure U on κ, iU (f )(κ) = iU (`)(κ). It follows that ` has the
Laver property at κ since f does.
To see that ` is a co-Dedekind self-map, it is sufficient to show that, whenever f : κ → Vκ
is Vκ+1 -Laver for subsets of Vκ, for each x ∈ Vκ, |f −1 (x)| = κ. Given x ∈ Vκ , let Tx = {α < κ |
f (α) = x}. Let U be a normal measure on κ such that iU (f )(κ) = x. Note that iU (x) = x since
x ∈ Vκ . Then since κ ∈ {α < i(κ) | i(f )(α) = x} = i(Tx), it follows that Tx ∈ U . Therefore
|f −1 (x)| = |Tx | = κ.
Remark 20 (The Blueprint Coder E). We describe the blueprint coder for the blueprint of Vκ+1 in
more detail. Suppose j : V → V is a a Dedekind self-map given by a model of ZFC+BTEE+MUA,
with critical point κ. For each normal measure U on κ, let iU : V → V κ /U ∼
= MU be the canonical
κ /U defined in the same way as
embedding and let iU = iU Vκ+1 (equivalently, iU : Vκ+1 → Vκ+1
iU ). We define E by
E = {iU | U is a normal measure on κ}.
We show that (`, κ, E) is a blueprint for Vκ+1 ; we use the criteria specified in Definition 3
(p. 56). The map ` satisfies (1) because it is a co-Dedekind self-map. For (2), κ is a legitimate
blueprint seed since it is a critical point of j. For (3), parts (a)–(c), it is clear that each g : Vκ → Vκ
belongs to Vκ+1 , and hence to the domain of each i ∈ E. Also, for each i ∈ E, by elementarity,
i(`) has domain Vi(κ) ⊇ Vκ = dom `, and κ ∈ dom i(`). For the main part of (3), each i is an
elementary embedding, hence structure-preserving in the same way as j. The “compatibility”
requirement comes from [6]; in the present setting, we observe that there is i : Vκ+1 → N ∈ E
and an elementary embedding k : N → Vj(κ)+1 with the following two properties:
(a) j Vκ+1 = k ◦ i;
(b) k Vκ+1 = idVκ+1
138
In the language of [6, Definition 4.18], these facts tell us that, at least in a weak sense, E is
compatible with j.75
For (D), it is obvious that ` is defined from j, κ, and E. For (E), suppose X ⊆ Vκ . Since,
75
We explain more details about this compatibility criterion. In [6], familiar globally defined large cardinal
notions (like supercompactness and superhugeness) were represented by classes of set embeddings. The reason
for this representation was to provide a uniform setting for discussing existence of Laver sequences for arbitrary
globally defined large cardinals. The starting point for this study is the notion of a suitable formula.
Let θ(x, y, z, w) be a first-order formula (in the language {∈}) with all free variables displayed. We will call θ a
suitable formula if the following sentence is provable in ZFC:
ˆ
∀x, y, z, w θ(x, y, z, w) =⇒ “w is a transitive set” ∧ z ∈ ON
˜
∧ “x : Vz → w is an elementary embedding with critical point y” .
For each cardinal κ and each suitable θ(x, y, z, w), let
Eκθ = {(i, M ) : ∃β θ(i, κ, β, M )}.
The codomain of an elementary embedding i needs to be explicitly associated with i in the definition for technical
reasons; for practical purposes, we think of Eκθ as a collection of elementary embeddings i : Vβ → M with critical
point κ.
In [6], familiar large cardinals, like supercompact and superhuge, are re-defined in terms of classes Eκθ of embeddings. With this general concept of classes of set embeddings, we defined in [6] a general notion of Laver
sequence:
Given a class Eκθ of embeddings, where θ is a suitable formula, a function g : κ → Vκ is defined to be Eκθ -Laver
at κ if for each set x and for arbitrarily large λ there are β > λ, and i : Vβ → M ∈ Eκθ such that i(κ) > λ and
i(g)(κ) = x.
A key sufficient condition for Laverness, mentioned in [6], is the notion of compatibility with an ambient elementary embedding j : V → V having critical point κ, such as the embedding given by MUA. Suppose κ < λ < β, and
iβ : Vβ → M is an elementary embedding with critical point κ. Then iβ is compatible with j up to Vλ if there is
k : M → Vj(β) with
j Vβ = k ◦ i and k Vλ ∩ M = idVλ ∩M .
If θ is suitable, then Eκθ is said to be compatible with j if for each λ < j(κ) there is a β > λ and i : Vβ → M ∈ Eκθ
that is compatible with j Vβ up to Vλ .
j
- V
V
Vβ
i
?
j
Vβ - V
j(β)
k
3
M
In the present context of ZFC+MUA, the large cardinal under consideration is not globally defined and therefore
we do not expect to obtain a Laver sequence for such a cardinal. Our concern is to demonstrate the existence of
a Vκ+1 -Laver sequence, appropriate for strong kinds of measurable cardinals. We can adapt the definitions given
here: First, we can represent the notion of a measurable cardinal with a suitable formula θm :
θm (i, κ, β, M ) :
β = κ + 1 ∧ M is transitive ∧ i : Vβ → M is elementary.
Now κ is measurable if and only if there exists i, β, M such that θm (i, κ, β, M ): If κ is measurable, let ı̄ : V → N
N
be an elementary embedding with critical point κ, and consider i = ı̄ Vκ+1 : Vκ+1 → Vj(κ)+1
= M . Conversely,
given i : Vκ+1 → M with critical point κ, define
U = {X ⊆ κ | κ ∈ j(X)}.
U is well-defined since P(κ) ⊆ Vκ+1 , and is easily seen to be a normal measure on κ.
139
from the previous theorem, ` has the Laver property at κ, we can find U such that iU (`)(κ) = X.
Since ` ∈ Vκ+1 , it follows that iU (`)(κ) = X as well.
Next, we show that there is a strong blueprint for Vκ+1 − Vκ , the collection of subsets of Vκ
of size κ. This strong blueprint shows the generating and collapsing effects of ` and its dual, `op ,
which will be defined to be a certain section of `. In particular, just as in the case of the strong
blueprint (S W 0 , F W 0 , ∅, E) for W 0 , in which every element of W 0 − {∅} is returned by F to
its “source” ∅, which is the seed of the blueprint (Example 1), so likewise we will have in the
present context that all sets in Vκ+1 − Vκ are returned to the blueprint seed κ by `op ; that is, for
each x ∈ Vκ+1 − Vκ , there is i ∈ E such that i(`op )(x) = κ. Note, however, that we do not claim
that there is such an i for which i(`op)(x) = κ for x ∈ Vκ . In fact, there is no way to define a
section s of ` so that this is true: Let s : Vκ → Vκ be a section of ` and let x ∈ Vκ . Let i = iU be
a canonical embedding, as usual. Then
i(s)(x) = i(s) (i(x)) = i(s(x)) = s(x) 6= κ.
The fact that elements of Vκ are “left out” of the dynamics of the strong blueprint accords
with our expectation: For any MUA-embedding j : V → V with critical point κ, we are given κ,
and thereby Vκ as well, so the only sets that need to “return” to κ are those that lie outside of
Vκ . On a smaller scale, in Example 1, the same thing happens: All sets in W 0 − {∅} are returned
to ∅, but the function F that accomplishes this does not return the seed ∅ itself to ∅ (unless it
does so by accident—recall that F (∅) has arbitrary value).
Example 10 (Strong Blueprint for Vκ+1 − Vκ ). Let j : V → V be a Dedekind self-map given by
a model of ZFC + BTEE + MUA, with critical point κ. Let (`, κ, E) be a blueprint for Vκ+1 . We
define the dual map `op so that (`, `op, κ, E) is a strong blueprint for Vκ+1 − Vκ .
We define `op : Vκ → Vκ as follows:
(
α if α is the least ordinal in `−1 (x), if there is one
op
` (x) =
y otherwise, where y is an arbitrary element of `−1 (x)
Next, we can define the notion of an X-Eκθm -Laver sequence, adapting the definition given in the main text: For
any set X, a function f : κ → Vκ is an X-Eκθm -Laver function at κ if, for each x ∈ X, there is i ∈ Eκθm such that
i(f )(κ) = x. It is straightforward to show that for any f : κ → Vκ and any set X, f is X-Laver at κ if and only if
f is X-Eκθm -Laver at κ.
These definitions naturally lead to the weak form of compatibility of Eκθm with j : V → V that is discussed in
the main text: Starting with an MUA-embedding j : V → V with critical point κ, we declare that Eκθm is locally
compatible with j if there is i : Vκ+1 → M ∈ Eκθm that is compatible with j Vκ+1 up to Vκ+1 . This definition is
equivalent to the clauses (a) and (b) given in the main text.
Vκ+1
i
-
j
V
j
?
V
Vκ+1 - V
j(κ)+1
k
3
M
140
As `op (x) ∈ `−1 (x) for each x ∈ Vκ , it is obvious that `op is a Dedekind self-map and a section of `.
Claim. Suppose X ∈ Vκ+1 − Vκ, U is a normal measure on κ, and i = iU is the canonical
embedding with critical point κ for which i(`)(γ) = X. Then γ ≥ κ.
Note that the Claim (once proven) continues to hold true if iU is replaced with iU .
Proof. Suppose α < κ. We compute i(`)(α), using the fact that i(α) = α:
i(`)(α) = i(`) (i(α)) = i(`(α)) = `(α) ∈ Vκ .
The fact that i(`(α)) = `(α) follows because `(α) ∈ Vκ and i is the identity on Vκ . We have
shown that if X ∈ Vκ+1 − Vκ and i(`)(γ) = X, then γ ≥ κ.
We verify the main property of `op : Let X ∈ Vκ+1 − Vκ . Let U be a normal measure on κ
and i = iU the canonical embedding so that i(`)(κ) = X. Note that, by elementarity, i (`op ) is
defined, for each x ∈ Vi(κ), by
(
α if α is the least ordinal in i(`)−1 (x), if there is one
i (`op ) (x) =
y otherwise, where y is an arbitrary element of i(`)−1 (x)
By the Claim, κ is the least ordinal in i(`)−1 (X). By definition of i(`)op, it follows that i(`)op(X) =
κ. Again, note that the same argument goes through if iU is replaced by iU .
We can now formally establish that (`, `op, κ, E) is a strong blueprint for Vκ+1 −Vκ by verifying
the properties in Definition 3. As we have already shown that (`, κ, E) is a blueprint for X, where
X = Vκ+1 − Vκ , what remains is to establish the following points, and these were demonstrated
in the paragraphs above:
(A) dom `op = dom `.
(B) `op is a section of `.
(C) For each i ∈ E, dom i(`op) = dom i(`)
(D) For every x ∈ X, there is i ∈ E such that i(`op )(x) = κ.
We arrived at the theory ZFC+BTEE by noticing (1) the needed strengthening of a Dedekind
self-map j : V → V to produce an infinite set was obtained by requiring j to have fairly natural
preservation properties; (2) by strengthening these preservation properties further, certain large
cardinals could be derived; (3) the strongest kind of preservation possible is obtained when j is
an elementary embedding, and the theory ZFC + BTEE is the formal assertion of the existence
of such an embedding from V to V .
In our initial study of Dedekind self-maps, we found they exihibited not only interesting
preservation properties, but led naturally to the concept of a nonprincipal ultrafilter. Generalizing
141
these ideas led to an example and corresponding theoretical results in which a Dedekind self-map
j : V → V exhibits strong preservation properties and a nonprincipal ultrafilter plays a key
role. The results in this case provided motivation for the existence of a measurable cardinal.
The construction of the ultrafilter in this case could not be carried out directly in the theory
ZFC+BTEE because of definability restrictions, and so we were led to postulate a supplementary
axiom to ZFC + BTEE which asserts that the ultrafilter naturally derived from a j : V → V
exists as a set; this is the theory ZFC + BTEE + MUA. This strengthened theory turns out to
imply existence of more than just a measurable cardinal, and also exhibits many more of the
characteristics that we specified in the case of a Dedekind self-map—in particular, a blueprint
for Vκ+1 and a strong blueprint for Vκ1 − Vκ . These points were described in Properties of Set
Dedekind Self Maps (p. 133) and elaborated in this section.
However, one characteristic of Dedekind self-maps that we did not discuss—mentioned in
part (B) of in Properties of Set Dedekind Self-Maps (p. 133)—is the generation of a critical
sequence and the role of restrictions of j. Although it is true that the “generating” effect of j, in
the theory ZFC + BTEE + MUA, was captured nicely by the blueprint and strong blueprint that
are derived from j, still it is natural to ask about the properties of the sequence κ, j(κ), j(j(κ)), . . .
and the role of restrictions of j.
This topic reveals interesting limitations in the theory ZFC + BTEE + MUA. We mention
some known results [7] and introduce some new refinements.
(1) In the theory ZFC + BTEE + MUA, the critical sequence κ, j(κ), j(j(κ)), . . . cannot be
shown to “exist,” even as a j-class, without supplementing the theory with additional
axioms. What can be shown is that, for each particular (metatheoretic) natural number n,
the theory proves that the sequence hκ, j(κ), . . ., j n(κ)i exists as a set. On the other hand,
whenever we work in a transitive model of ZFC + BTEE + MUA, this problem is corrected,
and the critical sequence can indeed be shown to be a j-class in the model.
(2) The theory ZFC+BTEE +MUA shows, for each particular (metatheoretic) natural number
n that Vκ ≺ Vj(κ) ≺ . . . ≺ Vj n (κ) , but the sequence of models Vκ , Vj(κ), Vj 2(κ) , . . . cannot
be shown to be a j-class. Once again, this sequence can be shown to be a j-class inside
any transitive model of the theory. However, even inside such a transitive model, without
additional axioms, the reasonable conjecture Vκ ≺ Vj(κ) ≺ . . . ≺ Vj n (κ) ≺ . . . ≺ V cannot
be proven.
(3) A natural question, which the theory ZFC + BTEE + MUA cannot answer, is whether
the critical sequence κ, j(κ), j(j(κ)), . . . is bounded. Assuming a 2κ -supercompact cardinal, there is a transitive model of ZFC + BTEE + MUA in which the critical sequence is
bounded, but, on the other hand, any I3 embedding i : Vλ → Vλ (with critical point κ
and with λ a limit > κ) gives rise to a transitive model (Vλ , i) of ZFC + BTEE + MUA +
“the critical sequence is unbounded”.
(4) The restriction j Vκ can be shown to exist as a set in ZFC + BTEE + MUA (it is equal to
idVκ ); using elementarity of j, one can show that Vκ ≺ Vj(κ), and hence that j Vκ : Vκ →
Vj(κ) is an elementary embedding.76 However, it is not possible to show that restrictions of
76
Here is a short proof: We first show that Vκ ≺ j(Vκ ). Arguing in the metatheory: Suppose φ(x0, x1 , x2 , . . . , xn )
142
j to Vj n(κ) , for n ≥ 1, exist as sets in ZFC + BTEE + MUA.77 Here is a sampling of what
is known:
(A) The theory ZFC + BTEE + ∃z (z = j κ+ ) has consistency strength at least that of a
strong cardinal
(B) The theory ZFC + BTEE + ∃z (z = j Vκ+1 ) has consistency strength at least that of
a Woodin cardinal.
(C) The theory ZFC + BTEE + ∃z (z = j P(P(j(κ)))) directly proves κ is huge with κ
huge cardinals below it.78
is a formula, a1 , a2 , . . . , an are sets, and there is c = j(b) ∈ j(Vκ ) such that j(Vκ ) |= φ[j(b), j(a1 ), . . . , j(an )]. By
elementarity of j, Vκ |= φ[b, a1 , . . . , an ]; the result follows. Since j(Vκ ) = Vj(κ) , and since the statement that
Vκ ≺ Vj(κ) is equivalent to the assertion that j Vκ : Vκ → Vj(κ) is elementary, we have therefore established this
latter assertion.
To make this metatheoretic argument formal, since we cannot quantify over formulas, we use codes for formulas;
codes are sets that live in Vω . Every formula in the language {∈} of ZFC can be coded as a set in Vω . Given
a formula φ(x0 , x1 , . . . , xn ) as before, let p be a code for φ. Let f : rank(p) → Vκ ; f represents a possible finite
sequence of sets for the formula coded by p; in the above example, the values of f were a1 , . . . , an . Then j(p) = p
and j(f ) : rank(p) → Vj(κ) . By elementarity of j, we have
Sat(p, Vκ , f ) ⇐⇒ Sat(p, Vj(κ) , j(f )),
where, in general Sat(x, M, g) is the ∆1 statement that asserts formally that M satisfies the formula coded by x
with parameter values given by g. This argument shows formally Vκ ≺ Vj(κ) .
77
We prove this here by showing that existence of j Vj(κ) in any model (V, ∈, j) of ZFC + BTEE suffices to
produce a transitive model of ZFC + BTEE + MUA; by Gödel Incompleteness, this shows that the restricted map
j Vj n (κ) cannot be a set in a model of ZFC + BTEE + MUA, for any n ≥ 1.
Proposition 68. The theory ZFC+ BTEE + ∃z (z = j Vj(κ) ) proves “there is a transitive model of ZFC+ BTEE +
MUA.”
Proof. Let j : V → V be the embedding given in the hypothesis, having critical point κ and having the property
that j Vj(κ) is a set. Let g = j Vj(κ) . Since P(P(P(κ))) ∈ Vj(κ) , one can define a 2κ -supercompactness measure
Ug by
Ug = {X ⊆ Pκ2κ | g00 2κ ∈ g(X)},
which ensures that κ is 2κ -supercompact. But this degree of supercompactness has been shown [7, Proposition 9.10]
to be sufficient to build a transitive model of the theory ZFC + BTEE + MUA.
78
Assuming a related but somewhat weaker axiom
“ is enough to
” obtain a blueprint for Vκ+2 . We outline the proof
here. We work in the theory ZFC + BTEE + ∃z z = j j(κ)) . Let λ = 2κ and let h = j λ. Our extra axiom,
together with Lemma 69, imply that h exists as a set (since λ < j(κ)).
We describe our plan for building a blueprint (`+ , κ, E + ) for Vκ+2 : We will define a Vκ+2 -Laver function f + :
κ → Vκ from which `+ can be defined, as was done earlier. We let E + consist of all λ-supercompact embeddings
(restricted to Pκλ+ ), defined from a normal measure U over Pκλ (where Pκλ = {X ⊆ P(λ) | |X| < κ}) as the
canonical ultrapower embedding i = iU : V → V Pκ λ ∼
= MU . The Laver property that we will show f + has is that
for every X ∈ Vκ+2 , there is a λ-supercompact ultrafilter U such that iU (f + )(κ) = X.
We follow the steps of our earlier construction of a Vκ+1 -Laver sequence. As before, we define a formula ψ:
ψ(u, x, γ) : u : γ → Vγ ∧ x ⊆ Vγ+1 ∧ for all normal measures U on Pγ 2γ ,iU (u)(γ) 6= x.
When ψ(u, x, λ) holds true, it means that u is not a Vγ+2 -Laver sequence at γ: Some subset x of Vγ+1 cannot
be computed as iU (u)(γ) for any choice of U . We can now define f + :
143
(∗)
+
f (α) =
(
∅
x
if α is not a cardinal or f + α is Vα+2 -Laver at α
where x satisfies ψ(f + α, x, α).
The definition tells us that f + (α) has nonempty value just when the restriction f + α is not Vα+2 -Laver at α,
and in that case, its value is a witness to non-Laverness.
Claim. The function f + defined in (∗) is a Vκ+2 -Laver function at κ.
Proof. Let j : V → V be the embedding, with critical point κ, given to us in a model of our theory ZFC + BTEE +
∃z (z = j j(κ)).
Suppose f + is not Vκ+2 -Laver at κ, so, in particular, for some y, ψ(f, y, κ) holds. We consider j(f +) : j(κ) →
Vj(κ) . Notice, as in the earlier proof, that j(f + ) κ = f + . Also, by elementarity, j(f + ) has the same definition as
f +:
(
∅ if α is not a cardinal or j(f + ) α is Vα+2 -Laver at α
+
j(f )(α) =
x where x satisfies ψ(j(f + ) α, x, α).
In particular, since f + = j(f + ) κ is not Vκ+2 -Laver, j(f + )(κ) is itself a witness to non-Laverness and
ψ(j(f + ), j(f + )(κ), κ) is true.
Let D = Uj be the normal measure derived from j; that is, D = {X ⊆ Pκ λ | h[λ] ∈ j(X)} (recall h = j λ).
We claim that, by our extra axiom, D is a set: Since κ is measurable, j(κ) is a measurable above κ and so
<κ
|P(Pκλ)| = 2λ < j(κ). Therefore, by Lemma 69, there is a set r = j P(Pκλ). Therefore,
D = {X ⊆ Pκ λ | h[λ] ∈ r(X)} is a set.
Let i = iD : V → V
/D ∼
= N be the canonical λ-supercompact embedding, and define k : N → V by
k([t]) = j(t)(h[λ]). One can show that k is an elementary embedding with critical point > 2κ and makes the
following diagram commutative:
Pκ λ
j
V
H
H
-
V
6
k
H
HH
j
iD
N
By the diagram, we compute:
j(f + )(κ)
=
=
=
(k ◦ i)(f + )(κ)
`
´
k(i(f + )) (k(κ))
`
´
k i(f + )(κ) .
Since iD : V → N is λ-supercompact, λ N ⊆ N . It follows that N contains all sets that are hereditarily of size
≤ 2κ —in particular, Vκ+1 and each of its subsets belongs to N . Therefore V`κ+2 = P(V
´ κ+1 ) ⊆ N .
Since crit(k) > 2κ , k P(Vκ+1 ) = idVκ+1 . Then since j(f + )(κ) ⊆ Vκ+1 , k j(f + )(κ) = j(f + )(κ). By 1-1-ness of
k, therefore, since
`
´
`
´
k j(f +)(κ) = j(f + )(κ) = k( i(f + )(κ) ,
it follows that
j(f + )(κ) = i(f + )(κ).
Therefore, as before, though we have claimed that ψ(j(f + ), j(f + )(κ), κ) holds true, we have just exhibited a
normal measure D on Pκλ such that iD (f + )(κ) = j(f + )(κ). We have a contradiction. Therefore f + is Vκ+2 -Laver
after all.
144
(D) The theory ZFC + BTEE + ∃λ∃z (λ is an upper bound for the critical sequence and
z = j λ) is inconsistent.
The extension of ZFC + BTEE obtained by adding the axiom MUA is derivable from an
axiom that asserts that a restriction of j exists. In particular,
ZFC + BTEE + ∃z (z = j P (κ)) ` MUA.
This can be shown as follows: Working in the theory ZFC+BTEE+∃z (z = j P (κ)), where
κ is the critical point of the embedding j : V → V , let g = j P (κ). Now the ultrafilter
derived from j has the following simple ZFC definition: U = {X ⊆ κ | κ ∈ g(X)}.
It can be shown that, for any set X for which |X| ≤ κ, the restriction j X does exist
as a set in ZFC + BTEE + MUA (but restrictions of j to sets of larger cardinality require
stronger axioms, as (A) demonstrates). This follows from two observations, which we prove
below.
Lemma 69. Suppose T is an extension of the theory ZFC + BTEE and j : V → V is the
embedding with critical point κ. Suppose j X can be proven to exist (as a set) in the theory T .
Then:
(i) For any Y ⊆ X, j Y is also a set
(ii) For any Y for which there is a bijection X → Y , j Y is also a set.
Proof of (i). Suppose Y ⊆ X and let i = j X. Note that
j Y = {(u, v) | u ∈ Y and i(u) = v},
which is a set by ordinary Replacement.
Proof of (ii). Suppose f : X → Y is a bijection. By elementarity, j(f ) : j(X) → j(Y ) is also a
bijection. Let i = j(f ) and k = j X, both of which are set in T . We have
j Y
= {(y, z) | z = j(y)}
= {(y, z) | ∃x ∈ X y = f (x) and z = i(k(x))}.
and the last expression is a set by ordinary Replacement.
A consequence of (ii) is that the following theories are equivalent: ZFC + BTEE + ∃z (z =
j Vκ+1 ) and ZFC + BTEE + ∃z (z = j P (κ)).
As our sampling of results suggests, the large cardinal consequences of requiring that restrictions of j to sets X be themselves sets increase in strength as those sets X increase in size.
Part (C) shows one limitative result in this direction. When the critical sequence is unbounded,
however, we are free to require restrictions of j to sets of any size without inconsistency. The
next section explores this possibility.
`
´
Examining the proof shows that the blueprint (`+ , κ, E + ) derived from f + is actually a blueprint for H (2κ )+ )
Vκ+2 (for any infinite cardinal γ, H(γ) is the set of sets having hereditary cardinality < γ). Moreover, the proof
goes through in the weaker theory ZFC + BTEE + ∃z (z = j P(P(P(κ)))).
145
24
The Theory ZFC + WA
Our analysis of the embedding j : V → V that we get from a model of ZFC + BTEE + MUA
shows that our strategy for strenghtening the notion of a Dedekind self-map V → V based
on observations we have made about set Dedekind self-maps, in Properties of Set Dedekind
Self-Maps (see p. 133 and the remarks following), has been successful so far: We have, using
techniques of generalization that are suggested to us by our new Axiom of Infinity, arrived at
a formulation of a Dedekind self-map of the universe whose properties ensure the existence of
many measurable cardinals. However, we have yet to tap the full potential of these Properties.
For example, our blueprint for generating sets (discussed in part (C) of Properties) has taken
us only up to Vκ+1 . Also, property (B) in Properties suggests that the critical sequence derived
from j plays a special role (for set Dedekind self-maps, it forms a blueprint for ω) and that the
critical sequence “emerges” from consideration of a sequence of restrictions of the Dedekind selfmap under consideration. However, under MUA, the critical sequence for j is not even formally
defined, and it is not possible to define restrictions j Vj n (κ) for n > 0; doing so, entails much
stronger large cardinal consequences than those available in the theory ZFC+BTEE+MUA.
The last paragraph of the previous section suggested a way to proceed further, and to address
the limitations we have just outlined. Working in the theory ZFC + BTEE provides us with
a nontrivial elementary embedding j : V → v, and maximizes the preservation properties a
Dedekind self-map from V to V could have. Insisting also that restrictions of j to various sets
are also sets in the universe leads to significant strengthenings of the theory, in the direction of
stronger large cardinals. As we mentioned in the last section, the theory ZFC + BTEE + ∃z (z =
j P(j(κ))) is already strong enough to derive MUA, while the theory
ZFC + BTEE + ∃z (z = j λ) + “λ is ≥ the supremum of the critical sequence”
is inconsistent.
In the spirit of maximizing the strength of our axiomatic extension of ZFC + BTEE, we
consider the following new axiom, which we will add to ZFC + BTEE:
Axiom of Amenability. For every set X, j X : X → j(X) is a set.
Recall that the boldface j signifies that it is a symbol of the expanded language of set theory that
we are working in, as we saw in our discussion of ZFC + BTEE + MUA. Any model hV, ∈, ji of
ZFC + BTEE + Amenability will interpret j as a self-map j : V → V .
Intuitively, the Axiom of Amenability is a way of ensuring that the “dynamics” of an elementary embedding j : V → V are present within the universe by requiring that the restriction
of j to any set is also a set in the universe.
In the literature, the set of axioms BTEE + Amenability is given the name The Wholeness
Axiom or WA.79 We have the following result:
79
To be precise, the Wholeness Axiom is BTEE + Separationj , where Separation j is the usual Separation axiom,
applied to j-formulas. Amenability is a consequence of Separation j . In the literature, BTEE + Amenability is
denoted WA0 to indicate it is slightly weaker than WA. Nevertheless, it has been shown that all the known large
cardinal consequences of WA can also be shown to be consequences of WA0 . Therefore, in this paper, we have
emphasized the more intuitively appealing Amenability axiom.
146
Theorem 70 (Consequences of the Wholeness Axiom). [6] 80 Working in ZFC + WA, let j :
V → V denote the WA-embedding and let κ denote the least ordinal moved by j. Then j is a
Dedekind self-map (not a class or a set) that is BSP and has critical point κ. Moreover, κ has
“virtually all” the known large cardinal properties; in particular, κ is super-n-huge for every n
(and is in fact the κth such cardinal).
Using WA, we are in a position to see even more clearly the extent to which the notion of a
Dedekind self-map points the way to a deeper understanding of the origin of large cardinals. We
once again refer to the Properties of Set Dedekind Self-Maps (A)–(D) mentioned earlier (p. 133).
As we discussed for MUA embeddings, any WA-embedding will be a clear generalization of
(A) and most of (B). In particular, relative to (B), the critical point κ in this case has extremely
strong large cardinal properties; indeed, κ has virtually all known large cardinal properties.81
There are also natural analogues to (C)–(D), which we discuss next. These are very similar
to those described in our analysis of ZFC + BTEE + MUA, except that we are now able to obtain
a blueprint for all sets in the universe rather than just for all subsets of Vκ . The steps of reasoning
are very similar to the MUA case, so we just outline the results.
In place of elementary embeddings derived from a normal measure, we use, in the present
context, extendible embeddings. To begin, then, we discuss the notion of an extendible cardinal
a bit further. A perusal of our list of large cardinals given earlier (p. 126) shows that extendible
cardinals are among the very strongest large cardinals in the list. A cardinal κ is extendible if, for
every η > κ, there is an elementary embedding i : Vη → Vξ —called an extendible embedding—
having critical point κ. Intuitively, extendible cardinals arise from a WA-embedding (and its
iterates) by restriction: Certainly, for every η, if κ is the critical point of j : V → V , where j is a
WA-embedding and κ < η < j(κ), it follows that j Vη is an extendible embedding with critical
point κ. Likewise, for any η, j(κ) ≤ η < j(j(κ)), j ◦ j Vη is another extendible embedding with
critical point κ, and so forth.
Associated with any extendible embedding i : Vη → Vξ , with critical point κ and η ≥ κ + 1,
is a particular normal measure Ui defined, exactly as for the theory ZFC + BTEE + MUA, by
Ui = {X ⊆ κ | κ ∈ i(X)}.
Using the notion of extendible elementary embeddings, which, as we have seen, are simply
derived embeddings from a WA-embedding j : V → V , one can define from j and its critical
point κ a blueprint ` : Vκ → Vκ for all sets in the universe; ` will turn out to be, as before, a
80
The theory ZFC + WA provides examples of phenomena that were alluded to earlier in the paper. First of
all, any WA-embedding j : V → V , viewed as a collection of ordered pairs, is an example of a subcollection of V
that is neither a set nor a class. (In fact, the range of j is also an example of this kind.) The question of whether
such entities could meaningfully exist was raised on p. 65. A second interesting phenomenon arises because of the
fact (which we do not prove here) that the sequence κ, j(κ), j(j(κ)), . . . , j n (κ), . . . has no upper bound in ON. It
follows that the function f defined on ω by f (n) = j n (κ) is a member of V <V but not a member of V <V ∩ V .
This is because the range of f is unbounded in V , so f cannot be a set. We asked (p. 44) whether a function with
these properties could exist. Note that even though the domain of f is a set, and the collection of ordered pairs
that determine f is only countably infinite, it is not possible to view f as a set since its range is unbounded in the
universe.
81
There are a few exceptions. WA does not produce the very strongest large cardinals; there are axioms, notably
I1 , I2 , I3 , that have greater consistency strength, and so the large cardinals they produce are, consistencywise,
stronger than those arising from WA.
147
co-Dedekind self-map. We will once again call ` a blueprint self-map. We will show that, for
every set x ∈ V , there is an extendible embedding i such that i(`)(κ) = x. In this way, every set
in the universe can be seen as arising from or being generated by the interplay of κ, j and `.
The essence of the construction of ` is a Laver function f : κ → Vκ . A Laver function82 is a
function f : κ → Vκ with the property that for any set x, there is an extendible embedding i such
that i(f )(κ) = x. This definition is in contrast with that for an X-Laver function (in particular,
a Vκ+1 -Laver function, as discussed in connection with ZFC+BTEE + MUA) which restricts the
possible values of x to the set X.
The Wholeness Axiom guarantees the existence of a Laver function:
Theorem 71. [6](WA) Suppose j : V → V is a WA-embedding with critical point κ. Then there
is a Laver function f : κ → Vκ .
Proof. Let φ(g, x) be the formula
∃α α is a cardinal ∧ g : α → Vα ∧ ∀η∀ζ ∀i : Vη → Vζ elem(i, α) ∧ rank(x) < η < i(α) < ζ
→ i(g)(α) 6= x .
where elem(i, α) asserts “i is an elementary embedding with critical point α.”
The formula φ(g, x) says that g is not Laver, with witness x. Define f : κ → Vκ by
(
∅ if f α is Laver at α or α is not a cardinal
f (α) =
x otherwise, where x satisfies φ(f α, x).
Let D = Uj be the normal ultrafilter over κ that is derived from j; that is:
D = {X ⊆ κ | κ ∈ j(X)}.
By Amenability, D is a set, since D = {X ⊆ κ | κ ∈ g(X)}, where g = j P(κ).
Define sets S1 and S2 by
S1 = {α < κ | f α is Laver at α}
S2 = {α < κ | φ(f α, f (α)}.
Clearly, S1 ∪ S2 ∈ D. To complete the proof, it suffices to prove that S1 ∈ D, and for this, it
suffices to show S2 6∈ D.
Toward a contradiction, suppose S2 ∈ D. Then, φ(f, j(f )(κ)) holds in V . Let x = j(f )(κ).
Pick η so that rank(x) < η < j(κ). Let i = j Vη : Vη → Vζ , where ζ = j(η). By Amenability, i is
a set, and is an elementary embedding with critical point κ. Clearly, in V , rank(x) < η < i(κ) < ζ
and i(f )(κ) = x, contradicting the fact that φ(f, j(f )(κ)) holds in V . Therefore S2 6∈ D, as required.
We can now state the main fact about `:
82
In the literature, it is called an extendible Laver function. This notion first appeared in [6] where the existence
of such a function was proved to follow from (and to be equivalent to) the existence of an extendible cardinal.
148
Theorem 72 (Existence of Blueprint Self-Maps). (WA) Suppose j : V → V is a WA-embedding
with critical point κ. Then there is a blueprint self-map Vκ → Vκ; that is, there is an ` : Vκ → Vκ
such that for any set x, there is an extendible elementary embedding i : Vη → Vξ with critical
point κ and η ≥ κ + 1 such that i(`)(κ) = x. Moreover, ` is a co-Dedekind self-map.
The proof is exactly the same as the one given for the MUA case, replacing canonical embeddings iU with extendible embeddings i.
For (D) in the Properties list, we show that there is, just as in the MUA case, a natural dual
to `, which we will once again denote `op , which sends every sufficiently large set back to the
“point” κ. The precise statement is given in the following theorem:
Theorem 73. (WA) Let j : V → V be a WA-embedding with critical point κ. For each blueprint
self-map ` : Vκ → Vκ , there is a Dedekind self-map `op : Vκ → Vκ with the following properties:
(1) For every x 6∈ Vκ , there is an extendible elementary embedding i : Vη → Vξ with critical
point κ and η ≥ κ + 1 such that
i(`op)(x) = κ
(2) `op is a section of `; in particular, ` ◦ `op = idVκ .
Again, the proofs are essentially identical to those given in the MUA case, replacing embeddings iU obtained from a normal measure with extendible embeddings. In this case, we are not
restricted to subsets of Vκ , as we were in the MUA case, but the reasoning is the same since now
we have a (full) Laver function f that gives access to sets of arbitrarily large rank.
As before, we do not claim that there is an i for which i(`op)(x) = κ when x ∈ Vκ . Indeed,
as before, there is no way to define a section s of ` so that this is true, and the proof of this is
identical to the one given in the previous section.
We turn now to a more detailed examination of the our blueprint for the universe under WA.
Remark 21 (The Blueprint Coder E for V ). We describe now in more detail the blueprint coder
for the blueprint of V , given to us by WA. Suppose j : V → V is a WA-embedding, given to us
by a model of ZFC + WA, with critical point κ. We show that the triple (`, κ, E), where E is
the class of all extendible embeddings having critical point κ, is a blueprint for V , and also that
(`, `op, κ, E) is a strong blueprint for V − Vκ . We refer to Definition 3.
The fact that ` is a co-Dedekind self-map satisfies (1), and the existence of the critical point
κ satisfies (2). Parts (3)(a)–(c) hold just as in the MUA case, where B = Vκ . For (4), it is
clear that the Laver function, on the basis of which both ` and `op are defined, is defined from
E, j, κ. And we have already verified condition (5) with Theorem 72. What remains to be checked
are the two requirements in (3): that E consists of “structure-preserving” functionals—having
essentially the same structure-preserving characteristics as j itself—and that E is “compatible
with” j. As we have seen, the extendible elementary embeddings with critical point κ, which
comprise E, can be seen as approximations to the given WA-embedding j; this takes care of
the “structure-preserving” requirement. To see that E is compatible with j, we again—as with
MUA—defer to the notion of compatibility developed in [6]. In the present context, this notion
of compatibilty can be described as follows: Given a WA-embedding j : V → V with critical
149
point κ, for each β > κ, there is an extendible embedding i : Vβ → Vη ∈ E with critical point κ
such that j Vβ = i.83
To verify that (`, `op, κ, E) is a strong blueprint for V − Vκ, we need to verify points (A)–(D)
in Definition 3. This verification was done in Theorem 73.
24.1
Restrictions of a WA-Embedding and Its Critical Sequence
In our analysis of the theory ZFC + BTEE + MUA, we showed that most of the interesting
properties of Dedekind self-maps, which we listed in the section “Properties of Set Dedekind
Self-Maps” (p. 133), had natural generalizations to MUA-embeddings j : V → V . However, one
of the properties in the original Properties list had to do with the critical sequence of the selfmap and its emergence on the basis of successive restrictions of the original self-map. We listed
several points (p. 142) that show that these particular characteristics of Dedekind self-maps do
not generalize well to the context of MUA embeddings.
The situation with WA is quite different. We indicate here how the stronger theory ZFC+WA
corrects the “deficiencies” we encountered in the theory ZFC + BTEE + MUA. One technical
point that cannot be ignored, however, is that the “weak” version of WA that we have been using
so far will not be adequate for our discussion in this section. The version of WA that we have
been using, and that is usually denoted WA0 , is BTEE + Amenability; the “full” version of WA,
however, is BTEE + Separationj . The results concerning WA that we have discussed so far can in
every case be carried out in the theory ZFC + WA0 . It is known [7], however, that our discussion
below about indiscernibilty of the terms of the critical sequence and regarding the self-application
operator · on j cannot be carried out in ZFC + WA0 . Therefore, for the remainder of this section,
our base theory will be the “full” ZFC + WA.
We recall that, given a Dedekind self-map j : A → A with critical point a, the terms of j 0 s
83
We continue the discussion that we began in the footnote on p.139, concerning compatibility, as it was developed
in [6], and as it pertains to the class of extendible elementary embeddings. In [6], the class of extendible embeddings
was shown to be captured by the suitable formula:
θext(i, κ, β, M ) :
∃δ > 0 ∃ζ [β = κ + δ ∧ M = Vζ ∧ i : Vβ → M is elementary with critical
point κ ∧ β < i(κ) < ζ].
Applying the definition of compatibility to the class Eκθext leads to the following: Suppose κ < λ < β, and
iβ : Vβ → Vη ∈ Eκθext . Then compatibility of iβ with j up to Vλ simply means that j Vλ = i Vλ . Then, to
say that Eκθext is compatible with j means that for each λ < j(κ) there is a β > λ and i : Vβ → M ∈ Eκθext that
is compatible with j Vβ up to Vλ . To show this requirement is met, the fact that for each β > κ, there is an
extendible embedding i : Vβ → Vη ∈ Eκθext such that j Vβ = i suffices. (See [6, Theorem 4.33].)
150
critical sequence a, j(a), j(j(a)), . . . arose as critical points of successive restrictions of j:
A0 = A
j0 = j : A → A
crit(j0 ) = a
A1 = j[A0 ]
j1 = j A1
crit(j1 ) = j(a)
An+1 = j[An ]
jn+1 = j An+1
crit(jn+1 ) = j n+1 (a).
Something similar occurs when we consider a certain sequence of restrictions of a given WAembedding j : V → V with critical point κ. We observed in our study of ZFC + BTEE + MUA
that j Vκ : Vκ → Vj(κ) is an elementary embedding, but we were unable to consider restrictions
like j Vj(κ), j Vj(j(κ)), . . . because the theory was not strong enough to admit such restrictions
as sets in the universe. In the theory ZFC+WA, we no longer have this limitation and we are now
able to observe that each of these restricted embeddings serves to bring to light the next term in
the critical sequence κ, j(κ), j(j(κ)), . . ., as we observed in the case of set Dedekind self-maps. In
particular, the observation that j Vκ : Vκ → j(Vκ) = Vj(κ) is an elementary embedding brings
to light the image j(κ) of κ. This is analogous to the discovery of the critical point j(a) for j B
obtained by restricting j : A → A to its range: j B : B → B, with B = j[A].
If we now restrict j to the codomain of j Vκ , we obtain (the set) j Vj(κ). Elementarity
tells us that, for any x ∈ Vj(κ), j(x) ∈ j(Vj(κ)) = Vj(j(κ)). In this way, the next term of j’s
critical sequence appears, namely, j(j(κ)). In general, the critical sequence for j can be seen to
arise as the sequence of successive ranks of the codomains obtained by considering restrictions
j Vκ, j Vj(κ) , j Vj(j(κ)), and so forth.
The critical sequence of j that we obtain in the theory ZFC + WA must be unbounded in the
universe: As we mentioned at the end of Section 23, any theory which includes both Amenability
and the statement that the critical sequence is bounded must be inconsistent.
A surprising fact about the terms of the critical sequence is that they are indistinguishable
from each other on the basis of properties formulated in the language {∈} (that is, formulas
that do not include the symbol j). More formally, the critical sequence hκ, j(κ), j(j(κ)), . . .i is a
j-class of ∈-indiscernibles [6, Theorem 3.18]. This means that, for any ∈-formula φ(x1 , . . . , xm)
and any two finite increasing sequences of integers i1 < i2 < . . . < im , k1 < k2 < . . . < km ,
V |= φ[j i1 (κ), . . ., j im (κ)] ⇔ V |= φ[j k1 (κ), . . . , j km (κ)].
One consequence is that each j n (κ) has all the same large cardinal properties as κ itself.
Using this observation, one can show that “almost all” cardinals in the universe are super-n-huge
for every n ∈ ω: Given m ∈ ω, let Xm denote the set of all cardinals α < j m(κ) that are supern-huge cardinals for every n. By indiscernibility, j m (κ) is super-n-huge for every n (since κ has
this property), and by indiscernibility again, j m(κ) admits a normal measure that contains Xm
151
(since κ has this property relative to X0 ). Finally, let us say that for any class C of cardinals,
almost all cardinals belong to C if, for all but finitely many m ∈ ω, C ∩ j m(κ) ∈ Dm . It follows
that if C = {α | α is super-n-huge for every n}, then almost all cardinals belong to C.
We consider next another sequence of restrictions of j that also leads to the critical sequence
κ, j(κ), j(j(κ)), . . .. First, we can apply j to the set function j Vκ : Vκ ,→ Vj(κ) to obtain
j(j Vκ) : Vj(κ) → Vj(j(κ)).
By elementarity of j, we expect j(j Vκ ) to be an inclusion map and a restriction of some elementary embedding, but which one? Formally applying j suggests that the new map is j(j) j(Vκ)
but the meaning of j(j) is not clear (certainly j itself is not in the domain of j).
We can address the problem in the following way. Let α > κ + 1. Let f = j Vα . Certainly
j Vκ = f Vκ .
Now we can apply j:
j(j Vκ ) = j(f Vκ ) = j(f ) Vj(κ).
This example motivates the following definition:
Definition 11. Suppose i, k : V → V are elementary embeddings. Then
[
i·k =
i(k Vα).
α∈ON
The operator · is called application. If i, k are elementary embeddings V → V , then i · k
is also an elementary embedding. We shall often write ik for i · k. Note that j · j gives precise
expression to the intuitive notation j(j). In particular, we may now write:
j(j Vκ ) = jj Vj(κ) .
Moreover, by elementarity of j, we have:
jj Vj(κ) : Vj(κ) ,→ Vj(j(κ)).
One may apply j in this way repeatedly (for instance, the next iteration yields the fact that
j(j(j Vκ)) is the inclusion map, describing the fact that Vj(j(κ)) ≺ Vj(j(j(κ)))). This sequence of
steps leads to the conclusion that there is an elementary chain of elementary submodels of V [6,
Proposition 3.12]:
Vκ ≺ Vj(κ) ≺ . . . ≺ Vj n (κ) ≺ . . . ≺ V.
This statement provides us with one other powerful consequence: Because Vκ ≺ V , we may
conclude that every first-order sentence that holds in V must also hold in Vκ . Moreover, since
κ = |Vκ|, this tells us that the “point” κ is a set representative of the wholeness V —intuitively,
κ can declare,84
“I am the totality.”
84
To make this point more concretely, one could define a model (κ, E) of set theory as follows: Let g : κ → Vκ
be a bijection. Define E on κ by:
α E β ⇔ g(α) ∈ g(β).
Then i : κ → V , defined by i = incl ◦ g (where incl : Vκ ,→ V is the inclusion map), is an elementary embedding;
in particular, for every sentence σ in the language {∈}, (κ, E) |= σ if and only if (V, ∈) |= σ.
152
In fact, by indiscernibility, each term of the critical sequence for j “embodies totality”
in this way. This observation allows us to form a rather special blueprint for ω: Let S =
{κ, j(κ), j(j(κ)), . . .} denote the j-class of critical points of j. Adapting the proof of Theorem 20(3) in minor ways, one may show (ω, s, 0) and (S, j S, κ) are Dedekind self-map isomorphic; in particular, there is a unique j-bijection F : ω → S that makes the following commutative
(adapting the diagram from Theorem 40 slightly):
ω
s
-
F
?
S
ω
F
j S
-
?
S
It follows that S is a blueprint for ω of a rather special kind. Unlike the ordinary natural
numbers, each element of S, as we have been saying, “embodies the totality.” Though members
of S are distinct and correspond to distinct natural numbers, each stands for totality and, at that
level, each is indistinguishable from the others.85
One of the motivating themes of this paper has been to consider the set of natural numbers in
a different light—as precipitations of transformational dynamics of an unbounded field, embodied
in the concept of a Dedekind self-map. One philosophical motivation for this change of viewpoint
is the desire to recognize, in accord with points made by Maharishi and other early philosophers
(Section 2), the natural numbers as different on the surface but fundamentally the same, being
in each case an individual expression of wholeness. In Maharishi’s approach, this viewpoint
is expressed by declaring that each natural number is an expression of the Absolute Number.
Viewing the natural numbers as arising from the Dedekind self-map (S, j S, κ) expresses this
point of view, in two ways. First, any initial Dedekind self-map gives expression to the idea
that individual natural numbers are precipitations of an underlying field (represented by the selfmap). Second, each natural number arising from this particular self-map (S, j S, κ) arises as the
(Mostowski) collapse of “wholeness” (represented by the elements of S) to a particular value. 86
We summarize the observations of this section in the following theorem.
Theorem 74. Suppose j : V → V is a WA-embedding with critical point κ.
(1) The critical sequence of j, hκ, j(κ), j(j(κ)), . . .i, is unbounded in ON and is a j-class of
∈-indiscernibles.
(2) Vκ ≺ Vj(κ) ≺ . . . ≺ Vj n (κ) ≺ . . . ≺ V.
(3) Vκ ≺ V . Moreover, there exist a binary relation E on κ and an elementary embedding
i : (κ, E) → (V, ∈). In particular, for every sentence σ in the language {∈}, (V, ∈) |= σ if
and only if (κ, E) |= σ.
85
From the Vedic perspective, Maharishi remarks, “. . . every point of consciousness is infinitely correlated with
every other point. . . [28, p. 64].”
86
Maharishi remarks, “At the door of the Transcendent, the finite numbers begin to tap the unlimited reservoir
of the Absolute Number, rendering every finite value on the ground of the infinite—the Absolute” [25, p. 633].
153
(4) Let S = {κ, j(κ), j(j(κ)), . . .}. Then (S, j S, κ) is Dedekind self-map isomorphic to (ω, s, 0),
and is therefore a blueprint of the set of natural numbers.
24.2
A Connection Between the Critical Sequence of a WA-Embedding and
Mathematical Physics
We have seen that the critical sequence S of a WA-embedding j : V → V with critical point
κ provides a blueprint for ω in terms of mathematical representatives of wholeness (namely,
κ, j(κ), j(j(κ)), . . .). This result gives expression to the insight that the expressed value of the
natural numbers has its reality in the unmanifest level of the natural numbers. Maharishi’s
perspective on this point extends considerably further: Not only is the reality of the expressed
value of numbers to be found within the unmanifest, but in fact the reality of the universe itself—
of the space-time continuum, of every particle of manifest existence—is also to be found at the
unmanifest level.
All the verses of Veda are within AK. . .all the Devas (the administering intelligence
of the universe) are lively within AK—the universe is lively within AK—the entire
dynamism of the Veda and Vedic Literature, and the corresponding expression of the
Veda and Vedic Literature in terms of the physical expression within the unmanifest
structure of self-referral consciousness, presents the self-referral ocean of consciousness
as the invincible Cosmic Catalyst of the entire ever-expanding universe. [25, p. 545]
The notion of time comes into being with the phenomenon of sequence arising in
the self-referral state of Samhita of Rishi, Devata, and Chhandas—the unmanifest
continuum. The holistic value of Natural Law—Samhita—has within it the notion of
time and space in the realization of itself—Rishi, Devata, Chhandas—the realization
of the three-in-one structure of its unified nature. [25, p. 535]
We describe here how the “absolute” blueprint S for ω that we located in the previous
section can be elaborated further to produce a separable Hilbert space—the sort of mathematical
object that serves as a mathematical foundation for development of quantum mechanics. In this
paper, we have not provided physics applications that make use of the Hilbert space we define.
Our intention is to just to show that the etheric realms of existence that are pointed to by the
Wholeness Axiom can be applied to provide a new account in “absolute terms” not only of the
set of natural numbers, but also of structures that are used to understand the physical world—in
this case, a Hilbert space, suitable for use in an application of quantum mechanics.
Recall that the j-class S is obtained by repeated applications of j to its critical point κ. It
should be pointed out that all of the obvious elementary embeddings V → V definable from j
that we have discussed in earlier sections—such as j ◦ j, j ◦ (j ◦ j), . . .—also have κ as their critical
point. We now discuss how applying j to itself (using the application operator ·) in all possible
ways produces embeddings which, collectively, have an infinite number of other critical points.
For instance, the critical point of j · j turns out to be j(κ). (Here is the quick proof: Apply j
to the true sentence “the critical point of j is κ” to obtain “the critical point of j · j is j(κ).”
154
A more careful proof is given in Proposition 76.87 ) As we will see, the collection Aj of all the
87
Here, when applying j to “the critical point of j is κ,” we resisted the temptation of writing “the critical point
of j(j) is j(κ),” since, as discussed earlier, ‘j(j)’ makes no sense; in place of j(j), we wrote j · j since j · j can be
understood as playing the same role as j(j) in elementary embedding arguments.
In this footnote, we clarify the connection between j ·j and the intuitively appealing notation j(j). The technique
we discuss is valid in the context of elementary embeddings i : Vλ → Vλ , with critical point < λ, and for which
λ is a limit ordinal. Such embeddings are reflections of elementary embeddings V → V in the realm of sets, and
have essentially the same characteristics as the embeddings V → V . However, embeddings Vλ → Vλ provide a
special convenience because of the fact that we can explore layers in the universe above Vλ —in particular, the sets
belonging to Vλ+1 . The technique we describe should therefore be viewed simply as a heuristic for our study of
elementary embeddings V → V .
We begin with the observation that every elementary embedding i : Vλ → Vλ can be lifted to a “partially”
elementary embedding î : Vλ+1 → Vλ+1 . As a function, î is defined on the new sets in Vλ+1 (namely, the subsets
of Vλ ) by
[
î(X) =
i(X ∩ Vα ).
α<λ
Now î is not exactly an elementary embedding on Vλ+1 , but it is elementary relative to all bounded formulas. We
take a moment to discuss bounded formulas. Here is an example of a bounded formula:
∀x ∈ Vλ ∃y ∈ Vλ (x ∈ y).
Notice that in this formula, all quantified variables (“for all x”, “there exists y”) are bound to Vλ . The formula
says that for every element of Vλ , there is another element of Vλ which contains the first as an element. This is a
true statement, and it is bounded relative to Vλ+1 because every quantified variable is bound to a set (in this case,
Vλ ) that is a member of Vλ+1 . Because it is bounded like this, this statement is also true relative to Vλ+1 . Vλ+1
is not a model of set theory, but it is an ∈-structure, and does satisfy some of the ZFC axioms. It is not hard to
prove that the statement above is true not only in the model V , but also in the model Vλ+1 .
Here is the same formula without the bounds; we’ll call this formula (for now) ψ:
ψ : ∀x ∃y (x ∈ y).
This statement says that any set is contained in another set. It is certainly true in V . But it is not true in Vλ+1 .
For instance, if x = Vλ , it is not true that there is another set y in Vλ+1 for which x ∈ y.
The first of the statements above is called the relativization of the second formula ψ to Vλ . The first formula is
often denoted ψVλ . ψVλ is true in both V and Vλ+1 whereas ψ is true in V but not Vλ+1 .
These considerations lead to the following.
Fact 1. Suppose i : Vλ → Vλ is elementary. Let î be the extension of i to Vλ+1 described above. Then for any
formula φ(x1 , x2 , . . . , xn ) and any a1 , a2 , . . . , an ∈ Vλ+1 ,
(a) φVλ [a1 , a2 , . . . , an ] is true if and only if it is true in Vλ+1 , and
(b) φVλ [a1 , a2 , . . . , an ] iff φVλ [î(a1 ), î(a2 ), . . . , î(an )].
Fact 1(b) is a precise way of saying that î is elementary relative to certain types of formulas. All this effort
allows us to view the application operation in a more comfortable way:
Fact 2. For any nontrivial elementary embeddings i, k : Vλ → Vλ ,
i · k = î(k).
î(k) makes sense here because k ⊆ Vλ . For this reason, one sometimes writes i(k) instead of i · k, as a convenient
notation, even though it is not correct to do so.
To illustrate how this approach works, we prove a slight generalization of the fact mentioned in the main text,
that crit(j · j) = j(κ):
155
elementary embeddings obtained from j through self-application has an internal structure that
lends itself to analytical methods. In particular, Aj admits a natural metric that turns it into a
dense-in-itself metric space. The completion Āj of Aj is then seen to be a perfect Polish space,
which is a traditional starting point for defining a complex Hilbert space. This kind of Hilbert
space can be used as a mathematical foundation for development of quantum mechanics. This
path into mathematical foundations of physics has the advantage that the underlying space Āj
is composed of the “absolute elements” provided by the Wholeness Axiom, or limits of sequences
of such embeddings.
We begin once again with a fixed WA-embedding j : V → V with critical point κ. For any
n ≥ 1, let κn = j n (κ), and let κ0 = κ.
The next proposition collects together a few well-known properties of this application operation.
Proposition 76.
(1) crit(j · j) = j(κ).
(2) For all n ≥ 1, j · j(κn) = κn+1 .
(3) (j · j) ◦ j = j ◦ j.
Proof of (1). Let α > κ. Let i = j Vα. Since crit(i) = κ, the following is true in V :
∀β < κ (i(β) = β) and i(κ) > κ.
Proposition 75. For any nontrivial elementary embeddings i, k : Vλ → Vλ ,
crit(i · k) = i(crit(k)).
This proposition says that to find the critical point of the embedding i · k, simply apply i to the critical point
of k. We are not assuming that the critical point of either embedding is κ, or that the critical points of the two
embeddings are the same. In the special case in which i = k = j and crit(k) = κ, the result in the main text
follows: crit(j · j) = j(κ).
Here’s is the logic: Let α = crit(k). The following formula states that α is the critical point of k:
k(α) 6= α and for all β < α, k(β) = β
In fact, this is a formula of the form φ(x, y), defined above, with the substitutions x = k and y = α. There is just
one quantified variable: ∀β < α. This could be written equivalently as ∀β ∈ Vλ (β < α −→ . . . ) (since α ∈ Vλ ), so
the quantifier is bound to Vλ .
So, φ(k, α) is a true statement in V . Since it is bounded in Vλ+1 (in the sense above), it is also true in Vλ+1 .
Now we apply î to φ(k, α), and we may therefore conclude that φ(î(k), î(α)) holds in Vλ+1 , since î is elementary
relative to such a formula. Again, by Fact 1(a), φ(î(k), î(α)) must also hold in V .
Finally, let’s consider what the formula φ(î(k), î(α)) actually says. Returning to the informal form of the formula,
we have
î(k)(î(α)) 6= î(α) and for all β < î(α), î(k)(β) = β.
In other words, the critical point of î(k) is î(α). Now î(α) = i(α) = i(crit(k)), so we have shown that
crit(i · k) = crit(î(k)) = i(crit(k)).
156
By elementarity of j, the following also holds in V :
∀β < j(κ) (j(i)(β) = β),
(+)
and
(++)
j(i)(j(κ)) > j(κ).
Since (+) and (++) hold for all large enough α, we conclude that both hold true if j(i) is
replaced by j · j. Therefore: crit(j · j) = j(κ).
Proof of (2). Let α > κn−1 . Let i = j Vα. Applying j to the formula
i(κn−1 ) = κn
and noting that j(κn−1 ) = κn yields
(+)
j(i)(κn) = j(i)(j(κn−1)) = j(κn ).
Since κn−1 ∈ dom i, κn ∈ dom j(i), and so j(i) and j · j agree on κn . We have from (+):
j · j(κn ) = j(i)(κn) = j(κn ) = κn+1 .
Proof of (3). Let x be a set and α an ordinal such that x ∈ Vα. Let i = j Vα and let y = j(x).
Then, applying j to the formula
i(x) = y
yields
(j · j)(j(x)) = j(i)(j(x)) = j(y) = (j ◦ j)(x),
as required. (j was applied in the middle step.)
For any nontrivial elementary embedding k : V → V (whose critical point may or may not
be κ), define k[n] and k[n] inductively by:
k[1] = k[1] = k
k[2] = k[2] = k · k
k[n+1] = k · k[n]
k[n+1] = k[n] · k.
Fact 0. crit(j [n+1]) = κn .
Proof. By induction on n.
We wish to define a collection Aj consisting of all the elementary embeddings V → V
obtainable by applying j to itself finitely many times. To do this formally, we first define j-terms
to be all terms obtainable by finitely many applications of the following rules:
157
(j) is a j-term
if i, k are j-terms, (i · k) is a j-term
Then we define Aj to be the collection of j-terms. We also define
crit(Aj ) = {γ | γ is the critical point of some k in Aj }.
A fact that we have already assumed without proof (see p. 152) is that all elements of Aj
are elementary embeddings.88
Proposition 77. Let id be the identity function (on V). Then id 6∈ Aj .
Proof. We proceed by using the inductive definition of Aj . Certainly j 6= id. Assuming neither
i, k equals id, we show i · k 6= id. For this we may invoke Proposition 75: Since k 6= id, k has
a critical point, and so by the Proposition, i(crit(k)) = crit(i·k). Since crit(i·k) exists, i·k 6= id.
In the literature, it is observed that Aj is a left-distributive (LD) algebra (usually, though,
nontrivial elementary embeddings Vλ → Vλ are considered, where λ is a limit ordinal). Aj is the
LD subalgebra generated by j (and this can be used as the formal defintion). See [12, 22].
Fact 0 tells us that the familiar elements of the critical sequence of j all turn out to be
critical points of appropriately defined embeddings; therefore, crit(Aj ) must be infinite. A natural
question is whether there is an embedding in Aj whose critical point is not a term in the critical
sequence for j. It turns out that there are many such embeddings; the least such critical point
lies between κ2 and κ3 .89
88
In fact, if k ∈ Aj , then k is a WA-embedding; that is, (V, ∈, k) is a model of ZFC + WA. Here, by “WA,” we
mean the stronger form: WA = BTEE + Separation j .
89
For the interested reader, we exhibit an embedding in Aj whose critical point γ is not a term in the critical
sequence for j. It can be shown that crit(Aj ) ∩ κ3 = {κ0 < κ1 < κ2 < γ}, so that γ is the least in crit(Aj ) that
does not belong to the critical sequence.
We prove that ((j · j) · j) · (j · j) has a critical point γ strictly between κ2 and κ3 . As we have already shown,
crit(j · j) = κ1 < κ2 . We write:
crit(j · j) < κ2
(∗)
Now we apply the map (j · j) · j to the formula (*): (and suppress the · for readability):
((jj)j)(crit(jj)) < ((jj)j)(κ2).
By Proposition 75:
`
´
crit ((jj)j)(jj) < ((jj)j)(κ2 ).
We compute ((jj)j)(κ2 ): Notice that jj(κ1 ) = ((jj)◦j)(κ0 ) = (j◦j)(κ0 ) = κ2 . Likewise, jj(κ2 ) = (j◦j)(κ1 ) = κ3 .
So we have
((jj)j)(κ2 ) = ((jj)j)(jj(κ1)) = jj(j(κ1 )) = jj(κ2 ) = κ3
Combining the last two displays, we have
`
´
crit ((jj)j)(jj) < κ3 .
`
´
Let γ = crit ((jj)j)(jj) ; we have shown so far thatγ < κ3 . Next we show that γ > κ2 . Again, since crit(jj) = κ1 ,
it follows that
(∗∗)
crit(jj) > κ0 .
158
Fact 1 [22]. crit(Aj ) has order-type ω. In fact, for each n ∈ ω, crit(Aj ) ∩ [κn , κn+1 ) is finite.
Also, for each n > 2, crit(Aj ) ∩ (κn , κn+1 ) is nonempty.
Fact 1 tells us that the critical sequence S of j is a proper subcollection of crit(Aj ). The fact
that crit(Aj ) is well-ordered of type ω implies there is a successor function sj : crit(Aj ) → crit(Aj ),
defined by
sj (α) = least β ∈ crit(Aj ) such that β > α.
It is easy to see that (crit(Aj ), sj , κ) is an initial Dedekind self-map.
We state next (without proof) a remarkable result due to Laver. It tells us that any element
of Aj is completely determined by its behavior on critAj .
Fact 2 [22]. If i, k ∈ Aj and i crit(Aj ) = k crit(Aj ), then i = k.
Lemma 78. Suppose k ∈ Aj and γ ∈ crit(Aj ). Then k(γ) ∈ crit(Aj ).
Proof. We need to find i ∈ Aj such that crit(i) = k(γ). Let ` ∈ Aj be such that γ = crit(`). Let
i = k · `. Then
crit(i) = crit(k · l) = k(crit(`)) = k(γ).
Using Fact 1, we define e : ω → crit(Aj ) to be the unique increasing enumeration of crit(Aj ).
For every pair (i, k) from Aj for which i 6= k, let mi,k be the least n ∈ ω such that i(e(n)) 6=
k(e(n)). By Fact 2, mi,k exists. Define d : Aj × Aj → R by
(
0
if i = k
d(i, k) =
1/(mi,k + 1) otherwise
Proposition 79. d is a metric on Aj .
Proof. We’ll prove that the triangle inequality holds; the other properties of a metric are
immediate. Let x, y, z ∈ Aj . We assume that x, y, z are all different (the other cases are easy). Let
We apply (jj)j to (**); applying Proposition 75 as well, we have:
`
´
(∗ ∗ ∗)
γ = crit ((jj)j)(jj) > ((jj)j)(κ0 ).
Next, we compute ((jj)j)(κ0 ): Let y = j(κ0 ). Applying jj, we get,
`
´
(jj)(y) = ((jj)j) (jj)(κ0 ) = ((jj)j)(κ0).
Since y = j(κ0 ), (jj)(y) = (jj ◦ j)(κ0 ) = (j ◦ j)(κ0 ) = κ2 , and this leads to:
κ2 = ((jj)j)(κ0 ).
Plugging into (***) yields:
The conclusion is that
`
´
γ = crit ((jj)j)(jj) > κ2 .
`
´
κ2 < γ = crit ((jj)j)(jj) < κ3
and so we have exhibited a member γ of crit(Aj ) that does not lie in the critical sequence of j.
159
x0 , y 0, z 0 denote their corresponding restrictions to crit(Aj ). Let m be least such that x0 (e(m)) 6=
y 0 (e(m)) and let n be least such that y 0 (e(n)) 6= z 0 (e(n)). Let r be least for which x0 (e(r)) 6=
z 0 (e(r)). We show that r ≥ min{m, n}. Suppose s < min{m, n}. Then
x0 (e(s)) = y 0 (e(s)) and y 0 (e(s)) = z 0 (e(s)),
so x0 (e(s)) = z 0 (e(s)). Therefore, x0 and z 0 do not differ on any s < min{m, n}, whence r ≥
min{m, n}.
To complete the proof, we must show that
1
1
1
+
≥
m+1 n+1
r+1
Without loss of generality, assume m ≤ n. Then, since r ≥ m, then
1
1
1
1
≤
≤
+
,
r+1
m+1
m+1 n+1
as required.
Definition. A metric space (X, ρ) is dense-in-itself if for every x ∈ X, there is a sequence
hyn in∈ω , where each yn is different from x, which converges to x (i.e., the sequence hρ(x, yn)in
converges to 0).
Proposition 80. (Aj , d) is dense-in-itself.
Proof. Let k ∈ Aj . We define a sequence hin in∈ω of elements of Aj , all different from k, such
that in → k. To define in , let α = e(n) ∈ crit(Aj ). Let mn be large enough so that κmn > k(α);
it follows that κmn > α. Define
in = j [mn+3] · k.
Claim 1. For each n, in he(0), e(1), . . . , e(n)i = k he(0), e(1), . . ., e(n)i. Therefore, in → k.
Proof of Claim 1. Let β ∈ crit(Aj ), β ≤ α. Let y = k(β). Applying j [mn +3] to the statement
k applied to β = y
yields
j [mn+3] · k applied to j [mn +3] (β) = j [mn+3] (y).
But j [mn+3] (β) = β and j [mn +3] (y) = j [mn+3] (k(β)) = k(β) since κmn > k(β). We therefore have:
j [mn +3] · k applied to β = k(β).
The result follows.
Claim 2. For each n, in 6= k.
160
Proof of Claim 2.
Case I. crit(k) ≥ κmn +2 . We show in and k have different critical points in this case:
crit(in ) = crit(j [mn+3] · k) = j [mn +3] (crit(k)) > crit(k),
as required.
Case II. crit(k) < κmn +2 . By Fact 1, crit(Aj ) ∩ [crit(k), κmn+2 ) is finite. Let β be the largest
element in this finite set. Since k(β) ∈ crit(Aj ) (by Lemma 78), k(β) > β (since β ≥ crit(k)),
and β is the largest element of crit(Aj ) ∩ [crit(k), κmn +2 ), it follows that k(β) ≥ κmn +2 . Applying
j [mn +3] to the statement
k applied to β is k(β)
yields
j [mn +3] · k applied to j [mn+3] (β) is j [mn+3] (k(β)).
But since j [mn+3] (β) = β and k(β) ≥ κmn +2 ,
(j [mn+3] · k)(β) > k(β).
Therefore, in (β) 6= k(β).
Proposition 80 will be useful when we look at the completion of (Aj , d); it will tell us
that the completion must be a perfect Polish space [21, Theorem 1] and must therefore have
cardinality 2ℵ0 (see for example [31, p. 66]). Before discussing this point, we will describe the
completion of (Aj , d) in some detail. We will consider three ways of performing the construction.
The least interesting way is to form equivalence classes of Cauchy sequences of members of Aj .
This approach is certainly sufficient to provide a complete metric space in which Aj is densely
embedded. But the elements of the space are not especially useful.
A second approach to forming the completion—and we will use this approach to produce
the completion in two other ways—is to embed (Aj , d) into a complete metric space and take the
closure of the image. The resulting set, being a closed subspace of a complete metric space, must
itself be a complete metric space.
The all-purpose complete metric space that is often used in this strategy is obtained by
defining a certain metric on the set of all functions from the given space into R that are bounded
and continuous. In the present setting, let Y = {f : Aj → R | f is bounded and continuous}.
Define a metric σ on Y by
σ(f, g) = sup{|f (i) − g(i)| : i ∈ Aj }.
Now we embed (Aj , d) into (Y, σ) using the following mapping H: For each k ∈ Aj obtain an
fk ∈ Y defined as follows:
fk (i) = d(i, k) − d(i, j).
One shows that each fk is bounded and continuous (relative to the metric d). One also shows
that H is an isometry with respect to d and σ; that is, H is one-one and for all `, k ∈ Aj ,
σ(f` , fk ) = d(`, k).
161
We outline the proofs:
Claim A. For each k ∈ Aj , fk is bounded and continuous.
Proof. It is easy to verify that range(fk ) is a subset of [−1, 1], whence fk is bounded. To prove
1
1
continuity we show that for each m ∈ ω, if d(i, i0) < 2m+2
, then |fk (i) − fk (i0 )| < m+1
.
|fk (i) − fk (i0 )| =
=
≤
≤
≤
|d(i, k) − d(i, j) − d(i0 , k) + d(i0 , j)|
|(d(i, k) − d(i0 , k)) + (d(i0, j) − d(i, j))|
|d(i, k) − d(i0, k)| + |d(i0 , j) − d(i, j)|
d(i, i0) + d(i, i0)
2d(i, i0)
1
<
.
m+1
Claim B. H is one-one.
Proof. Suppose fk = f` . We show that k = ` by showing that d(k, `) = 0. Since fk = f` , the
two functions agree at ` ∈ Aj . We have
0 = fk (`) − f` (`) = (d(`, k) − d(`, j)) − (d(`, `) − d(`, j)) = d(`, k) − d(`, `) = d(`, k).
Claim C. For every k, ` ∈ Aj , σ(f` , fk ) = d(`, k).
Proof. First notice that by the triangle inequality, for any i ∈ Aj ,
d(k, `) ≥ |d(i, k) − d(i, `)|,
and so the supremum of the set {|d(i, k) − d(i, `)| : i ∈ Aj } is achieved when i = `. Therefore, we
have:
σ(fk , f` ) = sup {|fk (i) − f` (i)| : i ∈ Aj }
= sup {|d(i, k) − d(i, j) + d(i, j) − d(i, `)| : i ∈ Aj }
= sup {|d(i, k) − d(i, `)| : i ∈ Aj }
= |d(`, k) − d(`, `)|
= d(`, k).
These facts tell us that for each k ∈ Aj , fk is the representative of k in the space Y .
In this way, we have identified each k ∈ Aj with a bounded continuous real-valued function.
Unfortunately, in representing k by fk , one loses much information about the critical point and
the values of k at crit(Aj ). In particular, given j and e, one cannot typically compute the critical
point of k from fk without testing fk against arbitrarily many elements of Aj . For this reason,
we will make use of an alternative approach.
Another space in which to embed (Aj , d) is the space ω ω of all functions from ω to ω. This
space is extremely useful because of its close relationship to R and several of its subspaces. A
subspace of ω ω that will be important for our work is the set ω ↑ω of strictly increasing functions
from ω to ω. We record some useful facts about ω ω and ω ↑ω , and then describe the embedding
of (Aj , d). An important related subspace
162
Theorem 81 (ω ω ).
(1) Define ρ : ω ω × ω ω → R by
(
0
ρ(f, g) =
1/(m + 1)
if f = g
if f =
6 g and m is least for which f (m) 6= g(m)
Then (ω ω , d) is a complete metric space. The induced topology has a base consisting of sets
of the form Np = {f : ω → ω | f ⊃ p}, where p is a finite sequence of elements of ω.
(2) ω ω , with the topology induced by the metric ρ, is homeomorphic to both P, the set of irrationals (with the topology induced by R), and P[0,1] = P ∩ [0, 1].
(3) There is a continuous bijection τ : ω ω → [0, 1) that is “measure-preserving” in the following
sense: Let µ be the product measure on ω ω , obtained from the measure on ω that assigns
the value 2−n−1 to each n ∈ ω, and let λ denote Lebesgue measure on [0, 1). Then for any
µ-measurable set X in ω ω , τ (X) is λ-measurable in [0, 1) and µ(X) = λ(τ (X)).
(4) The subspace ω ↑ω of strictly increasing functions is closed in ω ω .
Remark 22. The results mentioned in Theorem 81 are well-known [4, 30] so we omit the proofs,
but include a few remarks about them. The proof of (1) is similar to Proposition 79. For (2), we
give the construction that underlies the proof; we describe the constructions that demonstrate
ω ω is homeomorphic to P and also to P[0,1], simultaneously. We begin by observing that any
bijection v : ω → Z (Z is the set of integers) induces a homeomorphism v̄ : ω ω → Zω , defined by
v̄(f )(n) = v(f (n)).
Now build a collection {Is | s ∈ Z<ω } of open intervals in R (or [0, 1]) as follows: For each n ∈ Z,
let Ihni = (n, n + 1) (or, in the case of [0, 1], let Ihni be the following collection of subintervals of
(0, 1):
1 1
1 3
3 7
7 15
1 1
,
,
,
,
,
,
,
,
,
, . . . .)
. . .,
16 8
8 4
4 4
4 8
8 16
For any s ∈ Z<ω , assuming Is has been defined (for either construction), let {Is_ n | n ∈ Z} be a
disjoint family of open subintervals of Is , ordered like Z and lying next to each other: The right
endpoint of Is_ n is the left endpoint of Is_ n+1 , with union dense in Is and each having diameter
≤ half the diameter of Is . Now define τ : Zω → R (or τ : Zω → (0, 1)) by
\
{τ (g)} = {Ig n | n ∈ ω}.
Then τ is a homeomorphism Zω → R − H (or Zω → (0, 1) − H), where H is a countable dense
set (consisting of the endpoints of the Is ’s). If Zω is given the lexicographic ordering, then τ
is order-preserving. Since H is order-isomorphic to Q (and to Q ∩ (0, 1)), there is an order
isomorphism R → R (or (0, 1) → (0, 1)) taking H to Q (or H to Q ∩ (0, 1)). We have exhibited
homeomorphisms Zω → R − Q = P and Zω → P[0,1].
163
For (3), we first describe τ : ω ω → [0, 1): Given f : ω → ω, τ (f ) is the real number x ∈ [0, 1)
obtained as follows: We first specify a sequence s of 0s and 1s, coded by f : start with f (0) ones,
then append a 0, then f (1) ones, then a 0, and so forth. Now define x by
x=
∞
X
s(i)
i=1
2i
.
It is not hard to see that τ is a continuous bijection; however, τ −1 fails to be continuous at precisely
the dyadic rationals. Let µ be the product measure on ω ω , where we first form the measure space
based on ω ω : For each j ∈ ω, let Ej be the measure space (ω, P(ω), ν), with ν(n) = 2−n−1 .
The product measure space is (ω ω , B, µ) where B is the smallest σ-algebra containing subsets
A ⊆ ω × ω × . . . = ω ω of the form A0 × A1 × · · · where all but finitely many of the Ai are equal
Q
to ω, and µ(A) = j ν(Aj ).
A couple of simple examples will clarify the relationship between µ and Lebesgue measure λ
on [0, 1). We consider how µ acts on a couple of basic open sets of ω ω . Let p = h0i; we compute
µ(Np): Since, as a product of components, Np has only one component different from ω, namely
{0}, then µ(Np ) = 20−1 · 1 · 1 · . . . = 21 . Now τ (0) = 0, and it is easy to see that τ (Np) = [0, 12 ).
Thus, λ(τ (Np)) = 21 = µ(Np ). Likewise,
1 3
µ(Nh1i) = 2−2 = λ
,
= λ(τ (Nh1i)).
2 4
We give a proof of (4): Suppose f ∈ ω ω − ω ↑ω , and let n be least such that f (n) ≥ f (n + 1).
Let p = f n + 2. Then f ∈ Np and Np ∩ ω ↑ω = ∅. Therefore, whenever an element f of ω ω does
not belong to ω ↑ω , there is an open set containing f and missing ω ↑ω ; hence, ω ↑ω is closed.
Next, we embed (Aj , d) into (ω ω , ρ) by mapping each k ∈ Aj to the function fk : ω → ω
defined by:
(∗∗)
fk (n) = e−1 (k(e(n))).
One shows easily that the map F : k 7→ fk is an isometry: Let i 6= k in Aj . Notice that for
each m, i(e(m)) 6= k(e(m)) if and only if fi (m) 6= fk (m). So
ρ(fi , fk ) = 1/(m + 1) where m is least for which fi (m) 6= fk (m)
= 1/(m + 1) where m is least for which i(e(m)) 6= k(e(m))
= d(i, k)
We also observe here that each fk is strictly increasing: Certainly any k ∈ Aj is strictly increasing.
Suppose m < n ∈ ω. Since both k and e are strictly increasing,
fk (m) < fk (n) ⇔ e−1 (k(e(m))) < e−1 (k(e(n)))
⇔ k(e(m)) < k(e(n))
⇔ e(m) < e(n)
⇔ m < n.
164
Therefore, the image B of (Aj , d) under F is a representation of Aj by increasing functions
¯
ω → ω.90 Let Āj be the closure of B in ω ω . Let d¯ be the restriction of ρ to Āj × Āj . Then (Āj , d)
is (a third form of) the completion of (Aj , d).
A complete metric space is separable if it has a countable dense set. The completion X̄ of
any countable metric space X is always separable since X is dense in its completion. A point x
in a metric space X is said to be an isolated point if {x} is an open set in X. A Polish space is
a complete separable metric space. A metric space is perfect if it has no isolated points. (Both
Euclidean spaces and Hilbert spaces are perfect.) The first part of the following proposition is
well-known:
Proposition 82. [21] The completion of a countable dense-in-itself metric space is a perfect
¯ is a perfect Polish space.
Polish space. Therefore (Āj , d)
Below, we list some additional properties of (Āj , d̄). These require a few preliminary definitions. Let Y be a Polish space. A subset X of Y is nowhere dense if, for every open subset U of
Y , there is an open set V ⊆ U such that V ∩ X = ∅. A set X ⊆ Y has universal measure zero if,
for every diffused (takes singletons to 0) Borel measure µ on Y , there is a µ-measure zero Borel
set in Y that includes X. X has Marczewski measure zero if, for every perfect subset P ⊆ Y ,
there is a perfect subset Q ⊆ P such that Q ∩ X = ∅.
Theorem 83.
(1) All elements of Āj are strictly increasing; that is, Āj ⊆ ω ↑ω .
(2) The set ω ↑ω is nowhere dense in ω ω ; thus, in particular, Āj is nowhere dense in ω ω .
(3) There exists a finite diffused Borel measure µ on ω ω such that µ(Āj ) > 0 (in other words,
Āj does not have universal measure zero).
(4) (Āj , d̄) is closed and bounded in (ω ω , ρ) but not compact. Indeed, we can find an infinite
subset X = {i1 , i2 , . . .} of Aj such that F [X] has no limit point in Āj (indeed, not even in
ω ω ).
(5) There is a real number c > 0 and an infinite decreasing
T chain F0 ⊇ F1 ⊇ . . . of nonempty
closed subsets of Āj for which hδ(Fn )i → c and yet n≥0 Fn = ∅.
Proof of (1). Suppose hin in → k, where k ∈ Āj and each in belongs to Aj . As we observed
earlier, each in is strictly increasing. Let m ∈ ω. It suffices to show k(m) < k(m + 1). Let n0 ∈ ω
be large enough so that
(+)
for all n ≥ n0 , in m + 2 = k m + 2.
90
Dougherty-Jech [12] discuss B starting from first principles. A subset A of ω ω of strictly increasing functions
is an embedding algebra if A is equipped with a binary operation · so that, for all f, g, h ∈ A,
(1) f · (g · h) = (f · h) · (f · h) (self-distributivity)
(2) if cr(f ) denotes the least integer moved by f (whenever there is one), then cr(f · g) = f (cr(g)).
The set B is naturally an embedding algebra: Define · on B by fi · fk = F (i · k). Note that for any fk ∈ B,
cr(fk ) = e−1 (crit(k)). With these definitions, (B, ·) can be shown to be an embedding algebra.
165
Since in is increasing, in (m) < in (m + 1). It follows by (+) that k(m) < k(m + 1), as required.
Proof of (2). The basic open sets of ω ω are of the form Np = {f : ω → ω | f ⊃ p}, where p is
a function p : n → ω, for some n ∈ ω. Given such a basic open set Np, we exihibit q for which
Nq ⊆ Np and Nq misses Āj . If p is not strictly increasing, then we let q = p; this works because
all elements of Āj are strictly increasing. If p is strictly increasing, let q = p_ h1, 0i. Clearly q is
not strictly increasing, so Nq misses Āj , yet is a subset of Np .
Proof of (3). It suffices to show that Āj is not universal measure zero. But this follows from
the fact (see [30]) that every universal measure zero set has Marczewski measure zero (but no
perfect set could have Marczewski measure zero).
Proof of (4). Recall that for each k ∈ Aj , fk : ω → ω is defined as in (∗∗) by fk (m) =
e−1 (k(e(m))). First we note that the ρ-diameter of ω ω is 1:
sup{ρ(f, g) | f, g ∈ ω ω } = 1.
Thus, in particular, the space Āj is bounded.
To show Āj is not compact, we will make use of some known results about Aj . Let us say
γ
that, for any limit ordinal γ and any i, k ∈ Aj , i = k if for all x ∈ Vγ , i(x) ∩ Vγ = k(x) ∩ Vγ ; it
γ
can be shown [11] that = is an equivalence relation on Aj . We refer the reader to [11] for proofs
of the following two facts:
γ
Fact A. [11, Section XII, Lemma 4.4]. Suppose i = k and α < γ. Then if i(α) < γ, we have that
i(α) = k(α).
Fact B. [11, Section XII, Lemma 4.13]. Let j : V → V be a WA-embedding with critical point κ.
Let n ∈ ω. Then there is i ∈ Aj such that
κn+1
i = j n.
Claim. For each n ∈ ω there is in ∈ Aj such that crit(in ) = κ and in (κ) = κn .
κn+1
Proof. Given n ∈ ω, using Fact B, we obtain in ∈ Aj such that in = j n . Since κ < κn+1 and
j n (κ) < κn+1 , by Fact A, we conclude that in (κ) = j n (κ).
Using the notation from the Claim, let X = {fin | n ∈ ω}. Notice that for each n, fin ∈ Nhni,
and for m 6= n, Nhni ∩ Nhmi = ∅. It follows that X is infinite. To see X has no limit point (even
in ω ω ), let g ∈ ω ω , and let n = g(0). Then g ∈ Nhni and X ∩ Nhni = {fin }. This shows g is not
a limit point of X, since in a metric space, a limit point of a set Y has the property that every
open set containing the limit point contains infinitely many points of Y .
Proof of (5). Let X = {fin | n = 1, 2, . . .}, from part (4). Let F0 = X and for each n ≥ 1,
let Fn = X − {fi1 , . . ., fin }. Each Fn is closed and F0 ⊇ F1 ⊇ . . .. Since for each n, fn 6∈ Fn , it
166
T
¯ i , fi ) = 1 whenever m 6= n, it follows that δ(Fn ) = 1
follows that n≥0 Fn = ∅. Also, since d(f
m
n
for each n and so hδ(Fn )in → 1. This completes the proof that the sets F0 , F1 , F2 , . . . have the
required properties.
Remark 23. Part (4) shows that even though Āj is closed and nowhere dense in ω ω , Āj is
not homeomorphic to the Cantor set (since the Cantor set is compact). Part (5) shows Āj is
somewhat pathological; the result of (5) contrasts with the fact that, since Āj is a complete metric
space,Tit has the property that whenever F0 ⊇ F1 ⊇ . . . are closed subsets with hδ(Fn )in → 0,
then n≥0 Fn 6= ∅.
This way of representing the completion of Aj , as in (∗∗), allows us to preserve information
about the behavior of elements of Aj on crit(Aj ). For instance, if crit(k) = e(m), we can discover
this fact by examining fk : Look for the least r for which fk (r) 6= r; this r must equal m. In
addition, if k(e(m)) = e(n), we can discover this fact by applying fk to m; it must be the case
that fk (m) = n. This feature of strong preservation of structure is in sharp contrast to the
characteristic of the space of bounded continuous real-valued functions on Aj in which, as we
have seen, very little of this structure is preserved.
This representation of elements of Aj also allows us to understand the new elements of Āj
as other functions from ω to ω, that is, as sequences of natural numbers.
Example 11. We show how a limit of a sequence of elements in Aj is realized as a function
ω → ω in Āj − Aj . Consider the sequence hj [m] : m ≥ 1i. For each m ∈ ω, let rm be such that
e(rm ) = κm , which is the critical point of j [m+1]. Under F , j [m+1] is mapped to a function fm+1
where fm+1 rm = id rm , and fm+1 (rm ) > rm . So ρ(fm+1 , id) = 1/(rm + 1), and it follows that
hfm im → id. As above, if we let B = F 00 Aj , I claim that id ∈ Āj − B: Suppose instead that it
lies in B. Then there is an ` ∈ Aj whose restriction to crit(Aj ) is id = idC , where C = crit(Aj ).
But this is impossible because, as we showed earlier, id 6∈ Aj , as required.
Finally, we outline the steps one might typically take to make use of our perfect Polish space
Āj in applications to quantum physics. We begin with a complex-valued Borel measure µ on Āj ;
it would suffice to use the measure guaranteed by Theorem 83(3).
As a first step toward defining the Hilbert space L 2 (Āj , µ), we define:
V = {f : Āj → C | |f |2 is µ-integrable over Āj },
where C is the set of complex numbers. Defining addition and scalar multiplication pointwise
turns V into a vector space. A quasi-inner product can be defined by
Z
(∗)
hf1 , f2 i =
f1 f2∗ dµ.
Āj
As usual in this construction, a property that is needed, but that we do not have, is
hf, f i = 0 ⇔ f = 0.
This difficulty is corrected by introducing an equivalence relation ≈ on V by
f1 ≈ f2
iff |f1 − f2 | = 0 (µ-a.e.) on Āj .
167
For each f ∈ V , let [f ] denote the ≈-equivalence class containing f . Define V̄ = {[f ] | f ∈ V }.
One shows that the inner product computation (∗) does not depend on the choice of representatives, and that addition and scalar multiplication are well-defined. Now, we have properly defined
L 2 (Āj , µ) by the triple V̄ , µ and this inner product defined on equivalence classes. Because Āj
is a Polish space, it follows that L 2 (Āj , µ) is a separable Hilbert space. We have now the mathematical setting for performing a quantum-mechanical analysis of some system behaving within
Āj .
We began this section by elaborating on the dynamics of a WA-embedding j : V → V . By
studying the self-application operation · that allows us to combine WA-embeddings V → V to
produce others, we discovered an infinity of such embeddings—forming the collection Aj —and
a countable j-class crit(Aj ) of critical points that determine the behavior of the embeddings
belonging to Aj . Having observed that Aj is a dense-in-itself metric space, we took the next
logical mathematical step and obtained the completion Āj of Aj , which turned out to be a
nowhere dense perfect Polish space—a space suitable for building a canonical Hilbert space X =
L 2 (Āj , µ). Although we do not have a particular physical application in mind for this formalism,
we have taken a step in the direction of forging a theoretical link between, on the one hand, our
highly abstract mathematical model of wholeness and its dynamics, and, on the other hand,
the mathematical formalism (that of separable Hilbert spaces) for studying the physical universe
through the eyes of quantum mechanics. Such a link parallels, we feel, the real-life link that joins
the unmanifest dynamics of pure consciousness to the phenomena of manifest existence.
We close this section with a technical question that must not be overlooked, but which we
have neglected so far, namely: How can we provide a proper formal treatment of the collection
Aj as a j-class when each i ∈ Aj is itself already a j-class? Indeed, the collection Aj itself is
too big to be a j-class; it doesn’t “fit” in the universe V . There are several ways to handle this
situation. An elegant solution would be to make use of the following theorem due to Laver (see
[12]):
Theorem 84. (Aj , ·) is, up to isomorphism, the unique, free, monogenic, left-distributive algebra.
Left-distributivity asserts that i · (k · `) = (i · k) · (i · `); Aj satisfies this property [11]. It is
monogenic because the algebra is generated by a single element, namely, j. Because of freeness,
this algebra is unique up to isomorphism. But there are examples of such algebras in the realm
of sets. One example that has been studied extensively is found in the braid group B∞ with
infinitely many generators b1 , b2 , . . .[11]. One can define an operation · on B∞ that makes B∞
a left-distributive algebra. It follows that if we form the left-distributive subalgebra B1 = hb1 i
of B∞ , then B1 is an example of a free (monogenic) left-distributive algebra. Therefore, the
interesting applications of Aj that we have in mind could be carried out using B1 in its place.91
91
We describe here a direct approach to coding elements of Aj so that Aj may be treated as if it were a j-class.
We begin by recalling that the elements of j can be obtained by forming the closure of the following statements
which define “elements” of Aj :
(i) (j) is an element
(ii) if k, ` are elements, (k · `) is an element.
This finitary definition can be abstracted as a notion of a set of expressions, defined independently of the notion of
elementary embeddings, and then any particular i ∈ Aj can be viewed as a realization of an expression, obtained
168
by substituting suitable restrictions of j for each element in the expression. This footnote is an elaboration of most
of the details of this approach.
We begin with an abstract definition of the set Expr of expressions: Start with an alphabet A = {a}, with a
single element a. Expr is the smallest set closed under the following rules:
(i) (a) is an expression
(ii) if b, c are expressions, (bc) is an expression
The length of an expression ex, denoted |ex|, is the number of occurrences of a in ex. An indexed expression is
an expression in which each of the integers 1, 2, . . . , |ex| is assigned to an occurrence of a in ex, in descending order,
scanning ex from left to right. An indexing of an expression ex is the result of assigning integers to occurrences of
a so that ex becomes an indexed expression. We shall denote the assignment of an integer i to an occurrence a in
an expression by ai .
For example, an indexing of the expression (((a)(a))(((a)(a))(a)))(a) is
(((a6 )(a5 ))(((a4 )(a3 ))(a2 )))(a1 ).
One can show by induction that every expression has a unique indexing: Certainly (a) has the unique indexing
(a1 ). Assuming expressions b and c have unique indexings, obtain an indexing for (bc) by increasing each index in
b by the amount |c|. Since |bc| is a fixed size and indices must decrease from left to right, the indexing is unique.
To represent a computation of i(x) for some i ∈ Aj , our plan is to substitute into each indexed expression a term
of the form j Vα , for α large enough. The following lemma tells us what “large enough” needs to be.
Lemma 85. For each n ∈ ω, j Vκn +ω·n ⊆ Vκn+1 +ω·n .
Proof. For each x ∈ Vκn +ω·n ,
j(x) ∈ j (Vκn +ω·n ) = Vκn+1 +ω·n .
Clearly, then, all ordered pairs of the form (x, j(x)) for x ∈ Vκn +ω·n , belong to Vκn+1 +ω·n .
Next, we define a function that will facilitate our intended substitutions, in light of the Lemma.
Definition 12. Define t : V → ω by
t(z) = least m ∈ ω such that rank z < κm
We define a substitution function Sub : (indexed expressions) → V , so that for each ex, z, Sub(ex, z) is obtained
in the following way: For each ai in ex, substitute j Vκt(z)+i +ω·i for ai , and consider concatenations of these
restrictions of j to be functional application.
For example, let ex = (x3 )((x2 )(x1 )) be an indexed expression, and let z = κ + κ. Then t(z) = 1 since κ1 = j(κ)
is the least cardinal in the critical sequence above z. We have:
“
“
””
Sub(ex, z) = j Vκt(z)+3 +ω·3 j Vκt(z)+2 +ω·2 j Vκt(z)+1 +ω
“
“
””
= j Vκ4 +ω·3 j Vκ3 +ω·2 j Vκ2 +ω .
Now let A0j = {Sub(ex, z) | ex is an indexed expression and z ∈ V }. Note that the range of each Sub(ex, z) is
bounded in V . Therefore, A0j is a j-class, and provides us with (a j-class of) codes for Aj in the following way: For
each i ∈ Aj , let ex be an expression realized by i. Then for any z ∈ V , i(z) can be computed by Sub(ex, z)(z). (It
can be shown that this computation does not depend on the representing expression ex—see [11].)
The class A0j of codes is good enough to provide us with j-class versions of the collections/maps:
(a) crit(Aj )
(b) the increasing enumeration e : ω → crit(Aj )
(c) the operator mi,k defined to be the least n for which i(e(n)) 6= k(e(n)).
(d) the metric d : Aj × Aj → R.
For (a),
crit(Aj ) = {α | ∃ex, z (z is an ordinal and Sub(ex, z)(z) 6= z and ∀δ < z (Sub(ex, z)(δ) = δ))}.
Parts (b)–(d) follow easily from (a).
169
Let us also remark that the seemingly larger completion Āj of Aj is not difficult to formalize
since it has been formally defined as a subspace of ω ω .
24.3
The WA Blueprint and the Ancient View of Blueprints
A theme that we have traced in this article from our initial study of the notion of infinite set, by
way of Dedekind self-maps, through strong forms of Dedekind self-maps of the universe, including
the theories ZFC+BTEE+MUA and ZFC+WA, is the idea of a blueprint. We think of a blueprint
as containing the essence, the knowledge, and the vitality of all that is shaped from the blueprint.
The blueprint is the link between the primary intelligence, out of which the blueprint is woven,
and the diversity of form to which the blueprint gives rise.
There are many instances in the ancient wisdom of this idea of a blueprint, emerging from
the Source, and giving rise to everything else. A well-known example is Plato’s world of forms.
Plato’s forms are fundamental archetypes or templates on the basis of which the changeable
manifest world is constructed. Plato cites as examples of forms the form of virtue and geometric
forms, such as a circle. The form of virtue is to be understood as the archetype by which we
recognize a great variety of behaviors—which could arise in an endless variety of contexts—as
instances of virtue. Likewise, geometric forms, like a circle, represent archetypes of another kind:
For instance, though no one has ever seen in the physical world a perfect circle, it is by virtue of
the form of a circle that the physical approximations to a circle that we encounter are recognized
as approximations of a circle.
In Plato’s dialogue, The Phaedo, he makes it clear that the forms are unchangeable, eternal,
beyond sense perception (but knowable), of divine nature, and the cause of the multiplicity of
beings, each shaped according to the parent form’s nature. Although there is a vast diversity of
forms, Plato makes clear in The Sophist that the forms are subsumed under the highest form:
the one being. By virtue of “participating” in the one being, each form is existent and is a
“one” on its own, and all forms collectively form a unity, a oneness of being. The Sophist also
makes clear a distinction between one being and the one; in Plato’s view, the one is devoid of
multiplicty, while the one being is the interface between one and many.92 From the one being
emerges, according to Plato, the intelligible realm, which has a threefold division, described in
the Philebus as bound, infinite, mixed, and also as symmetry, truth, beauty;93 and described by
the Neoplatonist Simplicius, in his commentary94 on Aristotle’s Physics, as intellect, intellection,
intelligible. Simplicius remarks95 that these three, when considered as the constituents of the one
92
In The Sophist, we read, “According to right reason, that which is truly one should be. . . entirely without
parts.. . . For, since in a certain respect being is passive to the one, it does not appear to be the same as the
one” [35, Vol. 3, pp. 246-7]. On this passage, Thomas Taylor comments, “Plato here dividing the one and being
from each other, and showing that the conception of the one is different from that of being, evinces that what is
most properly and primarily one is exempt from the one being. For the one being does not abide purely in an
unmultiplied and uniform hyparxis. But the one withdraws itself from all addition, since by adding any thing to
it you diminish its supreme and ineffable union.”[35, Vol. 3, p. 243].
93
See Thomas Taylor’s discussion of this division in his notes on the Parmenides [35, Vol. 3, pp. 162-3].
94
See [35, Vol. 3, pp. 245–7].
95
Simplicius says, “It remains, therefore, that the Parmenidean one being must be the intelligible, the cause of
all things, and hence it is intellect and intellection, in which all things are unitedly and contractedly comprehended
according to one union. . .” The use of these variations of the word “intellect” can easily be misinterpreted, though
they are commonly encountered translations of the terminology used by Plato and the Neoplatonists. Thomas
170
being, are identical with one another.
This emergence of three within the one being is, according to Neoplatonic doctrines, further
elaborated in groups of three, forming a vast hierarchy of intelligence, remaining all the while
unified with its source. See [32].
In a different direction, in the Sophist, Plato enumerates five genera of being—essence,
sameness, difference, permanence, and motion. Again, these are to be understood as being no
different from being itself, but provide rather a texture for grasping the purely abstract field of
being.
In the Timaeus, Plato discusses how the manifest universe is shaped by the world of forms.
In that dialogue, he explains how the demiurge, a fundamental principle of intelligence, molds
“the Receptacle”—which may be understood as pure emptiness—using the forms as a template.
In the process, this intelligence intends that “all things should come as near as possible to being
like himself” [5, Part I, p. 217].
Summarizing these points, we see in Plato’s vision that the ultimate nature of things is unity,
which has an entirely unexpressed aspect that Plato calls the One, and an aspect that interfaces
with and gives rise to multiplicity, which he calls the one being. In particular, the one being gives
rise to a threefold division intellect, intellection, intelligible, which in turn unfold into additional
triads and ultimately unfold as the world of forms. These divisions—the one being, the threefold
division, the triads, and the realm of forms—are to be understood as one reality, in more or less
elaborated form. Each can be viewed, in the sense we intend in this paper, as a blueprint for
manifest existence. Then, manifest existence arises as an image of this blueprint, architected by
a fundamental principle of intelligence.
Another approach that considers manifest existence to be an expression of a fundamental
blueprint is given in Maharishi Mahesh Yogi’s discussion of the Vedic literature. In the Vedic
conception, the first step in the move of wholeness—the ultimate totality—within itself involves
a gathering of all its unbounded potential into a single focal point; this point of focus is said to
be a point of “infinite dynamism,” teeming with the tremendous power and possibilities of the
Taylor hastens to distinguish ordinary discursive reasoning, which arises from the dianoetic power of the soul, from
the “intellection” described by Simplicius, which arises from the noetic power of the soul:
The rational and gnostic power of the soul receive a triple division: for one of these is opinion, another,
the dianoetic power, and another intellect. Opinion therefore is conversant with the universal in
sensibles, which also it knows, as that every man is a biped, and that all color is the object of sight.
It likewise knows the conclusions of the dianoetic energy, but it does not know them scientifically.
For it knows that the soul is immortal, but is ignorant why it is so, because this is the province of the
dianoetic power.. . . For, the dianoetic power, having collected by a syllogistic process that the soul is
immortal, opinion receiving this conclusion knows this alone, that it is immortal. But the dianoetic
power. . .[makes] a transition from propositions to conclusions.. . . But the province of the intellect is
to dart itself, as it were, to things themselves by simple projections, like the emissions of visual rays,
and by an energy superior to demonstration. And in this respect intellect is similar to the sense of
sight, which by simple intuition knows the objects which present themselves to its view. [35, Vol. 1,
pp. 353–4]
171
Absolute.96 This focal point arises in the collapse of unboundedness;97 this collapse, according
to Maharishi, is captured in the first syllable, AK, of Rik Veda, where the letter ‘A’ embodies
the unbounded value (pronounced with a fully open throat) and ‘K’ embodies the point value
(pronounced with full restriction of the throat). In this way, the letter ‘K’ represents this focal
point of all possibilities and infinite dynamism.98
In the Vedic conception, this focal point begins to expand, step by step, into the fundamental frequencies from which the manifest universe is comprised. Collectively, these frequencies
are known as the Veda and form a blueprint for the manifest universe. The term Veda is translated as “knowledge”; these fundamental frequencies represent the self-knowledge that emerges
sequentially within the pure wakefuleness of wholeness.99
In the following passage, the Rik Veda itself describes its own unfoldment from its first
syllable AK [8, pp. 180–181]:
r.icho akshare parame vyoman
yasmindevā adhi vishve nisheduh.
The verses of the Veda exist in the collapse of fullness in the transcendental field, in
which reside the fundamental frequencies responsible for the whole manifest universe.
–R
. k Veda I.164.39
The expression “verses” in the above passage is also sometimes translated “frequencies.” The
quotation indicates how, within wholeness, the “transcendental field,” the basic frequencies of
the univese emerge in the collapse of the unbounded fullness of the Absolute.
The fact that this collapse is a gathering of potential to a point is important and is brought
out by a basic feature in the structure of Vedic Sanskrit: The structure of the Vedic literature
and the Vedic language is such that the first word in a verse, hymn, or longer section in the Veda
contains in seed form the total content of this verse, hymn or section. Moreover, the sound of
96
Like Plato, Maharishi also considers the ultimate wholeness to have two aspects, one that is entirely unexpressed, which he calls singularity or pure Samhita (“Samhita” can be translated as “unity”) and Samhita of rishi,
devata, and chhandas (that is, knower, knowing, and known in their unified state). In our discussion here, we focus
on the latter since it is this aspect that provides a blueprint for the universe. In the following quotation, Maharishi
describes this distinction between the two aspects of the ultimate and how these two aspects of wholeness are
related to each other:
Veda equates with the unmanifest self-referral intelligence of Samhita, which conceives of the three
qualities of rishi, devata, and chhandas within its own self-referral singularity—singularity finds diversity within its structure. [28, p. 61]
97
On this point, Maharishi remarks, “It is interesting to observe that A in its continuum stands for the continuum
of silence, and the collapse of this infinite value onto its point value generates dynamism within the nature of silence,
and this silent dynamism is the lively home of all the Lasw of Nature from where specific Laws of Nature emerge
within the wholeness of the total potential of Natural Law in A.” [25, p. 620]
98
Maharishi remarks, “Vedic Science identifies ‘K’ as the point of all possibilities . . . from which the whole
sequential progression of the Samhita emerges simultaneously. This locates in ‘K’ the infinite creative potential of
nature that in one stroke can give expression to the infinite diversity of creation” [29].
99
In a quote from Maharishi cited in [36, p. 218], we read that the Veda is “a beautiful, sequentially available
script of nature in its own unmanifest state, eternally functioning within itself, and, on that basis of self-interaction,
creating the whole universe and governing it.”
172
this first word—indeed, even the first syllable—is said to contain the fundamental impulses or
frequencies that structure the content of the entire verse, hymn or section. Maharishi describes
this structure of the Veda as a“self-created commentary”—Apaurusheya Bhasya—since later
stages of unfoldment express in ever greater detail the totality of knowledge inherent in the
earlier stages [25, pp. 495–500].
Not only does the Veda give rise to every detail of manifest creation, but it does so, according
to Maharishi, by virtue of its nature as simultaneously infinite dynamic and infinitely silent; all
opposite values find their lively integration within this field.100 This high degree of integration
is due to the fact that at each stage of unfoldment, the Veda remains completely self-referral
and united with itself; the parts of expression never dominate the original underlying wholeness.
This integration within the Veda is responsible for all unity and coherence displayed in manifest
existence.
How then does the Veda, as blueprint of the universe, actually give rise to the universe?
According to Maharishi Vedic Science, the tranformational dynamics by which the universe
“emerges” from the Veda is a principle of intelligence within the nature of pure consciousness
itself, called vivarta. “The principle of vivarta makes . . . the Veda and Vedic Literature appear
as Vishwa [the universe]” [25, pp. 377, 589]. So, in reality, the universe is not produced by the
Veda, but by a power embedded in intelligence itself, the Veda appears as the universe.101
Summarizing the discussion so far, we list some of the key characteristics of the Veda, as a
blueprint for manifest existence; we compare these points with the points we extracted earlier
from Plato’s philosophy.
(1) Every detail of the manifest universe arises from Veda; this manifestation or appearance
is by virtue of an aspect of intelligence—vivarta—inherent in pure consciousness itself. In
Plato, likewise, the world of forms provide a template for the manifest world, which arises
by virtue of the dynamics of intelligence embodied in the demiurge.
(2) Maharishi’s Apaurusheya Bhasya asserts that knowledge of the whole of Veda is present
in expanding packets of knowledge, in increasingly elaborated forms: From the first letter,
to the first syllable, to the first pada, to the first richa, to the first mandala, to the entire
Veda. Each contains the totality of knowledge. In Plato, also, we see that the world of
forms is appreciated in different degrees of elaboration—first as one being, then as the triad
intellect, intellection, intelligence, and ultimately as the full range of eternal forms. Each
stage of this elaboration is understood to be fundamentally the same as all the others,
differing only in degree of elaboration.
(3) The Veda embodies, in its unfoldment, simultaneously infinite silence and infinite dynamism. Recall also that two of the genera of being in Plato’s account of forms are permanence and motion, which resemble closely the Vedic notions.
(4) Everything that emerges from the Veda arises in the collapse of fullness to a point. It is
not clear from Plato’s writings that he held a similar view, but Neoplatonist commentators
100
See for example [25, p. 566].
Maharishi remarks, “Vishwa (universe) is not the parinam (result) of Veda, it is the vivarta of Veda” [25, p.
589].
101
173
observe [35] that inspiration for Plato’s doctrine of forms (or “ideas” as they are somtimes
called) came in part from the Chaldean oracles. Proclus in particular traces Plato’s forms
to a particular poem from the Chaldeans in which it is stated, “The intellect of the Father
made a crashing noise, understanding with unwearied counsel omniform ideas. But with
winged speed they leaped forth from one fountain: for both the counsel and the end were
from one Father” [35, Vol. 3, p. 20]. This “crashing” has been interpreted as the first
sprouting of the world of forms, and perhaps parallels the Vedic conception of “collapse.”
We now turn to a summary of characteristic features of a Laver sequence, derived from
ZFC + WA, as a blueprint for the sets in V . The numbering of the points below is intended to
correspond to the numbering in the list above.
In the points below, we have taken certain liberties in our mathematical use of the term
“blueprint.” Formally, a blueprint for V has been defined to be a triple (`, κ, E). However, ` was
defined on the basis of a Laver sequence f : κ → Vκ . For this discussion, therefore, it will be
convenient at times to refer to the Laver sequence f , rather than ` itself, as the blueprint for V .
(1) As we already discussed (Remark 21), every set in the universe arises from the blueprint
(`, κ, E), formed through the interaction of a WA-embedding j : V → V with its critical
point κ. Moreover, the “decoding” of this blueprint is by virtue of mappings that resemble
j itself (reminiscent of the demiurge transforming the world of forms into the universe, in
such a way that “all things should come as near as possible to being like himself”).
(2) Maharishi’s Apaurusheya Bhasya, and sequence of elaborations of Plato’s one being, have
a parallel in the structure of a Laver sequence f : κ → Vκ under WA: For almost all α < κ,
f α is itself Laver, another blueprint for the universe.
Proposition 86. Suppose j : V → V is a WA-embedding with critical point κ and let E be
the class of extendible embeddings with critical point κ.
(A) Let D = {X ⊆ κ | κ ∈ j(X)}. Let f : κ → Vκ be a Laver sequence at κ. Then
(+)
{α < κ | f α is Laver at α} ∈ D.
(B) There is a normal measure U on κ such that if X is defined by
X = {α < κ | f (α) = f α and f α is Laver at α},
then X ∈ U . In other words, for almost all α, the αth term of f is f α which is itself
a Laver sequence at α.
Proof. Notice that whenever i ∈ E or i = j, i(f ) κ = f : For each α < κ,
i(f )(α) = i(f )(i(α)) = i i(α) = i(α).
The last step follows because i(α) ∈ Vκ .
Proof of (A). Applying j to the set (+), it is clear that
κ ∈ {α < j(κ) | j(f ) α is Laver at α}
174
since j(f ) κ = f .
Proof of (B). Let i ∈ E be such that i(f )(κ) = f and let U = {X ⊆ κ | κ ∈ i(X)}. Then
the result follows from the fact that
κ ∈ {α < i(κ) | i(f ) α = i(f )(α) and i(f )(α) is Laver at α}.
(3) The fact that the Veda simultaneously incorporates infinite dynamism and infinite silence,
and that Plato’s one being embodies at once permanency and motion, find a parallel in the
structure of a Laver sequence f in the following ways: Almost all terms of f are “silent,”
and, at the same time, almost all terms are “dynamic.” Relative to a Laver function f ,
we interpret “silence” or “permanency” at a point α to mean that f does not move α;
that is, f (α) = α. And we consider “dynamism” or “motion” to be represented by rapidly
growing behavior; if a function dominates a known fast-growing function (like a generalized
exponential), we consider that such a function exhibits a high degree of dynamism.
Proposition 87. Suppose j : V → V is a WA-embedding with critical point κ. Let f : κ →
Vκ be a Laver sequence at κ. Then
(A) There is a normal measure U on κ that contains the set {α < κ | f (α) = α}. Therefore,
f is “silent” almost everywhere (relative to U ).
(B) Consider the following generalized exponential function A(α, β), for α, β below κ, defined by the following clauses
A(α, 0) = 2α
A(α, γ + 1) = 2A(α,γ)
A(α, λ) = sup{A(α, γ) | γ < λ}(λ a limit)
Then there is a normal measure U on κ such that the set
{ρ < κ | f (ρ) is a cardinal and f (ρ) > A(ρ, ρ)}
belongs to U . Therefore, f is “highly dynamic” almost everywhere (relative to U ).
(C) Assuming102 no Laver sequence at κ is definable in (Vκ , ∈) with parameters,103 there
is a Laver sequence f : κ → Vκ such that for each h : κ → κ definable in Vκ ,
{α | f (α) is a cardinal and f (α) > h(α)}
102
Assuming the existence of an I3 -embedding is consistent with ZFC, the theory ZFC + WA +
“No Laver sequence at κ is definable in Vκ ” is also consistent. This is proven in [6]. The idea is: start with
an I3 -embedding, which directly provides a transitive model of ZFC + WA; then use a modified reverse Easton
forcing to obtain a model of ZFC + WA + V = HOD. In that model, no Laver sequence is definable in Vκ with
parameters.
103
A function q : κ → Vκ is definable in (Vκ , ∈) with parameters if there are sets a1 , . . . , am ∈ Vκ and a formula
ψ(x, y, z, u1 , . . . , um ) such that for all sets X, Y, Z ∈ Vκ
Z = q(X, Y ) if and only if ψ(X, Y, Z, a1 , . . . , am ).
An easy counting argument shows that there are precisely κ functions κ → κ that are definable in (Vκ , ∈) with
parameters; therefore, they can be enumerated as {hξ | ξ < κ}, as discussed in the proof of this result.
175
belongs to D. In other words, f dominates, on a large set, every definable function
κ → κ.
Proof of (A). Let i ∈ E be such that i(f )(κ) = κ. Let U = {X ⊆ κ | κ ∈ i(X)} Let
X = {α < κ | f (α) = α}. We claim that X ∈ U , which will complete the proof. We observe
that
κ ∈ i(X) = {α < i(κ) | i(f )(α) = α},
from which it follows that X ∈ U .
Proof of (B) Let i ∈ E be such that i(f )(κ) = 2A(κ,κ) . Let X = {α < κ | f (α) = 2A(α,α)}.
Since A is definable, i(X) = {α < i(κ) | i(f )(α) = 2A(α,α)}, and clearly κ ∈ i(X). But now
let Y = {α < κ | f (α) is a cardinal and f (α) > A(α, α)}. Clearly X ⊆ Y , so Y ∈ U , as
required.
Proof of (C). To ensure that α 7→ f (α) dominates, on a D-measure 1 set, every function
κ → κ that is definable in Vκ (with parameters), we define the sequence t of sets as follows:
Let hhξ : ξ < κi enumerate the members of κ κ which are definable in Vκ . Define a function
t : κ → κ by
t(α) = sup{hξ (α) : ξ < α} + 1.
Let φ(g, x) be the formula
φ(g, x) :
∃α α is a cardinal ∧ g : α → Vα ∧ ∀η ∀ζ∀i : Vη → Vζ elem(i, α) ∧
rank(x) < η < i(α) < ζ → i(g)(α) 6= x ,
where elem(i, α) asserts “i is an elementary embedding with critical point α.”
As in our earlier construction of a Laver sequence, the formula φ(g, x) says that g is not
Laver, with witness x. Define f : κ → Vκ by


t(α) if f α is Laver at α
f (α) = ∅
if α is not a cardinal


x
otherwise, where x satisfies φ(f α, x).
Let D = Uj be the normal ultrafilter over κ that is derived from j; that is:
D = {X ⊆ κ | κ ∈ j(X)}.
By Amenability, D is a set, since D = {X ⊆ κ | κ ∈ h(X)}, where h = j P(κ). As in
our previous argument, define sets S1 and S2 by
S1 = {α < κ | f α is Laver at α}
S2 = {α < κ | φ(f α, f (α)}.
176
Clearly, S1 ∪ S2 ∈ D. We show that S1 ∈ D, and for this, it suffices to show S2 6∈ D.
Toward a contradiction, suppose S2 ∈ D. Then, φ(f, j(f )(κ)) holds in V . Let x = j(f )(κ).
Pick η so that rank(x) < η < j(κ). Let i = j Vη : Vη → Vζ , where ζ = j(η). By
Amenability, i is a set, and is an elementary embedding with critical point κ. Clearly, in
V , rank(x) < η < i(κ) < ζ and i(f )(κ) = x, contradicting the fact that φ(f, j(f )(κ)) holds
in V . Therefore S2 6∈ D, as required.
Finally, we prove that f has the required additional property. Suppose h : κ → κ is a
function that is definable in Vκ with parameters a1 , . . ., am ∈ Vκ . Let
T = {α | α is a cardinal, f (α) is a cardinal, and h(α) < f (α)}.
Let χ < κ be such that h = hχ (this is possible since the hξ collectively enumerate all
functions κ → κ that are definable from parameters in Vκ ). Suppose α ∈ S1 ∩ (χ, κ). Then
f (α) = t(α) = sup{hξ (α) : ξ < α} + 1 ≥ hχ (α) + 1 > hχ (α).
Therefore T ⊇ S1 ∩ (χ, κ) ∈ D, and so T ∈ D. We have shown that f dominates each
definable function κ → κ on a set of D-normal measure 1, as required.
(4) Just as everything in existence emerges from the collapsing dynamics of the Veda, so likewise
every set in V arises from the collapsing dynamics of the blueprint (`, κ, E). We indicate
here some highly “self-referral” dynamics that are responsible for the emergence of several
particularly interesting sets: The critical point κ, the Laver sequence f itself, and each
extendible elementary embedding e : Vη → Vξ with critical point κ. As usual, j : V → V
denotes a WA-embedding with critical point κ and f : κ → Vκ is a Laver sequence at κ,
and E is the class of extendible embeddings with critical point κ.
(A) Collapse giving rise to κ. Let i ∈ E with i(f )(κ) = κ. As we showed before, such an i
forces it to be the case that for some normal measure U on κ, the set {α < κ | f (α) = α}
belongs to U .
(B) Collapse giving rise to f itself. Let i ∈ E with i(f )(κ) = f . Then there must be a
normal measure U on κ (namely, {X ⊆ κ | κ ∈ i(X)}) so that the set {α < κ | f (α) =
f α} belongs to U . This follows from the fact that f = i(f ) κ, as we observed
earlier. This phenomenon, that for so many α we have f (α) = f α, is reminiscent of
the fact that the Veda is structured in self-referral loops;104 almost every value of f is
structured in terms of the wholeness of the Laver sequence.
(C) Collapse giving rise to each extendible embedding e : Vη → Vξ . Let i ∈ E be such that
i(f )(κ) = e. Then there is a normal measure U on κ such that
{α < κ | f (α) is a nontrivial elementary embedding of the form Vη → Vξ }
belongs to U . From this point of view, the range of f contains almost exclusively
extendible embeddings! This is evidence of another kind of dynamism inherent in the
blueprint.
104
See [28, pp. 63–64].
177
25
Conclusion
We began our discussion of the set of natural numbers with a brief look at the viewpoint of the
ancients. We found that ancient traditions of knowledge have recognized the natural numbers as
having a common source, a unified foundation, whose internal dynamics lead to the sequential
emergence of these numbers, while at the same time remaining connected to their source. We
also recalled the Maharishi Vedic Science admonition that failure to view the natural numbers as
connected to their source marks the beginning of ignorance. Maharishi’s name for the self-referral
dynamics at the source of all the natural numbers is the Absolute Number.
We went on to observe that the usual way of asserting the existence of the set N of natural
numbers axiomatically, in the foundations of mathematics, tends to disregard the question of the
possible “origin” of this set, and instead, simply asserts that it exists. We suggested that failure
to view N as originating from something more fundamental has led to unnecessary limitations on
the mathematician’s intuition regarding the infinite in mathematics. This difficulty, we argued,
becomes apparent when one attempts to address the Problem of Large Cardinals.
Trying to determine which of the known large cardinals should be taken to be “real” entities
in the universe of sets requires a deeper intuition about the nature of the mathematical infinite.
If one attempts to acquire the necessary insight by examining more deeply the foundational ZFC
axioms, one is led to the Axiom of Infinity, which, in simply asserting that the set N exists, tells
us very little about which large cardinals exist.
We have suggested, in accord with the ancient view of the matter, that the natural numbers
should be seen as emerging from self-referral dynamics of an unbounded field, and that it is these
dynamics that should be declared by the Axiom of Infinity to exist.
Moreover, we have shown that this philosophy is easy to implement: Our New Axiom of Infinity states that a Dedekind self-map—a map from a set to itself that is 1-1 but not onto—exists.
The axiom emphasizes the existence of transformational dynamics rather than the existence of
discrete values. We then went to considerable lengths to show how the sequence of natural numbers can be derived from this new axiom, without any reliance on the natural numbers in the
process.
We then were in a position to verify whether this new perspective could shed more light on
questions about the mathematical infinite than the old perspective. It seemed natural, considering
the formulation of the New Axiom of Infinity, to ask whether large cardinals, like the sequence
of natural numbers, might be understood as emerging from some kind of Dedekind self-map; we
suggested looking at such self-maps that were themselves proper classes.
We found that proper class Dedekind self-maps are weaker than Dedekind self-maps that are
sets, and do not imply the existence of infinite sets. We showed however that slight strengthenings
of the properties of the proper class maps did allow us to derive infinite sets, and we argued
that the strategies for strengthening the maps in these ways could serve as further guidance
in formulating axioms to account for large cardinals. Moreover, the strategy of strengthening
preservation properties was inspired by an examination of the characteristics of set Dedekind
self-maps themselves: Dedekind self-maps always preserve the character of their domain—the
image B of a Dedekind self-map j : A → A is itself Dedekind-infinite—and, in turn, j induces
another such map, obtained by restricting j to B.
Our efforts to find appropriate preservation properties that a Dedekind self-map j : V → V
178
might have, which would give rise to an infinite set, led to the following results: Existence of an
infinite set is derivable from each of the following.
(1) A Dedekind self-map j : V → V that preserves disjoint unions, the empty set, and singletons.
(2) A Dedekind self-map j : V → V with critical point a with the following properties:
(a) j preserves disjoint unions, intersections, terminal objects, and the empty set.
(b) a ∈ j(A) is a universal element for j, for some set A.
(c) The set {a} is also a critical point of j.
(3) A self-map j : V → V with a strong critical point that preserves finite coproducts and
terminal objects.
Moreover, we showed that these properties, in each case, are consistent by giving examples
under a suitable axiom of infinity. (For (2) and (3), we could provide an example assuming
only the existence of ω. For (1), our only example was a model of ZFC + BTEE—the example
given was a nontrivial elementary embedding j : L → L; this example, however, requires large
cardinals.105)
In another direction, we also observed that an essential characteristic of a Dedekind self-map
j : A → A is the presence of a critical point a. We found that the sequence of natural numbers
could be computed through interaction between j and its critical point; indeed, we discussed in
some detail the sense in which the sequence {a, j(a), j(j(a)), . . .} forms a blueprint for the set
ω of natural numbers, developing a reasonably precise notion of blueprint. The significance of
the critical point in the study of Dedekind self-map suggested the possibility that large cardinals
themselves could arise from the interaction between a proper class Dedekind self-map with its
critical point, and that some parallel notion of “blueprint” for large sections of the universe should
be definable.
We saw an example of the role of the critical point for a suitably strong j : V → V in the
construction of the Lawvere functor j : V → V, j = G ◦ F. In that construction, we found that
the notion of the infinite—in the form of a Dedekind-infinite set and corresponding Dedekind selfmap—first emerges as the image of the critical point of j: j(1) = X1 = {η0 , f1 (η0 ), f12(η0 ), . . .}
(where f1 = F(1)).
Using these tools for generalization, we showed how certain combinations of preservation
properties belonging to a Dedekind self-map j : V → V had large cardinal consequences. For
instance, if j : V → V is a Dedekind self-map with critical point κ, κ must be inaccessible if the
following properties hold:
(A) j preserves ∈
(B) j preserves ordinals
(C) j is BSP (Definition 6, p. 88)
105
It is unlikely that a model of (1) actually requires large cardinals, but we do not have a proof at this time.
179
(D) j preserves countable unions
(E) j preserves the empty set
(F) j preserves singletons
(G) j preserves unboundedness
(H) j preserves power sets.
Moreover, we showed that these properties are consistent by showing that for any model
hV, ∈, ji of ZFC + BTEE, these properties are satisfied by j (once again, an example of such a
model is given by any nontrivial elementary embedding j : L → L).106
Another result in this direction was that, assuming j : V → V is a (essentially Dedekind)
functor with a strong critical point and a weakly universal element a ∈ j(A) for j, for some set A,
then, if j has the additional preservation properties listed below, there must exist a nonprincipal
ω1 -complete ultrafilter on A—in fact, there must exist a measurable cardinal ≤ |A|.
(A) j preserves countable disjoint unions
(B) j preserves intersections
(C) j preserves equalizers
(D) j preserves the empty set
(E) j preserves terminal objects.
Consistency of these properties follows from the existence of a canonical model that can be
built assuming the existence of a κ-complete nonprincipal ultrafilter on some uncountable cardinal
κ; in other words, assuming existence of a measurable cardinal is enough to produce a model of
all the preservation properties just mentioned. These latter results were improved slightly and
expressed more elegantly by Trṅkova-Blass, who showed that there is a measurable cardinal if
and only if there is an exact functor j : V → V having a strong critical point.
To form still stronger statements, using these same tools, we strengthened the preservation
properties even further. We considered stronger models hV, ∈, ji of ZFC + BTEE. For such
models, because of limitative results due to Kunen, we observed that j cannot be definable in V ;
however, it was observed that the formulation of these axioms using an extended language of set
theory in which there is an extralogical symbol j averts the Kunen inconsistency.
Recall that BTEE is a collection of axioms that collectively assert that j is an elementary
embedding V → V with a least ordinal moved, denoted κ. Such embeddings are necessarily
Dedekind self-maps, but are neither sets nor proper classes; they resemble proper classes because
they have values arbitrarily high in V , but they are not defined by a single ∈-formula.
We observed that the critical point κ of a BTEE embedding j : V → V is weakly compact;
however, the theory ZFC + BTEE is not strong enough to derive a measurable cardinal; in
106
Existence of such a model requires considerably more than an inaccessible (more than an ineffable cardinal is
required). We do not know at this time how to build such a model assuming only a single inaccessible.
180
particular, κ is not measurable. The natural choice for defining a normal measure on κ in the
theory ZFC + BTEE, namely,
(∗)
U = {X ⊆ κ | κ ∈ j(X)},
is not available in this theory (that is, it is not a set in any model of the theory). Having seen
by the Blass-Trnková result the naturalness of a measurable cardinal, and also having seen the
naturalness of the notion of “nonprincipal ultrafilter” on a set A, we introduced the postulate
MUA, which asserts that this naturally defined normal measure U , as defined in (∗), exists.
The theory ZFC + BTEE + MUA renders the critical point κ of the embedding j : V → V
a very strong kind of measurable cardinal. In addition to the fact that j exhibits the strongest
possible preservation properties and strong large cardinal properties at its critical point, the
theory is now strong enough to exhibit the beginnings of an interesting blueprint: From ZFC +
BTEE + MUA, we can derive the existence of a blueprint (`, κ, E) for Vκ+1 , where E consists of
ultrapower embeddings iU : Vκ+1 → M , having critical point κ.
Still motivated by the properties that we originally found present in a set Dedekind selfmap—namely, the existence of a blueprint for ω—we sought to strengthen the theory ZFC +
BTEE + MUA even further. One objective was to see whether it is possible to have a j : V → V
that yields a blueprint of all of V rather than just an initial part of V . The direction for
strengthening the theory further appeared when we attempted to generalize another feature of
a set Dedekind self-map, namely, that many of the “interesting things” about the self-map arise
when we consider successive restrictions. In the case of a Dedekind self-map j : A → A with
critical point a, the sequence a, j(a), j(j(a)), . . . arose from a consideration of restrictions of j to
successive ranges: First jB = j B : B → B, where B = ran j; then jC = j C : C → C where
C = ran jB , and so forth. A critical point of jB is j(a); a critical point of jC is j(j(a)), and so
on.
When we attempt a similar sequence of steps in a model (V, ∈, j) of ZFC + BTEE + MUA,
we immediately discover the limits of the theory; while we can define the restriction j Vκ :
Vκ ,→ Vj(κ) in the theory, we cannot take the next step suggested by the set Dedekind self-map
dynamics: The restriction j Vj(κ) is not a set in this theory; the reason is that existence of such
a map (as a set in the universe) implies large cardinals that are stronger than the theory itself
allows. Indeed, asserting the existence of restrictions of j to various power sets of κ and j(κ)
results in a number of extremely strong theories—strong enough to derive existence of some of the
strongest large cardinals; for instance, the theory ZFC + BTEE + ∃z (z = j P(P(j(κ)))) proves
that κ is the κth huge cardinal.
We also discovered where the limit of generalization in this direction lies: We cannot have
a consistent theory that allows j λ to be a set where λ lies above the critical sequence κ, j(κ),
j(j(κ)), . . .. This tells us that to carry this direction of generalization as far as possible, we require
(if possible) that the critical sequence κ, j(κ), j(j(κ)), . . . is unbounded in the class of ordinals.
Following this line of reasoning, we were led to the Amenability Axiom, which states that
the BTEE-embedding j should have the additional property that for any set X, j X is also be a
set; again, this theory has a chance to be consistent only if the critical sequence of j is unbounded
in ON.107
107
It turns out that the theory ZFC + BTEE + Amenability is indeed consistent, relative to another very strong
181
The axioms BTEE together with the Amenability Axiom are known collectively as the (weak)
Wholeness Axiom, or WA. We indicated earlier that the theory ZFC + WA is enough to imply
existence of virtually all large cardinals; moreover, if j : V → V is a WA-embedding with critical
point κ, we found that the “place of origin” of all large cardinal properites is κ itself, so, in this
sense, our earlier observation that the natural numbers emerge from dynamics of a Dedekind self
map interacting with its critical point is realized on the larger scale of emergence of large cardinals.
Moreover, in the theory ZFC + WA, we also showed that our objective regarding construction
of a blueprint for all sets is finally achieved; we exhibited a Laver sequence f : κ → Vκ (based
on extendible embeddings) for V, and showed how existence of f leads to a blueprint (`, κ, E) for
V itself, where κ is the critical point of the WA-embedding j and E consists of all extendible
embeddings with critical point κ.
We also showed that the theory ZFC + WA (in its stronger form, where WA = BTEE +
Separationj ) is strong enough to include as sets all restrictions of j of the form
j Vκ , j Vj(κ), j Vj(j(κ)), . . . ,
providing an alternative way to generate the critical sequence κ, j(κ), j(j(κ)), . . . and also providing a satisfying generalization of the restriction properties of set Dedekind self-maps.
We also showed that another sequence of restrictions of a WA-embedding j : V → V is
possible in the theory ZFC + WA, namely, the sequence
j Vκ ≡ Vκ ≺ Vj(κ)
j(j Vκ ) = j · j Vj(κ) : Vj(κ) ,→ Vj(j(κ)) ≡ Vj(κ) ≺ Vj(j(κ))
j(j · j Vj(κ)) = j [3] Vj(j(κ)) : Vj(j(κ)) ,→ Vj(j(j(κ))) ≡ Vj(j(κ)) ≺ Vj(j(j(κ)))
···
···
leading to the remarkable fact that Vκ ≺ Vj(κ) ≺ · · · ≺ V .
An easy proof then demonstrated that the critical sequence S of a WA-embedding j : V → V
with critical point κ turns out to be yet another blueprint for ω, just as in the case of (ordinary)
set Dedekind self-maps, except in this case, all elements of the blueprint have “full knowledge”
of the universe V itelf since all statements σ (in the language {∈}) hold in V if and only if they
hold in any one of the models Vj n (κ) . In this way, the elements of S are like Maharishi’s Absolute
Number.
Studying S further, we were led to a deeper investigation of the application operation · on the
collection Aj of elementary embeddings V → V with fixed critical point κ, obtained by applying
j to itself finitely many times in all possible ways. This investigation led to the discovery that
Aj admits a natural metric that renders it a dense-in-itself metric space, whose completion Āj in
ω ω forms a nowhere dense, perfect Polish space, which admits a finite, non-zero, diffused Borel
measure. We arrived at the conclusion that we can obtain from Āj a reasonably natural Hilbert
space L 2 (Āj , µ), which could in principle be used in the context of quantum mechanics. This
discovery shows how, having discovered a blueprint for ω consisting of all “absolute” elements,
large cardinal notion, called I3 . In any such model, the critical sequence of j is indeed unbounded in the ordinals
of the model.
182
we could expand this discovery to include a rather concrete mathematical object, also composed
entirely of “absolute” elements.
Having arrived at a reasonable generalization of the characteristics of a Dedekind self-map
to give an account of large cardinals, we are in a position to comment on the naturalness of
this solution. In the first place, the path we have taken, starting from a set Dedekind self-map
i : A → A, extending to self-maps V → V and incrementally strengthening them by gradually
introducing generalizations of characteristics of i, already provides considerable evidence for the
naturalness of the theory ZFC + WA.
Arguing from another perspective, as in Section 19, one can use as a criterion for naturalness a
parallel criterion for existence of an infinite set. We suggested in that section that the formulation
There is an infinite set if and only if the forgetful functor SM → Set has a left adjoint
illustrates naturalness of infinite sets in a particularly striking way, since it shows how the concept
of an infinite set arises in the same way as so many other mathematical structures have arisen:
from an adjoint situation, or equivalently, from a monad j : V → V . We suggested in that
discussion (p. 118) that if it turned out that large cardinals also arise from a “monad-like”
Dedekind self-map j : V → V , appearing first in the vicinity of the least critical point of j, then
we would have similar evidence of the naturalness of large cardinals. The j : V → V that arises
from the Wholeness Axiom satisfies these criteria to a large extent, though not completely. While
it is unlikely that j itself is a monad (for example, it is known that κ ∈ j(κ) is not a universal
element of j), large cardinals do arise in the vicinity of its critical point, and the existence of a
Laver sequence seems to be a reasonable strengthening of the idea of a universal element: One
can show, for example, that, if f is a Laver function obtained from j, then
V = {i(f )(κ) | i ∈ E},
where E is the class of extendible embeddings with critical point κ.
In this sense, then, the j : V → V provided by the Wholeness Axiom can be viewed as natural
not just because of the natural process of generalization from existence of Dedekind self-maps
that was employed, but also because it reflects to a large extent the highly natural dynamics of
the adjunction relationship, by which the very notion of “infinite set” can be seen to emerge in
mathematics.
We began this paper by proposing a new Axiom of Infinity, that would reflect some principles
of ancient and modern wisdom regarding the possible “origin” of the set of natural numbers. We
suggested that by viewing the concept of an infinite set, particularly the set of natural numbers,
as fundamentally the self-referral dynamics of an unbounded field, rather than a bag of discrete
points, we would be able to approach questions about the infinite in mathematics, like the Problem
of Large Cardinals, with more clarity and a stronger intuition. By now, we have demonstrated
our claim reasonably well, and in so doing, justified to a large degree this new viewpoint about
the “underlying reality” of the set of natural numbers.
We summarize our conclusions in the following short list of assertions:
(1) The infinite in mathematics is, fundamentally, the self-referral dynamics of an unbounded
field.
183
(2) The set of natural numbers is a side effect of the dynamics of a Dedekind self-map.
(3) Large cardinals are justifiable entities in mathematics because they arise through the same
natural means and self-referral dynamics as the set of natural numbers themselves.
184
26
Appendix I: A First Look at How 1 Emerges from 0
We have discussed in this paper how notions of the mathematical infinite—from the bare existence
of an infinite set to large cardinals—arise from the dynamics of Dedekind self-maps, which can
be seen as mathematical representatives of the self-referral dynamics of an unbounded field. In
particular, we have seen how the “Infinite” emerges in the hereditarily finite world HF by the
introduction of a set Dedekind self-map, and how large cardinals can be seen to emerge into the
world of ZFC by way of class Dedekind self-maps with strong preservation properties. In both
cases, Dedekind self-maps can be seen as the catalysts that drive expansion from ZFC − Infinity
to ZFC and from ZFC to ZFC + large cardinals.
With these insights in mind, we venture into a rather speculative topic: How does the empty
set ever give rise to a nonempty set? How does 0 give rise to 1? How does something arise
from nothing? In the set theoretic universe V , to answer this question is to shed light on the first
sprouting of the unmanifest into manifestation. Our proposal is to view the transition from 0 to 1
as being of the same character as the expansions described in the previous paragraph; indeed, that
this fundamental transformation, from 0 to 1, is then reflected outwardly as the more elaborate
transformations from ZFC − Infinity to ZFC and from ZFC to ZFC + large cardinals. This
theme of unfoldment reflects, in the world of mathematics, the analysis of the structure of pure
consciousness as described in Maharishi’s Apaurusheya Bhasya: The transformational dynamics
that are found in the sequential unfoldment of the Ved, and ultimately of the creation itself, are
to be traced back to the very first transformation—the transformation from ‘A’ to ‘K’. In this
collapse of unboundedness to a point is found the seed of all transformations that emerge from
it. Likewise, we hope to locate within this first stage of unfoldment—from 0 to 1—the dynamics
at work in the larger-scale transitions within the dynamics of V .
As a first naı̈ve attempt at answering this question, we consider whether there could exist
anything like a Dedekind self-map from 0 to 0. We can certainly get started by examining any
map j : ∅ → ∅. But there is only one such map, namely, the empty map, which is in fact simply
∅ itself. In particular, since domain, codomain, and map are all empty, there can be no critical
point of such a map, no seed that can germinate into something that is not empty. j : ∅ → ∅
represents unmovable silence, a stillness that never changes, a state of unperturbable balance and
symmetry. How does the emptiness of the empty set develop into concrete, bounded, nonempty,
individual sets?
Maharishi remarks [27, p. 286], “...the [empty] set is the self-perpetuating source of all
sets;” in the same sentence, he says, “We have seen in mathematics that the consciousness of the
mathematician is the source of all order....”. He also remarks [27, p. 294],
In order to bring about the transformation of anything into anything, the intelligence
of the [mathematician] must operate from the transcendental level, which is the basic
level of intelligence of every object in creation.
These points remind us of something that is in a sense obvious: The impulse of intelligence that drives the empty map j : ∅ → ∅ to expand into the universe is the impulse of the
transcendental level of the consciousness of the mathematician. The empty set is the unshaped
all-possibilities source of all sets, but the impulse for this emptiness to take shape as individual
185
sets does not arise from the objective form on its own.108 It is consciousness itself that is responsible for locating and bringing into expression the potential that is locked in the empty set. In
this way, the empty set can be seen as the meeting point between the objective world of sets and
mathematics and the subjective reality of consciousness.
The impulse of consciousness that causes the sprouting of j : ∅ → ∅, we suggest, finds
expression in what we could call unmanifest flows—transformational dynamics of wholeness that
are realized as proper class maps, like the global successor function s and the power set operator
P. We discovered that maps of this type do not have much strength on their own—for instance,
they do not imply the existence even of an infinite set—but they provide a context in which
manifestation occurs.
To show how all-pervasive proper class maps really are, one can prove, for example, that
every (set) Dedekind self-map is obtained from a proper class Dedekind self-map by restriction:
Let f : A → A be a Dedekind self-map with critical point a. Define f : V → V by
(
f (x) if x ∈ A
f (x) =
x
if x 6∈ A
Certainly f is a proper class map with the property that f = f A. But now, even though
this is the case, it is not generally true, as we have already seen, that, for a given proper class
Dedekind self-map j : V → V with critical point a that there is any set A at all that contains a
and is j-closed; in other words, none of the restrictions of j to sets may turn out to be Dedekind
self-maps. The conclusion is that the notion of a Dedekind self-map that is a proper class is what
we would call an unmanifest flow, creating a possibility but not quite able to manifest on its own.
We speculate that two of these unmanifest flows are primary, and work together to “form
the universe” through their interaction, corresponding to two fundamental principles underlying
all expression from the unmanifest: existence and intelligence. The flows we have in mind are the
global successor function s and the power set operation. Each provides a sequential unfoldment
of sets, but according to different themes.
The power set operation P draws out from any set argument A all possibilities from the set,
giving rise to the set of all subsets of the set A. Overall, P is what gives rise to the raw material
of mathematics — all structures that are used in mathematics arise from the work of P. In this
sense, P gives expression to the principle of existence.
The successor operation s, applied to any set, points in the direction of the “next step.” This
directional character of s is especially meaningful in the context of the class ON of ordinals, as
it provides a dynamic expression of the well-ordered property of ON: For each ordinal α, s(α) is
equal to the smallest ordinal that is greater than α. For this reason, we think of s as expressing
the principle of intelligence in the unfoldment from the unmanifest.
These two operations, s and P, follow identical steps at early stages. The critical step for
each, from a philosophical point of view, is the move from 0 to 1. This is the transformation
from 0 to {0}. Here, the value of emptiness in a sense awakens to itself to see itself as an object
108
Compare with this remark from Maharishi [27, p. 296]: “What is lacking in the objective approach of science, where the whole knowledge is truthfully contained in one little symbol, is that the symbolic expression of
total knowledge . . . does not by itself evolve into the self-referral quality—the self referral subjectivity of the
mathematician”.
186
of awareness. Seeing itself as an object means viewing the boundless in terms of a boundary—
in this case, as a singleton set whose only element is itself. This is the crucial step, according
to Maharishi, in the unfoldment of pure consciousness: consciousness becoming conscious, and
becoming conscious of itself. This vital step, in the mathematical context, is performed by the
unmanifest dynamics represented by the global successor operator and the power set operator
simultaneously.
Once 1 emerges, each operator is applied to produce the number 2. Then, as Maharishi has
described in his Vedic Science, the number 3 emerges; 3 is the number that gives expression to
the relationship between the two, between existence and intelligence; it is the number that gives
expression to the flow between these, and to the awareness that one has of the other. Maharishi
identifies the number 3 with the fundamental principles of rishi, devata, and chhandas—knower,
process of knowing, and the known. In a similar way, we find that 3 marks the point at which
the operations of s and P begin to be different: s(2) = 3, whereas P(2) = {0, 1, 2, {1}}, a new
set that is not an ordinal. The operators continue thereafter to unfold along different paths, but
there is a close relationship between them. This relationship can be expressed in the following
commutative diagram:
ON
s
- ON
T
T
?
V
P
-
?
V
where T is the map ON → V defined in the remarks following Theorem 33, where it is shown to
satisfy T(α) = Vα for every ordinal α. In particular,
P(T(α)) = P(Vα ) = Vα+1 = T(s(α)).
In this way, the impulse for sequential unfoldment, embodied in s is combined with the expansive
power of the set-generator P to produce a sequence of stages V0 , V1, . . . , Vα, . . ., providing us with
an orderly unfoldment of all sets in the universe.109
What we have suggested here is that 1 arises from 0 because of the confluence of two unmanifest tendencies operating in the “unmanifest context” in which 0 originally appears. This
unmanifest context, and the impulses that arise within it, we recognize as the expression of the
transcendental level of the consciousness of the mathematician: 1 cannot arise from 0 without the
involvement of the subjective reality of consciousness. We speculated that among such unmanifest flows, two are primary: The power set operator P and the global successor operator s. The
109
In these points we have emphasized the active side of the unfoldment of the stages in the construction of V . In
reality, the development of V0 , V1 , . . . , Vα , . . . involves a mix of active and silent steps. Active steps are performed
at successor ordinals, having the form β + 1. Stages of the form Vβ+1 are obtained by applying the power set
operator to the previous stage: Vβ+1 = P(Vβ ). However, whenever α is an infinite limit ordinal, Vα is obtained by
collecting togetherSall stages that have been built so far—it is a restful phase in which no new sets are produced.
In this case, Vα = β<α Vβ . We can represent the silent phase of construction with a diagram as follows: Let LIM
denote the class of limit ordinals. Let H : LIM → V be defined by H(α) = T α. And let S : V → V be defined
by
(S
ran x if x is a function
S(x) =
∅
otherwise.
187
power set operator elicits from 0 its hidden value as a storehouse for all sets. The global successor
operator elicits from 0 its capacity to set things in order through sequential unfoldment. Each
operating on 0 in its own way, 0 becomes as if aware of itself and discovers itself to be {0}, or 1.
Then, as 1 gives rise to 2, these operators begin separate but closely related lines of unfoldement,
giving rise to the ordered unfoldment of all sets.
Finally, we return to the hypothesis that we began with: That the dynamics of emergence of 1
from 0 should contain in seed form the dynamics of the larger-scale transitions in the unfoldment
of all sets, such as the emergence of an infinite set in a ZFC − Infinity world. The dynamics
involved in the emergence of 1 from 0 entail an objectification of the pure subjectivity represented
by ∅, so that this emptiness is appreciated in terms of an objective, single-element set {∅}. This
transformation occurs by introducing what we have called unmanifest flows, or proper class maps.
In particular,
s(∅) = {∅} = P(∅).
In a similar vein, we have seen that the notion of an infinite set also arises as a kind of objectification of a potential. Without explicitly introducing the Axiom of Infinity, the sequences
∅, s(∅), s2 (∅), . . . and ∅, P(∅), P2 (∅), . . . cannot be seen as sets, but rather as proper class sequences, representing in a sense unmanifest possibilities. This potential for infinity is realized by
declaring—or, in a parallel way, in the subjective world, recognizing—that these class maps have
a closure point: There is a set A such that s A : A → A and P A : A → A. The recognition
of something infinite is at the same time a bringing into concrete realization certain unmanifest
flows or possibilities. Something similar happens in taking the step of including the Wholeness
Axiom as a fundamental axiom of set theory; this step corresponds on the subjective level to a
certain kind of awakening to the reality that self-referral dynamics of consciousness are in fact
responsible for the emergence of everything. The truth of WA is recognized in acknowledging
the presence of a transformation j : V → V , a transformation of wholeness, that preserves all
first-order properties but at the same time remains connected to every point (j X is a set for
every set X). In this way, we see that emergence of 1 from 0, of infinite from finite, and of large
cardinals from the ordinary world of sets occurs as a realization or manifestation of a possibility
that is evident at an undeveloped stage: The empty set contains the possibility of 1; the world
HF of hereditarily finite sets recognizes a potential infinity; and in the standard universe, the
large cardinal nature of V itself, as well as its natural transformations V → V that accompany its
construction, points to the potential for existence of large cardinals. In every case, these potentials are realized objectively in the mathematics by declaring that certain proper class functions
The silent phase of the construction is then represented by the following commutative diagram:
LIM
id
- LIM
H
T
?
V
S
-
?
V
Here, the action of the diagram is driven by the identity transformation id : LIM → LIM rather than by s as was
done for successor stages.
188
exhibit certain new (but natural) characteristics; this objective declaration corresponds on the
subjective level to a certain realization about the nature of one’s own consciousness.
189
27
Appendix II: Unfoldment of the Universe Within the Unmanifest
In Appendix I, we attempted to give an account of how 1 arises from 0—the mathematical
analogue to the philosophical question of how “something” arises from “nothing.” We observed
that the transformation of 0 into 1 occurs by virtue of class operators, like the power set operator
and the global successor function. These, we argued, live in the unmanifest context for creation
and provide the hidden impetus for unfoldment.
From the point of view of giving an account of the universe from first principles, there
is something unsatisfactory about the approach taken there. From the ancient point of view,
everything arises from the dynamics of pure consciousness, the dynamics of one fundamental field.
Nothing from the outside is needed to stimulate these dynamics. In the case of mathematics,
however, the empty set really has nothing in it that would cause it to expand into other sets.
While it is true that, by virtue of the construction of the cumulative hierarchy V , each set in
V can be appreciated as a pattern involving set braces ({, }) and the empty set, these complex
patterns cannot be said to arise from dynamics that are somehow “inherent in” the empty set.
In this Appendix, we describe another approach, which does account for the universe in
a way that parallels more closely the ancient view. We begin by asking for a solution to a
set of equations that express essential characteristics of the field of pure consciousness. These
essential characteristics are sufficient, from the Vedic point of view, to give an account of the
origin of everything within the field of consciousness. We then ask whether, within the realm of
mathematics itself, there is a solution to these equations. We write these equations first and then
explain their significance:
(+)
xx = x = {x}.
The notation xx is intended to mean the set of all transformations from x to x, in the usual
way. A solution Ω to (+) would have the following characteristics:
(1) Ω consists precisely of all possible transformations of itself to itself.
(2) One of the transformations of Ω to itself is Ω itself: Ω : Ω → Ω (since Ω ∈ Ω).
(3) Ω and its collapse to a point {Ω} are one and the same
In this Appendix, we show there is a natural (and unique) mathematical solution Ω to (+),
using a slight “expansion” of the usual ZFC axioms. We show also how it is possibe to derive all
sets in the universe from this one point Ω, and at the same time, how all sets naturally collapse
back to Ω. Moreover, the entire mathematical landscape will be seen, on this view, to be nothing
but patterns and permutations of this one “reality” Ω.
Properties (1)–(3) tell us that Ω—or, equivalently, the ‘x’ of equations (+)—exhibits the
essential characteristics of pure consciousness itself, as described by Maharishi [25]: It is single
and undifferentiated (like Ω) but, from a certain point of view, has internal dynamics that arise
from the fact that, being consciousness, it assumes the roles of knower, known, and process of
knowing. In the role of process of knowing, it transforms itself within itself. This is seen in the
190
case of Ω from the fact that Ω ∈ ΩΩ , which implies that Ω : Ω → Ω. Also, in the role of knower,
it is the witness of all such transformations; it contains within itself all possible transformations
that could arise within it. This role as knower is expressed by the fact that Ω = ΩΩ . It also
assumes the role of a point within itself (Ω = {Ω}) and in this way becomes the object of its
own perception, its own process of knowing. In addition, its dynamics can be appreciated as an
expression of the field of Maharishi’s Vedic Mathematics—the dynamics by which, behind the
scenes, Nature carries out its administration of the universe. This can be seen by the fact that
Ω is itself equal to the evaluation map eval : ΩΩ × Ω → Ω defined by eval(f, x) = f (x). The
evaluation map expresses the way in which Natural Law is applied to each point in existence to
carry it forward to the next stage of its unfoldment.
Ω As Samhita of Rishi, Devata, and Chhandas, and Administrator of the Cosmos
(1) Rishi: Ω = ΩΩ
(2) Devata: Ω : Ω → Ω
(3) Chhandas: Ω = {Ω}
(4) Structuring Dynamics of Creation:
Ω = eval : ΩΩ × Ω → Ω : (Ω, Ω) 7→ Ω.
We demonstrate the fact that
Ω = eval
in the following Proposition. We take as our background theory the ZFC axioms without the
Axiom of Foundation (denoted ZFC− ), together with the assumption that Ω is indeed a solution
to (+). To draw further conclusions, we will assume somewhat more later on.
Proposition 88. (ZFC− ) Suppose Ω is a solution to the equations (+).
(A) Ω ∈ ΩΩ , so we may write Ω : Ω → Ω.
(B) Ω = (Ω, Ω)
(C) Ω = Ω × Ω = ΩΩ × Ω.
(D) For all x ∈ Ω, Ω(x) = Ω (using the representation of Ω in (A)).
(E) Ω = eval.
Proof of (A). This follows from the fact that Ω ∈ Ω = ΩΩ .
191
Proof of (B). We have the following derivation (using the fact that, by definition, for any sets
x, y, (x, y) = {{x}, {x, y}}).
(Ω, Ω) = {{Ω}, {Ω, Ω}}
= {{Ω}, {Ω}}
= {{Ω}}
= {Ω}
= Ω.
Note that the proof of (B) makes use only of the fact that Ω = {Ω} (and not the fact that Ω = ΩΩ ).
Proof of (C). Note that Ω × Ω = ΩΩ × Ω since Ω = ΩΩ . For the first equality, we have:
Ω × Ω = {(x, y) | x ∈ Ω and y ∈ Ω}
= {(Ω, Ω)}
= {Ω} by (B)
= Ω
Proof of (D). Since Ω = {Ω}, the only element of Ω is Ω. Therefore, it suffices to show that
Ω(Ω) = Ω. But this is equivalent to the assertion that (Ω, Ω) ∈ Ω, and this follows from (B) and
the fact that Ω ∈ Ω.
Proof of (E). The set-theoretic definition of eval is:
eval = {(x, y, z) ∈ ΩΩ × Ω × Ω | x(y) = z}.
For any (x, y, z) ∈ eval, the fact that (x, y) ∈ ΩΩ × Ω implies that (x, y) = (Ω, Ω); and the fact
that z ∈ Ω implies z = Ω. Therefore, the only element of ΩΩ × Ω × Ω is (Ω, Ω, Ω), and this
element (x, y, z) does satisfy x(y) = z (as shown in (D)). Therefore,
eval = {(Ω, Ω, Ω)} = {(Ω, Ω)} = {Ω} = Ω,
as required.
The Proposition has been established in the theory ZFC− together with (+), but we have
not yet demonstrated that this theory is consistent. We handle this issue by showing that (+) is
derivable from the theory ZFC− + AFA, where AFA denotes Aczel’s Anti-Foundation Axiom. It
is known that if ZFC is consistent, so is the theory ZFC− + AFA. Later in this appendix, we will
discuss interesting points about the proof of this fact.
We give a quick introduction to this axiom AFA so that we can use it to gain additional
insights into this model of pure consciousness that we have proposed. A (directed) graph G is
a pair (M, E) consisting of a set M of vertices and a set E of edges. Edges are represented by
pairs of vertices.110 So, if u, v are vertices in a graph G and G has an edge from u to v, this edge
is denoted (u, v). We also write u → v to indicate that (u, v) ∈ E.
110
We assume that our graphs do not admit parallel edges—i.e., each pair of vertices has at most one edge incident
to both vertices.
192
A pointed graph is a graph with a designated vertex, called its point. When we draw pointed
graphs, the point of the graph is the top vertex (whenever that makes sense) and descendants
of the point evolve downward. (See examples below.) A pointed graph (G, p) = (M, E, p) is
accessible if for all v ∈ M there is a path from p to v in G. Accessible pointed graphs are referred
to by the acronym apg. If, for all v, there is exactly one path from p to v, then G is called a
tree. A decoration of a graph is an assignment of a set to each node of the graph in such a way
that the elements of a set assigned to a node are always assigned to the children of that node. In
symbols, a decoration of G is a map d : G → V such that, for all vertices v, w of G,
v → w if and only if d(w) ∈ d(v).
A picture of a set X is an apg that has a decoration in which X is assigned to the point. A
graph is well-founded if it has no infinite path. The following theorem does not require AFA; it
follows from ZFC:
Theorem 89.
(A) Every well-founded graph has a unique decoration.
(B) Every well-founded apg is picture of a unique set.
(C) Every set has a picture.
Here are some examples of well-founded apgs; each apg shown is a picture of the set that
labels it. The first apg has no edges; this means that the set it represents can have no elements.
Accordingly, the unique set that is represented by this apg is the empty set ∅. The designated
point for the second apg has just one child, which, in turn, has no children; accordingly, the set it
represents is {∅}, the set whose only element is ∅. Similarly, the third apg shown represents the
set having as children the empty set and the set whose only element is the empty set, namely,
{∅, {∅}}.
These examples illustrate Theorem 89(A); in these simple cases, it is easy to see that there
is only one way to decorate the vertices of the given apgs with sets. A reasonable generalization
of this theorem to all possible apgs is the Anti-Foundation Axiom:
The Anti-Foundation Axiom (AFA). Every graph has a unique decoration.
An immediate consequence is the following:
193
Proposition 90 (Uniqueness Theorem). Every apg is a picture of a unique set.
A consequence of the Uniqueness Theorem is that the following apg uniquely determines a
set:
The unique way to decorate this graph is with a set whose only element is itself; such sets
can be shown not to exist in any universe built from the standard ZFC axioms; in particular,
existence of such a set contradicts the Axiom of Foundation. For this reason, we must consider
this new axiom AFA only in the absence of the Axiom of Foundation. In the theory ZFC− +AFA,
the Uniqueness Theorem does indeed hold true. The unique set pictured by this single-loop graph
is usually denoted Ω.
Single-loop Graph with Unique Decoration Ω
We observe that, with regard to the single-loop graph shown above, AFA tells us two things:
First, that the graph is a picture for some set; and second, that there is only one such set. This
latter point is as important as the first. Without AFA, even if we assume that the single-loop
graph above can be decorated with a set X = {X}, there is no guarantee that it is unique (there
could be X and Y such that X = {X} and Y = {Y } and X 6= Y ).
We have now used the symbol Ω in two different ways: as a solution to (+) and as the unique
decoration for the single-loop graph. We show now that this ambiguous usage is justified.
Theorem 91. (ZFC− + AFA). Let Ω be the unique decoration of the loop graph shown above.
Then Ω is the unique solution to the equations (+); that is, Ω is the unique set satisfying Ω = {Ω}
and Ω = ΩΩ .
Proof. The fact that Ω = {Ω} follows from the structure of the apg for Ω: certainly the
designated point of the single-loop graph is a child of itself, so it follows Ω ∈ Ω. But it is also
clear that the designated point is its only child. Therefore, Ω is the only element of Ω, and so
Ω = {Ω}.
To see that Ω = ΩΩ , since Ω is the singleton set {Ω}, it follows there is exactly one map
f : Ω → Ω, namely, the function defined by f (Ω) = Ω. The function f can be represented as a
194
set of ordered pairs, namely f = {(Ω, Ω)}. As we argued before (Proposition 88(B)), Ω = (Ω, Ω).
It follows that f = Ω. Therefore, ΩΩ = {f } = {Ω} = Ω, as required.
To see that Ω is the unique solution to the equations (+), note that any solution to these
equations—even to the single equation x = {x}—is a decoration of the single-loop apg pictured
above. By AFA, there is only one such decoration. Therefore, there is only one solution to (+).
The Theorem, together with the fact that ZFC− + AFA is consistent whenever ZFC is
consistent, shows that ZFC− is consistent with the equations (+), and so our earlier work in this
section is fully legitimized.
Before embarking on a discussion about how all real sets can be seen to arise from, and
return to, the single non-well-founded set Ω, we spend some time exploring how a ZFC− + AFA
universe is built.
Building a ZFC− + AFA Universe
Let V denote the usual universe of sets—a model of ZFC—as we have discussed in the main
part of this paper. One can build, within V , a model V̂ of ZFC− + AFA. Moreover, any such
model will have a well-founded part WF = WFV̂ (consisting of all the well-founded sets in V̂ )
that is isomorphic to the original ZFC model V : WF ∼
=V.
In this way, we obtain the intuitive picture that a ZFC− + AFA universe is an expansion of
the standard cumulative hierarchy of well-founded sets—an expansion consisting of well-founded
sets together with ideal elements, which in the present context are the non-well-founded sets, like
Ω.
We take a moment to describe, at a high level, how a universe for ZFC− + AFA can be
constructed. We begin with the usual well-founded universe V of ZFC. Roughly speaking, we
wish to think of the sets of our new universe as being precisely the apgs that live in V . This
isn’t quite right though because different (nonisomorphic) apgs can picture the same set, even
in the well-founded case. For example, the following pictures of the (well-founded) set {0, 1} are
nonisomorphic:
195
This situation is similar to the construction of the reals from the rationals—a first try is to
declare that a real is a Cauchy sequence of rationals; the problem here also is that many Cauchy
sequences converge to the same real, even Cauchy sequences that converge to a rational. The
solution is to form a quotient by an appropriate equivalence relation. We wish to say that two
apgs are equivalent if they picture the same set, but, since we have not yet constructed all the
sets of our new universe, this statement of the relation is not formally correct (likewise, one would
like to say that two Cauchy sequences are equivalent if and only if they converge to the same real,
but this definition, for the same reason, is not formally correct). To capture this idea without
assuming existence of the non-well-founded sets we are trying to build, the necessary equivalence
relation is formulated in another way.
We give a brief description of the required equivalence relation. Suppose G = (MG , EG ), H =
(MH , EH ) are graphs. A bisimulation for G, H is any relation R ⊆ MG ×MH having the following
properties: There is a relation R+ with R ⊆ R+ ⊆ MG × MH satisfying the following: For each
a ∈ MG and b ∈ MH , aR+ b if and only if both of the following hold:
(i) whenever a → x there is y ∈ MH with b → y and xRy
(ii) whenever b → y there is x ∈ MG with a → x and xRy
The equivalence relation that we will need is a maximum bisimulation; we will discuss this
concept after giving an example.
Example 1. Consider the two apgs mentioned earlier that picture the set {0, 1}; for this example,
we will call them G and H.
196
We have labelled these graphs differently here to emphasize the fact that bisimulation relations make sense for any kind of directed graph, not just apgs. We define three bisimulations for
G, H. The first is a trivial bisimulation, which is obtained by declaring that childless nodes are
related to each other:
R1 = {(b, f ), (d, f )}.
The conditions for a bisimulation are satisfied vacuously because none of the vertices in the
relation have children.
The second bisimulation includes vertices that do have children:
R2 = {(c, g), (d, f ), (b, f )}.
The new pair (c, g) can be included because the respective children of c and g occur as pairs also.
Notice that R02 = {(c, f ), (d, f ), (b, f )} is not a bisimulation since, although c → d, there is no d0
in H such that f → d0 . Also, R002 = {(c, g), (b, f )} is not a bisimulation because, though g → f in
H, there is no corresponding child of c that is paired with f in the relation.
The third bisimulation includes all the vertices.
R3 = {(a, e), (c, g), (d, f ), (b, f )}.
It is not hard to see that there is only one bisimulation that includes all the vertices; it
is called the total bisimulation, or the maximum bisimulation. Two graphs that admit such a
bisimulation are called bisimilar. Considering these graphs as apgs (with respective designated
points a and e), it is apparent in this example that, although the graphs are not isomorphic, the
membership structures that they specify are the same; it is clear in this case that these apgs must
picture the same set.
Bisimilarity can be shown to be an equivalence relation on directed graphs. Moreover, the
discussion in the example gives some idea about why this equivalence relation is the one we
are seeking: When we consider apgs as displays of potential membership structure, it seems
intuitively clear that whenever apgs are bisimilar, the membership structure of both apgs is the
same, so they should picture the same set.
Using this equivalence relation ≡ on apgs, we wish to form the collection of resulting equivalence classes. Since typically each equivalence class will itself be a proper class, we use a familiar
technique (known as Scott’s trick) to reduce their size: we take a representative r from each such
class having least rank and we let [r] denote the set of all apgs equivalent to r and having the
same rank as r; and finally, we let V̂ denote the collection of all such reduced equivalence classes
[r].
We have described the “sets” of our new univere. We also need to specify the “membership
relation.” Suppose [r], [s] ∈ V̂ . Since r, s are pointed graphs, we may write r = (G, pG) and
s = (H, pH ), where pG and pH are the designated points of the apgs. We declare that [r] is
a “member of” [s] if there is a vertex q in H with pH → q such that the sub-apg Hq of H
determined by q (defined as: Hq = {h ∈ MH | there is a path in H from q to h}) is equivalent
to G:
∃q ∈ MH Hq ≡ G.
197
We give a simple example:
Example 2. Consider the following apgs:
We observe that [G] is an “element” of [K], according to our new definition. We show that
G is equivalent to a sub-apg of K whose designated point is a child of u. Here, there is only one
way for this to happen since u has only one child, namely, e. Clearly, the sub-apg of K whose
designated point is e is precisely the apg H of Example 1, which, as we have seen, is indeed
equivalent to G. Therefore, [G]“∈”[K]. Of course, this is what we expect since G is a picture of
{0, 1} and K is a picture of {{0, 1}}.
It can be shown that the equivalence classes belonging to V̂ that contain well-founded trees
correspond exactly to the sets in the original universe V ; more precisely, if we let WF = {[r] ∈
V̂ | r is well-founded}, then V ∼
= WF.
The Ideal Elements of V̂ As Solutions to Equations
This viewpoint that a ZFC− + AFA universe is an expansion of the usual universe V of
well-founded sets by ideal elements is familiar from other areas of mathematics, such as the
expansion of the reals to the complex numbers, through the introduction of an “ideal” solution
to the equation x2 + 1 = 0. In fact, one can also view the expansion from V to V̂ as arising from
the introduction of “ideal” solutions to—in this case—classes of equations. One such equation, as
we have seen, is x = {x}. Another is x = (0, x). An example of a small system of such equations
is:
x = {0, x, y}
y = {{x}}
We can give a precise formulation of the relevant equations as follows: By analogy with the
expansion from R to C, we need to introduce indeterminate elements. To take the step from
R to C, we first need to obtain the domain R[x] of polynomials in the indeterminate x; then
x2 + 1 ∈ R[x] is an expression for which we seek a root; moreover, any root will be an expression
that does not implicitly contain the indeterminate x. Likewise, we will expand V with a class
X of indeterminates, one for each set in V : X = {xa : a ∈ V }.111 And the “polynomials” we
111
More formally, we are re-building the universe starting with atoms or urelements at the 0th stage. Urelements
198
obtain—which we will call complex sets—are sets built up from other sets together with elements
of X. As a simple example, consider the complex set A(xa , xb) = {0, {1, xa}, (xb, 2)}, where
xa , xb ∈ X. One of the equations that we wish to be able to solve is
xa = A(xa , xb).
A solution to such an equation will be a set whose build-up does not contain any of the elements
of X; such a set is called a pure set.
The Solution Lemma, which is equivalent in ZFC− to AFA, gives a precise statement of
classes of equations that we wish to consider, and asserts that any such system of equations
always has a unique solution.
Theorem 92 (Solution Lemma). Suppose Ax is a complex set, for each x ∈ X. Then the system
of equations
(∗∗)
x = Ax
has a unique solution; that is, there is a unique family {bx | x ∈ X} of pure sets such that for
each x ∈ X, and each indeterminate xa occurring in Ax , if we replace in Ax each such occurrence
of xa with bxa , and if we denote the resulting set Bx , then, for each x ∈ X,
bx = Bx .
Therefore, a ZFC− + AFA universe is obtained as a universe that includes the well-founded
sets and provides unique solutions to all equations of the form (∗∗).
We now return to our discussion about the relationship between Ω and all sets in the standard universe V .
Ω As the Underlying Reality of All Sets
In the first part of this Appendix, we suggested that a mathematical model of pure consciousness, as the ultimate reality, unfolding within itself and knowing itself and itself alone, could be
obtained by finding a solution to the equations
xx = x = {x}.
We found that, although no such solution can be found in the standard universe of sets, there is
one, and only one, solution (the set Ω) if we expand the universe with ideal elements—non-wellfounded sets—by working in the theory ZFC− + AFA.
Next, we will indicate how the dynamics inherent in this solution Ω give rise to an “unmanifest” version of all mathematical objects. We will see how this unmanifest “cosmic counterpart”
to each mathematical object behaves exactly as the mathematical object itself does, but the
transcendental reality of the mathematical object in its cosmic counterpart is not lost in the
expression of its details.
are sets, different from the empty set, that have no elements.
199
Even without introducing the set Ω, the standardard universe V can already be seen as an
expression of a single reality: the empty set. As we have seen, V is built in stages V0 , V1 , V2, . . .,
where
V0 = ∅, V1 = P(V0 ) = {∅}, V2 = P(V1 ) = {∅, {∅}}, . . . 112
Notice that in these first few stages, every set can be seen as a pattern of set braces ({, }) and
the empty set ∅.
The universe V is obtained as the union of all these stages:
[
V =
Vα = V0 ∪ V1 ∪ V2 ∪ · · ·
α∈ON
The observation we made regarding the first few stages continues to hold throughout the construction: Each stage consists of sets which can be seen as just patterns of set braces ({, }) and
the empty set ∅. Moreover, the Axiom of Foundation guarantees that, for any set A ∈ V , there
is a finite sequence of sets a0 , a1, . . . , am satisfying:
(#)
a0 = ∅, am = A, and a0 ∈ a1 ∈ · · · ∈ am .
In this way, every set can be seen as being “made of” the emptyset, and it is always possible to
trace each set back to its “origin” in ∅.
These facts already provide a profound analogy to the ancient wisdom concerning the structure of the universe: Everything is, from this point of view, an expression or precipitation of
one reality, pure consciousness, and everything has, deep within its nature, the full value of pure
consciousness.
Another insight from the ancient perspective, which the present analogy fails to capture, is
that everything is “nothing but” pure consciounsness; that everything is nothing but the dynamics
of pure consciousness. The reality is one; differences arise as a point of view, a way of looking at
or conceiving, this one reality.
We do not see a parallel to this profound insight and potential realization as we consider
the dynamics of the empty set and set formation. What we find instead is that, even though the
empty set is at the core of every set, differences among sets in the universe dominate. The entire
enterprise of modern mathematics relies on the fact that sets that do not have precisely the same
elements are different sets. In no mathematical sense can it be said, for example, that {1, 2} and
{0, 4, 9} are “the same.” The fact that these sets have a common source in the empty set is a
hidden reality, not a “living” reality.
The reason that this fundamental unity is not seen “on the surface” of mathematics is, we
suggest, because of the fact that even the dynamics of ∅ are hidden from view: How does the
empty set actually “give rise” to anything? Even to the set {∅}? Although the very concept of set
formation naturally leads us from ∅ to {∅}, this concept is not in any mathematical way present
in ∅; it is, rather, added “from the outside” to propel ∅ to {∅}. This kind of transformation is
in sharp contrast to the dynamics of pure consciousness, from the ancient perspective. On that
112
For this discussion, we have suppressed the fact that limit stages of the construction are obtained by forming
the union of all previous stages; the logic of the discussion remains the same whether or not this technical point is
included.
200
view, the dynamics that propel pure consciousness to go through its inner transformations are the
dynamics of consciousness alone; indeed, they are nothing but consciousness. Nothing “outside”
of pure consciousness impels it to move or to know itself. Indeed, from this point of view, there
is nothing else.
A more suitable mathematical analogue to pure consciousness and its dynamics is the set
Ω, which we have introduced in this appendix. The difference between the empty set and Ω is
captured nicely in contrasting their respective apgs:
In the single-loop graph, we see a picture (and apg) of a self-relationship, an inherent dynamism between the vertex and itself. Moreover, the starting point, the target, and the relationship between them are, when realized as a set, one and the same: We have Ω : Ω → Ω,
with Ω(Ω) = Ω; indeed, Ω(x) = Ω for every x ∈ Ω. In this way, Ω succeeds as a model of pure
consciousness in ways that ∅ cannot.
One way to view the single-loop graph that pictures Ω is as a kind of refinement of the
single-vertex graph that pictures ∅, in the sense that the self-referral dynamics that one may
imagine are “hidden” deep within the empty set have been brought into plain view. The edge
from the single vertex to itself that is added to the single-vertex graph can be seen as symbolic
of “re-connecting” the point to itself.
This viewpoint provides a new way of viewing all sets in the standard ZFC universe V . In V ,
all sets are seen as distinct and unrelated, even though the “core” of every set is always simply
the empty set (as we discussed earlier). We can use our insights about the relationship between
∅ and Ω to explicitly re-connect every set to its “source” in the following way.
First, let us observe that every well-founded set can be pictured by an apg that has exactly
one childless vertex, and in every case, this childless vertex is decorated with the empty set.
This is true because, for every set A, by the Axiom of Foundation (as we have observed), every
maximal ∈-chain starting at A is finite and terminates in ∅, as in the display (#) mentioned
earlier and reproduced here:
∅ ∈ . . . ∈ ai ∈ . . . ∈ A.
Therefore, as in the different apg pictures of the set {0, 1} discussed earlier, all edges pointing to
∅ can be directed to a single node decorated with ∅, and this node is necessarily childless (since ∅
has no elements). We shall call any such picture of a well-founded set a canonical picture of the
set.
Then, to reconnect any well-founded set A to its source, we can simply add one edge to a
canonical picture from the vertex decorated with ∅ to the designated point, decorated with A.
We will call the edge that it is added in this way the reconnecting edge; notice that there is, for
201
any canonical picture of a well-founded set, just one reconnecting edge. Here is an example:
In the example, we begin with a canonical picture of the well-founded set {0, 1}; recall that
0 = ∅ and 1 = {∅}, so the graph on the left is also a canonical picture of {∅, {∅}}. Its unique
childless node is located in the lower left of the picture, labelled by 0 = ∅. The middle graph
shows what happens when we reconnect this node to its source by adding an edge from 0 to
{0, 1}. The decoration will necessarily change because of the addition of the reconnecting edge
(so no decoration is shown in the middle graph). Finally, in the third apg, we attempt to decorate
the graph in the middle with some set. Certainly Ω can be used, as one may easily verify. But
now because every apg has a unique decoration, the only way to decorate this middle graph is
by placing Ω at every vertex.
What the example shows is that, by adding the reconnecting edge to a canonical apg of a
well-founded set, everything about the set, including its elements and internal relationships, reveal
themselves to be nothing but Ω. This captures in a mathematical way the vision of Vedanta, the
reality of Brahman Consciousness according to Maharishi Mahesh Yogi, in which every object
is recognized to be nothing but pure consciousness. Though differences remain, what dominates
perception is the reality of pure consciousness in every detail.
This same approach can be taken for every well-founded set, as the following theorem shows:
Theorem 93 (The Ω-Theorem). An apg is a picture of Ω if and only if every vertex of the apg
has a child.
By the Ω-Theorem, the “fundamental reality” underlying each well-founded set can be discovered by adding a single reconnecting edge to its canonical picture, from the vertex labelled
with 0 to the designated point of the apg.
We observe that, in particular, one can apply this procedure to every stage Vα of the universe,
and so the “reality” of every stage of V is seen to be Ω itself. Finally, one may form the union of
all stages, as one does to obtain V itself, and what one gets after adding the reconnecting edge
to canonical apgs of each Vα and then forming the union of all these refined stages is, once again,
Ω. One thereby sees that the “reality” of the entire universe is nothing other than Ω.
Returning to our example for a moment, notice also that by removing the reconnecting
edge, the original apg, and hence the original set, comes back into view. For example, removing
the reconnecting edge in the example just displayed, Ω is transformed back to the set {0, 1}.
This same maneuver can be applied to any canonical apg that has a reconnecting edge. If we
form a collection U of all apgs having a single vertex decorated with ∅ and for which there is a
reconnecting edge, one could say that every well-founded set “arises from” the act of removing
the reconnecting edge and thereby breaking the apg’s (and the set’s) connection to its source.
This is a kind of “symmetry-breaking” by which the “perfectly balanced” state of Ω is disrupted
through the removal of one edge from one of Ω’s representations as an apg. In this sense, Ω is
202
seen as the source of all sets, arising from a broken symmetry in the perfectly balanced state
represented by Ω.
These observations suggest that all sets have their fundamental underlying reality in Ω; that
the viewpoint that sets are discrete unconnected expressions is a mirage, just a peculiar way of
looking at the one reality Ω. One could say that the final truth about sets is that each is a unique
expression of one reality, but both the “essence” of the set and its individual characteristics
originate in the dynamics of the ideal set Ω. Unlike the empty set, which is in a sense the “core”
of every set, Ω embodies the dynamics of each set, at an unmanifest level. As an example, the
set (∅, {∅}), after adding the reconnecting edge to its canonical apg, becomes (Ω, {Ω}). This
modified version is built up from Ω in exactly the same way the original set was built up from the
empty set, and so retains all the unique features of the original set, except that the new version
is nothing other than Ω: (Ω, {Ω}) = Ω. All the differences that distinguished the original set
have become, as if, transparent, revealing the true nature of the set as Ω. In this sense, Ω plays
the role of Maharishi’s Absolute Number, as described at the beginning of this paper, revealing
the ultimate nature of all numbers as well as all sets, without obliterating the differences that
distinguish each from another.113
113
See p. 5.
203
References
[1] P. Aczek. Non-Well-Founded Sets. in CSLI Lecture Notes, #14. Center for the Study of
Language and Information. 1988.
[2] S. Awodey. Category Theory, Second Edition. Oxford University Press. 2011.
[3] A. Blass. Exact functors and Measurable Cardinals. Pacific Journal of Mathematics. (1976)
63(2), pp. 335–346..
[4] A. Blass. Combinatorial cardinal characteristics of the continuum. In Handbook of Set Theory.
Eds. A. Kanamori, M. Foreman. Springer. 2010. pp. 393–489.
[5] S.J. Coppleston. History of philosophy, volume 1: Greece & Rome. Image Books. 1962.
[6] P. Corazza. The Wholeness Axiom and Laver Sequences. Annals of Pure and Applied Logic.
(2000) 105, pp. 157–260.
[7] P. Corazza. The Spectrum of Elementary Embeddings j : V → V Annals of Pure and Applied
Logic. (2006) 139, pp. 327–399.
[8] P. Corazza. The Wholeness Axiom. In Consciousness-Based Education: A Foundation for
Teaching and Learning in the Academic Disciplines, Volume 5: Consciousness-Based Education in Mathematics, Part I. Eds. P. Corazza, A. Dow, D. Llewelyn, C. Pearson. MUM Press.
2011.
[9] P. Corazza. Vedic Wholeness and the Mathematical Universe: Maharishi Vedic Science As a
Tool for Research in the Foundations of Mathematics. In Consciousness-Based Education: A
Foundation for Teaching and Learning in the Academic Disciplines, Volume 5: ConsciousnessBased Education in Mathematics, Part II. Eds. P. Corazza, A. Dow, D. Llewelyn, C. Pearson.
MUM Press. 2011.
[10] P. Corazza. The Axiom of Infinity and Transformations j : V → V . The Bulletin of Symbolic
Logic. (2010) 16, pp. 37–84.
[11] P. Dehornoy. Braids and Self-Distributivity. Birkhäuser. 2000.
[12] R. Dougherty, T. Jech. Finite Left-Distributive Algebras and Embedding Algebras. Advances
in Mathematics. (1997) 130, pp. 201–241.
[13] F. D’Olivet, N.L. Redfield. Golden Verses of Pythagoras (English and French Edition).
Kessinger Publishing, LLC. 1992.
[14] F. Drake. Set theory: An introduction to large cardinals. American Elsevier Publishing Co.,
New York, 1974.
[15] A. Enayat, J. Schmerl, A. Visser. ω-Models of Finite Set Theory. In Set Theory, Arithmetic,
and Foundations of Mathematics: Theorems, Philosophies. Eds. Kennedy, J. and Kossak, R..
Cambridge University Press. United Kingdom. pp. 43–65.
204
[16] G. Feng, J. English. Tao Te Ching, 25th Anniversary Edition.
http://www.duhtao.com/translations.html.
Vintage.
1997.
[17] M. Hallett. Cantorian Set Theory and the Limitation of Size. Clarendon Press. 1988.
[18] K. Hrbacek, T. Jech. Introduction to set theory. Marcel Dekker, New York, 1999.
[19] T. Jech. Set Theory: The Third Millenium Edition, Revised and Expanded. Springer-Verlag.
Berlin, Heidelberg, New York. 2003.
[20] K. Kunen. Elementary Embeddings and Infinitary Combinatorics. The Journal of Symbolic
Logic. (1971) 36, pp. 407–413.
[21] K. Kuratowski. Topology, Volume 1. Academic Press. 1966.
[22] R. Laver. On the Algebra of Elementary Embeddings from a Rank into Itself. Advances in
Mathematics. (1995) 110, pp. 334–346.
[23] W. Lawvere. Adjointness in Foundations. Dialectica. (1969) 23. pp. 281–296.
[24] W. Lawvere. Sets for Mathematics. Cambridge University Press. 2003.
[25] Maharishi Mahesh Yogi. Maharishi’s Absolute Theory of Defence. Maharishi Vedic University
Press. Vlodrop, The Netherlands. 1996.
[26] S. Mac Lane. Categories for the Working Mathematician, Second Edition. Springer. 1978.
[27] Maharishi Mahesh Yogi. Maharishi’s Absolute Theory of Government. Maharishi Prakashan.
India. 1995.
[28] Maharishi Mahesh Yogi. Vedic Knowledge for Everyone. Maharishi Vedic University Press.
Vlodrop, The Netherlands. 1994.
[29] Maharishi Mahesh Yogi. Maharishi Vedic University, Bulletin for 1986-1988. MERU Press.
West Germany. 1986.
[30] A. Miller. Special Subsets of the Real Line. In Handbook of Set-Theoretic Topology. Eds.
K. Kunen, J.E. Vaughn. North-Holland. 1984.
[31] Y. Moschovakis. Descriptive Set Theory. North-Holland. 1980.
[32] G.R.S. Mead. Orpheus. John W. Watkins. London. 1965.
[33] R. Rosebrugh, R.J. Wood. An Adjoint Characterization of the Category of Sets. Proceedings
of the American Mathematical Society. (1994) 122. pp. 409-413.
[34] Saunders Mac Lane. Categories for the Working Mathematician. Springer-Verlag. New York.
1971.
[35] Thomas Taylor. The Works of Plato. AMS Press. New York. 1972.
205
[36] R.K. Wallace. The Physiology of consciousness. Maharishi International University Press.
Fairfield, IA. 1993.
[37] H. Weber. Leopold Kronecker. Mathematische Annalen. (1893) 43, pp. 1–25.
206
Download