Foundations of Arithmetic Marcus Tornea January 24, 2021 1 Introduction The idea of theorems is central to all known mathematics—alongside definitions, they are what develops any branch of mathematics into an intricate system worthy of their category. To illustrate, some theorems we know at present are only there to constitute proofs of even more complicated and meaningful theorems. However, anyone willing to trace this development backwards may encounter a fundamental problem of somewhat philosophical significance. Theorems are by definition, statements for which there exists a proof, but proofs themselves are based on other theorems previously obtained. In other words, even before we prove the first theorem, there are certain mathematical truths that we already know prior to doing so, which is problematic for the idea of there being a first theorem at all. Even more so is the unavoidable predicament that searching for a proof of every mathematical truth we encounter is bound to be an endless cycle of finding proofs within proofs. The only obvious solution to this problem is the acceptance of certain mathematical statements without proof, which by our conscious decision, shall serve as the basis of all mathematical reasoning. These statements are called axioms, or as you may have already heard, postulates. Essentially all of mathematics rests on a particular group of axioms, though there are other axiom groups convenient only for a single branch of mathematics, should one be interested in studying only such. In particular, we shall expose a certain axiom system that suffices to lay foundations only for arithmetic over the natural numbers. We call this Peano Arithmetic. Basic truths about arithmetic such as the associative, commutative and distributive laws of addition and multiplication are what we deem essentially rudimentary that we almost never notice their usage underlying our dealings 1 with high school algebra. A few empirical examples already suffice to convince us that these laws hold. However, we may have not so far seen their actual proofs as we deem them too basic for our attention. Now is the time. 2 Axioms Below we expose the current formulation of Peano Arithmetic that shall begin with a few prerequisites. We must know beforehand the range of entities that the axioms refer to in the first place, among which is the constant 0. We do not know what 0 is (at least for now, we will pretend so). It is merely an arbitrary symbol that names some entity in the universe of discourse for the axiom system at hand. Mario λ α β 0 Luigi ψ D C A universe of discourse is, in simple terms, the implicit range of entities that we intend to talk about, or study. It constructs the setting in which our logical arguments take place. For someone interested in studying the algebraic properties of real numbers, the universe of discourse could be R, for instance. Generally, a universe of discourse could be any collection of entities, as long as it does not violate any axioms established so far. For instance, the set of natural numbers N cannot be the universe of discourse for any talk about the real numbers, since N does not satisfy the law of additive inverses, which otherwise holds for R. But for our current situation, we have not yet introduced any axioms for Peano Arithmetic, so the associated universe of discourse could be any random collection at the moment, just like the figure above. Only when we introduce new axioms should the possible universes 2 become more exact. Similarly, we will soon discover that the universe of Peano Arithmetic is N (at least by convention). Other than 0, there are three more symbols in the language of Peano Arithmetic that resemble operations: operation symbols + and ·, that will soon resemble addition and multiplication of two entities, and another symbol S called the successor operation, that will resemble an “add one” function; i.e. S(0) = 1, S(1) = 2, and so on. We do not know this yet, however, so you may freely dismiss the remark just now; the symbols 0, +, ·, and S are undefined and will remain so, but only by introducing new axioms do we establish their intended properties. Do note that since +, ·, and S are operations, which dictate relationships between entities in the universe of discourse, they are not themselves entities in the universe. All that said, we are now ready to introduce the first axiom. Axiom 2.1. For all n, S(n) 6= 0. By “for all n”, we are referring to all entities in the universe, over which the variable n ranges. Also note that the outputs of the operations are also themselves entities therein, therefore S(0), an output of S, is some entity in this universe, which we know by the axiom, is not equal to 0. As such, S(S(0)) is also an entity not equal 0, just as S(S(S(0))), and so on. However, this axiom does not prevent us from stipulating that S(S(0)) = S(0). Note that just as we can configure the universe of discourse to contain whatever we want as long as no axioms were violated, we can also configure the operations to behave however we want under the same condition. If we stipulate the above, and apply S to both sides of the equation, we obtain S(S(S(0))) = S(S(0)), but since S(S(0)) = S(0) as stipulated, we have S(S(S(0))) = S(0). One can demonstrate that S(S(S(S(0)))) is also equal to S(0), and the same occurs no matter how many times S was applied. This means that as far as our axioms are concerned, our universe of discourse can at the very least contain only two distinct objects, namely 0 and S(0), and be a valid universe nonetheless. 3 S(0) 0 Such a universe is trivial and insufficient to model our goal of arithmetic over natural numbers; we want this universe to somehow contain an infinite amount of distinct entities to reflect the infinite set of natural numbers, so we call for the need to introduce more axioms. Axiom 2.2. For all m and n, if m 6= n, then S(m) 6= S(n). Now, this axiom tells us that distinct inputs to S yield distinct outputs, so if S(0) and 0 are distinct by the first axiom, then so are S(S(0)) and S(0); thus we can no longer make the problematic stipulation that we made before, a promising sign. Moreover, since S(S(0)) and 0 are also distinct by Axiom 2.1, the universe must now contain at the very least, three mutually distinct objects: 0, S(0), and S(S(0)). A similar argument shows that S(S(S(0))) is also distinct from the rest; we have S(S(S(0))) 6= 0 as given by axiom 2.1; moreover, since S(S(0)) 6= 0, we have S(S(S(0))) 6= S(0), and since S(S(0)) 6= S(0) as well, we have S(S(S(0))) 6= S(S(0)). Now, our universe contains at the very least, four mutually distinct objects. In fact, we can keep going, showing that the next application of S yields an entity distinct from the ones so far obtained. But there is no limit to how many times one could successively apply S, so one can keep asserting the existence of a new distinct entity, after another. Suddenly, the addition of this axiom just spawned an infinite bounty of mutually distinct entities in our universe, just as promised. 4 S(S(S(0))) S(0) 0 S(S(0)) . . . so on S(S(S(S(0)))) We now casually borrow some abbreviational shorthands from the Arabic symbols 1, 2, 3, 4, 5, 6, 7, 8, and 9 to make our notation concise: 1 := S(0) 2 := S(S(0)) = S(1) 3 := S(S(S(0))) = S(2) 4 := S(3) 5 := S(4) 6 := S(5) 7 := S(6) 8 := S(7) 9 := S(8). Our notational convention may also adopt some kind of “overflow” system so that 10 := S(9), 100 := S(99), and so on. Soon, we will be able to name our universe of discourse the set of natural numbers. It appears however that something is hindering us from doing so. Recall that the universe of discourse can contain whatever we want and S could perform however we wish as long as no axioms so far established were violated. Consider the following universe: 5 3 1 0 2 . . . so on A B 4 namely the previous one just with two new distinct objects A and B that are different from the rest, and such that S(0) = 1, S(1) = 2, S(2) = 3 and so on like the usual, but additionally, S(A) = B and S(B) = A. In this setup, one can check for sure that both axioms still hold. Moreover, other than the “chain” formed by the objects 0, 1, 2, 3, . . . with the S operation “linking” them together, the objects A and B appear to form a closed loop that S cycles around, particularly one that is completely separate from the chain linking 0, 1, 2, 3, . . . . Now we have the opposite of the problem we had before—nothing stops our universe from containing excess stuff other than what we intended. Clearly, more axioms are needed to restrict our universe to contain only the chain. Before that, we need some technical prerequisites. You will encounter some notation that reads “P (m, n1 , . . . , nk )”. For our purposes, we will interpret this as an abbreviation for an equation where the variables m, n1 , . . . , nk appear, that can be substituted for entities as we wish. For example, we can define P (m, n) as a convenient shorthand for “m + n = n + m” and thus P (0, k) and P (S(m), n) become shorthands for 0 + k = k + 0 and S(m) + n = n + S(m), respectively. This convention shall only be for our own purposes. In truth, P (m, n1 , . . . , nk ) can stand for a larger class of statements (called “well-formed formulas”) of which equations are merely a subclass. However, this document wishes not to overwhelm the reader, as we will not encounter these more general formulas herein, even in later sections. Still, we shall introduce the following principle in the most general form. Axiom Schema 2.3. Principle of Induction. If P (m, n1 , . . . , nk ) is a well-formed formula, then the following is an axiom: 6 for all n1 , . . . , and nk , if 1. P (0, n1 , . . . , nk ), and 2. for all m, if P (m, n1 , . . . , nk ) then P (S(m), n1 , . . . , nk ), then for all m, P (m, n1 , . . . , nk ). In other words, if 0 satisfies a property and for all m that satisfy the same, S(m) does so as well, then all entities m in the universe satisfy the property. Notice that it is an axiom schema—a scheme to generate multiple axioms following some format. For each equation P (m, n1 , . . . , nk ), there is one induction axiom. Reflecting the unlimited number of ways to construct an equation, there is essentially an infinite number of induction axioms, of which below are two examples. Pay attention to how the following axioms are derived from the general axiom schema. If 1. 0 + 0 = 0, and 2. for all m, if 0 + m = m, then 0 + S(m) = S(m), then for all m, 0 + m = m. In this example, we constructed an induction axiom relative to the equation P (m) : 0 + m = m. Below we construct another one associated with the equation P (k, m, n) : m + (n + k) = (m + n) + k instead: For all m and n, if 1. m + (n + 0) = (m + n) = 0, and 2. for all k, if m + (n + k) = (m + n) + k, then m + (n + S(k)) = (m + n) + S(k), then for all k, m + (n + k) = (m + n) + k. These induction axioms will appear in later sections. Those already familiar with the process of induction may be able to identify the induction basis, hypothesis, and the induction step within these examples. Now, the motivation behind this principle is that since 0, 1, 2, 3, . . . are linked by S in a single chain, i.e. S(0) = 1, S(1) = 2, S(2) = 3, and so on, then if 0 satisfies the property, then so does S(0) = 1, and since 1 satisfies the same, so does S(1) = 2, and so on, essentially establishing the property for each entity in the entire chain, in some sort of “domino effect”. However, if the 7 universe of discourse is like the previous one with an excess pair A and B around which S cycles in isolation from everything else, then this pair will be untouched by the domino effect, and thus A and B cannot certainly be proven to satisfy the property. But this contradicts the supposed conclusion of the induction, that the property is certainly true for all entities m. Therefore, such a universe would violate the principle of induction. Essentially, the principle of induction eliminates outliers like A and B from the universe of discourse, thereby leaving only the entities 0, 1, 2, 3, . . . , sequenced by S. We now call this universe the set of natural numbers and the “entities” shall be referred to accordingly, as natural numbers. 0, 1, 2, 3, 4, 5, 6, 7, . . . Recall that S is named the successor operation. Its role of succession is fully realized at this point. As such, we are now prepared to axiomatize arithmetic by utilization thereof. Axiom 2.4. For all n, n + 0 = n. Axiom 2.5. For all m and n, m + S(n) = S(m + n). The addition operator is described fully with the above two axioms. That is, all the information we need to prove all the properties of addition over the natural numbers is encapsulated in these two concise axioms. We now finally arrive at our first theorem: Theorem 2.6. 1 + 1 = 2. 8 Proof. 1 + 1 = 1 + S(0) = S(1 + 0) = S(1) = 2. by Axiom 2.5 by Axiom 2.4 Another similar theorem below demonstrates the recursive nature of the axioms of addition. Do you notice that they dictate some kind of recurring algorithm for computing sums? Theorem 2.7. 3 + 4 = 7. Proof. 3 + 4 = 3 + S(3) = S(3 + 3) = S(3 + S(2)) = S(S(3 + 2)) = S(S(3 + S(1))) = S(S(S(3 + 1))) = S(S(S(3 + S(0)))) = S(S(S(S(3 + 0)))) = S(S(S(S(3)))) = S(S(S(4))) = S(S(5)) = S(6) = 7. You are invited to identify which axioms were used for which lines of the proof. At this point, we can now demonstrate our awaited proof of associativity and commutativity rules for addition. Theorem 2.8. For all k, m, and n, k + (m + n) = (k + m) + n. 9 Proof. Suppose that P (n, k, m) is a sentence abbreviation that stands for “k + (m + n) = (k + m) + n”. According to the axiom schema of induction, the following is an axiom: for all k and m, if 1. k + (m + 0) = (k + m) + 0, and 2. for all n, k + (m + n) = (k + m) + n implies that k + (m + S(n)) = (k + m) + S(n), then for all n, k + (m + n) = (k + m) + n. Let k and m be natural numbers. By the first axiom of addition, k + (m + 0) = k + m = (k + m) + 0; thereby proving the first condition of the induction axiom (those familiar with induction would call this the induction basis). Now, let n be a natural number such that k + (m + n) = (k + m) + n. This is the induction hypothesis. We have: k + (m + S(n)) = k + S(m + n) = S(k + (m + n)) = S((k + m) + n) = (k + m) + S(n) by the induction hypothesis Axiom 2.5 backwards. Thereby proving the final condition of the induction axiom (the induction step). As a result, we thus have for all n, k + (m + n) = (k + m) + n. Theorem 2.9. For all m and n, m + n = n + m. Proof. We will use induction twice in this argument. First, we want to prove the following claim: Claim. For all m, 0 + m = m. If P (m) stands for “0 + m = m”, we obtain an induction axiom as follows: if 10 1. 0 + 0 = 0, and 2. for all m, 0 + m = m implies that 0 + S(m) = S(m), then for all m, 0 + m = m. Notice that there is no “for all” statement at the beginning. This is because P (m) has no other variables available for substitution other than m, unlike all examples we’ve shown so far. In such cases, the prefix is omitted. The induction basis is obvious, and is an instance of Axiom 2.4. Now let m such that 0 + m = m (induction hypothesis). We have: 0 + S(m) = S(0 + m) = S(m) by the induction hypothesis thereby proving the induction step, and so proving the claim. Now let P (n, m) stand for “m+n = n+m”; the following becomes an axiom: Axiom 2.10. For all n, n · 0 = 0. Axiom 2.11. For all m and n, m · S(n) = m · n + m. 11