Set Theory Definitions A set is an unordered, duplication-free collection of values (values = strings, numbers, functions, objects, sets, whatever). Examples: {0, 1, 2, 3, 5, 8, 13} {} {a, e, i, o, u} {{}, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}} Unordered means order doesn't matter: {1, 2, 3} = {1, 3, 2} = {2, 1, 3} = etc. Duplication-free means that duplicated elements are ignored: {1, 1, 2, 3} = {1, 2, 3} Suppose S = {1, 2, 3}, then 1, 2, and 3 are the members of S. Equivalently, S contains 1, 2, and 3. Notation: 1 S && 2 S && 3 S && 4 S We use a special symbol to denote the empty set: {} = The cardinality or size of a set is the number of members: |S| = 3 || = 0 A multiset (or bag) is an unordered collection of values. Notation: S = [1, 2, 3] = multiset containing 1, 2, && 3 (Just made up [].) Duplicates are not ignored in multisets: [1, 1, 2, 3] != [1, 2, 3] A list (or sequence) is an ordered collection of values. Duplicates are allowed and order is important. For example: (1, 2, 3) != (3, 2, 1) (1, 1, 2, 3) != (1, 2, 3) Each element in a list has a unique position. The first element has position 0. For example, suppose L = (a, e, i, o, u), then L[0] = a, L[1] = e, etc. More notation: In general, there are two ways to describe a set. Set lister notation: A = {1, 2, 3} Set builder notation: A = {x | x == 1 || x == 2 || x == 3} In English: A = the set of all x such that x equals 1 or x equals 2 or x equals 3. Set lister notation is convenient for small sets. We sometimes stretch it to describe infinite sets by using ellipses: EVEN = {0, 2, 4, 6, ... } Set builder notation is more formal since any property can come after the "such that" pipe: PRIME = {x | 1 < x && x = a * b implies a = x || b = x} Operations Assume A and B are sets, then so are: A B = {x | x A || x B} A X B = {(x, y) | x A && y B} Ak = A X A X A ... X A (k-times) A – B = {x | x A && x B} A B = {x | x A && x B} P(A) = {x | x A} A -> B = {f | f is a function && domain(f) A && range(f) B} A is a subset of B if every member of A is also a member of B. Notation: A B A equals B (A = B) if A B && B A Note: For each set A, A It is conventional to write f A->B as f: A->B. Famous Sets NAT = {0, 1, 2, ... } INT = {0, 1, -1, 2, -2, ... } RAT = {a/b | a, b INT && 0 < b} REAL = all possible magnitudes and their negatives STRING = all strings of char BOOLEAN = {true, false} Infinities Yes, there is more than one infinity. In fact, there are infinity infinities, some more infiniter than others! Theorem: For every set A, |P(A)| = 2|A| Proof: By induction on |A|. Induction is a cool way to prove things. Here's the idea. Start by defining: W = { n NAT | if |A| = n, then |P(A)| = 2n} In other words: W = { n NAT | the theorem is true if |A| = n} Our goal will be to prove our theorem by proving W = NAT. We do this by showing: (i) 0 W (ii) if n - 1 W, then n W In other words, W contains the increment of any member by (ii), and since 0 W by (i), then 1 W. Applying (ii) again we conclude that 2 W, and so on. The cool feature of induction is that (ii) allows us to assume n – 1 W. This is a powerful assumption. In essence a proof by induction is analogous to a recursive algorithm. proof of (i): If |A| = 0, then A = and P(A) = {} and |P(A)| = 1 = 20. proof of (ii): Assume |A| = n > 0. Pick a random element a A. Let B = A – {a}. Then |P(B)| = 2n1 by induction. Of course P(B) P(A). P(A) – P(B) = {x {a} | x B}, so |P(A – B)| = |P(B)| and therefore |P(A)| = 2 * 2n-1 = 2n. QED. Assume A and B are infinite sets. What does it mean to say that |A| = |B|? Definition |A| = |B| if there is an f: A -> B such that: x != y implies f(x) != f(y) (f is one-to-one) for every b: B there is an a: A such that f(a) = b (f is onto) A one-to-one onto function is called a bijection. Note that if f is a bijection, then so is f-1: B -> A, where f(f-1(a)) = a. Note, using this definition, it's possible that a proper subset can have the same cardinality as the superset. Theorem: |NAT| = |EVEN| Proof: This is true because the function f(n) = 2 * n is a bijection. QED. Theorem: If A = {n NAT | n = 2k for some k}, then |A| = |NAT| Proof: Left as an exercise. Theorem (Euclid): If A = {n NAT | n is prime}, then |A| = |NAT| Proof: Clearly |A| <= |NAT|. If |A| < |NAT|, then A = {p0, p1, p2, ..., pn}, where 2 = p0 < p1 < p2 < ... < pn . Let k = (p0*p1*p2*...*pn), and let p = k + 1. Assume p = r * pi for some i and r, then p = pi*s + 1 = pi * r, so pi = 1/(r – s). This is impossible if pi is a NAT larger than 1, hence p is a prime larger than all of the primes assumed to exist. QED. Theorem: |A x B| <= |A| * |B| Proof: Left as an exercise. Theorem: |NAT| = |INT| = |RAT| = |STRING| Proof: Left as an exercise. Theorem: |NAT x NAT| = |NAT| Proof: Left as an exercise. Theorem: |A B| <= |A| + |B| Proof: Left as an exercise. Theorem: A B implies |A| <= |B| Proof: Left as an exercise. Theorem: If A NAT and A is finite (i.e., |A| < |NAT|), then |NAT – A| = |NAT| Proof: Left as an exercise. Theorem: If A = all binary strings, then |A| = |NAT| Proof: Left as an exercise. Theorem: If A = all valid C programs, then |A| = |NAT| Proof: Left as an exercise. Theorem: If A = all protons in the universe, then |A| <= |NAT| Proof: Left as an exercise. Theorem (Cantor): |NAT| < |P(NAT)| Proof: Clearly |Nat| <= |P(NAT)|. Now assume there is a bijection f: NAT -> P(NAT). Let D = {n: NAT | n f(n)}. Since f is onto, f(m) = D for some m. Is m a member of D? If yes, then m f(m), but this implies m D. On the other hand, if m f(m), then by definition m D, i.e., m f(m). A contradiction either way! QED. Theorem: |A| < |P(A)| for any set A. Proof: Left as an exercise. Notation: Even though |A| may be infinite, we still use exponentiation notation to denote the cardinality of its power set: |P(A)| = 2|A|. We also write |P(A)| = |A|+. So for all A, |A| < 2|A| = |A|+. Theorem: if 2 <= |B|, then |A| < |A->B| Proof: Note that |P(A)| = |A->BOOLEAN| and D = {n: NAT | f(n) = false}. More generally, assume f: A -> (A->B) is a bijection. Let g:A->B be defined by g(a) != f(a)(a). This should be possible since B contains more than one element. Now assume g = f(k) for some k A. then g(k) != f(k)(k) = g(k), a contradiction! QED. Theorem: |NAT| < |REAL| Proof: The proof depends on a famous bijection from NAT->NAT to REAL. So, we have an infinity of infinities: |NAT| < |P(NAT)| < |P(P(NAT))| < ... Notation: |NAT| = 0 0 is called countable infinity since any set with this cardinality can be put in a one-to-one correspondence with NAT, i.e., it can be counted. The next infinity after 0 is 1, and so on: 0 < 1 < 2 < ... < 0 < ... Generalized Continuum Hypothesis: i+1 = i+. This is a hypothesis because it is known that in some models of set theory it is true, while it is false in others. For example, in some weird models of set theory 0+ = 2.