Chapter 5 Sets, Etc. A set is a collection of distinguishable objects, called members or elements xS S =ψ Z = {, -2, -1, 0, 1, 2, } N = {0, 1, 2, } R ≣ The set of real numbers A B ≣ A is a subset of B. A B ≣ A is a proper subset of B A = B iff A B and B A Idempotency AA=A Commutativity AB=BA Associativity (A B) C = A (B C) Distributivity A (B C) = (A B) (A C) DeMorgan’s Law A ( B C ) ( A B) ( A C ) A ( B C ) ( A B) ( A C ) A B A B A B A B When all the sets under consideration are subsets of some larger set U, called universe, we can define the complement of a set A as A U A A partition of a set S is a collection of {S1, S2, , Sn} subsets of S, s.t. (1) Si Sj = ψ i, j n (2) S i S i 1 An infinite set that can be put into a one-to-one correspondence with N is called countable set. e.g. N is countable Z is countable 2K – 1 if K 0 -2K if K 0 |AB|=|A|+|B|−|AB| The set of all subsets of a set S, denoted 2s, is called the power set of S e.g. 2{a, b}= {ψ, {a} , {b}, {a, b}} An ordered pair of 2 elements a and b is denoted (a, b), and can be defined formally as (a, b) = {a, {a, b}}. Thus (a, b)≠(b, a) Cartesian product of sets A and B, denoted A × B, is A × B = {(a, b) : a A and b B} The Cartesian product of n sets A1, A2, , An is the set of n-tuples A1 × A2 × × An = {(a1, a2, , an) : ai Ai, i = 1, 2, , n} Relations A binary relation R on 2 sets A and B is a subset of the Cartesian product A × B If (a, b) R, we sometimes write aRb R is a binary relation on A if R is a subset of A × A e.g. “ ” is a binary relation on N : {(a, b) : a, b N and a b} Properties on a binary relation R R is reflexive if aRa R is symmetric if aRb bRa R is transitive if aRb, bRc aRc R is an equivalence relation if R is reflexive, symmetric and transitive The equivalence class of a A is {b A : aRb} e.g. R = {(a, b) : a, b N and a + b is even} R forms 2 equivalence classes 1, 3, 5, ……… 2, 4, 6, ……… An equivalence relation is the same as a partition R is antisymmetric if aRb and bRa a = b A relation that is reflexive, antisymmetric and transitive is a partial order e.g. “ a descendant of “ relation on people is a partial order A partial order R on a set A is total if a, b A, we have aRb or bRa e.g. “ “ is a total order Exercises 5.2-2, 5.2-5 Functions A function f is a binary relation on A and B s.t. for all a A, there exists precisely one b B s.t. (a, b) f. A is the domain of f and B is the codomain. If (a, b) f, we write b = f(a) e.g. f = {(a, b) : a N and b = a mod 2} is a function f : N {0, 1} e.g. g = {(a, b) : a, b N and a + b is even} is NOT a function. Why? When the domain of a function f is a Cartesian product f : A1 × A2 ××An B, we write f(a1, a2,, an) = b instead of f((a1, a2, , an)) = b The range of f is defined as {b B : b = f(a) for some a A} Properties A function f is a surjection if range of f = codomain of f A function f is an injection if f(a) f(b) a, b A and a b A function is a bijection or called one-to-one correspondence if it is injective and surjective n e.g. f (n) (1) n is a bijection from N to Z 2 When a function f is bijective, its inverse f –1 f -1(a) = a iff f(a) = b n e.g. f (n) (1) n 2 2m –1 f (m) = -2m-1 Q: if m 0 if m<0 2k-1 if k>0 -2k if k 0 f (k) = –1 what is f ? Graph A directed graph G is an ordered pair (V, E), where V is a finite set and E is a binary relation on V In an undirected graph, E is a subset of 2-subset of V (i.e. E consists of unordered pairs) In a directed graph, we say an edge (u, v) is incident from u or incident to v. We can also say v is adjacent to u. For an undirected graph (V, E), we say an edge (u, v) E is incident on u and v. The degree of a vertex in an undirected graph is the # of edges incident on it. For a vertex in a directed graph, we distinguish the degree as in-degree and out-degree. A path of length k is a sequence <v0, v1, , vk> s.t. (vi-1, vi) E for i = 1, ,k. If there is a path from u to v, we say v is reachable from u. A simple path is the one without duplication of vertices on its sequence. A cycle is a path <v0, v1, , vk> s.t. v0 = vk. A cycle <v0, v1, , vk> is simple if v1, v2, , vk are distinct. A graph with no cycles is acyclic. An undirected graph is connected if every pair of vertices is connected by a path. A directed graph is strongly connected if every 2 vertices are reachable from each other. 2 graphs G = (V, E) and G’ = (V’, E’) are isomorphic if there exists a bijection f : V V’ s.t. (u, v) E iff (f(u), f(v)) E’. See Fig. 5.3 on p.89. A graph G’ = (V, E) is a subgraph of G = (V, E) if V’ V and E’ E. A complete graph is an undirected graph in which every pair of vertices is adjacent. A bipartite graph is an undirected graph G = (V, E) in which V can be partitioned into 2 sets V1 and V2 s.t. (u, v) E, u V1, v V2 or u V2, v V1. A multigraph is like an undirected graph, but it allows multiple edges between vertices and self-loops. A hypergraph is like an undirected graph, but each edge is allowed to connect more than 2 vertices. Exercise 5.4-7 in page 90 Chapter 6. Counting and Probability Q: Argue that n n 1 n 1 k k k 1 Q: Argue that n n n j k j k j Def. A sample space is a set of elementary events, each of which is a possible outcome of an experiment. e.g. Flipping 2 distinguishable coins S = {HH, HT, TH, TT} Def. An event is a subset of sample space S Def. A probability distribution Pr{} on a sample space S is a mapping from events of S to real numbers s.t. 1. Pr{A} > 0 event A 2. Pr{S} = 1 3. Pr{AB} = Pr{A} + Pr{B} for any 2 mutually exclusive event A and B Def. A probability distribution is discrete if it is defined over a finite or countable infinite sample space We often deal with uniform probability distribution. That is 1 , where sS Pr{s} |S| In that case, we say we pick an element of S at random e.g. Flipping a fair coin n times S = {H, T}n Def. The continuous uniform probability distribution is defined over a closed interval [a, b] of reals, where a < b, s.t. Pr{[ c, d ]} d c , for any interval [c, d] where a c d b ba A discrete random variable X is a function from a finite or countably infinite sample space S to real numbers e.g. Flipping a coin n times S = {H, T}n X ≡ # of heads appeared X HHH HHT HTH S 3 2 1 0 HTT THH THT TTH TTT We define X = x as {s S : X(s) = x} That is, x = 2 is equivalent to {HHT, HTH, THH} f (2) Pr{ X 2} 3 8 n 1 1 In general, f(x) = Pr{X = x} = x 2 2 n n x , f(x) is the probability density function of X e.g. Tossing a pair of 6-sided dice X ≡ maximum of 2 values S: 1, 1 1, 2 1 1, 6 2, 1 2 3 4 5 6 6, 6 Pr{X=3} = Pr{(1 3) (2 3) (3 3) (3 2) (3 1)} = 5 36 We can define several random variables on the same sample space Def. Two random variables X and Y are independent if Pr{X = x|Y = y} = Pr{X = x} I.e. Pr{X = x and Y = y} = Pr{X = x} Pr{Y = y} Def. The expected value of X is E[X] = x Pr{ X x} x e.g. Tossing a fair coin 4 times X ≡ # of heads came up E[ X ] 0 Pr{ X 0} 1 Pr{ X 1} 2 Pr{ X 2} 3 Pr{ X 3} 4 Pr{ X 4} 4 4 4 4 4 1 4 1 4 1 4 1 0 2 3 4 1 2 2 2 3 2 4 2 4 1 4 4 4 4 ( 2 3 4 ) 2 1 2 3 4 4 1 4 (1 1) 3 2 2 E[ X Y ] E[ X ] E[Y ] E[ g ( x)] g ( x) Pr( X x) x E[aX ] aE[ X ] E[aX Y ] aE[ X ] E[Y ] For any 2 independent random variables X and Y E[XY]=E[X] E[Y] Def. The variance of X is Var[ X ] E[ X E[ X ] 2 ] E[ X 2 2 XE[ X ] E 2 [ X ]] E[ X 2 ] 2 E[ XE[ X ]] E 2 [ X ] E[ X 2 ] E 2 [ X ] When X and Y are independent Var[X+Y] = Var[X] + Var[Y] Bayes’s Theorem Pr{ A} Pr{B | A} Pr{ A | B} Pr{B} = Pr{ A} Pr{B | A} Pr{ A} Pr{B | A} Pr{ A} Pr{B | A} Ex.Given a fair coin and a biased coin that always comes up heads. Suppose you chose one coin at random, tossed it twice, and this coin came up heads twice. What is the probability that it is biased? Let X = 1 be choosing biased coin X = 0 be choosing fair coin Y be # of heads came up (= 2) 1 1 Pr( X 0) 2 2 Pr(Y 2 | X 1) 1 Pr( X 1) Pr( X 1 | Y 2) = Pr(Y 2 | X 0) 1 4 Pr( X 1) Pr(Y 2 | X 1) Pr(Y 2) Pr( X 1) Pr(Y 2 | X 1) Pr( X 0) Pr(Y 2 | X 0) Pr( X 1) Pr(Y 2 | X 1) 1 1 2 = 1 1 1 1 2 4 2 1 =2 5 8 = 4 5 Def. A Bernoulli trial is an experiment with outcome “success” of probability p, and outcome “failure” of probability q = 1-p Q: We have a sequence of Bernoulli trials. How many trials do we need for the 1st success? Def. A geometric distribution g(p) is Pr{X = k} = qk-1.p, see Fig 6.1 for its graphical representation. E[X] = 1 , p We have n Bernoulli trials, how many successes occur in n trials? Def. A binomial distribution b(k; n, p) n is p k (1 p) n k k n b(k ; n, p) 1 , see Fig. 6.2 for its graphical representation. k 0 n E[ X ] E[ X i ] n p , where Xi describes the # of successes in ith trial i 1 n Var[ X ] Var[ X i ] i 1 n pq i 1 npq Theorem n Pr{ X k} p k , X ≡ # of successes in n Bernoulli trials k proof see P.121 Corollary Pr{X k} ≡ the probability of at most k successes ≡ the probability of at least n-k failures n (1 p) n k n k n (1 p) n k k Probability analysis: 1. The birthday paradox : How many people must be in a room before there is a good chance(> 1 ) that two of them were born on the same day? 2 Suppose there are K people, with birthdays being b1, b2,, bk, and there are n(=365) days in a year Pr{bi b j } 1 n i j Ai ≡ the event that person (i+1)’s birthday different from person j’s for all ji k 1 Bk ≡ Ai ≡ k people have different birthdays i 1 Pr{Bk } Pr{Bk 1 } Pr{ Ak 1 | Bk 1 } Pr{B1 } Pr{ A1 | B1 } Pr{ A2 | B2 } Pr{ Ak 1 | Bk 1 } n 1 n 2 n k 1 1 n n n 1 2 k 1 1 1 1 1 n n n Since 1 x e x 1 1 1 e n n Pr{Bk } e e 1 2 3 k 1 n n n n k ( k 1) 2n To let Pr{Bk } 1 2 k (k 1) 1 ln( ) 2n 2 k (k 1) 2n ln 2 , for n = 365, k 23 2. Toss identical balls into b bins Suppose you toss n balls, how many balls fall in a given bin? n 1 b How many balls must one toss until every bin contains at least one ball? hit hit hit hit hit hit ○ × ○ × × ○ × × ○ … ○ …… ○ i i+1 X i ≡ # of balls you toss after hit i before you reach hit i+1 E[ X 0 ] 1 E[ X 1 ] b b 1 E[ X i ] b bi E[ X b 1 ] b b 1 E[ X ] [ X i ] b(1 i 0 1 1 1 ) b ln b O(1) 2 3 b Exercise 6-2 in page 133 Z-transform F ( Z ) f n Z n , where f n Pr[ X n] n 0 ex. f n 1 n! Zn ez n ! n0 F (Z ) F ( Z ) E[ Z K ] dF ( Z ) | z 1 n Z n1 f n dZ n 0 =X d F (Z ) | Z 1 n(n 1)Z n2 f n dZ n 0 2 = ( n 2 n) Z n 2 f n n 0 = E[ X 2 X ] X2 X F (1) F (1) {F (1)}2 2 X Let X X 1 X 2 X n where Xi is independent FX ( Z ) E[ Z X 1 X 2 X n ] E ( Z X 1 ) E ( Z X 2 ) E ( Z X n ) FX 1 ( Z ) FX 2 ( Z ) FX n ( Z ) e.g. Poisson distribution k k 0 k! FX 1 ( Z ) e Z k ( Z ) k e k! k 0 e ( Z 1) FX ( Z ) e( 1 2 n )( Z 1)