1. COMBINATORIAL ANALYSIS 1.1 Counting Principles 1. Theorem (The basic principle of counting): If the set E contains n elements and the set F contains m elements, there are nm ways in which we can choose, first, an element of E and then an element of F. 2. Theorem (The generalized basic principle of counting): If r experiments that are to be performed are such that the first one may result in any of n1 possible outcomes, and if for each of these n1 possible outcomes there are n 2 possible outcomes of the second experiment, and if for each of the possible outcomes of the first two experiments there are n3 possible outcomes of the third experiment, and if …, then there is a total of n1 n2 nr , possible outcomes of the r experiments. 3. Theorem: A set with n elements has 2 n subsets. 4. Tree diagrams 1.2 Permutations 1. Permutation: n! 1 The number of permutations of n things taken r at a time: Prn n! (n r )! 2. Theorem: The number of distinguishable permutations of n objects of k different types, where n1 are alike, n 2 are alike, …, n k are alike and n n1 n2 nk , is n! n1!n2 ! nk ! 1.3 Combinations 1. Combination: The number of combinations of n things taken r at a time: C rn n! r!(n r )! (combinatorial coefficient; binomial coefficient) n 2. Binomial theorem: ( x y ) n C in x i y n i i 0 3. Multinomial expansion: In the expansion of ( x1 x 2 x k ) n , the coefficient of the term x1n1 x 2n2 x knk , n1 n2 nk n ( x1 x2 xk ) n , is n! n1!n2 ! nk ! . Therefore, n! x1n1 x2n2 xknk . Note that the sum is taken n1 n2 nk n n1!n2 ! nk ! over all nonnegative integers n1 , n 2 , …, n k such that n1 n2 nk n . 1.4 The Number of Integer Solutions of Equations 1. There are C rn11 distinct positive integer-valued vectors ( x1 , x2 , , xr ) satisfying x1 x2 xr n , xi 0 , i 1, 2, , r . 2. There are Crn1r 1 distinct nonnegative integer-valued vectors ( x1 , x2 , , xr ) satisfying x1 x2 xr n . 2 3 2. AXIOMS OF PROBABILITY 2.1 Sample Space and Events 1. Set theory concepts: set, element, roster method, rule method, subset, null set (empty set). 2. Complement: The complement of an event A with respect to S is the subset of all elements of S that are not in A. We denote the complement of A by the symbol A’ ( Ac ). 3. Intersection: The intersection of two events A and B, denoted by the symbol A B , is the event containing all elements that are common to A and B. -- Two events A and B are mutually exclusive, or disjoint, if A B that is, if A and B have no elements in common. 4. The union of the two events A and B, denoted by the symbol A B , is the event containing all the elements that belong to A or B or both. 5. Venn diagram: 4 6. Sample space of an experiment: All possible outcomes (points) 7. Events: subsets of the sample space impossible events (impossibility): ; sure events (certainty): S. c 8. DeMorgan’s laws: n n c Ei Ei i 1 i 1 c n n c Ei Ei i 1 i 1 2.2 Axioms of Probability 1. Probability axioms: (1) 0 P( A) 1 ; (2) P ( S ) 1 ; (3) P( A1 A2 ) P( A1 ) P( A2 ) if A1 , A2 , is a sequence of mutually exclusive events. 2. Equally likely outcomes: the probabilities of the single-element events are all equal 2.3 Basic Theorems 1. (1) 0 P( A) 1 ; 5 (2) P( A) N ( A) ; N (S ) (3) complementary events: P( A ) 1 P( A) ; (4) P( A B) P( A) P( B) P( A B) : inclusion-exclusion principle P( A1 A2 A3 An ) (5) If A1, A2,…, An is a partition of sample space S, then P( A1 ) P( A1 ) P( An ) P( S ) 1 (6) If A and A’ are complementary events, then P( A) P( A' ) 1 . 6 3. CONDITIONAL PROBABILITY AND INDEPENDENCE 3.1 Conditional Probability 1. Conditional probability: P( A B ) P( A B) . P( B) 2. If in an experiment the events A and B can both occur, then P( A B) P( A) P( B A) P( B) P( A B) . P( A B C) P( A B) P(C A B) P( A) P( B A) P(C A B) , The multiplication rule: P( A1 A2 An ) P( A1 ) P( A2 A1 ) P( A3 A1 A2 ) P( An A1 A2 An1 ) . 3. Partition: Let B1 , B2 , , Bn be a set of nonempty subsets of the sample space S of an B1 , B2 , , Bn experiment. If the events are mutually exclusive B1 B2 Bn S , the set B1 , B2 , , Bn is called a partition of S. and 4. Theorem of total probability: If B1 , B2 , is a partition of S, and A is any event, then P ( A) P ( A Bi ) P ( Bi ) . i 1 5. Bayes’ Theorem: If B1 , B2 , is a partition of S, and A is any event, then P( Bi A) P( Bi ) P( A Bi ) P( Bi A) . P( A) P( Bi ) P( A Bi ) i 1 3.2 Independence 1. Independent events: If A, B are independent events P( A B) P( A) P( B) . 7 2. Theorem: If A and B are independent, then A and B ; A and B are independent. 3. The events A, B, and C are called independent if P( A B) P( A) P( B) , P( A C ) P( A) P(C ) , P( B C ) P( B) P(C ) , P( A B C ) P( A) P( B) P(C ) . B, and C are independent events, we say that 4. The set of events k 2 , of A1 , A2 , , An A1 , A2 , , An , A, B, C is an independent set of events. is called independent if for every subset A , A P( Ai1 Ai2 Aik ) P( Ai1 ) P( Ai2 ) P( Aik ) . 8 If A, i1 i2 , , Aik , 4. DISTRIBUTION FUNCTIONS AND DISCRETE RANDOM VARIABLES 4.1 Random Variable 1. Random variables X: Let S be the sample space of an experiment. A real-valued function X : S R is called a random variable of the experiment if, for each interval I R , s : X (s) I is an event. 2. Probability function p X (x) : (a) p X ( x) 0 if x range RX ; (b) p X ( x) 0 if x RX (c) p xRX X ( x) 1 . 4.2 Distribution Functions 1. Cumulative Distribution Functions (cdf): FX (t ) P( X t ) , t . 2. FX (t ) is non-decreasing; 0 FX (t ) 1 ; lim FX (t ) 0, lim FX (t ) 1 . t t 3. If c d , then FX (c) FX (d ) ; P(c X d ) FX (d ) FX (c) ; P( X c) 1 FX (c) . 4. The cdf of a discrete random variable: a step function. 4.3 Expectations of Discrete Random Variables 1. Expected value (mean value or average value or expectation) for a random variable X: E( X ) X xp xRX X ( x) . 9 2. Let g be a real-valued function. Then g(X) is a random variable with E[ g ( X )] g ( x) p xRX X ( x) . 3. Let g1 , g 2 , , g n be real-valued functions, and let 1 , 2 , , n be real number. Then E[1 g1 ( X ) 2 g 2 ( X ) n g n ( X )] 1 E[ g1 ( X )] 2 E[ g 2 ( X )] n E[ g n ( X )] . 4.4 Variances of Discrete Random Variables 1. The variance of a random variable: the average square distance between X and its mean X Var[ X ] E[( X X ) 2 ] E[ X 2 ] X2 . 2. Standard deviation: X Var[X ] . 3. Let X be a discrete random variable, then Var[ X ] 0 if and only if X is a constant with probability 1. 4. Let X be a discrete random variable, then for constants a and b: Var[aX b] a 2Var[ X ] , aX b a X . 10 5. SPECIAL DISCRETE DISTRIBUTIONS 5.1 Bernoulli and Binomial Random Variables 1. Bernoulli trials: an experiment with two different possible outcomes Bernoulli random variable X with parameter p, p is the probability of a success. p X ( x) p x (1 p)1 x p x q1 x , for x R X 0, 1 expected value: E[ X ] X p ; variance: Var[ X ] X2 p(1 p) pq Example: If in a throw of a fair die the event of obtaining 4 or 6 is called a success, and the event of obtaining 1, 2, 3, or 5 is called a failure. 2. Binomial distribution: number of successes to occur in n repeated, independent Bernoulli trials binomial random variable Y with parameters n and p pY ( y) C yn p y q n y , for y RY 0, 1, 2, , n Expected value: E(Y ) Y np ; Variance: Var[Y ] Y2 npq Example: A restaurant serves 8 entrees of fish, 12 of beef, and 10 of poultry. If customers select from these entrees randomly, what is the probability that two of the next four customers order fish entrees? 11 5.2 Multinomial Random Variables 1. Multinomial trials: an experiment with k 2 different possible outcomes 2. Multinomial distribution: n independent multinomial trials multinomial random variable X 1 , X 2 , , X k with parameters n, p1 , p 2 , , p k ; X i : the k number of ith outcomes; xi n ; i 1 k p i 1 i 1 ; R X i {0, 1, , n} n x1 x2 p1 p 2 pkxk , for ( x1 , x2 , , xk ) R X1 , X 2 ,, X k p X1 , X 2 ,, X k ( x1 , x2 , , xk ) x1 , x2 , , xk Example: Draw 15 balls with replacement from a box containing 20 red, 10 white, 30 black, and 50 green. What is the probability of 7R, 2W, 4B, 2G? 5.3 Geometric distribution: trial number of the first success to occur in a sequence of independent Bernoulli trials geometric random variable N with parameter p p N (n) pq n 1 , for n R N 1, 2, 3, geometric series: 1 q q 2 q i i 0 expected value: E ( N ) N 1 , if q 1 1 q q 1 ; variance: Var[ N ] N2 2 p p 12 Example: From an ordinary deck of 52 cards we draw cards at random, with replacement, and successively until an ace is drawn. What is the probability that at least 10 draws are needed? Memoryless property of geometric random variables: In successive independent Bernoulli trials, the probability that the next n outcomes are all failures does not change if we are given that the previous m successive outcomes were all failures. PN ( N n m N m) PN ( N n) 5.4 Negative binomial distribution: trial number of the rth success to occur negative binomial random variable N r with parameters r, p PN2 (n) pC1n1 pq n2 (n 1) p 2 q n2 , for n RN2 2, 3, 4, PNr (n) Crn11 p r q nr , for n RNr r, r 1, r 2, expected value: E ( N r ) N r rq r ; variance: Var[ N r ] N2 r 2 p p Example: Sharon and Ann play a series of backgammon games until one of them wins five games. Suppose that the games are independent and the probability the Sharon wins a game is 0.58. Find the probability that the series ends in seven games. 5.5 Poisson Distribution 1. The Poisson probability function: PK (k ) e k , for k 0, 1, 2, 3, k! 13 Poisson random variable K with parameter expected value: E( K ) K ; variance: Var[ K ] K2 2. The Poison approximation to the binomial: If X is a binomial random variable with parameters n and p n , then lim PX ( x) n e x , for x R X 0, 1, 2, x! Example (Application of the Poisson to the number of successes in Bernoulli Trials and the number of Arrivals in a time period): your record as a typist shows that you make an average of 3 mistakes per page. What is the probability that you make 10 mistakes on page 437? 3. Poisson processes Example: Suppose that children are born at a Poisson rate of five per day in a certain hospital. What is the probability that at least two babies are born during the next six hours? 14 6. CONTINUOUS RANDOM VARIABLES 6.1 Probability Density Functions 1. Densities 2. Probability density function (pdf) for a continuous random variable X: f X (x) f X ( x) 0 for all x; b f X ( x)dx 1 ; P(a X b) f X ( x)dx ; a b a P( X b) f X ( x)dx FX (b) ; P( X a) f X ( x)dx ; f X () f X () 0 ; P ( X x) 0 ; P( X x) f X ( x)dx Example: Experience has shown that while walking in a certain park, the time X, in minutes, between seeing two people smoking has a density function of the form: f X ( x) xe x , x 0 . (a) Calculate the value of . (b) Find the probability distribution function of X. (c) What is the probability that Jeff, who has just seen a person smoking, will see another person smoking in 2 to 5 minutes? 6.2 Cumulative Distribution Functions (cdf) PX ( x), for a discrete random variable X 1. FX (t ) P( X t ) t xt , t f X ( x)dx, for a continuous random variable X 2. The cdf of a discrete random variable: a step function; the cdf of a continuous random variable: a continuous function 3. The probability function of a discrete random variable: size of the jump in FX (t ) ; the pdf of 15 a continuous random variable: f X (t ) d FX (t ) dt 6.3 Expectations and Variances 1. Definition: If X is a continuous random variable with pdf f X (x) , the expected value of X is defined by E[ X ] X xf X ( x)dx . Example: In a group of adult males, the difference between the uric acid value and 6, the standard value, is a random variable X with the following pdf: f X ( x) 27 (3 x 2 2 x) if 2 / 3 x 3 . 490 Calculate the mean of these differences for the group. 2. Theorem: Let X be a continuous random variable with pdf f X (x) ; then for any function h: E[h( X )] h( x) f X ( x)dx . 3. Corollary: Let X be a continuous random variable with pdf f X (x) . Let h1 , h2 , , hn be real-valued functions and 1 , 2 , , n be real numbers. Then E[1h1 ( X ) 2 h2 ( X ) n hn ( X )] 1 E[h1 ( X )] 2 E[h2 ( X )] n E[hn ( X )] . 4. Definition: If X is a continuous random variable with E[ X ] X , then Var[ X ] and X , called the variance and standard deviation of X, respectively, are defined by Var[ X ] E[( X X ) 2 ] E[ X 2 ] X2 , X Var[X ] . 16 7. SPECIAL CONTINUOUS DISTRIBUTIONS 7.1 Uniform Random Variable 1. Density of a uniformly distributed random variable: f X ( x) 1 , for a x b . ba 2. The cdf of a uniformly distributed random variable: 0 if x a x a FX ( x) if a x b b a 1 if x b 3. The expected value and variance of the uniform random variable: X E[ X ] Var[ X ] X2 ab ; 2 (b a) 2 . 12 Example: Starting at 5:00 A.M., every half hour there is a flight from San Francisco airport to Los Angeles International airport. Suppose that none of these planes is completely sold out and that they always have room for passengers. A person who wants to fly to L.A. arrives at the airport at a random time between 8:45 A.M. and 9:45 A.M. Find the probability that she waits (a) at most 10 minutes; (b) at least 15 minutes. 7.2 The Exponential Distribution 1. The exponential probability law with parameter : 17 FT1 (t ) 1 e t , f T1 (t ) e t , for t 0. T1 is the time of occurrence of the first event in a Poisson process with parameter , starting at an arbitrary origin t 0 ; X t 0 T1 t , Poisson process X t is the number of events to occur with parameter t . 2. The expected value and variance of the exponential random variable: T1 E[T1 ] Var[T1 ] T21 1 2 1 ; . Example: Suppose that every three months, on average, an earthquake occurs in California. What is the probability that the next earthquake occurs after three but before seven months? 3. Memoryless feature of the exponential: P(T1 a b T1 b) P(T1 a) , a, b are any two positive constants. 7.3 The Erlang Distribution 1. The cdf and pdf for T2 : FT2 (t ) 1 (1 t )e t , f T2 (t ) 2 te t , for t 0. T2 is the time of occurrence of the second event in a Poisson process with parameter , starting at an arbitrary origin t 0 ; X t 1 T2 t , Poisson process X t is the number of events to occur with parameter t . ( t ) k t e , k! k r 2. The Erlang probability law with parameters r and : FTr (t ) f Tr (t ) 18 r t r 1 (r 1)! e t , for t 0. Tr is the time of occurrence of the rth event in a Poisson process with parameter , starting at an arbitrary origin t 0 ; X t r 1 Tr t, X t is the number of events to occur. 3. The expected value and variance of the Erlang random variable: Tr E[Tr ] Var[Tr ] T2r r 2 r ; . Example: Suppose that, on average, the number of -particles emitted from a radioactive substance is four every second. What is the probability that it takes at least 2 seconds before the next two -particles are emitted? 7.4 The Gamma Distribution 1. The gamma probability law with parameters n and : f U (u ) n u n1 ( n) e u , for u 0, n 0, 0. Gamma function: (n) x n1e x dx, n 0 ; (n) (n 1)(n 1) ; (n) (n 1)! , if n 0 is a positive integer. The Erlang random variable is a particular case of a gamma random variable, where n is restricted to the integer values r 1, 2, 3, . 2. The expected value and variance of the gamma random variable: E[U ] U Var[U ] U2 n 2 . 7.5 The Normal (Gaussian) Distribution 19 n ; 1. The normal probability law with parameters and : f X ( x) 1 2 e ( x ) 2 2 2 , for x , 0 . 2. The expected value and variance of the normal random variable: X E[X ] ; Var[ X ] X2 2 . 1 3. Standard normal random variable (the unit normal): f Z ( z ) 2 e z 2 2 , 0 , 1. 4. The new random variable: If X is normal with mean and variance 2 , then X b is normal with mean b and variance 2 ; aX is normal with mean a and variance a 2 2 ; aX b is normal with mean a b and variance a 2 2 . 5. Switching from a non-unit normal to the unit normal: Z X , Z 0 , Z 1. Example: Suppose that height X is normal with 66 inches and 2 9 . You can find the probability that a person picked at random is under 6 feet, using only the table for the standard normal. 6. The probability that a normal random variable is with k standard deviations of the mean: P( k X k ) P(k X * k ) , X * is the unit normal. 7. The normal approximation to the binomial: A binomial distribution X with parameters n, p where n is large, then X is approximately normal with np , 2 npq . Example: A coin has P(heads) 0.3 . Toss the coin 1000 times so that the expected number of heads is 300. Find the probability that the number of heads is 400 or more. 20 7.6 The Election Problem 4.6 Functions of a Random Variable 1. The distribution function (cdf) method: FX ( g 1 (t )), if g is increasing , if X is a random variable FW (t ) 1 1 1 - ( FX ( g (t )) P( X g (t ))), if g is decreasing with cdf FX (t ) , g is a monotonic function for x RX , W g ( X ) dg 1 (t ) 2. The density function (pdf) method: f W (t ) f X ( g (t )) dt 1 4.7 Simulating a Random Variable 1. Simulating a continuous random variable 2. Simulating a discrete random variable 3. Simulating a mixed random variable 21 8. JOINTLY DISTRIBUTED RANDOM VARIABLES 8.1 Joint Densities 1. Jointly distributed random variable: If the observed values for two or more random variables are simultaneously determined by the same random mechanism. 2. The discrete random variables X 1 , X 2 : the probability function is p X 1 , X 2 ( x1 , x 2 ) ; p X 1 , X 2 ( x1 , x 2 ) 0 for all x1 , x 2 ; p X1 , X 2 ( x1 , x 2 ) 1 . The continuous random R variables X 1 , X 2 : the pdf f X1 , X 2 ( x1 , x 2 ) is positive; f X1 , X 2 ( x1 , x 2 ) 0 for all x1 , x 2 ; R f X1 , X 2 ( x1 , x2 )dx1dx2 1 ; the probability is given by integrating f X1 , X 2 ( x1 , x 2 ) . 3. The joint density of independent random variables: The random variables X, Y are independent if and only if p X , Y ( x, y ) p X ( x) pY ( y ) , when X, Y are discrete; f X , Y ( x, y ) f X ( x) f Y ( y ) , when X, Y are continuous. 4. Uniform joint densities: If X and Y are jointly uniform on a region, the joint pdf is 1 , for (x, y) in the region. f X , Y ( x, y) area of region The joint density of independent uniform random variables: P (event) favorable area total area 8.2 Marginal Densities 1. The discrete random variables X 1 , X 2 with the probability function p X 1 , X 2 ( x1 , x 2 ) and 22 range R X 1 , X 2 , the marginal probability function for X 1 is p X1 ( x1 ) p X1 , X 2 ( x1 , x2 ) for x2 x1 R X 1 . R X 1 is the marginal range for X 1 , the set of first elements of ( x1 , x 2 ) R X 1 , X 2 . 2. The continuous random variables X 1 , X 2 with pdf f X1 , X 2 ( x1 , x 2 ) and range R X 1 , X 2 , the marginal pdf for X 1 is f X1 ( x1 ) f X1 , X 2 ( x1 , x 2 )dx 2 for x1 R X 1 . x2 R X 1 is the marginal range for X 1 , the set of first elements of ( x1 , x 2 ) R X 1 , X 2 . 8.3 Functions of Several Random Variables 8.4 Sums of Independent Random Variables 1. The sum of independent binomials with a common p: X 1 with parameters n1 , p , X 2 with parameters n2 , p , then X 1 X 2 with parameters n1 n2 , p 2. The sum of independent Poissons: X 1 with parameter 1 , X 2 with parameters 2 , then X 1 X 2 with parameters 1 2 3. The sum of independent exponentials with a common : X 1 , , X n with parameter , then X 1 X n has a gamma(Erlang) distribution with parameters n, . 4. The density of the sum of two arbitrary independent random variables: the convolution of the individual pdf, f X Y ( z ) RX f X ( x) f Y ( z x)dx f X ( z y ) f Y ( y )dy RY 5. The sum of independent normals: X 1 with mean 1 and variance 12 , X 2 with mean 2 and variance 22 , then X 1 X 2 with mean 1 2 and variance 12 22 23 9. EXPECTATION 9.1 Expectation of a Random Variable 1. Definition of expected value (continuous): X E[ X ] xf X ( x)dx 2. Mean of a uniformly distributed random variable: X E[ X ] ab , if X is uniform on 2 [a, b] 3. The expected value of g ( X ) , some function of X: E[ g ( X )] g ( x) P( X x) , if X is x discrete; E[ g ( X )] g ( x) f X ( x)dx , if X is continuous. The expected value of g ( X , Y ) , some function of X and Y: E[ g ( X , Y )] plane g ( x, y ) f X , Y ( x, y )dA 4. Properties of Expectation: i. E[k ] k (constant k); ii. E[ X Y ] E[ X ] E[Y ] ; iii. E[kX ] kE[ X ] (constant k); iv. If X and Y are independent then E[ XY ] E[ X ]E[Y ] . 8. The expected value of the normal random variable: E[ X ] 9.2 Variance 1. Definition of variance (continuous): Var[ X ] E[( X X ) 2 ] E[ X 2 ] X2 2. 3. The variance of the Erlang random variable: 24 4. The variance of the gamma random variable: 5. The variance of the normal random variable: Var[ X ] 2 25 5.2 Conditional Distributions and Independence 1. Definition 5.3: Conditional probability function 2. Definition 5.4: Conditional pdf 3. Definition 5.5: Independence 4. Definition 5.6: 5.3 Multinomial and Bivariate normal Probability Laws 1. Multinomial trial 2. Theorem 5.3: Multinomial probability function 3. Bivariate normal probability law 26