Counting Techniques and Probability Introduction - Brief historical remarks The counting Principle, Permutations and Combination - The factorial notation - Tree diagrams and the fundamental counting principle - Permutation and combination The Language of Probability and Basic Properties - Terminology and properties - Using counting techniques to find probabilities - The complement Rule The Addition Rule Random Variables and Discrete Probability Distribution - Definitions and Examples - Expected Value §1 Introduction Probability is a branch of Mathematics that studies random phenomena and gives a measure for the likelihood that a given event is realized. Probability is the science of studying the outcomes of random phenomena. A phenomenon is called random if individual outcomes are uncertain but the long-term pattern of many outcomes is predictable. The modern theory of probability is the most important tool for statisticians and statistical inference. Historically, human interest has always been associated with games of chance and gambling which, in one form or another, were present in almost every known civilization. Through such games, humans developed intuition for the frequency of occurrence of certain events and used this intuition to make some predictions. But the beginnings of scholarly study of random events goes back only to the 15th century AD. Italian scholars like Tartaglia (1499 – 1557), Cardano (1501 – 1576) and Galileo (1564 – 1642) were probably among the first mathematicians who made probabilistic calculations of the outcomes of games of chance. The well known French mathematicians Pascal (16231662) and Fermat (1601-1665) also conducted studies of gambling games. A further systematic theory was developed by Jakob Bernoulli (~1718) and Simon de Laplace (~1812). Their theory was utilized in astronomy when studying the errors of measurements. In this handout we will touch on the basics of discrete probability. We will cover some counting rules that will useful in calculating probabilities. We’ll also introduce the concept of the expected value of a probability distribution and how to use it make inference regarding the sustained outcome of a random experiment. §2 The counting Principle, Permutations and Combination 2.1 The Factorial Notation. Let n be a natural number. The product of the first n integers is called n-factorial and is denoted by n!. In other words Definition (The factorial notation) For any positive integer n, n! = (1)(2) … (n – 1)(n) By convention, we’ll set 0! = 1. Example 1 3! = (1)(2)(3) = 6 4! = (1)(2)(3)(4) = 24 5! = (1)(2)(3)(4)(5)= 120 6! = (1)(2)(3)(4)(5)(6)= 720 Note that, in general, we have n! = (n – 1)! n = (n – 2)! (n – 1) n = (n – 3)! (n – 2)(n – 1) n = … etc. Exercises 1) Find the following: 2) For the following pairs n and k, find the value of a) n = 8 and k = 5 b) n = 8 and k = 3 2.2 The Fundamental Counting Principle If a process consists of k parts such that part number i can be performed in ni ways (for each i = 1, 2, … , k), then there are exactly n1n2…nk ways to complete the process. Example 2 Suppose that we want to construct all the “words” (meaningful or otherwise) of 3 distinct characters from the letters {A, C, T}. How many are there? A systematic listing produces the following six such words, namely, ACT, ATC, CAT, CTA, TAC, TCA The tree diagram to the right confirms that we did not miss any word. It also indicates where the number 6 comes from. The first digit can be any of the three letters; hence can be chosen in 3 ways, and resulting of the initial 3 branches. For every choice of a first letter, the second letter can be chosen in 2 ways (since it is not allowed to equal the first one). So each of the initial 3 branches will produce two branches; resulting in a total of 3 2 = 6 branches. Given the first two letters, there is only one choice for the last letter. Hence each of the six branches of the second stage produces a single branch for the third stage, resulting in a total of 6 1 = 6 branches. Note: to just get the number of “words” just think of it as having three tasks to do: choose a first character, followed by a second character, and then a third. The first task can be done in three ways, the second in two ways, and the last in only one way. So by Fund. Principle, the words can be chosen in 3x2x1= 6 ways. Example 3 How many 3-digit numbers with digits being either a 1 or 2 are there? Again we use a tree diagram to come up with such numbers. Eight are possible: 111, 112, 121, 122, 211, 212, 221, 222 Here the total number of branches realized in the final stage is 2 2 2 = 8, since at each stage we can chose the digit in two ways, 1 or 2. Exercises 3) An electric circuit contains 4 switches that could be toggled “on/off”. For safety issues, the first and third switches are not allowed to be “on” simultaneously. What is the number of possible modes of the circuit switches? Draw a suitable tree diagram to support your answer. 4) In how many different ways can four people {A, B, C,D} stand in line if B and C insist on standing next to one another? Use the Fundamental Principle to determine the number of such arrangements. Hint: Think of the B and C “tied” together(this can be done in how many ways?) and then think of arranging the A, D, and the BC “cluster” in a line. 5) In how many different ways can four people {A, B, C, D} stand in line if B and C refuse to stand next to one another? Hint: Again think of Band C as a cluster which has to be located in non-consecutive positions. How many different positions are there for the “cluster?” In how many ways can the remaining two letters be placed?. Example 4 A five alpha-numeric case-insensitive password is chosen so that the first character is not a digit. How many such passwords are possible? The first character is a letter (A-Z) and can be chosen in 26 ways. Each of the remaining four characters can be either a letter (A-Z) or a digit (0-9), hence can be chosen in 36 ways. By the counting principle, there are 26 36 36 36 36 = 43670016 such passwords Exercise 6) If, in the previous example, we additionally require that all characters be distinct, how many possible passwords do we end-up with? 7) A six-faced die is rolled 3 times. How many distinct outcomes are there if the order is relevant? 8) In the previous problem, how many distinct outcomes are there if the order is irrelevant? 2.3 Permutations and Combinations 2.3.1. Permutations. By the counting principle we have If A1, A2, A3, … , An-1, An are n distinct objects. The number of distinct ways (or sequences, or arrangements, or permutations) for listing these objects in different orders equals n!. In example 2 above, we listed all the possible arrangements of three 3 distinct characters from the letters {A, C, T} and we have seen (by the counting principle) that there are 3! =3 2 1 = 6 such arrangements. The order in which the characters are listed is of course relevant in this case. Now, suppose we have 6 distinct objects and we want to count the number of ways we can arrange three of them in a sequence. The first object of the sequence can be chosen to be any of the 6 objects, the second is chosen to be any of the remaining 5, and the third can be any of the 4 objects left. So, there are 6 5 4 = 120 arrangements of 5 objects taken 3 at a time. More generally, we have Definition (Permutations) The arrangement or an ordered listing of k objects taken from n distinct objects is called a permutation. Permutations Formula The number of different permutations of n distinct objects taken k at a time equals nPk = n (n - 1) (n - 2) … (n – k + 1) = We stress that when talking about permutations, the order is important. For example, the two listings (A B C) and (B A C) of three letters taken from the 26 letters of the alphabet are considered different permutations. Example 5 15 contestants compete in a game show where the first through fifth place winner takes $100, $80, $60, $40, and $20, respectively. How many possible different outcomes are there for the game? An outcome is five winners taken from the 15 contestants. Since the prizes are different for the different places, order is relevant in this situation, and each outcome is a permutation of 5 people taken from 15. So, there are 15P5 = 15 14 13 12 11 = 360360 outcomes. 2.3.2. Combinations Example 6 How many different poker hands are there? A poker hand consists of five cards. The order in which the player hold the card is irrelevant. So, for example the hand (10♠, 10♣, K♥, K♦, 8♥) is the same as (10♠, K♥,8♥, 10♣, K♦) count as the same hand. Thus, the number of different 5-card poker hand taken from a regular deck of 52 cards will not coincide with 52P5, the numbers of permutations of objects taken from 52. The number of posker hands will be of course less. By the counting principle, there are 5! =120 ways a player can arrange the five cards (K♥, K♦, 10♠, 10♣, 8♥) his/her hand. This also holds true for any other set of 5 cards. Therefore, we conclude that the number of different 5-card poker hands is These hand (or sets of cards where order is irrelevant) is called combinations. More generally, Definition (Combinations) A set of k objects chosen from n distinct objects without regard to the order in which they appear is called a combination. As in the case of the example of poker hands, we can justify the the following formula for the number of combinations (in terms of nPk ). Combinations Formula The number of different combinations of k objects chosen from n distinct objects equals Exercises. 9) How many different 4-member committees can be formed in a department of 11 people? 10) In a “powerball” lottery, a player selects 4 numbers (order is not important) between 1 and 50 plus one number between 1 and 45 designated as the powerball number. The powerball number is allowed to be the same as one of the 4 other numbers. How many choices does the player have? 11) In the previous problem, how many different choices does the player have if the order of the four non-powerball numbers is important? 12) John sees 10 books in a bookstore, all equally priced, that he likes to buy, but he has enough money to buy four of them only. How different selections is he facing? 2.3.3. Counting with non-distinct objects Example 7 How many different 4-letter sequences can be made from the letters in the word “Hanna”? Recall that 5 distinct characters have (5! = 120) permutations, by the permutation formula. But since letters of Hanna are not all distinct, it will not give as many different sequences. The 5 letter in Hanna are made of 3 distinct characters: Two a’s, two n’s, and one h. A permutation of the five letters (h,a,n,n,a) will not change if two identical letters were to switch places. For instance, the permutation “ahnan” remains the same if one was to interchange the first and fourth characters (a’s), nor will it change if one was to permute the four n’s among themselves. The number of permutations that end up only interchanging similar letters is 2! 2! 1! = 4. Thus among the 5! permutations of 5 characters, there only 5!/ (2! 2! 1!) = 30 distinct sequences. In general, we have Permutations Formula (for non-distinct objects) If there are n objects with n1 alike, n2 alike, n3 alike, … , nk alike, then the number of distinct permutations of these n objects equals Exercises 13) 14) A fair coin is tossed 10 times and the sequence of heads and tails is registered. Among all possible sequences, what is the percentage of the sequences with exactly 3 heads? In the experiment of the previous problem, how many of the sequences have at least 3 heads? §3 The Language of Probability and Basic Properties 3.1 Random experiment, events and sample spaces Definition (Random Experiment) A (discrete) random experiment (or random phenomenon) is process whose possible outcomes are known, but the outcome of a particular instance is random and cannot be determined. The simple experiment of tossing a coin has only two possible outcomes (heads or tails), but at any one instance of tossing a coin, we cannot tell ahead of time what the outcome is going to be. Similarly, rolling a die can produce one of six possible outcomes, but which number will show up at any given instance, is random. Definition (Sample Space, outcomes, and events) A sample space of a random experiment is a set (or collection) that consists of all the possible outcomes of the experiment. An event is a collection (possibly empty) of outcomes of the experiment. I.e. it is a subset of the sample space. A simple event is one that contains only one element of the sample space. Let A be an event. Denote by the event consisting of all outcomes in the sample space that do not belong to A. The event is called the complement of A. The sample space of the experiment of tossing a coin is the two element set {heads, tails}, while that of the experiment of rolling a die is {1, 2, 3, 4, 5, 6}. The subset {2, 4, 6} is the event of rolling a multiple of two. A sample space for the experiment of tossing a coin followed by rolling a die is {H1,H2,H3,H4,H5,H6,T1,T2,T3,T4,T5,T6}. Exercises 15) List the sample space of the experiment consisting of tossing a coin and rolling a die. Do in class. 16) Describe a possible sample space for the experiment of randomly drawing two cards without replacement. What is the size of the sample space? List the event A = “Both cards are red and higher than a Jack”. Count the size of event B = “At least one card is an Ace”. 17) List a possible sample space for the experiment consisting of rolling two dice. What is the size of the event “the sum is 7”. Do in class. 18) For the experiment in the previous question, for every x=1, 2, 3,… etc., list the outcomes, and the number of elements in the event A(x) = “the sum is x”. Do in class 3.2 Discrete Probability The theory of (discrete) probability enables us assign a quantitative measure of the likelihood (or chance) that a particular outcome of a random experiment will materialize. A numeric value, called the probability of the event, is a assigned to each event that will be proportional to the likelihood that the event will be realized in a (any) instance of the experiment. This assigned probability value, converges to the percentage of times the outcome is satisfied when the experiment is repeated a large number of times. Definition (Probability) Probability for a random event is a rule that assigns to each event A of the sample space a number P(A). The assigned values must satisfy the following condition 0 P(A) 1, If A and B are disjoint events, P(A or B) = P(A) + P(B) where the word disjoint means that not simple outcome can satisfy both events simultaneously. This assignment of probability has the intuitive properties For any event or outcome, the probability is a number between 0 and 1. The probability of an impossible event is 0 For two events A and B, the probability P(A or B) that one or the other occurs is the sum less than or equal to the sum P(A) + P(B) of the individual probabilities of the events. The probability of sample space is 1. For a finite probability space, these rules guarantee that the probabilities of the individual simple outcomes must add up to 1. 3.2.1. Sample spaces with equally likely outcomes Intuitively speaking, the two events (heads and tails) in the experiment of tossing a fair coin have equal likelihood to be realized. So it seems justified to give each event a probability 0.5 (=50%). Likewise in the experiment of rolling a wellbalanced die, the six numbers are equally likely to turn up. So it is fair to take the probability of each of these events to be 1/6. In general we have the following rule. Rule for probabilities in a space of equally likely events If all the simple outcomes in a (finite) sample space are equally likely, the probability P(A) of an event A is computed by 3.3 Using counting techniques to find probabilities Example 8 Two cards were drawn without replacement from a well-shuffled regular deck of card. What is the probability that both are aces? (Using Permutations) Take the sample space to be the permutations of 52 cards taken two at a time. The size of this sample space is Since there are 4 aces in a deck, the number of permutations consisting of two aces is 4P2 = 12. So, P(drawing 2 aces) = 12/2652=1/221 The above computation could also be done using combinations. The changes the sample space and the outcomes in the event, but the probability remains the same (of course). Do in class. The last example demonstrates a “simple multiplication” rule for probabilities. This seen as follows. The probability of drawing an ace as the first card is (4/52), since there are 4 aces in a deck of 52 cards. Without replacement, the deck now has 51 cards; only three of them are aces. So, the probability of drawing an ace again is (3/51). The probability we computed turned out to be the same as the product P(getting an ace on first draw) P(getting an ace on second draw given the first was an ace). Exercises 19) Two dice were rolled. What is the probability that their sum is 5? Is 7” is 9? 20) Five cards were randomly drawn. What is the probability that none of them is an ace? Answer, once assuming replacement and the second without replacement. 21) Five cards were randomly drawn without replacement. What is the probability that they all have the same suite? 22) Four students {A, B, C, D} stood in line randomly (one behind the other). What is the probability that A is right in front of C? 3.3.1. The Complement Rule When computing the probability for an event A sometimes it is easier to compute the probability of the outcomes when the event A does not occur; i.e In other words, P( ), of the complementary event. The two probabilities are related by the relation. P( ) + P(A) = 1 Example 9 A coin is tossed 20 times and the outcome of each instance was registered. What is the probability that at least 1 heads showed up? Let A denote the event of “getting more than one heads”. This events includes the 20 (disjoint) sub-events A(x) = “getting x heads”, where x =1, 2, 3, 4, … , 20. Computing the probability of these events takes time. It is easier to compute P( ), as is the event “getting no heads”. 3.3.2. The Addition Rule If A and B are two events, we would like to relate the probability that at least one of them occur, i.e. the event “A or B”, also called the sum of A and B. If we count the outcomes in A and add to them the count of the outcomes of B, we will be counting the events common between the twice. The Venn diagram illustrates this. So, P(“A or B”) need not equal P(A) + P(B). The correct formula is P(“A or B”) = P(A) + P(B) - P(A B) where A B is the event that A and B both occur simultaneously. If A B , then A and B are mutually exclusive in which case the formula above could be simplified. Exercises 23) The table gives a breakdown of the Titanic passengers by different categories. Men Women Boys Girls Survived 332 318 29 27 Died 1360 104 35 18 a) If a passenger is randomly selected, what is the probability of getting a woman or a child? b) If a passenger is randomly selected, what is the probability of getting a man or someone who survived? c) If a passenger is randomly selected, what is the probability of getting boy that died? §4 Random Variables and Probability Distribution 4.1 Definitions and Example To able to quantitatively analyze probabilities of events, we often associate to the outcomes and events numeric values. A (random) variable is a real variable that assumes these values. For example, consider the experiment of rolling two dice. Each instance of this experiment produces a pair of values. There are 36 possible equally likely outcomes depicted in the figure If we are playing a game where certain consequences depend only on the sum of the values on the dice, on need to introduce a quantity (variable) representing the sum. Calling this variable x, we realize that x can take the values 2 to 12. Those outcomes for the variable x do not have equal probability. For instance, P(x =2) = 1/36 as there is a single pair in the picture where the sum of the dice is 2. On the other hand, P(x =4) is 3/36. The following table computes the probabilities for each value of x. It is called a probability distribution (or a probability model) for the random variable x. For each value of x we calculate the probability that it occurs. X 2 3 4 5 6 7 8 9 10 11 12 P(x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 The probability function P(x) has the following properties of the following definition Definition. (Probability Models) A function of a random variable x is a probability distribution if it has the properties For each x, 0 P(x) 1. 1 = P(x) Exercises 24) In a gambling game, the player is paid $3 if s/he draws a jack or a queen and $5 if s/he draws a king or an ace. Otherwise, the player will pay the casino the bet amount of $2.00. Find the probability model representing the amount of money the player wins (negative number for loss!) 25) An American roulette wheel has 38 slots numbered 0, 00, and 1 to 36. The ball is equally likely to rest in any of these slots when the wheel is spun. One way to place a bet is to bet that the ball will rest on a multiple of 3. Joe places a $1 bet that pays out $3 if a multiple of 3 comes up. Set up the probability model for the game 4.2 Expected Values In each of the experiments and distribution functions arising in the exercises above, if a group of gamblers (or equivalently a computer simulation of the game) play the experiment over and over a large number of times, recording the outcome value of the variable x (which represents winnings or loss here) and tracking the mean (average) of the x-values realized, then this mean will start approaching a particular value, which we call the expected value of the probability model. The expected value of a probability model is denoted by . The (theoretical) value of the expected value can be calculated by the following formula Calculating the expected value (mean) of a probability model Suppose that the possible outcomes s1, s2, s3, ... , sk in the sample space are numbers. Suppose that the probability of the outcome sj is pj . The expected value (mean) of the probability distribution is = s1 p1 + s2 p2 + ... + sk pk Example 10 A teacher realized from experience that the probability that a students get a grade x = 0, 1, 2, 3, or 4 on his 4 pts quizzes follows the follow model. Grade 0 1 2 3 4 Probability 0.10 0.15 0.30 0.30 0.15 Calculate the expected value of this model The mean = (0)(0.1) + (1)(0.15) + (2)(0.3) + (3)(0.3) + (4)(0.15) = 2.25 Exercises 26) In the probability model obtained in exercise 28 above, a) Find the mean of this model. Does the game favor the casino or the layer? Explain b) What should the amount of the bet be so that the game becomes fair? 27) In the probability model obtained in exercise 29 above, a) What is Joe's probability of winning? b) What are Joe's mean winnings for one play, taking into account the $1 cost of each play? c) Joe plays roulette every day for years. What can we expect as a result?