COUNTING TECHNIQUES Multiplication Rules - If an operation can be performed in n1 ways, and if for each of these a second operation can be performed in n2 ways, and for each of the first two a third operation can be performed in n3 ways, and so forth, then the sequence of k operations can be performed in n1 · n2 · · · nk ways. Example: The design for a website is to use one of the four colors, a font from among three, and three different positions for an image. Calculate the number of web designs possible. Solution: From the multiplication rule, 4 × 3 × 3 = 36 web designs are possible. Permutation – 1. Permutation of Distinct Elements - The number of permutations of n different elements is n!, where n! = n(n − 1)(n − 2)· · · 3 · 2 · 1 Example: A food company has four different recipes for a potential new product and wishes to compare them through consumer taste tests. In these tests, a participant is given the four types of food to taste in a random order and is asked to rank various aspects of their taste. How many different rankings of the four types of food are possible? Solution: The four types of food are to be ranked (ordered sequence) by a participant. A permutation of the four types gives 4! = 4 × 3 × 2 × 1 = 24 different rankings. 2. Permutation of Subsets - The number of permutations of n distinct objects taken r at a time is nPr = n! / (n − r)! Example: In one year, three awards (research, teaching, and service) will be given to a class of 25 graduate students in a statistics department. If each student can receive at most one award, how many possible selections are there? Solution: Since the awards are distinguishable, it is a permutation problem. The total number of sample points is 25P3 = 25! / (25 − 3)! = 25! / 22! = 25 · 24 · 23 = 13,800 3. Circular Permutation - The number of permutations of n objects arranged in a circle is (n − 1)! Example: Eight people are to be seated around a dining table. How many different arrangements are possible a) with no restrictions; b) if two people insist on sitting next to each other? Solutions: a) The total number of arrangements is (8 − 1)! = 7! = 5040 b) We solve this in several steps. The first step is to count the two as one group. There are 7 units (6 individuals, 1 group) to be arranged around the table. According to the circular permutation rule, there are (7 − 1)! = 6! = 720 different arrangements. The second step is to count the number of arrangements of the people within the group. In this case, the number of ways is 2! = 2. According to the multiplication rule, the total number of arrangements is 720 × 2 = 1440 4. Permutation of Similar Objects - The number of permutations of n = n1 + n2 + · · · + nr objects of which n1 are of one type, n2 are of a second type, . . ., and nr are of an r th type is n! / n1!n2! · · · nr! Example: Code 39 is a common bar code system that consists of narrow and wide bars (black) separated by either wide or narrow spaces (white). Each character contains nine elements (five bars and four spaces). The code for a character starts and ends with a bar (either narrow or wide) and a (white) space appears between each bar. The original specification (since revised) used exactly two wide bars and one wide space in each character. For example, if b and B denote narrow and wide (black) bars, respectively, and w and W denote narrow and wide (white) spaces, a valid character is bwBwBW bwb (the number 6). One character is held back as a start and stop delimiter. How many other characters can be coded by this system? Can you explain the name of the system? Solution: The four white spaces occur between the five black bars. In the first step, focus on the bars. The number of permutations of five black bars when two are B and three are b is 5! / 2!3! = 10. In the second step, consider the white spaces. A code has three narrow spaces w and one wide space W so there are 4! / 3!1! = 4, possible locations for the wide space. Therefore, the number of possible codes is 10×4 = 40. If one code is held back as a start/stop delimiter, then 39 other characters can be coded by this system (and the name comes from this result). Partitions – The number of ways of partitioning a set of n objects into r cells with n1 elements in the first cell, n2 elements in the second, and so forth, is (𝑛 𝑛 1 ,𝑛 2 ,…,𝑛 𝑟 ) = n! / n1!n2! · · · nr! Example: In how many ways can 7 graduate students be assigned to 1 triple and 2 double hotel rooms during a conference? 7 Solution: The total number of possible partitions would be (3,2,2 ) = 7! / 3!2!2! = 210 Combination – Example: A mother-participant samples eight food products and is asked to pick the best, the second best, and the third best. She buys the three products she likes best to take home as pasalubong for the family a) How many different rankings are possible? b) How many different pasalubong are possible? Solution: a) Only three of the eight products were ranked. Thus, the number of ways to rank them is 8P3 = 8! / 5! = 336 b) Since her three best choices are now reclassified as pasalubong, the other 5 products are considered not pasalubong. This is a partition into two cells, or a combination. The number of ways of choosing a pasalubong of three items is 8C3 = 8! / 3!5! = 56 DISCRETE: RANDOM VARIABLES AND THEIR PROBABILITY DISTRIBUTIONS A random variable is a function that assigns a real number to each outcome in the sample space of a random experiment. Example: Two balls are drawn in succession without replacement from an urn containing 4 red balls and 3 black balls. If the random variable Y is the number of red balls, then Solution: Outcome; y RR 2 RB 1 BR 1 BB 0 Probability Mass Functions – For a discrete random variable X with possible values x1, x2, . . . , xn, a probability mass function is a function f(x) such that (1) f(xi) ≥ 0 (2) ∑𝑛𝑖=1 𝑓(𝑥𝑖 ) = 1 (3) f(xi) = P[X = xi ] Example: Let the random variable Y denote the number of semiconductor wafers that need to be analyzed in order to detect a large particle of contamination. Assume that the probability that a wafer contains a large particle is 0.01 and that the wafers are independent. Determine the probability distribution of Y . Solution: Let c denote a wafer in which a large particle is present, and let a denote a wafer in which it is absent. The sample space of the experiment is infinite, and it can be represented as all possible sequences that start with a string of a’s and end with c. That is, S = {c, ac, aac, aaac, aaaac, . . .} Consider a few cases. We have P[Y = 1] = P[c] = 0.01. Also, using the independence assumption, P[Y = 2] = P[{ac}] = 0.99(0.01) = 0.0099. CUMULATIVE DISTRIBUTION FUNCTIONS The cumulative distribution function of a discrete random variable X, denoted as F(x), is F(x) = P[X ≤ x] = ∑𝑥 ≤𝑥 P[𝑋 = 𝑥𝑖 ] . The cumulative distribution function F(x) of a discrete random variable satisfies the following properties: (1) F(x) = ∑𝑥 ≤𝑥 𝑓(𝑥𝑖) (2) 0 ≤ F(x) ≤ 1 (3) If a ≤ b then F(a) ≤ F(b). 𝑖 𝑖 Example: Determine the probability mass function of X from the cumulative distribution function: 0, 𝑥 < −2 0.2, −2 ≤ 𝑥 < 0 𝑓(𝑥) = { 0.7, 0 ≤ 𝑥 < 2 1, 2 ≤ 𝑥 Solution: The domain of the probability mass function are the included endpoints of each interval, xi = −2, 0, 2. The value of f(x) at each xi is determined by f(xi) = F(xi) − F(xi−1) for i = 1, 2, 3 and f(x1) is taken to be equal to F(x1). f(x1) = f(−2) = F(x2) = F(0) = 0.2 f(x2) = f(0) = F(x2) − F(x1) = F(0) − F(−2) = 0.7 − 0.2 = 0.5 f(x3) = f(2) = F(x3) − F(x2) = F(2) − F(0) = 1 − 0.7 = 0.3 0.2, 𝑥 = −2 0.5, 𝑥 = 0 Therefore, 𝑓(𝑥) = { f(xi) is the difference between values of F(x) at consecutive subintervals, 0.3, 𝑥 = 2 and the xi ’s are the left endpoints of each subinterval. EXPECTED VALUES OF RANDOM VARIABLES The mean or expected value of the discrete random variable X with probability mass function f(x), denoted as µX of E[X] is µX = E[X] = ∑𝑎𝑙𝑙 𝑥 𝑥𝑓(𝑥). Example: A salesperson for a medical device company has two appointments on a given day. At the first appointment, he believes that he has a 70% chance to make the deal, from which he can earn $1000 commission if successful. On the other hand, he thinks he only has a 40% chance to make the deal at the second appointment, from which, if successful, he can make $1500. What is his expected commission based on his own probability belief? Assume that the appointment results are independent of each other. Solution: Let Y denote the total commission of the salesperson in the appointments. The table below summarizes his total commission and the associated probabilities in parentheses. His expected commission is µ = E[Y ] = 2500(0.28) + 1000(0.42) + 1500(0.12) + 0(0.18) = 1300 THE BINOMIAL DISTRIBUTION Many processes can be thought of as consisting of a sequence of Bernoulli trials, such as, for example, the repeated tossing of a coin or the repeated examination of objects to determine whether or not they are defective. In such cases, a random variable of interest is the number of successes obtained within a fixed number of trials n, where a success is defined in an appropriate manner. Such a random variable is called a binomial random variable. If the binomial random variable X is the number x of trials that result in a success in a Bernoulli process having n trials, the probability mass function of X is f(x) = (𝑛𝑥)px(1 − p)n-x, x = 0, 1, 2, . . . , n. The binomial random variable is probably the most important of all discrete probability distributions. Its probability distribution is called a binomial distribution. The mean µ and variance σ 2 of the binomial random variable X with parameters n and p, the number of trials and the probability of a success, respectively, are µ = E[X] = np & σ2 = E[X] = np(1 − p). Example: Each sample of water has a 10% chance of containing a particular organic pollutant. Assume that the samples are independent with regard to the presence of the pollutant. a) Find the probability that in the next 18 samples, exactly 2 contain the pollutant. b) Find the probability that 3 to 5 of the 20 samples contain the pollutant. c) Find the mean and standard deviation of the number of pollutants in 16 samples. Solutions: a) Let X be the number of samples that contain the pollutant in the next 18 samples analyzed. X is a binomial random variable with p = 0.1 and n = 18. Therefore, P[X = 2] = (82)(0.1)2 (0.9)16 = 0.2835. b) The required probability is P[3 ≤ X ≤ 5]. P[3 ≤ X ≤ 5] = P[X = 3] + P[X = 4] + P[X = 5] = (20 )(0.1)3 (0.9)17 + (20 )(0.1)4 (0.9)16 + (20 )(0.1)5 (0.9)15 = 0.3118 3 4 5 c) µ = np = 16(0.1) = 1.6 σ = √ σ2 = √ np(1 − p) = √ 16(0.1)(0.9) = 1.2 THE POISSON DISTRIBUTION The number X of outcomes occurring during a Poisson experiment is called a Poisson random variable, and its probability distribution is called the Poisson distribution. The probability mass function of the Poisson random variable X, representing the number of outcomes occurring in a given time interval or specified region denoted by t, is f(x) = e −λt(λt)X/ x! , x = 0, 1, 2, . . . where λ is the average number or outcomes per unit time, distance, area, or volume. Example: Ten is the average number of oil tankers arriving each day at a certain port. The facilities at the port can handle at most 15 tankers per day. a) What is the probability of finding 8 oil tankers on a given day? b) What is the probability that on a given day tankers have to be turned away? Solution: a) We are given λ = 10 (oil tankers per day), so we take t = 1 (day). f(x = 8) = e −10108 / 8! = 0.1126 b) Tankers will be turned away if the number of tankers exceed the port’s capacity of 15. Thus, the probability we seek is P[X > 15]. P[X > 15] = 1 − P[X ≤ 15] = 1 − ∑15 𝑥=0 𝑒 −10 10𝑥 𝑥! = 1 − 0.9513 = 0.0487 CONTINUOUS RANDOM VARIABLES AND THEIR PROBABILITY DISTRIBUTION Continuous Probability Distributions - Physical quantities such as time, length, area, temperature, pressure, load, intensity, etc., when they need to be described probabilistically, are modeled by continuous random variables. A continuous random variable is a function whose range is an interval of real numbers. When a sample space has an infinite number of sample points, the associated random variable is continuous with its values distributed over one or more intervals on the real number line. The function f(x) is a probability density function of the continuous random variable ∞ X defined over the set of real numbers if a) 𝑓(𝑥) ≥ 0 ; b) ∫−∞ 𝑓(𝑥)dx = 1 ; c) P[𝑎 ≤ 𝑥 ≤ 𝑏] = 𝑏 ∫𝑎 𝑓(𝑥) 𝑑𝑥 Example: Consider the function 𝑓(𝑥) = 1 2 𝑥 , {3 −1 < 𝑥 < 2 . a) Show that it is a probability density 0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒 function of some continuous random variable X. b) Determine P[0 < X ≤ 1]. Solution: 1 a) We show that Properties (1) and (2) are satisfied. 1) Clearly, 3 𝑥 2 ≥ 0 for all real 1 1 number x. 2) We must show that ∫−∞ 𝑓(𝑥)dx = 1, ∫−∞ 3 𝑥 2 dx = ∫−1 3 𝑥 2 dx = 1 ∞ ∞ 2 11 b) P[0 < X ≤ 1] = ∫0 3 𝑥 2 dx = 1/9 If X is a continuous random variable with probability density function f(x), the cumulative 𝑥 distribution function F(x) is defined as 𝐹(𝑥) = P[𝑋 ≤ 𝑥] = ∫−∞ 𝑓(𝑡)𝑑𝑡. The cumulative distribution function has the following properties: 1) lim 𝐹(𝑥) = 0 2) lim 𝐹(𝑥) = 1 3) P[𝑎 < 𝑋 ≤ 𝑏] = 𝐹(𝑏) − 𝐹(𝑎) 𝑥→+∞ 𝑥→−∞ Example: Suppose that for some continuous random variable X, F(x) = 1 / 80 (x4 − 1) for 1 ≤ x ≤ 3. a) What is the probability that X assumes a value between 1.2 and 2.6? b) Find the density function and use it to compute P[1.2 < X < 2.6]. Solution: a) We apply Property 3 to compute P[1.2 < X < 2.6]. P[1.2 < X < 2.6] = F(2.6) − F(1.2) = 1/ 80 (2.64 − 1) − 1/ 80 (1.24 − 1) = 0.5453 b) f(x) = F’ (x) = 1 / 80 (4x3 ) = 1 / 20 x3, P[1.2 0.5453 1 2.6 < 𝑋 < 2.6] = ∫1.2 20 𝑥3 dx = EXPECTED VALUES OF CONTINUOUS RANDOM VARIABLES Let X be a continuous random variable with density f(x). The mean or expected value of X, denoted ∞ µX or E[X], and variance of X, denoted σ2X or V[X] are defined as µX = E[X] = ∫−∞ 𝑥𝑓(𝑥)dx & σ2X = V[X] = ∞ ∫−∞(𝑥 − µ𝑋)2 𝑓(𝑥)dx 1 Example: Consider the function 𝑓(𝑥) = {3 𝑥 2 , −1 < 𝑥 < 2 0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒 random variable. 1 3 2 2 1 , Compute the mean and variance of the 1 3 2 Solution: µ = ∫−1 𝑥 . 𝑥2 dx = ∫−1 𝑥3 dx = 5/4, σ2 =∫−1 𝑥 2 . 𝑥2 dx - µ2 = 51/80 3 NORMAL DISTRIBUTION Undoubtedly, the most widely used model for a continuous measurement is a normal random variable and its distribution, normal distribution, is the most important continuous probability distribution. Example: The time X until recharge for a battery in a laptop computer under common conditions is normally distributed with µ = 260 minutes and σ = 50 minutes. Find the probability that a fully charged laptop lasts a) anywhere from 3 to 4 hours; b) longer than 3 hours; c) less than 270 minutes; d) longer than 300 minutes. 240 a) P[180 < X < 240] = ∫180 Solution: 1 50√2𝜋 𝑒 − 1 (𝑥−260)2 2.502 𝑑𝑥 = 0.2898 b) We use the symmetric property of the curve, P[−∞ < X < µ] = P[µ < X < ∞] = 0.5. ∞ P[X > 180] =∫180 50 ∞ 1 ∫260 50√2𝜋 𝑒 1 𝑒 2𝜋 − √ 1 − (𝑥−260)2 2.502 1 (𝑥−260)2 2.502 270 270 1 𝑒 2𝜋 50√ − 1 (𝑥−260)2 2.502 1 300 1 𝑒 50√2𝜋 𝑒 2𝜋 − 50√ 1 (𝑥−260)2 2.502 𝑑𝑥 + − 1 (𝑥−260)2 2.502 260 𝑑𝑥 = ∫−∞ 1 𝑒 2𝜋 − 50√ 1 (𝑥−260)2 2.502 𝑑𝑥 + 𝑑𝑥 = 0.5793 ∞ 1 − (𝑥−260)2 2.502 𝑒 2𝜋 50√ d) P[X > 300] = ∫300 50 ∫260 1 𝑑𝑥 = 0.9452 c) P[X < 270] = ∫−∞ ∫260 260 𝑑𝑥 = ∫180 1 𝑒 2𝜋 √ 𝑑𝑥 = 0.2119 − 1 (𝑥−260)2 2.502 ∞ 𝑑𝑥 = ∫260 50 1 𝑒 2𝜋 √ − 1 (𝑥−260)2 2.502 𝑑𝑥 − EXPONENTIAL DISTRIBUTION The random variable X, the distance between successive events from a Poisson process with mean number of events λ > 0 per unit distance is an exponential random variable with parameter λ. The probability density function of X is f(x) = λe−λx µ = 1 /λ & σ2 = 1 / λ2 Example: The lifetime of a mechanical assembly in a vibration test is exponentially distributed with a mean of 400 hours. a) What is the probability that an assembly on test fails in less than 100 hours? b) What is the probability that an assembly operates for more than 500 hours before failure? Solution: a) Let X be the time before a mechanical assembly in a vibration test fails, measured in hours. The mean lifetime is µ = 400 hours. P[X < 100] = 1 − F(100) = 0.2212 1 b) P[X > 500] = 𝑒 −400(500)= 0.2865 JOINT PROBABILITY DISTRIBUTION