Chapter 3 Lecture Slides 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 3: Probability 2 Section 3.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance with certainty. Examples: rolling a die tossing a coin weighing the contents of a box of cereal. 3 Sample Space Definition: The set of all possible outcomes of an experiment is called the sample space for the experiment. Examples: • For rolling a fair die, the sample space is {1, 2, 3, 4, 5, 6}. • For a coin toss, the sample space is {heads, tails}. • Imagine a hole punch with a diameter of 10 mm punches holes in sheet metal. Because of variation in the angle of the punch and slight movements in the sheet metal, the diameters of the holes vary between 10.0 and 10.2 mm. For this experiment of punching holes, a reasonable sample space is the interval (10.0, 10.2). 4 More Terminology Definition: A subset of a sample space is called an event. • A given event is said to have occurred if the outcome of the experiment is one of the outcomes in the event. For example, if a die comes up 2, the events {2, 4, 6} and {1, 2, 3} have both occurred, along with every other event that contains the outcome “2”. 5 Example 1 An electrical engineer has on hand a box containing four bolts and another box containing four nuts. The diameters of the bolts are 4, 6, 8, and 10 mm, and the diameters of the nuts were 6, 10, 12, and 14 mm. One bolt and one nut are chosen. Let A be the event that the bolt diameter is less than 8, let B be the event that the nut diameter is greater than 10, and let C be the event that the bolt and the nut have the same diameter. 6 Example 1 cont. 1. Find the sample space for this experiment. 2. Specify the subsets corresponding to the events A, B, and C. 7 Combining Events The union of two events A and B, denoted A B, is the set of outcomes that belong either to A, to B, or to both. In words, A B means “A or B.” So the event “A or B” occurs whenever either A or B (or both) occurs. 8 Example 2 Let A = {1, 2, 3} and B = {2, 3, 4}. What is A B? 9 Intersections The intersection of two events A and B, denoted by A B, is the set of outcomes that belong to A and to B. In words, A B means “A and B.” Thus the event “A and B” occurs whenever both A and B occur. 10 Example 3 Let A = {1, 2, 3} and B = {2, 3, 4}. What is A B? 11 Complements The complement of an event A, denoted Ac, is the set of outcomes that do not belong to A. In words, Ac means “not A.” Thus the event “not A” occurs whenever A does not occur. 12 Example 4 Consider rolling a fair sided die. Let A be the event: “rolling a six” = {6}. What is Ac = “not rolling a six”? 13 Mutually Exclusive Events Definition: The events A and B are said to be mutually exclusive if they have no outcomes in common. More generally, a collection of events A1 , A2 ,..., An is said to be mutually exclusive if no two of them have any outcomes in common. Sometimes mutually exclusive events are referred to as disjoint events. 14 Probabilities Definition: Each event in the sample space has a probability of occurring. Intuitively, the probability is a quantitative measure of how likely the event is to occur. Given any experiment and any event A: The expression P(A) denotes the probability that the event A occurs. P(A) is the proportion of times that the event A would occur in the long run, if the experiment were to be repeated over and over again. 15 Axioms of Probability 1. Let S be a sample space. Then P(S) = 1. 2. For any event A, 0 P( A) 1 . 3. If A and B are mutually exclusive events, then P( A B) P( A) P( B.) More generally, if A1 , A2 ,.....are mutually exclusive events, then P( A1 A2 ....) P( A1 ) P( A2 ) ... 16 A Few Useful Things • For any event A, P(AC) = 1 – P(A). • Let denote the empty set. Then P( ) = 0. • If S is a sample space containing N equally likely outcomes, and if A is an event containing k outcomes, then P(A) = k/N. • Addition Rule (for when A and B are not mutually exclusive): P( A B) P( A) P( B) P( A B) 17 Example 5 A target on a test firing range consists of a bull’s-eye with two concentric rings around it. A projectile is fired at the target. The probability that it hits the bull’s-eye is 0.10, the probability that it hits the inner ring is 0.25, and the probability that it hits the outer ring is 0.45. 1. What is the probability that the projectile hits the target? 2. What is the probability that it misses the target? 18 Example 6 An extrusion die is used to produce aluminum rods. Specifications are given for the length and diameter of the rods. For each rod, the length is classified as too short, too long, or OK, and the diameters is classified as too thin, too thick, or OK. In a population of 1000 rods, the number of rods in each class are as follows: Length Too Short OK Too Long Too Thin 10 38 2 Diameter OK 3 900 25 Too Thick 5 4 13 19 Example 6 (cont.) 1. What is the probability that a randomly chosen rod is too short? 2. If a rod is sampled at random, what is the probability that it is neither too short or too thick? HW 3.1: 3, 4, 5, 6, 8 20 Section 3.2: Conditional Probability and Independence Definition: A probability that is based on part of the sample space is called a conditional probability. Let A and B be events with P(B) 0. The conditional probability of A given B is P( A B) P( A | B) . P( B) 21 Back to Example 6 What is the probability that a rod will have a diameter that is OK, given that the length is too long? 22 Independence Definition: Two events A and B are independent if the probability of each event remains the same whether or not the other occurs. • If P(A) 0 and P(B) 0, then A and B are independent if P(B|A) = P(B) or, equivalently, P(A|B) = P(A). • If either P(A) = 0 or P(B) = 0, then A and B are independent. • These concepts can be extended to more than two events. 23 Example 6 (cont.) • If an aluminum rod is sampled from the sample space of 1000 rods, find the P(too long) and P(too long| too thin). Are these probabilities different? Why or why not? 24 The Multiplication Rule • If A and B are two events and P(B) 0, then P(A B) = P(B)P(A|B). • If A and B are two events and P(A) 0, then P(A B) = P(A)P(B|A). • If P(A) 0, and P(B) 0, then both of the above hold. • If A and B are two independent events, then P(A B) = P(A)P(B). 25 Extended Multiplication Rule • If A1, A2,…, An are independent results, then for each collection of Aj1,…, Ajm of events • In particular, 26 Example 7 A system contains two components, A and B, connected in a series. The system will function only if both components function. The probability that A functions is 0.98 and the probability that B functions is 0.95. Assume that A and B function independently. Find the probability that the system functions. HW 3.2: 3, 5, 6, 7, 11 27 Section 3.3: Random Variables Definition: A random variable assigns a numerical value to each outcome in a sample space. Definition: A random variable is discrete if its possible values form a discrete set. 28 Example 8 The number of flaws in a 1-inch length of copper wire manufactured by a certain process varies from wire to wire. Overall, 48% of the wires produced have no flaws, 39% have one flaw, 12% have two flaws, and 1% have three flaws. Let X be the number of flaws in a randomly selected piece of wire. Write down the possible values of X and the associated probabilities, providing a complete description of the population from which X was drawn. 29 Probability Mass Function • The description of the possible values of X and the probabilities of each has a name: the probability mass function. Definition: The probability mass function (pmf) of a discrete random variable X is the function p(x) = P(X = x). • The probability mass function is sometimes called the probability distribution. 30 Cumulative Distribution Function • The probability mass function specifies the probability that a random variable is equal to a given value. • A function called the cumulative distribution function (cdf) specifies the probability that a random variable is less than or equal to a given value. • The cumulative distribution function of the random variable X is the function F(x) = P(X ≤ x). 31 More on a Discrete Random Variable Let X be a discrete random variable. Then The probability mass function of X is the function p(x) = P(X = x). The cumulative distribution function of X is the function F(x) = P(X ≤ x). F ( x) p(t ) P( X t ) . tx tx p( x) P( X x) 1, where the sum is over all the x x possible values of X. 32 Example 8 (cont.) Recall the example of the number of flaws in a randomly chosen piece of wire. The following is the pmf: P(X = 0) = 0.48, P(X = 1) = 0.39, P(X = 2) = 0.12, and P(X = 3) = 0.01. Compute the cdf of the random variable X that represents the number of flaws in a randomly chosen wire. 33 Mean and Variance for Discrete Random Variables • The mean (or expected value) of X is given by X xP( X x) , x where the sum is over all possible values of X. • The variance of X is given by X2 ( x X ) 2 P( X x) x x 2 P ( X x) X2 . x • The standard deviation is the square root of the variance. 34 Example 9 A certain industrial process is brought down for recalibration whenever the quality of the items produced falls below specifications. Let X represent the number of times the process is recalibrated during a week, and assume that X has the following probability mass function. x p(x) 0 0.35 1 0.25 2 0.20 3 0.15 4 0.05 Find the mean and variance of X. 35 The Probability Histogram • When the possible values of a discrete random variable are evenly spaced, the probability mass function can be represented by a histogram, with rectangles centered at the possible values of the random variable. • The area of the rectangle centered at a value x is equal to P(X = x). • Such a histogram is called a probability histogram, because the areas represent probabilities. 36 Probability Histogram for the Number of Flaws in a Wire (Example 8) The pmf is: P(X = 0) = 0.48, P(X = 1) = 0.39, P(X=2) = 0.12, and P(X=3) = 0.01. 37 Example 9 (cont.) Construct a probability histogram for the example with the number of weekly recalibrations (Example 9). 38 Continuous Random Variables • A random variable is continuous if its probabilities are given by areas under a curve. • The curve is called a probability density function (pdf) for the random variable. Sometimes the pdf is called the probability distribution. • The function f(x) is the probability density function of X. • Let X be a continuous random variable with probability density function f(x). Then f ( x)dx 1. 39 Computing Probabilities Let X be a continuous random variable with probability density function f(x). Let a and b be any two numbers, with a < b. Then b P(a X b) P(a X b) P(a X b) f ( x)dx. a In addition, P( X a ) P( X a ) a f ( x)dx P( X a) P( X a) f ( x)dx. a 40 More on Continuous Random Variables • Let X be a continuous random variable with probability density function f(x). The cumulative distribution function of X is the function x F ( x) P( X x) f (t )dt. • The mean of X is given by X xf ( x)dx. • The variance of X is given by ( x X )2 f ( x)dx 2 X x 2 f ( x)dx X2 . 41 Example 10 42 Section 3.4: Linear Functions of Random Variables If X is a random variable, and a and b are constants, then aX b a X b, 2 2 2 aX a X b aX b a X HW 3.3: 3, 5, 6, 7, 11 , . 43 More Linear Functions If X and Y are random variables, and a and b are constants, then aX bY aX bY a X bY . More generally, if X1, …, Xn are random variables and c1, …, cn are constants, then the mean of the linear combination c1 X1+…+cn Xn is given by c X c X 1 1 2 2 ...cn X n c1X1 c2 X 2 ... cn X n . 44 Two Independent Random Variables If X and Y are independent random variables, and S and T are sets of numbers, then P( X S and Y T ) P( X S ) P(Y T ). More generally, if X1, …, Xn are independent random variables, and S1, …, Sn are sets, then P( X 1 S1 , X 2 S2 ,K , X n Sn ) P( X 1 S1 ) P( X 2 S2 )L P( X n S n ) . 45 Variance Properties If X1, …, Xn are independent random variables, then the variance of the sum X1+ …+ Xn is given 2 2 2 2 by .... . X1 X 2 ... X n X1 X2 Xn If X1, …, Xn are independent random variables and c1, …, cn are constants, then the variance of the linear combination c1 X1+ …+ cn Xn is given by 2 2 2 2 2 2 2 c X c X 1 1 2 2 ...cn X n c1 X1 c2 X2 .... cn X n . 46 More Variance Properties If X and Y are independent random variables with variances X2 and Y2 , then the variance of the sum X + Y is 2 X Y . 2 X 2 Y The variance of the difference X – Y is X2 Y X2 Y2 . 47 Example 11 An object with initial temperature T0 is placed in an environment with ambient temperature Ta. According to Newton’s law of cooling, the temperature T of the object is given by T = cT0 + (1 – c)Ta, where c is a constant that depends on the physical properties of the object and the elapsed time. Assuming that T0 has mean 25 oC and standard deviation of 2 oC, and Ta has mean 5 oC and standard deviation of 1 oC. Find the mean of T when c = 0.25. Assuming that T0 and Ta are independent, find the standard deviation of T at that time. 48 Independence and Simple Random Samples Definition: If X1, …, Xn is a simple random sample, then X1, …, Xn may be treated as independent random variables, all from the same population. 49 Properties of X If X1, …, Xn is a simple random sample from a population with mean and variance 2, then the sample mean X is a random variable with X X2 2 n . The standard deviation of X is X n . 50 Example 12 The lifetime of a light bulb in a certain application has mean 700 hours and standard deviation 20 hours. The light bulbs are packaged 12 to a box. Assuming that light bulbs in a box are a simple random sample of light bulbs, find the mean and standard deviation of the average lifetime of the light bulbs in a box. HW 3.4: 3, 4, 10, 14 Supp: 4, 6, 8, 12, 13 51 Standard Deviations of Nonlinear Functions of Random Variables If X is a random variable whose standard deviation σx is small, and if U is a function of X, then 𝑑𝑈 𝜎𝑈 ≈ 𝜎𝑋 𝑑𝑋 In practice, we evaluate the derivative dU/dX at the observed value of X. 52 Standard Deviations of Nonlinear Functions of Random Variables If X1, X2 , …, Xn are a random variables whose standard deviations σx1, σx2 , …, σxn are small, and if U is a function of X1, X2 , …, Xn, then 𝜎𝑈 ≈ 𝜕𝑈 𝜕𝑋1 2 2 𝜎𝑥1 𝜕𝑈 + 𝜕𝑋2 2 2 𝜎𝑥2 𝜕𝑈 +⋯+ 𝜕𝑋𝑛 2 2 𝜎𝑥𝑛 In practice, we evaluate the partial derivatives at the observed points (X1, X2 , …, Xn). 53 Summary • • • • • • • • • • Probability and rules Conditional probability Independence Random variables: discrete and continuous Probability mass functions Probability density functions Cumulative distribution functions Means and variances for random variables Linear functions of random variables Mean and variance of a sample mean 54 Problem Workshop 3.4.3 A process that fills plastic bottles with a beverage has a mean fill volume of 2.013 L and a standard deviation of 0.005 L. A case contains 24 bottles. Assuming that the bottles in a case are a simple random sample of bottles filled by this method, find the mean and standard deviation of the average volume per bottle in a case. 55 Problem Workshop 3.4.10 A gas station earns $2.60 in revenue for each gallon of regular gas it sells, $2.75 for each gallon of midgrade gas, and $2.90 for each gallon of premium gas. Let X1, X2, and X3 denote the number of gallons of regular, midgrade, and premium gasoline sold in a day. Assume that X1, X2, and X3 have means 1=1500, 2=500, and 3 =300, and standard deviations 1 = 180, 2 = 90 1 = 40, respectively. a) Find the mean daily revenue. b) Assuming X1, X2, and X3 to be independent, find the standard deviation of the daily revenue. 56 Problem Workshop 3.4.11 The number of miles traveled per gallon of gasoline for a certain car has a mean of 25 miles and a standard deviation of 2 miles. The tank holds 20 gallons. a) Find the mean number of miles traveled per tank. b) Assume the distances traveled are independent for each gallon of gas. Find the standard deviation of the number of miles traveled per tank. c) The car owner travels X miles on 20 gallons of gas and estimates her gas mileage as X/20. Find the mean of the estimated gas mileage. d) Assuming the distances traveled are independent for each gallon of gas, find the standard deviation of the estimated gas mileage. 57 Example Assume the mass of a rock is measured to be m = 674 ±1g, and the volume of the rock is measured to be V= 261 ±0.1 mL. The density D of the rock is given by D = m/V. Estimate the density and find the standard deviation of the estimate. Is it better to upgrade the instrument that measures mass or volume to reduce the standard deviation of the density? 58 Problem Workshop 3.4.14 One way to measure the water content of a soil is to weigh the soil both before and after drying it in an oven. The water content is W = (M1-M2)/M1, where M1 is the mass before drying and M2 is the mass after drying. Assume that M1 = 1.32 ±0.01 kg and M2 = 1.04 ±0.01 kg. a) Estimate W, and find the standard deviation of the estimate. b) Which would provide a greater reduction in the standard deviation of W: reducing the standard deviation of M1 to 0.005 kg or reducing the standard deviation of M2 to 0.005 kg? 59