CONTINUOUS DISTRIBUTIONS The random variable can assume literally any value in some range Ø if x can be any value between, say, 0 and 12, the probability of any particular value is zero 12 Albert Beauregard 6 Ø But a value between 2 and 6 is more likely than a value between 1 and 1.25 „ in the childhood game of “spin-the-bottle,” the probability of being kissed depends on the size of the person „ intuitively, Beauregard is more likely to be kissed than Slim Albert Define a probability density function for each different kind of distribution probabilities will be represented by areas under the graph of the probability density function The Uniform Distribution: the spin-the-bottle distribution simplest possible continuous distribution graph: vertical axis does NOT represent probability but rather probability density normal probabilities, page 1 density measure along the vertical axis is chosen so that the area under the graph = 1: here, (1/12) × 12 = 1 the region marked P(1 ≤ x ≤ 1.5) has area (1/12) × 0.5 = 1/24, while that marked P(6 ≤ x ≤ 9) has area (1/12) × 3 = 1/4, and so on the probability of x values occurring in a particular range is given by the area under the curve in that range; that area is also the proportion of the total area in that range THE NORMAL DISTRIBUTION Reminder: the purpose of these various distributions is to represent mathematically some real-world processes. The normal distribution can represent (or approximate) a startling variety of such processes General characteristics • continuous • symmetric about its mean • values of x close to the mean are more likely than those further out formula: f ( x) = 1 2πσ 2 ( x − µ ) − 2σ 2 ×e where π = 3.14159… and e = 2.718281828… Graph: a "bell curve" height of the curve, as such, means little: probabilities are represented as areas under the curve. normal probabilities, page 2 on a distribution with µ = 200 and σ = 30, the probability of values between 225 and 250 is given by the shaded area the only things in the normal formula subject to change are µ and σ π and e are natural constants changing µ and σ will generate a whole family of distributions changing µ shifts location: increasing σ increases the dispersion in the distribution, as below Examples of normal (or "Gaussian") processes „ measures of humanity weight height IQ scores „ expectations of inflation held by individuals „ length, weight, diameter, etc. of manufactured parts weight of contents of cereal boxes, soap boxes, oil cans normal probabilities, page 3 diameters of ball bearings, tires, etc. „ deviations of prices from expected values „ changes in prices of a corporation's stock shares „ the deflection of an electron struck by gamma radiation „ sample means of repeated samples drawn from the same population Comment: Laplace could arrive at such a complex formula because he knew what he was looking for FINDING NORMAL PROBABILITIES There are infinitely many normal distributions, one for each µ, σ pair integration techniques? in practice, tremendously useful fact about the normal distribution: all that really matters in calculating probabilities on a normal distribution is the number of standard deviations from the mean for any two normal distributions, no matter how different their µ and σ, the same proportion of the distributions lie in the range µ to µ + r × σ where r is any real number Example: weights of adult human males are normally distributed with µ = 165 and σ = 15; a particular size of ball bearing has normally distributed diameters with µ = 10 mm and σ = 0.01. Then the same proportion of males have weights between 165 and 180 (34.3% as it happens) as ball bearings have diameters between 10 and 10.01 mm; likewise, the same proportion of males have weights between 135 and 150 as ball bearings have diameters between 9.98 and 9.99 mm. Allows us to lay out a table for only one distribution w Distribution actually used to derive probabilities has µ = 0 and σ = 1, the "standard normal distribution" w This table may be arranged in several ways: Appendix E Table E.2a: Standardized Normal Distribution Table gives areas between the mean and the number of standard deviations given in the z column (extreme left hand) the row across the top allows the extension to hundredths „ P(0 ≤ z ≤ 1.5) = 0.4332 „ P(0 ≤ z ≤ 1.54) = 0.4382 normal probabilities, page 4 Table E.2b: Cumulative Standardized Normal Distribution gives cumulative less-than-or-equal probabilities in the sketch, the crosshatched area is P(z ≤ 1.5) Examples: „ P(z ≤ −2) = 0.0228 „ P(z ≤ 1.5) = 0.9332 notice that this is exactly 0.5 more than the probability found above that is because P(z ≤ 0) = P(z ≥ 0) = 0.5. „ P(z ≥ x) = 1 − P(z ≤ x) „ P(z ≥ 1.5) = 1 − P(z ≤ 1.5) = 1 − 0.9332 = 0.0668. „ P(z0 ≤ z ≤ z1) = P(z ≤ z1) − P(z ≤ z0) P(−1.58 ≤ z ≤ 2.1) = P(z ≤ 2.1) − P(z ≤ −1.58) = 0.9821 − 0.0571 = 0.925 Reverse of this process is sometimes useful, that is, Find z0 such that P(z ≤ z0) has some given value Examples: „ find z0 such P(z ≤ z0) = 0.4 find 0.4 in the body of the table (well, 0.3974) and read backwards to find z0 = − 0.26 „ find z* such that 80% of all z's are less than z* find 0.8 in the body of the table (0.7995) and read backwards to find z = 0.84 note that 0.84 is the 80-th percentile of the z distribution „ find zH such that there is only a 4% probability that z > zH P(z ≤ zH) = 1 − 0.04 = 0.96, so zH = 1.75 „ find zL and zH such that the middle 90% of the distribution is between these values by symmetry zL = − zH; since 90% is between, there must be 5% in either tail, so that the required values imply P(z ≤ zL) = 0.05 and P(z ≤ zH) = 0.95. Accordingly, zL = − 1.64 and zH = +1.64 normal probabilities, page 5 MORE GENERAL NORMAL DISTRIBUTIONS z values are numbers of standard deviations on a particular normal distribution; the probabilities will be the same for the same number of standard deviations on any normal distribution. To find P(x ≤ x0) we need only find out how many standard deviations x 0 is from the mean of its distribution and look up the result in the z table. We’ve seen this before: z = (x − µ)/σ expresses the difference between x and µ in terms of σ FUNDAMENTAL EQUATIONS FOR USE WITH THE NORMAL DISTRIBUTION To convert values of any normal distribution into values on the standard normal: z= x −µ σ Expresses deviations from the mean in numbers of standard deviations Examples: „ adult males have mean weight µ = 165 lb., with σ = 15 lb. Charley weighs 172.5 lb. His z score is z = (x − µ)/σ = (172.5 − 165)/15 = 0.5 Put another way, Charley's weight is greater than the mean by 0.5 standard deviation „ Sam weighs 135 lb. His z score is z = (135 − 165)/15 = −2 Sam's weight is 2 standard deviations below the mean z values may be positive or negative, depending on whether a given x value is greater or less than the mean To find an x value which is a given number of standard deviations from the mean: x = µ+ z × σ used when we want to find an x range that has a given probability Examples: „ adult females have a mean weight of 132 lb. with a standard deviation of 14 lb. Find the weights that are one standard deviation above and below the mean weight. xL = µ + z × σ = 132 + (−1) × 14 = 118 xH = µ + z × σ = 132 + 1 × 14 = 146 comment: by principles we’ve seen before 68.26% of all adult females have weights between 132 and 146 lb. normal probabilities, page 6 FINDING PROBABILITIES ON NORMAL DISTRIBUTIONS Procedure convert x values on a given distribution to z scores use z table to find appropriate probabilities Examples: Ø the random variable x is normally distributed with µ = 320 and σ = 12. „ Find P(x ≤ 340) z = (x − µ)/σ = (340 − 320)/12 = 20/12 = 1.67 P(z ≤ 1.67) = 0.9525 „ find P(x ≥ 330) z = (330 − 320)/12 = 0.83 P(z ≥ 0.83) = 1 − P(x ≤ 0.83) = 1 − 0.7967 = 0.2033 Ø with µ = 132 and σ = 14, find the proportion of females whose weight exceeds 154 lb. „ z = (x − µ)/σ = (154 − 132)/14 = 22/14 = 1.57 „ P(z ≥ 1.57) = 1 − P(z < 1.57) = 1 − 0.9418 = 0.0582 „ a woman is selected at random. What is the probability that she weighs less than 123 lb.? z = (123 − 132)/14 = −9/14 = −0.64 P(z < −0.64) = 0.2611 Ø x is normally distributed with µ = 0.6 and σ = 0.03. What is the probability that a randomly selected x value will be between 0.53 and 0.62? zL = (0.53 − 0.6)/0.03 = −2.33 zH = (0.62 − 0.6)/0.03 = 0.67 P(−2.33 ≤ z ≤ 0.67) = 0.7486 − 0.0099 = 0.7387 Ø with mean 165 and standard deviation 15, what proportion of adult males have weights between 180 and 190 lb.? zH = (190 − 165)/15 = 1.67, zL = (180 − 165)/15 = 1 P(1 ≤ z ≤ 1.67) = 0.9525 − 0.8413 = 0.1112 FINDING x VALUES FOR GIVEN PROBABILITIES Procedure: • find z value corresponding to given probability z will be positive or negative as probability is > or < 0.5 • use second fundamental equation to find x value or substitute appropriate values into first equation and solve for x normal probabilities, page 7 Examples: Ø x is normally distributed with µ = 23 and σ = 2. „ Find x0 such that 60% of the distribution is less than x0 in the body of the z table find 0.6 (0.5987). Read backwards to z = 0.25. x = µ + z × σ = 23 + 0.25 × 2 = 23 + 0.50 = 23.50 „ Find x0 such that 32% are less than x0 find 0.32 probability and read backwards to z = − 0.47 x = µ + z × σ = 23 + (−0.47) × 2 = 23 − 0.94 = 22.06 Ø W is normally distributed with µ = 4000 and σ = 162. Find a symmetric interval that contains 90% of the W values Find a WH such that 95% of the W values are less than WH and a WL such that 5% of the W's are less than W0 in the z table, find probability 0.05 (0.0495), then zL = −1.64 in the z table, find probability 0.95 (0.9495), then zH = +1.64. Since the z distribution is symmetric, one or the other of these efforts is redundant. WH = 4000 + 1.64 × 162 = 4265.68 WL = 4000 − 1.64 × 162 = 3734.32 ADDITIONAL EXAMPLES: Ø Fruit Loops boxes are filled to a mean weight of 10 oz. with a standard deviation of 0.2 oz. „ what proportion of these boxes have weights less than 9.8 oz? (0.1587) „ Fruit Loopers want to guarantee that their boxes contain at least some minimum weight; it would be prohibitively expensive to assure that every single box does contain that weight, but they wish to have no more than 2% of boxes that have less than the guaranteed weight. With machinery operating as above, what weight can Fruit Loopers safely guarantee? put another way, find W0 such that only 2% of all boxes contain less than W0 answer: 9.588. Ø Courageous Couriers Company has recorded data on delivery times of packages within the city. For all times recorded, µ = 25 min, and σ = 4 min. „ find the probability that a delivery time will be between 21 and 29 minutes (68.26%) „ what proportion of delivery times exceed 31 min? (0.0668) „ what time can CCC guarantee if they want to have to pay off on non-delivery no more than 1% of the time? (34.32 min.) Ø granules of industrial diamond have a mean size of 1/10 carat with σ = 0.01 carat. Find C0 such that 95% of all granules are at least as big as C0. C0 = µ + z.05 × σ = .1 + (−1.64) × 0.01 = 0.0836 normal probabilities, page 8 Ø Amalgamated Rat Trap has mean daily orders of 1920 cases of rat traps with µ = 300. ART would like to guarantee same-day shipment of orders. „ how much inventory must they hold to have no more than a 6% probability of being unable to ship the day an order is received? (1920 + 1.56 × 300 = 2388) „ how much inventory must ART hold to have no more than a 40% probability of being unable to honor their guarantee? (1920 + 0.26 × 300 = 1998) „ is the larger inventory worthwhile? If cost of one case of rat traps = $100. holding an additional 390 cases means additional $39,000 worth of inventory. If the relevant interest rate is 10%, ART must incur $3900 a year of additional inventory cost to achieve the lower probability normal probabilities, page 9