Business Statistics for Managerial Decision Probability Theory Probability Theory The mathematics of probability can provide models to describe The flow of traffic through a highway system, a telephone interchange, or a computer processor, the product preference of consumers, the spread of epidemics or computer viruses, and the rate of return on risky investments. We are interested in probability because of its usefulness in statistics. General Probability Rules Rule 1 Rule 2 P(S) =1 Rule 3 0 P(A) 1 for any event A Complement rule: for any event A, P(Ac)=1- p(A) Rule 4 Addition rule: If A and B are disjoint events, then P(A or B) = P(A) + P(B) The General addition rule: for any event A and B P(A or B) = P(A) + P(B) - P(A and B) Independence and the Multiplication Rule Two events A and B are independent if knowing that one occurs does not change the probability that the other occurs. If A and B are independent, P(A and B) = P(A) × P(B) Conditional Probability The following table contains counts (in 1000’s) of persons aged 16-24 who are enrolled in school classified by gender and employment status Male Female Total Employed Unemployed Not in labor force 3927 520 4611 4313 446 4357 8240 966 8968 Total 9058 9116 18174 Conditional Probability Randomly choose a person aged 16 to 24 who is enrolled in school. What is the probability that the person is employed? Now we are told that the person chosen is female. What is the probability that this person is employed? This is a conditional probability. The conditional probability above gives the probability of one event (the person chosen is employed) under the condition that we know another event(the person is female). Definition of conditional probability When P(A) > 0, the conditional probability of B given A is P( B | A) P( A and B) P( A) Two events A and B are independent if P( B | A) P( B) Example:prosperity and education Call a household prosperous if its income exceeds $100,000. Call the household educated if the householder completed college. Select an American household at random, and let A be the event that the selected household is prosperous and B the event that is educated. According to the Current Population Survey, P(A) = .134, P(B) = .254, and the probability that a household is both prosperous and educated is P(A and B) = .080. Example:prosperity and education Draw a Venn diagram that shows the relation between the events A and B. What is the probability P(A or B) that the household selected is either prosperous or educated? In the diagram, shade the event that the household is educated but not prosperous. What is the probability of this event? Find the conditional probability that a household is educated, given that it is prosperous. Are the events A and B independent? How do you know? The Binomial Distribution A store sells 10 computers with 1-year warranties. How many will not need repair within 1 year? A company’s human resources manager asks 100 employee if job stress is affecting their personal lives. How many will say “yes”. In all these situations, we want a probability model for a count of successful outcomes. The Binomial Setting There are a fixed number of n observations. The n observations are all independent. That is, knowing the result of one observation tells you nothing about the other observations. Each observation falls into one of just two categories, which for convenience we call “success” and “failure”. The probability of a success, call it p, is the same for each observation. Example Tossing a coin n = 15 times Binomial Distribution The distribution of successes (x) in a binomial setting is the Binomial distribution of x with parameters n and p. The parameter n is number of observations, and p is the probability of a success on any one observation. The possible values of X are the whole numbers from 0 to n. Binomial Probabilities Suppose we toss a coin 20 times. Let X be the number of heads. What is the probability that x =8? Finding Binomial Probabilities: Formula If X has the binomial distribution with n observations and probability p of success on each observation, the possible values of X are 0, 1, 2, 3, …, n. If k is any one of these values, P( X k ) n p (1 p) k k nk n! p k (1 p) n k k!(n k )! Finding Binomial Probabilities: Formula Wee tossed a coin 20 times, and X is the number of heads. What is the probability that X =8? In this example n = ----- and p =----Using the binomial formula P( X 8) 20! (0.5)8 (1 0.5) 208 0.1201 8!(20 8)! Finding Binomial Probabilities: Tables The formula given in the previous slide is practical for hand calculations when n is small. In practice, we either use statistical packages or table C in your Moore, MaCabe, Duckworth, Sclove text book. Example:Inspecting switches The quality engineers inspect a SRS of 10 switches from a large shipment of which 10% fail to conform to specifications. What is the probability that no more than 1 of the 10 switches in the sample fails inspection? The count X of nonconforming switches in the sample has approximately the binomial distribution with n = ----- and P = -----. What is the probability that exactly 4 in the sample of 10 fail to conform to specification? Binomial Mean and Standard Deviation If a count X has the binomial distribution based on n observations with probability p of success, what is the average count of successes in very many repetition of the binomial setting. If a count X has a Binomial distribution with number of observations n and probability of success p, the mean and the standard deviation of X are np np (1 p ) Example:Inspecting switches The count X of bad switches is Binomial with n = 10 and P = 0.1. The mean and standard deviation of this Binomial distributions are np (10)( 0.1) 1 np(1 p) 10(0.1)( 0.9) .9 .9487 The Normal Approximation to Binomial Distribution The Binomial probability formula and tables are practical only when the number of trials n is small. When n is large, we can use Normal probability calculation to approximate hard to calculate Binomial probability. Normal Approximation for Binomial Distribution Suppose that a count X has the Binomial distribution with n trials and success probability p. When n is large, the distribution of X is approximately normal, N (np, np(1 p) ) As a rule of Thumb, we will use the normal approximation when n and p satisfy np 10 and n(1-p) 10. Example:Is clothes shopping frustrating Sample surveys show that fewer people enjoy shopping than in the past. A recent survey asked a nationwide random sample of 2500 adults if they agreed or disagreed that “I like buying new clothes, but shopping is often frustrating and time consuming.” The population that the poll wants to draw conclusions about is all the U.S. residents aged 18 and over. Suppose that 60% of all adult U.S. residents would say “agree” if asked the same question. What is the probability that 1520 or more of the sample agree. Example:Is clothes shopping frustrating Histogram of 1000 binomial counts (n = 2500, p = 0.6) and the normal density curve that approximates this Binomial distribution. np (2500)( 0.6) 1500 np(1 p) 2500(0.6)( 0.4) 24.49 X ~ N (1500, 24.49) Example:Is clothes shopping frustrating What is the probability that 1520 or more of the sample agree? 1520 1500 ) P( Z 0.82) 24.49 1 .7930 .2061 P( X 1520) p( z The Poisson Distributions It is common to meet counts that are open ended. A bank counts the number of automatic teller machine (ATM) customers arriving at a particular ATM between 2:00 p.m. and 4:00 p.m. A railyard counts the number of work injuries that happen in a month. What are the possible outcomes for these examples? Poisson distribution is another distribution for counting random variables. The Poisson setting The number of events (call them successes) that occur in any unit of measure is independent of the number of successes that occur in any nonoverlapping unit of measure. The probability that a success will occur in a unit of measure is the same for all units of equal size and is proportional to the size of the unit. The probability that 2 or more successes will occur in a unit approaches 0 as the size of the unit becomes smaller. Poisson Distribution The distribution of the count X of successes in the Poisson setting is the Poisson distribution with mean . The parameter is the mean number of successes per unit of measure. The possible values of X are the whole numbers 0, 1, 2, 3, … if k is any whole number 0 or grater, then e k P( X k ) k! The standard deviation of the distribution is . Example: Flaws in carpets A carpet manufacturer knows that the number of flaws per square yard in a type of carpet material varies with an average of 1.6 flaws per square yard. The count X of flaws per square yard can be modeled by the Poisson distribution with = 1.6. The unit of measure is a square yard of carpet material. What is the probability of no more than 2 defects in a randomly chosen square yard of this material? Example: Flaws in carpets e k p( x k ) 1.6, k 2 k! P ( x 2) p ( x 0) p ( x 1) p ( x 2) e 1.6 (1.6) 0 e 1.6 (1.6)1 e 1.6 (1.6) 2 0! 1! 2! 0.2019 0.3220 .2584 .7834 The Role of Probability in Statistical Inference A statistic from a random sample will take different values if we take more samples from the same population. That is, sample statistics are random variables. The values of a statistic (sampling distribution, in many samples) have a regular pattern. We will use the language of probability to to examine the sampling distribution of a sample mean X . Example: Does this wine smell bad? Sulfur compounds such as Dimethyl sulfide (DMS) are sometimes present in wine. DMS causes “off-odors” in wine, so winemakers want to know the odor threshold, the lowest concentration of DMS that the human nose can detect. Different people have different thresholds, so we start by asking about the mean threshold in the population of all adults. The number is a parameter that describe this population. Example: Does this wine smell bad? To estimate , we present tasters with both natural wine and the same wine spiked with DMS at different concentrations to find the lowest concentration at which they can identify the spiked wine. Here are the odor threshold (measured in micrograms of DMS per liter of wine) for 10 randomly selected subjects: 28 40 28 33 20 31 29 27 17 21 The mean threshold for these subjects is X 27.4 . This sample mean is a statistic that we use to estimate the parameter . This is probably not exactly equal to . A different 10 subjects would give us a different X . Statistical Estimation and the Law of Large Numbers A parameter, such as the mean threshold of all adults, is in practice a fixed but unknown number. A statistic, such a the mean threshold X of a random sample of 10 adults, is a random variable. We use X to estimate . An SRS should fairly represent the population, so the mean X of the sample should be somewhere near the mean of the population (i.e. X it is an unbiased estimate of ). Statistical Estimation and the Law of Large Numbers If X is rarely exactly right and varies from sample to sample, why is it nonetheless a reasonable estimate of the population ? The answer: If we keep on taking larger and larger samples, the statistic X is guaranteed to get closer and closer to the parameter . That is if we can afford to keep on measuring more subjects, eventually we will estimate the mean odor threshold of all adults very accurately. This fact is known As the law of large Numbers. The Law of Large Numbers Draw independent observations at random from any population with finite mean . As the number of observations drawn increases, the mean X of the observed values get closer and closer to the mean of the population The Law of Large Numbers In fact, the distribution of odor threshold among all adults has mean 25. = 25 As we take more observations, the sample mean X always approaches the mean of the population. Sampling Distributions The law of large number assures us that if we measure enough subjects, the statistic X will eventually get very close to the unknown parameter . In our example we had a sample of 10 subjects. What can we say about X from 10 subjects as an estimate of ? That is, what would happen if we took many samples of 10 subjects from this population? Sampling Distributions To answer this question Take a large number of samples of size 10 from the same population Calculate the sample mean X for each sample. Make a histogram of the values of X . this histogram shows how X varies in many samples. The histogram of values of the statistic approximates the Sampling distribution that we would see if we kept on sampling for ever. One reason for studying probability is that the laws of probability can tell us about sampling distributions without the need to actually choose or simulate a large number of samples. The mean and Standard Deviation of X Suppose that X is the mean of a SRS of size n drawn from a large population with mean and standard deviation . Then the mean of the sampling distribution of X is and its standard deviation is n . Sampling Distribution of a Sample Mean If a population has the N(, ) distribution, the sample mean X of n independent observations has the N ( , ) n Example: Estimating Odor Threshold Adults differ in the smallest amount of DMS they can detect in wine. Extensive studies have found that the DMS odor threshold of adults follows roughly a Normal distribution with mean = 25 g/l and standard deviation = 7 g/l. because the population distribution is Normal, the sampling distribution of X is also Normal Both distribution have the same mean But means ( X )from a sample of 10 adults vary less than do measurements on individual adults. The standard deviation of X is n 7 2.21 g / l 10 Example: Estimating Odor Threshold The distribution of single observations compared with the distribution of the mean X of 10 observations. Averages are less variable than individual observations. Central Limit Theorem What happens when the population distribution is not Normal? As the sample size increases, the distribution of X changes shape: it looks less like that of the population distribution and more like a Normal distribution. When the sample is large enough, the distribution of X is very close to Normal This important fact of probability is called the central limit theorem. The Central Limit Theorem in Action The distribution of means X from a strongly nonnormal population becomes more Normal as the sample size increases. (a) the distribution of 1 observation (b) The distribution X of two observations (c)The distribution of X of 10 observations (d) the distribution of X of 25 observations. Central Limit Theorem Draw a SRS of size n from any population with mean and finite standard deviation . When n is large, the sampling distribution of the sample mean X is approximately Normal: X is approximat ely N ( , ) n Example: flaws in carpets The number of flaws per square yard in a type of carpet material varies with mean 1.6 flaws per square yard and standard deviation 1.2 flaws per square yard. The population distribution cannot be normal, because a count takes only whole number values. An inspector samples 200 square yards of material, records the number of flaws found in each square yard, and calculates X , the mean number of flaws per square yard inspected. Use the central limit theorem to find the approximate probability that the mean number of flaws exceeds 2 per square yard.