Sampling Distributions of Proportions Parameter • A number that describes the population • Symbols we will use for parameters include m - mean s – standard deviation p – proportion (p) a – y-intercept of LSRL b – slope of LSRL Statistic • A number that that can be computed from sample data without making use of any unknown parameter • Symbols we will use for statistics include x – mean s – standard deviation p – proportion a – y-intercept of LSRL b – slope of LSRL A distribution is all the values that a variable can be. • • The dotplot is a partial graph of the Toss a penny 20 times and record sampling distribution of all sample theproportions number ofof heads. sample size 20. If I found all possible sample Calculate thetheproportion of heads proportions this plot wouldon bethe & mark it on the– dot approximately normal! board. What shape do you think the dot plot will have? Sampling Distribution • Is the distribution of possible values x ˆ p possible of a statistic from all n samples of the same size from the Where x is the number in the same population sample & n is the sample size • In the case of the pennies, the We it’s will use: p for the population distribution of all possible sample proportion proportions (p) and p-hat for the sample proportion Suppose we have a population of six people: Alice, Ben, Charles, Denise, Edward, & Frank We are interested in the proportion of females. This is called The parameter of interest What is the proportion of females? 1/3 Draw samples of two from this population. How many different samples are possible? 6C2 =15 Find the 15 different samples that are possible & find the sample proportion of the number of females in each sample. Ben & Frank Alice & Ben .5 Charles & Denise Alice & Charles .5 Alice & Denise 1 Charles & Edward Alice & Edward .5 Charles & Frank the mean of the Alice & Frank How does .5 Denise & Edward (mp-hat) Ben & Charlessampling 0 distribution Denise & Frank Ben & Denise compare .5 to the population Edward & Frank parameter (p)? m = p Ben & Edward 0p-hat 0 .5 0 0 .5 .5 0 Find the mean & standard deviation of all p-hats. μpˆ 1 3 & σ pˆ 0.29814 Suppose we have a population of six people: Alice, Ben, Charles, Denise, Edward, & Frank Draw samples of three from this population. How many different samples are What do you notice about the possible? 20 means standard 6C& 3= deviations? Find the mean & standard deviation of all p-hats. m pˆ 1 3 & s pˆ 0.2108 Formulas: μpˆ p σ pˆ These are found on the formula chart! p 1 p n Does the standard deviation of the sampling distribution equal the equation? NO - σ pˆ 1 2 1 3 So3– in order to 0calculate .29814 the 2 standard 3 deviation of the sampling distribution, we WHY? MUST be sure that our sample Correction factor – multiply size is less than 10% of the by We are sampling more than 10% of our population! If we N n population! use the correction factor, we will see that weNare 1 correct. σ pˆ 1 2 3 3 6 2 0.29814 2 6 1 Assumptions (Rules of Thumb) • Sample size must be less than 10% of the population (independence) (Population > 10n) • Sample size must be large enough to insure a normal approximation can be used. np > 10 & n (1 – p) > 10 Why does the second assumption insure an approximate normal distribution? What would happen if the fixed numberdistributions was Remember back to binomial 100? Suppose n = 10 & p = 0.1 (probability of a success), a histogram of this distribution is strongly skewed right! Suppose a binomial distribution has n = 100 and p = 0.1. What is the mean and standard these when bars are deviation ofNotice thisthat distribution? However, n isextremely large enough, m= small and extend out to 100 – so this the tail will spread into an 10distribution & s = 3 is skewed right approximate normal curve Graph a histogram of this binomial distribution. What shape to do you expect this to be? Since p = .1, we would expect this distribution to be skewed right Why do we need to also check n(1 – p)? Consider what the histogram looks like when n = 10 and p = .9. We must also check that the upper tail will spread out into an approximate normal curve. Assumptions (Rules of Thumb) • Sample size must be less than 10% of the population (independence) (Population > 10n) • Sample size must be large enough to insure a normal approximation can be used. np > 10 & n (1 – p) > 10 Chip Activity: •Select three samples of size 5, 10, and 15 and record the number of blue chips. •Place your proportions on the appropriate dotplots. What do you notice about these distributions? Some proportion distributions where p = 0.2 Let p be the proportion of successes in a random sample of size n from a population whose proportion of S’s (successes) is p. n = 10 n = 20 n = 50 n = 100 0.2 0.2 0.2 0.2 Based on past experience, a bank believes that 7% of the μpˆ .07 people who receive loans will not make payments on .07 time. .93 σ . 01804 ˆ p The bank recently approved 200 loans. Yes – 200 np = 200(.07) = 14 and standard deviation n(1 - p) = 200(.93) = 186 What are the mean of the proportion of clients in this group who may not make payments on time? Ncdf(.10, Are assumptions met? 10^99, .07, .01804) = .0482 What is the probability that over 10% of these clients will not make payments on time? Suppose one student tossed a coin 200 times and found only 42% heads. Do you believe that this is likely to happen? Find the probability that a coin would land heads less than 42% .5(.5=) 200(.5) np = 200(.5) = 100 & n(1-p) = 100 ncdf ,. 42 ,. 5 , . 0118 of Since the time. both curve! > 10, I can use a normal 200 Find m & s using the formulas. No – since there is approximately a 1% chance of this happening, I do not believe the student did this. Assume that 30% of the students at SHS wear contacts. In a sample of 100 students, what is the probability that more than 35% of them wear contacts? mp-hat = .3 & sp-hat = .045826 Check assumptions! np = 100(.3) = 30 & n(1-p) =100(.7) = 70 Ncdf(.35, 10^99, .3, .045826) = .1376 Example If the true proportion of defectives produced by a certain manufacturing process is 0.08 and a sample of 400 is chosen, what is the probability that the proportion of defectives in the sample is greater than 0.10? Since np 400(0.08) 32 > 10 and n(1-p) = 400(0.92) = 368 > 10, it’s reasonable to use the normal approximation. Example (continued) mp p 0.08 p(1 p) 0.08(1 0.08) sp 0.013565 n 400 p mp 0.10 0.08 z 1.47 sp 0.013565 P(p > 0.1) P(z > 1.47) 1 0.9292 0.0708 Example Suppose 3% of the people contacted by phone are receptive to a certain sales pitch and buy your product. If your sales staff contacts 2000 people, what is the probability that more than 100 of the people contacted will purchase your product? Clearly p = 0.03 and p = 100/2000 = 0.05 so 0.05 0.03 P(p > 0.05) P z > (0.03)(0.97) 2000 0.05 0.03 P z > P(z > 5.24) 0 0.0038145 Example - continued If your sales staff contacts 2000 people, what is the probability that less than 50 of the people contacted will purchase your product? Now p = 0.03 and p = 50/2000 = 0.025 so 0.025 0.03 P(p 0.025) P z (0.03)(0.97) 2000 0.025 0.03 P z P(z 1.31) 0.0951 0.0038145