Chapter 9.1: Sampling Distributions Mr. Lynch AP Statistics The Heights of Women The heights of women in the world follow: N(64.5, 2.5) … Explain … Let’s draw a sketch that helps illustrate this MATH … PRB … 6:randNorm(64.5,2.5) Stand up if your value is between [62, 67] Stand up if your value is between [59.5, 69.5] Stand up if your value is between [57, 72] The Heights of Women MATH … PRB … 6:randNorm(64.5,2.5, 100) STO L1 1-Var Stats: Mean? Median? S? STAT PLOT 1: Histogram … L1, 1 WINDOW: X:[57,72, 2.5] …Y:[-10,60,10] STAT PLOT 2: Boxplot … L1, 1 TRACE Histogram … Enter frequencies is chart Repeat three times … fill out frequency chart as shown The Heights of Women Interval Set #1 Set #2 Set #3 Total 57-59.5 3 2 3 8 % 2.7 59.5-62 62-64.5 64.5-67 67-69.5 8 39 37 11 15 33 38 10 11 33 32 18 34 105 107 39 11.3 35.0 95.0 70.7 99.4 35.7 13.0 69.5-72 1 2 2 5 1.7 Pooled Data Period 03 – January 2008 Interval Row Row 1 2 Row 3 Lynch Row 4 Row Row 5 6 Total % 57 - 59.5 11 27 37 23 40 38 20 196 2.7% 59.5 – 62 40 158 208 120 180 155 126 987 13.7% 62 - 64.5 91 423 529 306 392 401 298 2440 33.9% 64.5 – 67 102 418 503 318 409 398 323 2471 34.3% 67 - 69.5 43 148 184 106 147 158 111 897 12.5% 69.5 – 72 13 25 39 26 32 50 19 204 2.8% The Heights of Women How did the “Empirical Rule” work out for you? What do the Shape, Center, and Spread look like? Let’s look at the n = 7500 histogram! How are we doing now? Conclusion: This distribution is just a miniature version of the population distribution with same mean and standard deviation The Heights of Women Now, take 4 samples again … and one at a time – Use 1-Var Stats to get the mean X . Write that value on one of your post-it notes. Repeat this 3 more times. Place the notes upon the board CAREFULLY in the correct slots to build a histogram! Let’s record the values in L2. The Heights of Women How did the “Empirical Rule” work out here? Compare a Boxplot for L2 in PLOT 3 – to the one we did in PLOT 2 for the population. What do the Shape, Center, and Spread look like for THIS NEW distribution? Let’s look at the new SAMPLING DISTRIBUTION of Sample means of n = 100 histogram! Conclusion: What is the relationship between the mean of the population and the mean of the X bars? What about the standard deviation of the population and that of the X-bars? Terminology Population Parameter– – – – Numerical value that describes a population A “mysterious” and essentially unknowable – idealized value. A theoretically fixed value Ex: Population Mean, Population Standard Deviation, Population Proportion, Population Size , , p, N Terminology Sample Statistic – – – – – Numerical value that describes a sample (a subset of a larger population) An easily attainable and knowable value Will vary from sample to sample Used to estimate an unknown population parameter Ex: Sample Mean, Sample Standard Deviation, Sample Proportion, Sample Size X , s, pˆ , n Example and Exercises EXAMPLE 9.1: MAKING MONEY EXAMPLE 9.2: DO YOU BELIEVE EXERCISE 9.4: WELL-FED RATS IN GHOSTS? EXERCISE 9.2: UNEMPLOYMENT Sampling Variability What would happen if we took many samples? EXAMPLE 9.3 BAGGAGE LUGGAGE Sampling Variability Sampling Distribution: of a statistic is the distribution of values in ALL POSSIBLE samples of the same size EXAMPLE 9.4 RANDOM DIGITS Describing Sampling Distributions EXAMPLE 9.5: ARE YOU A SURVIVOR FAN? 1000 SRSs; n = 100; p = 0.37 1000 SRSs; n = 1000; p = 0.37 Using the Using same a scale x-axis toscale showas shape! to the left! UNBIASED vs. BIASED A Statistic is said to be UNBIASED if the mean of the sampling distribution is equal to the true parameter being estimated When finding the value of a sampling statistic, it is just as likely to fall above the population parameter as it is to fall below it. VARIABILITY of a STATISTIC The larger the sample size, the less variability there will be EXAMPLE 9.6: THE STATISTICS HAVE SPOKEN – – 95% of the samples generated: Mean ± 2 Sd With n = 100 …0.37 ± 2 (0.05) = 0.37 ± 2 (0.05) – With n = 1000 …0.37 ± 2 (0.01) = 0.37 ± 2 (0.01) [0.32, 0.42] [0.35, 0.39] The N-size is irrelevant! Accuracy for n = 2500 is the same for the entire 280M US, as it is for 775K in San Fran BIAS & VARIABILITY (Revisited) Precision versus Accuracy BIAS & VARIABILITY (Revisited 2) Homework Example EXERCISE 9.9: BEARING DOWN p = 0.1; 100 SRSs of size n = 200 Non-conforming ball bearings out of 200 are shown: (e) isa repeated this exercise, instead (c) Find mean of the distribution ofbut p-hat; markused itofon (d)What Whatthe iswe the mean ofof “the sampling distribution” all (b) Describe the shape thethe distribution. (a) Make table that shows frequency of each count! SRSs of size 1000 instead ofp-hat 200?values. What would the Draw a histogram of the the histogram. Anyofevidence of bias in the sample? possible samples size 200? mean of this be? Would the spread be larger, smaller or about the same as the histogram from part (a)?