Sampling Distributions Chapter 7 The German Tank Problem In WWII, the Allies captured several German Tanks. Each one had a serial number on it. The German Tank Problem Allied commanders wanted to know how many tanks the Germans actually had! They sent the serial numbers to a group of mathematicians in Washington D.C. The German Tank Problem The mathematicians produced remarkably close estimates for the actual number of German Tanks. http://en.wikipedia.org/wiki/German_tank_pro blem The Textbook Problem How many Algebra 2 books are in this school? The Textbook Problem • Random Sample: – 10 – 38 – 59 – 61 – 74 – 90 – 94 Sampling Distribution • Samples are used to find out about the whole population. • A statistic is a number that describes the __________. • A parameter is a number that describes the ______________. • In statistics, the value of a parameter is usually unknown because Notation • Sample mean: • Population mean: • Sample proportion: • Population proportion: Example 1 • A pediatrician wants to know the 75th percentile for the distribution of heights for 10 year old boys, so she takes a sample of 50 patients and calculates Q = 56 inches. – Population – Sample – Parameter – Statistic Example 2 • A poll asked 1,102 12-17 year olds in the United States if they have a cell phone. Of the respondents, 71% said yes. – Population – Sample – Parameter – Statistic Colored Chips • Take out a handful of 20 chips. If you get more just randomly toss some back. • Count the number of RED CHIPS and record it on your index card. • What proportion of your chips were red? • Example: 8 red chips/20 total = 0.4 • Record this number on your index card, too. • Shuffle and pass to the next person. Colored Chips Number of RED = Proportion of RED = . 20 = ___ Sampling Variability • Each time we sample, we will get slightly different information. Why? • Because of this, it is best to take a large number of samples from the same population. – Calculate the same statistic for every sample. – Make a graph of these values. – Examine the distribution (SOCS) – Activity: Sampling Sampling Distribution • The distribution of values taken by the statistic in all possible samples of the same size from the same population. • If you were trying to find the mean age of people in Tennessee, the sampling distribution would be the list of all the means from every sample that could be taken. Sampling Distribution • It’s too difficult to take ALL POSSIBLE SAMPLES • Sometimes we use simulations to imitate this process. • If you don’t use ALL POSSIBLE SAMPLES, it’s not a sampling distribution! It would be an approximation. Sampling Distribution • In the activity you are working on (colored chips), we cannot make an actual sampling distribution. • We are limited to taking many samples. We will take about 20, but ideally we would take many more. • I will use your samples to create a dotplot. Proportion of RED chips 10 9 8 7 6 5 4 3 2 1 0 0 0.1 0.2 0.3 0.4 p 0.5 0.6 0.7 0.8 Sampling Distribution • Once we see the dotplot, we will analyze: – Shape – Center – Spread – Outliers SOCS • • • • Shape: symmetry? Skewed left/right? Center: Balance Point, Mean? Spread: Variability, Standard Deviation? Outliers: Any outliers or unusual features? Assignment • On a separate sheet of paper, answer the 3 Check Your Understanding Questions on page 420-421. Unbiased Estimator • How well does the sample proportion estimate the real proportion of red chips? • Remember, the real proportion is 0.5 • If the mean of the sampling distribution = the true value of the parameter being estimated, the statistic is called an unbiased estimator. Sample Proportions 7.2 The sampling distribution of p • Shape: • Center: • Spread: The sampling distribution of p • Shape: in some cases, it looks like a normal curve. This depends on n and p. • Center: The mean = p. This is because p is an unbiased estimator. • Spread: The standard deviation gets smaller as n gets larger. The value of it depends on n and p. Sample Proportion p= Mean and Standard Deviation CYU page 437 Using the Normal Approximation