Mar. 29 Statistic for the day: 80.4% of Penn State students drink; 55.2% engage in “highrisk drinking” source: Pulse Survey, n = 1446, margin of error = 2.6% Assignment: Read Chapter 20 Exercises p265: 1, 2, 3a,b Sample means: measurement variables Suppose we want to estimate the mean weight at PSU Histogram of Weight, with Normal Curve 40 Frequency 30 20 10 0 100 200 300 Weight Data from stat100.2 survey. Sample size 237. Mean value is 152.5 pounds. Standard deviation is about (240 – 100)/4 = 35 What is the uncertainty in the mean? Suppose we take another sample of 237. What will the mean be? Will it be 152.5 again? Probably not. Consider what happens if we take 1000 samples each of size 237 and compute 1000 means. Histogram of 1000 means with normal curve, based on samples of size 237 Frequency 100 50 0 145 150 155 Weight Standard deviation is about (157 – 148)/4 = 9/4 = 2.25 160 Note: When we have measurement data and we consider the sample mean, there are two different standard deviations: 1. The original standard deviation of the data. We estimated that from the original histogram of the data. 2. The standard deviation of the sample mean. We estimated that from a histogram of 1000 sample means. In general we will have to be given the standard deviation of the data. Or we will have to estimate it from a histogram. But once we have the standard deviation of the data (called the sample standard deviation) we can skip the histogram of sample means and use a formula. Formula for estimating the standard deviation of the sample mean (don’t need histogram) Suppose we have the standard deviation of the original sample. Then the standard deviation of the sample mean is: standard deviation of the data sample size Jargon: The standard deviation of the mean is also called the standard error or the standard error of the mean and abbreviated SEM or SE Mean. So in our example of weights: The standard deviation of the sample is about 35. Write SD = 35. Sample size is 237 Hence by our formula: SEM = SD/square root of sample size Standard error of the mean is 35 divided by the square root of 237: SEM = 35/15.4 = 2.3 So the margin of error of the sample mean is 2x2.3 = 4.6 Report 152.5 + 4.6 or 147.9 to 157.1 Using the margin of error as 2 SEMs we really have a 95% confidence interval for the pop mean. Normal Curve of sample mean. The standard error is 2.3 and the bell is centered at 152.5. 8 Anatomy of a 95% conf idence interv al 7 6 5 4 3 95% in middle 2 1 2 SEM 0 147.9 152.5 sample mean 157.1 True pop mean in here someplace The steps for 95% confidence interval: 1. sample mean: 152.5 (given) 2. sample standard deviation: SD = 35 (given) 3. sample size: 237 (given) 4. standard error of the mean: SEM = 35/sqrt(237) = 2.3 (you calculate) 5. number of SEMs for 95% confidence coefficient: 2 (you look up in a normal z table) Now you put it all together: 6. 95% confidence interval for pop mean: 152.5 + 2x(2.3) 152.5 + 4.6 147.9 to 157.1 Example: Estimate the true population mean amount spent by stat 100 students for text books in fall 2001. Include a 98% confidence interval. From the class sample survey 1. mean: 275 dollars 2. sample standard deviation: SD = 120 dollars 3. sample size: 100 4. standard error of the mean: SEM = SD/sqrt(100) = 120/10 = 12 5. number of SEMs for 98% confidence interval: 2.33 6. 98% confidence interval: 275 + 2.33x(12) 275 + 27.96 247.04 to 302.96 Interpretation: We estimate that the population of stat100 students spent about $275. 98% confidence interval is $247 to $303, a reasonable set of values for the pop mean. So we believe that the true pop mean amount spent on books this semester is between $247 and $303 with our best guess of $275. Normal Curve of sample mean. The standard error is $12 and the bell is centered at $275. 8 7 Anatomy of a 98% confidence interval 6 5 4 3 98% in middle 2 1 2.33 SEM 0 $247 $275 sample mean $303 True pop mean in here someplace Fibonacci Guess the next number in the sequence 1, 1, 2, 3, 5, 8, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, … Called a Fibonacci sequence. Ratios of pairs after a while equal approximately .618 eg. 8/13 = .615 13/21 = .619 21/34 = .618 34/55 = .618 width length If width .618 length Then the rectangle is called the golden rectangle. Daisy Head 21 clockwise spirals 34 counterclockwise Parthenon in Athens Villa in Paris by Le Corbusier St. Jerome Leonardo da Vinci La Parade Georges Seurat Place de la Concorde Piet Mondrian The golden rectangle has become an aesthetic standard for western civilization. It appears in many places: architecture art pyramids business cards credit cards Research question: Do non-western cultures also incorporate the golden rectangle as an aesthetic standard? Width to Length ratios for rectangles appearing on beaded baskets of the Shoshoni 0.662 0.609 0.670 0.600 0.690 0.844 0.606 0.633 0.606 0.654 0.611 0.595 0.570 0.615 0.553 Width to Length ratio of rectangles in Shoshoni beaded baskets 0.85 0.75 C1 0.693 0.628 0.576 0.610 0.65 Golden Rectangle: .618 0.55 0.749 0.668 0.633 0.652 0.601 0.625 Question: Is the golden rectangle (.618) a reasonable value for the mean of the population of Shoshoni rectangles? 1. 2. 3. 4. sample mean: .638 sample standard deviation: SD = .061 sample size: 25 standard error of the mean: SEM = .012 (I calculated if for you.) Could you create a 95% confidence interval for the population mean? (We’d like to know whether .618 is in this interval.)