What is a sampling distribution and why is called that? It’s a distribution of a sample statistic, like x or p. It’s called a ‘sampling’ distribution because every observation within the distribution was calculated from a sample. Why do we use statistics, x and p, instead of the original data, X? Statistics usually have much less variability than the original data. If we take a random sample, the standard deviation of our statistic is 1 n of the original standard deviation. What is this distribution dependent on? 1. the original (or parent) population: the mean and sd 2. the sampling method (e.g., simple random sample) 3. the size of the sample, n What good do they do us (how do we use them)? Knowing the sampling distribution of a statistic allows us to make probability statements about our particular statistic value. For example, “the likelihood of seeing this small of a proportion of red M&M’s® in a bag of 50, is only 2%”. We can also find percentiles. Can we always get (find) a sampling distribution? We could take many, many samples to create a sampling distribution OR we can use the Central Limit Theorem which says that the distribution of the sample mean, x , will be at least approximately normal if we take a large enough sample. X ~ N(52, 42) what is the sampling distribution of X 25 , the sample mean from samples of size n? X 25 ~ N(52, (4/25)2) ~ N(52, 0.82) How likely are we to see an X over 55? P(X>55)=P(Z>(5552)/4)=P(Z>0.75)=1P(Z<0.75) = 10.7734 = 0.2266 How likely are we to see an X 25 over 55? P( X 25 >55)=P(Z>(5552)/0.8)=P(Z>3.75)=1P(Z<3.75) 1 If X is NOT normal, what is the distribution of X n ? IF the sample size is large enough, X n will be approximately normal --Central Limit Theorem. What about categorical data? If n 10 AND n(1) 10, then p ~ N(, ((1)/n)2)