Chapter 5 5.3 Finding data values from z-score or probability Finding z-score from probability In the Standard Normal Table, look for probability number inside the table, (or the closest probability value or take average z between the two closest probabilities), then create the z-score from the column and row titles. Read the words carefully. The table is cumulative probability starting at the left of the normal distribution graph. For p = 0.3632, cumulative area of 0.3632 is to the left of our desired z-score, z = -0.35. For 10.75% of distribution area to the right of z-score, use p = 1 – 0.1075 = 0.8925. The desired z-score has 0.8925 cumulative area to the left. z = 1.24 Finding z-score from percentile Translate percentile number to probability. P5 = 5% which means p = 0.05 In Standard Normal Table, look for probability number inside the table, (or the closest probability value or take average z between the two closest probabilities), then create the z-score from the column and row titles. Read the words carefully. The table is cumulative probability starting at the left of the normal distribution graph. For P5 = 5% which means p = 0.05, cumulative area of 0.05 is to the left of our desired z-score. On the table we see 0.0495 (z = -1.65) and 0.0505 (z = -0.64). The probability we want is halfway between 0.0495 and 0.0505, so our z is halfway between -1.64 and -1.65. z = -1.645 For P90 = 90% which means p = 0.90, cumulative area of 0.90 is to the left of our desired z-score. On the table we see the closest probability is 0.8997, so use z = 1.28 Finding data value from z-score We have z, mean and standard deviation. Plug in and solve for x value. Or rearrange formula 𝒙=𝝁+𝒛𝝈 Finding data value for a given probability First find z-score from probability; then calculate x value from z-score formula. 5.4 Sampling Distributions and the Central Limit Theorem Sampling Distribution A sampling distribution is the probability distribution of a sample statistic that is formed when samples of size n are repeatedly taken from a population. If the sample statistic is the sample mean, then the distribution of all those sample means is called the sampling distribution of sample means. Every sample statistic (mean, proportion, variance, standard deviation) has a sampling distribution. Mean of the sample means The mean of the sample means 𝝁𝑥̅ is equal to the population mean 𝝁 . 𝝁𝑥̅ = 𝝁 Interpretation: if you took all the possible samples of size n from the population, every single combination of data to get samples of size n, and get a mean of each sample, then the mean of all those sample means is called 𝝁𝑥̅ and it will equal 𝝁 of the population. The mean of all the sample means will match the population mean. Standard deviation of the sample means = standard error of the mean The standard deviation of the sample means 𝝈𝑥̅ is equal to the population standard deviation divided by the square root of the sample size. 𝝈𝑥̅ = 𝝈 √𝒏 Interpretation: the sample means don’t vary as much as the population data did. Variance of the sample means The variance of the sample means σ2x̅ is equal to the population variance divided by the sample size. 𝝈𝟐 𝝈𝟐𝑥̅ = 𝒏 Interpretation: the sample means don’t vary as much as the population data did. Central Limit Theorem The sampling distribution of sample means will be a normal distribution if the population itself is normally distributed, for any size sample n. The sampling distribution of sample means can be approximated by a normal distribution if n ≥ 30, whether or not the population itself is normally distributed. The larger the sample size, the better the approximation. Finding z-score for sampling distribution of sample means 𝒛= 𝒗𝒂𝒍𝒖𝒆 − 𝒎𝒆𝒂𝒏 ̅−𝝁 𝒙 = 𝝈 = 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒆𝒓𝒓𝒐𝒓 𝑥̅ ̅−𝝁 𝒙 𝝈/√𝒏 Finding probability for sampling distribution of sample means Calculate z-score, then look up probability in the Standard Normal Table. Compare probabilities for x and x̅. These are very different! In both these problems you need to know population mean µ and population standard deviation σ (or substitute s for σ if n . x ̅ 𝒙 Item in distribution Population data values All sample means (means of all possible samples taken from population) Distribution Distribution of population data Sampling distribution of sample means Mean 𝜇 Mean of all possible sample means 𝝁𝑥̅ = 𝝁 Variance 𝝈𝟐 Standard Deviation 𝝈 z-score Example word problem 𝒛= 𝒗𝒂𝒍𝒖𝒆 − 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏 𝒎𝒆𝒂𝒏 𝒙 − 𝝁 = 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝝈 What is the probability that a randomly selected credit card holder has a credit card balance less than $2500? 𝝈𝟐𝑥̅ 𝝈𝟐 = 𝒏 Standard error = 𝝈𝑥̅ 𝒛= = 𝝈 √𝒏 ̅ − 𝝁 ̅ − 𝝁 𝒗𝒂𝒍𝒖𝒆 − 𝒎𝒆𝒂𝒏 𝒙 𝒙 = = 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒆𝒓𝒓𝒐𝒓 𝝈𝑥̅ 𝝈/√𝒏 You randomly select 25 credit card holders. What is the probability that their mean credit card balance is less than $2500? 5.5 Normal Approximations to Binomial Distributions Binomial Distribution Discrete distribution, not continuous N independent trials Two possible outcomes: success or failure p = probability of success p is constant for each trial q = probability of failure q = 1 – p x is number of successes in n trials, whole numbers If np ≥ 5 and nq ≥ 5, then the binomial random variable x is approximately normally distributed. Mean 𝝁 = 𝒏𝒑 Standard deviation 𝝈 = √𝒏𝒑𝒒 Correction for continuity x is midpoint of interval of width 1. Subtract 0.5 from lowest value and add 0.5 to highest value. Ex: find probability of getting between 270 and 310 successes inclusive. 269.5 < x < 310.5 Ex: find probability of getting at least 158 successes. x > 157.5 Find z-scores for each x. 𝒛= 𝒙− 𝝁 𝝈 Find probability from standard normal table. Note: In a discrete distribution, p(x ≥ c) is different from p(x > c) because p(x = c) is not zero. In a continuous distribution, there is no difference between p(x ≥ c) and p(x > c), because p(x = c) is zero.