Confidence Intervals- CHEAT SHEET WHAT IS A CONFIDENCE INTERVAL? A confidence interval for the unknown population mean μ is an interval (or range) of plausible values for μ. It is constructed such that with a chosen degree (or level) of confidence C, the value of the unknown population mean will be captured inside the interval. FOR A POPULATION WE KNOW THAT X SHOULD FALL BETWEEN: WHAT DOES C STAND FOR? C provides information on how much “confidence” we can have in the method used to construct the CI C usual choices are: 90%, 95%, and 99% C can be interpreted as the rate of success for the method used to construct CI in the long run C is the % we will use to get our z-score for the following formula: WHAT ARE THE CORRESPONDING Z VALUES FOR THE THREE MOST COMMON CHOICES FOR C? 90% z= 95% z= 99% z= HOW DO WE CALCULATE A CONFIDENE INTERVAL? For a sufficiently large sample size n (CLT can apply so x ̄ follows a normal distribution) or a population that is already normally distributed, the general formula for a level C confidence interval for the population mean μ when σ is known is given by: WHAT DOES THIS HAVE TO DO WITH SAMPLING DISTRIBUTIONS? Sample means vary in value and form a sampling distribution in which not all samples result in x -values equal to the population mean μ. We should not expect to obtain a sample mean x (based on a specific sample) that is exactly equal to the population mean μ. However, we can expect the point estimate to be fairly close in value to the population mean for a sufficiently large sample size (sampling distribution becomes approximately normal for large sample size). (CENTRAL LIMIT THEOREM) Recall 68-95-99.7 rule: 95% of all observations from a normal distribution will fall within ± 2 standard deviation. If the sample size n is large enough, the sampling distribution of the sample means is approximately normal. Our point estimate x ̄will hardly be equal to the population mean μ, but most likely (≈ 95% of all times) fall within 2 standard deviations about the population mean μ. HOW DO WE INTERPRIT CONFIDENCE INTERVALS? 1- We can be C% confident that the falls in the constructed level C confidence interval, i.e. between the lower and upper CI bound for a specific calculated example. 2- If we would take repeated samples, approximately C% of all samples taken will include the in the long run. 3- The interpretation of a CI is always in terms of the unknown population mean μ and never in terms of the sample mean x ̄. The sample mean x ̄, the center of every CI, will always be included in the CI by default. WHAT DO WE NEED TO BE CAUTIOUS OF? Before we take a sample from a population we can say there is a C% chance, (e.g. 95% chance), that our confidence interval will include the population parameter μ if we plan on constructing C% confidence intervals, (e.g. 95% CIs). Once we have taken the sample, this decision is made. Our interval either does contain μ or it does not. We just don’t know it. There is not a C% chance anymore, all we can say is that we are C% confident, (e.g. 95% confident)