Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as a Decision Definitions Statistical Inference – Provides methods for drawing conclusions about a population from sample data – What happens if I do this many times? Formal Inference – Using probability to express the strength of our conclusions (probability takes chance variation into account) Margin of Error – How accurate we believe our guess is, based on the variability of the estimate – What you always see in the fine print Confidence Intervals Use facts about sampling distributions (what would happen in the long run) to express our confidence in the results of any one sample “We got these numbers by a method that gives correct results 95% of the time.” In the form of estimate ± margin of error Given at confidence level C which gives probability the interval will capture true parameter value in repeated samples Fathom Demo #29: Capturing with Confidence Intervals – Black is a hit, red is a miss – What happens to CIs when….? p542 #10.1 to 10.3 What CIs Say WS Confidence Intervals Toolbox I Create Fabulous Confidence Intervals Identify – Population – Parameter – Procedure Conditions Formula Calculations Interpret in Context Many CIs Collect More Measures Confidence Intervals for μ Conditions – SRS – sampling distribution of Given CLT NPP is approximately normal ± z* “We are __% confident that the true mean ___ is between __ and __. By __% confident, we mean that we arrived at this conclusion by a method that gives correct results __% of the time.” z* (Critical Values) Same as z-scores in Table A (standard normal curve) Most common are 1.645, 1.96 and 2.576 (90, 95 and 99%) 90% is really 95% to left when you use the table (95 is 97.5, 99 is 99.5) Sketches p548 #10.5-10.6 Ways to Decrease Margin of Error Make z* smaller (decrease confidence level C) Increase sample size n Decrease standard deviation σ – High Confidence – our method almost always gives correct answers – Small Margin of Error – we have pinned down the parameter quite precisely Choosing Sample Size A wise user of statistics never plans data collection without planning the inference at the same time. Chapter 9 - the size of the sample determines the margin of error, not the size of the population (soup). p551 #10.10-10.11 Cautions Data can’t be from anything more complicated than an SRS Beware of outliers and skewness Using σ is unrealistic, but we’re using it now to understand the process – this entire chapter is about the process! Read last paragraph on p554 about what statistical confidence does not say. YMS - 10.2 Hypothesis Testing Tests of Significance – A Few Ways Used to assess evidence about a claim while CIs were estimating a parameter An outcome that would rarely happen if a claim were true is good evidence the claim is not true “Does our sample result reflect a true change or did our result occur just by chance; How unlikely is our outcome if the null hypothesis were really true?” Uses knowledge of how the sample mean vary in repeated samples would Hypotheses Null Hypothesis (Ho) – Statement saying there is no effect or change in the population – If true, the sample result is just chance at work Alternative Hypothesis (Ha) – Alternative we believe to be true – It is cheating to first look at data and then frame Ha One-sided vs. Two-sided tests – <, > or ≠ – Should know before you collect sample – Choose two-sided to be safe P-Value Probability of a result at least as far out as the result we actually got Evidence against Ho Lower p-value = stronger evidence Probability from Ch 2! Calculating Two-Sided P-Values Calculate the same way as one-sided and then double Alternative hypothesis stated some difference, not in any particular direction Must consider both differences – greater than and less than (even though your sample only produces one or the other) Diagram on p573 Statistically Significant Chance alone would rarely produce so extreme a result Significance level alpha α Reject null hypothesis when p < α Stats in Dating ha ha ha… p564 #10.27 to 10.35 odds Significance Tests Toolbox 1. Identify population and parameter AND state the null and alternative hypotheses in words and symbols. 2. Choose and verify the procedure (conditions are still SRS and normal). 3. Carry out the inference procedure. - Calculate the test statistic (one-sample z-statistic). - Find the p-value. 4. Interpret your results in the context of the problem. - Reject or do not reject the null hypothesis. - Include p-value and statement assessing strength. p576 #10.38-10.39 Rejecting is not Accepting Just because you can’t prove that something is false, doesn’t mean that you believe it to be true Not rejecting the null hypothesis and accepting the null hypothesis are not the same conclusion Examples – Shakespeare Video – OJ Ho: Person did not commit crime. Ha: Person did commit crime. If there is enough evidence, we find the person guilty. If there is not, we proclaim they are not guilty. We aren’t saying the person is innocent, just that we didn’t have enough evidence to find them guilty. Fixed Significance Level Z Tests for μ Use the z-score associated with chosen significance level to make the decision – You don’t need to find the p-value to make your decision. – More standard deviations from the mean yields a smaller and smaller p-value/tail area Example for One-Sided Tests with Ha > – If z > 1.645 you can reject at 0.05 – If z > 1.96 you can reject at 0.025 – If z > 2.576 you can reject at 0.005 CIs and Two-Sided Tests Reject if the value of μo falls outside a level 1 – α confidence interval for μ – You’re 99% confident the true mean is captured in a particular interval, but the interval doesn’t contain μo Why use a CI over a test? – CIs give an estimate of the parameter while tests just reject/accept values p580 #10.42-10.43 Test Review p583 #10.46 to 10.54 evens YMS – 10.3 Making Sense of Statistical Significance Choosing a Level of Significance What are the ramifications for rejecting Ho? Practical vs. Statistical Significance Who cares if your scab falls off half of a day sooner? Pay attention to actual data as well as p-value. Inference is Not Valid for All Sets of Data Inference cannot correct a poorly designed experiment or survey. Beware of Multiple Analyses Every once in a while the result will show by chance. p589 #10.58-10.59, 10.62 and 10.64 YMS – 10.4 Inference as Decision Acceptance Sampling When circumstances call for a decision or action at the end result of inference When we must accept instead of just not rejecting Type I Errors If we reject Ho (accept Ha) when in fact Ho is true Probability of Type I error is equal to α Type II Errors If we accept Ho (reject Ha) when in fact Ha is true Calculated based on the alternative for μ Truth about the Population Ho True Decision based on Sample Type I Error Correct Decision Correct Decision Type II Error Reject Ho Accept Ho Ha True p 595 p597 Examples 10.21 - 10.22 p598 #10.67 Type I & II Errors for HW Power (1-β) Choose your interpretation: - Probability the test will reject Ho when an alternative is true - Probability of rejecting Ho when it is in fact false - Probability of making a correct decision (to reject Ho) when Ho is false - Probability the test will pick up on an effect that is present - Probability the test will detect a deviation from the null hypothesis should such a deviation exist Ways to Increase Power Increase α (less evidence is required to reject) Consider an alternative that is farther away from μo Increase the sample size n (less overlap because spread decreases) Decrease σ (less overlap because spread decreases) Example 10.23 p603 #10.72-10.75