Topics 19 - 20 Unit 4 – Inference from Data: Principles TOPIC 19 CONFIDENCE INTERVALS: MEANS Topic 19 - Confidence Interval: Mean, σ is unknown The purpose of confidence intervals is to use the sample statistic to construct an interval of values that you can be reasonably confident contains the actual, though unknown, parameter. The estimated standard deviation of the sample statistic X-bar is called the standard error S n Confidence Interval for a population proportion : est imat e margin of error X t * where n >= 30 t * is calculated based on level of confidence When running for example 95% Confidence Interval: 95% is called Confidence Level and we are allowing possible 5% for error, we call this alpha (α )= 5% where α is the significant level S n Topic 19 - Confidence Interval: Mean, σ is unknown Use if the sample data is given, use the Stat, Edit and enter data in the calculator before running the Confidence Interval L1 is where data is entered by you C-Level: is the level you are running the Confidence Interval Use if the information about sample data is given. X-Bar mean of sample data Sx is Standard deviation of the sample n is sample size C-Level: is the level you are running the Confidence Interval Activity 19-3: M&M Consumption Travel time to work. • A study of commuting times reports the travel times to work of a random sample of 20 employed adults in New York State. The mean is = 31.25 minutes and the standard deviation is s = 21.88 minutes. What is the standard error of the mean? • s/√n = 21.88/√20 = 4.8925 minutes. Ancient air. The composition of the earth’s atmosphere may have changed overtime. To try to discover the nature of the atmosphere long ago, we can examine the gas in bubbles inside ancient amber. Amber is tree resin that has hardened and been trapped in rocks. The gas in bubbles within amber should be a sample of the atmosphere at the time the amber was formed. Measurements on specimens of amber from the late Cretaceous era (75 to 95 million years ago) give these percents of nitrogen: 63.4 65.0 64.4 63.3 54.8 64.5 60.8 49.1 51.0 Assume (this is not yet agreed on by experts) that these observations are an SRS from the late Cretaceous atmosphere. Use a 90% confidence interval to estimate the mean percent of nitrogen in ancient air. Ancient air. Enter data for L1. 95% confidence Interval: Using TI83, under Stat, TEST, Choose option 8:TInterval Mean of the sample = 59.6 Standard deviation = 6.26 Degree of freedom = df= 8 Confidence interval for mean percent of nitrogen is between 54.8 and 64.4. Exercise 19-15 Page 414 Exercise 19-23 Page 417 Exercise 19-24 Page 417 TOPIC 20 TEST OF SIGNIFICANCE: MEANS Topic 20 – Test of Significant: Mean The purpose of Test of Significant is when we do know the population Parameter but we do not necessary agree with it or we have question about it. To do the test we need to run a sample and we use the statistic to test its validity. Step 1: Identify and define the parameter. Step 2: we initiate hypothesis regarding the question – we can not run test of significant without establishing the hypothesis H 0 : H : a 0 0 0 0 Step 3: Decide what test we have to run, in case of proportion, we use t-test x 0 t S n or or Topic 20 – Test of Significant: Mean Step 4: Run the test from calculator Step 5: From the calculator write down the p-value T-test Step 6: Compare your p-value with α – alpha – Significant Level If p-value is smaller than α we “reject” the null hypothesis, then it is statistically significant based on data. If p-value is greater than the α we “Fail to reject” the null hypothesis, then it is not statistically significant based on data. Last step: we write conclusion based on step 6 at significant level α • • • • • p- value > 0.1: little or no evidence against H0 0.05 < p- value <= 0.10: some evidence against H0 0.01 < p- value <= 0.05: moderate evidence against H0 0.001 < p- value <= 0.01: strong evidence against H0 p- value <= 0.001: very strong evidence against H0 Few Possible cases to look at: A teacher suspects that the mean for older students is higher than 115 Higher than means (> 115) The opposite of higher than is less than or equal to 115 ( 115) Comparing the two, null hypothesis is the comparison that includes equality (=) Ho: µ = 115 One-sided alternative Ha: µ > 115 A teacher suspects that the mean for older students is same or more than 115 Same or more than means (> 115) The opposite of same or more than is less than 115 (< 115) Ho: µ = 115 Ha: µ < 115 One-sided alternative A teacher suspects that the mean for older students is also 115 Same means (= 115) The opposite of same is not equal to 115 ( 115) Ho: µ = 115 Ha: µ 115 Two-sided alternative Fuel economy. According to the Environmental Protection Agency (EPA), the Honda Civic hybrid car gets 51 miles per gallon (mpg) on the highway. The EPA ratings often overstate true fuel economy. Larry keeps careful records of the gas mileage of his new Civic hybrid for 3000 miles of highway driving. His result is x-bar= 47.2 mpg. Larry wonders whether the data show that his true long-term average highway mileage is less than 51 mpg. What are his null and alternative hypotheses? Answer Larry wonders whether the data show that his true long-term average highway mileage is less than 51 mpg. H0: µ = 51 mpg; Ha: µ < 51 mpg. Problem If a researcher is interested in testing whether the mean is different from some claimed value, 55, then the null and alternative are test the hypotheses H0: μ = 55, Ha: μ ≠ 55 Stating hypotheses. In planning a study of the birth weights of babies whose mothers did not see a doctor before delivery, a researcher states the hypotheses as H0 : x-bar = 1000 grams Ha : x-bar < 1000 grams What’s wrong with this? Hypotheses should be stated in terms of µ, not x-bar . Topic 20 – Test of Significant: Mean, σ is unknown Use if the sample data is given, use the Stat, Edit and enter data in the calculator before running the T-test µ0 is mean–value in question List: L1 where the raw data is entered by you µ: is the alternative hypothesis Use if the information about sample data is given. µ0 is mean–value in question X-bar is sample mean Sx is Sample Standard deviation n is sample size µ: is the alternative hypothesis Improving your SAT score. We suspect that on the average students will score higher on their second attempt at the SAT mathematics exam than on their first attempt. Suppose we know that the changes in score (second try minus first try) follow a Normal distribution. Here are the results for 46 randomly chosen high school students: Do these data give good evidence that the mean change in the population is greater than zero? −30 24 47 70 −62 55 −41 −32 128 −11 −43 122 −10 56 32 −30 −28 −19 1 17 57 −14 −58 77 27 −33 51 17 −67 29 94 −11 2 12 −53 −49 49 8 −24 96 120 2 −33 −2 −39 99 Activity 20- 2: Sleeping Times The null hypothesis is that the mean sleep time of the population is 7 hours. In symbols, the null hypothesis is H0 : µ = 7.0 hours. The alternative hypothesis is that the mean sleep time of the population is not 7 hours. In symbols, the alternative hypothesis is Ha : µ ≠ 7.0 hours. Sample Number Sample Size Sample Mean Sample SD 1 10 6.6 0.825 2 10 6.6 1.597 3 30 6.6 0.825 4 30 6.6 1.597 Test Statistic p- value Exercise 20-8: UFO Sighters’ Personality – Page 432 Exercise 20-10: Credit Card Usage - Page 433 Exercise 20-21: Pet Ownership - Page 436 Exercise 20-14: Age Guesses – Page 434 EXTRA PROBLEMS Problem Assume that you are conducting a test of significance using a significance level of α = 0.10. If your test yields a P-value of 0.08, what is the appropriate conclusion? P-value = 0.08 < 0.10 Reject Null, It is statistically significant Problem The nicotine content in cigarettes of a certain brand is normally distributed with mean (in milligrams) μ and standard deviation σ = 0.1. The brand advertises that the mean nicotine content of their cigarettes is 1.5, but measurements on a random sample of 400 cigarettes of this brand gave a mean of x = 1.52. Is this evidence that the mean nicotine content is actually higher than advertised? at significance level α = 0.01. You conclude Is this evidence that the mean nicotine content is actually higher than advertised? State the hypothesis test the hypotheses H0: μ = 1.5, Ha: μ > 1.5 Problem A researcher wants to know if the average time in jail for robbery has increased from what it was several years ago when the average sentence was 7 years. He obtains data on 400 more recent robberies and finds an average time served of 7.5 years. If we assume the standard deviation of sample is 3 years, what is the p-value of the test? at significance level α = 0.05. You conclude