Chapter 11.1 Inference for the Mean of a Population. Example 1: One concern employers have about the use of technology is the amount of time that employees spend each day making personal use of company technology, such as phone, e-mail, internet, and games. The Associated Press reports that, on average, workers spend 72 minutes a day on such personal technology uses. A CEO of a large company wants to know if the employees of her company are comparable to this survey. In a random sample of 10 employees, with the guarantee of anonymity, each reported their daily personal computer use. The times are recorded at right. Employee Time 1 2 3 4 5 6 7 8 9 10 66 70 75 88 69 71 71 63 89 86 When the standard deviation the of astatistic statistic is we use this estimator, estimated from the data, the result is called that results does not have a normal Doesisthe data provide evidence that the mean for What different about this problem? the standard error of the statistic, and is distribution, instead it has a new this company is greater than 72 minutes? given by s/√n.called the t-distribution. distribution, Time for some Nspiration! One-Sample z-statistic s known: statistic - parameter test statistic standard deviation of statistic z= x m s n One-sample t-statistic: s unknown: statistic - parameter test statistic standard deviation of statistic t= x m s n The variability of the t-statistic is controlled by the Sample Size. The number of degrees of freeom is equal to n-1 . ASSUMING NORMALITY? Use a Box and 1. SRS is extremely important. Whisker to 2. Check for skewness. check. 3. Check for outliers. 4. If necessary, make a cautionary statement. 5. In Real-Life, statisticians and researchers try very hard to avoid small samples. Example 2: The Degree of Reading Power (DRP) is a test of the reading ability of children. Here are DRP scores for a random sample of 44 third-grade students in a suburban district: 40 26 39 14 42 18 25 43 46 27 19 47 19 26 35 34 15 44 40 38 31 46 52 25 35 35 33 29 34 41 49 28 52 47 35 48 22 33 41 51 27 14 54 45 At the a = .1, is there sufficient evidence to suggest that this district’s third graders reading ability is different than the national mean of 34? • I have an SRS of third-graders SRS? Normal? •Since the sample size is large, the sampling distribution is How do you approximately normally distributed Name the Test!! One Sample t-test for mean OR know? Do you •Since the histogram is unimodal withs?no outliers, the know What are your sampling distribution is approximately normally hypothesis distributed • s is unknown statements? Is H0: m = 34 a key word? where m is the true mean there reading Ha: m = 34 ability of the district’s third-graders 35.091 34 Plug values t .6467 into formula. 11.189 44 p-value = tcdf(.6467,1E99,43)=.2606(2)=.5212 Use tcdf to calculate p-value. a = .1 Compare your p-value to a & make decision Since p-value > a, I fail to reject the null hypothesis. Conclusion: There is not sufficient evidence to suggest that the true mean reading ability of the district’s third-graders is different than the national mean of 34. Write conclusion in context in terms of Ha. Back to Example 1. The times are recorded below. Employee 1 2 3 4 5 6 7 8 9 10 Time 66 70 75 88 69 71 71 63 89 86 Does this data provide evidence that the mean for this company is greater than 72 minutes? • I have an SRS of employees SRS? •Since the histogram has no outliers and is roughly Normal? symmetric, the sampling distribution is approximately How do you normally distributed Do you know s? know? What are your hypothesis • s is unknown, therefore we are using a 1 sample t-test statements? Is H0: m = 72 where m is the truethere # ofamin keyspent word?on PT Ha: m = 72 time spent by this company’s employees Use tcdf to calculate p-value. 74.8 72 Plug values t .937 9.45 into formula. 10 p-value = tcdf(.937,1E99,9)=.1866(2)=.3732 Compare your p-value to a & make decision Since p-value > 15%, I fail to reject the null hypothesis that this company’s employees spend 72 minutes on average on Personal Technology uses. Conclusion: There is not sufficient evidence to suggest that the true amount of time spent on personal technology use by employees of this company is more than the national mean of 72 min. Write conclusion in context in terms of Ha. Now for the fun calculator stuff! Example 3: The Wall Street Journal (January 27, 1994) reported that based on sales in a chain of Midwestern grocery stores, President’s Choice Chocolate Chip Cookies were selling at a mean rate of $1323 per week. Suppose a random sample of 30 weeks in 1995 in the same stores showed that the cookies were selling at the average rate of $1208 with standard deviation of $275. Does this indicate that the sales of the cookies is different from the earlier figure? Assume: •Have an SRS of weeks Name the Test!! •Distribution of sales is approximately due to One Sample t-testnormal for mean large sample size • s unknown H0: m = 1323 Ha: m ≠ 1323 where m is the true mean cookie sales per week 1208 1323 t 2.29 275 30 p value .0295 Since p-value < a of 0.05, I reject the null hypothesis. There is sufficient to suggest that the sales of cookies are different from the earlier figure. Example 3: President’s Choice Chocolate Chip Cookies were selling at a mean rate of $1323 per week. Suppose a random sample of 30 weeks in 1995 in the same stores showed that the cookies were selling at the average rate of $1208 with standard deviation of $275. Compute a 95% confidence interval for the mean weekly sales rate. CI = ($1105.30, $1310.70) Based on this interval, is the mean weekly sales rate statistically different from the reported $1323? What do you notice about the decision from the Remember p-value = .01475 confidence intervalyour, & the hypothesis test? a = .02, we would reject H03. if a = What decision At would you make on Example .01? CI H = ($1100, $1316). You would fail A to96% reject 0 since the p-value > a. What confidence level would be correct to use? Since $1323 is not in the interval, we would reject H0. Does that confidence interval provide the same In a one-sided test,use all of a (2%) goes into level that tail (lower decision? You should a 99% confidence for aThe 98% CI = ($1084.40, $1331.60) tail). two-sided test at a = we .01. would fail Since $1323 hypothesis is in the interval, InIfa H CI, tails have equal area –would the : the m <probabilities 1323, what decision aTail toatbetween reject H 0. test give ain=the .02? sohypothesis there should also be 2% significant level (a) and CIthe = ($1068.6 , $1346.40) - Since a$1323 in this.02 upper tail = .02 is.96 the confidence leveltoMUST Why are we getting different answers? interval we would fail reject H0. Now, what96% confidence level isthat appropriate for this That leaves in the middle & match!) alternative hypothesis? should be your confidence level Ex4: The times of first sprinkler activation (seconds) for a series of fire-prevention sprinklers were as follows: 27 41 22 27 23 35 30 33 24 27 28 22 24 Construct a 95% confidence interval for the mean activation time for the sprinklers. Matched Pairs Test A special type of t-inference Matched Pairs – two forms • Pair individuals by certain characteristics • Randomly select treatment for individual A • Individual B is assigned to other treatment • Assignment of B is dependent on assignment of A • Individual persons or items receive both treatments • Order of treatments are randomly assigned or before & after measurements are taken • The two measures are dependent on the individual Is this an example of matched pairs? 1)A college wants to see if there’s a difference in time it took last year’s class to find a job after graduation and the time it took the class from five years ago to find work after graduation. Researchers take a random sample from both classes and measure the number of days between graduation and first day of employment No, there is no pairing of individuals, you have two independent samples Is this an example of matched pairs? 2) In a taste test, a researcher asks people in a random sample to taste a certain brand of spring water and rate it. Another random sample of people is asked to taste a different brand of water and rate it. The researcher wants to compare these samples No, there is no pairing of individuals, you have two independent samples – If you would have the same people taste both brands in random order, then it would be an example of matched pairs. Is this an example of matched pairs? 3) A pharmaceutical company wants to test its new weight-loss drug. Before giving the drug to a random sample, company researchers take a weight measurement on each person. After a month of using the drug, each person’s weight is measured again. Yes, you have two measurements that are dependent on each individual. A whale-watching company noticed that many customers wanted to know whether it was better to book an excursion in the morning or the afternoon. To test this question, the You may subtract either company thewhen following data on 15 way – collected just be careful writing Hadays over the past randomly selected month. (Note: days were not consecutive.) Day 1 2 Morning 8 9 3 4 5 6 7 8 9 10 11 12 13 14 15 7 9 10 13 10 8 2 5 7 7 6 8 7 After8 10 9 8 9 11 8 noon Since you have two values for 10 4 7 8 9 6 6 9 First, you must find the differences for each day. each day, they are dependent on the day – making this data matched pairs Day 1 2 3 Morning 8 9 7 9 10 13 10 Afternoon 8 10 4 5 9 8 9 6 7 8 9 10 11 12 13 14 15 8 2 5 7 7 6 8 7 11 8 10 4 7 8 9 6 6 9 I subtracted: Differenc 0 -1 -2 1 1 Morning 2 2 – -2 -2 -2 -1 -2 0 2 -2 afternoon es You could subtract the other way! • Have an SRS of days for whale-watching You need to state assumptions using the • s unknown differences! Assumptions: •Since the boxplot doesn’t show any outliers, we can assume the distribution is approximately normal. Notice the skewness of the boxplot, however, with no outliers, we can still assume normality! Differences 0 -1 -2 1 1 2 2 -2 -2 -2 -1 -2 0 2 Is there sufficient evidence that more whales are sighted in the afternoon? H0: mD = 0 Ha: mD < 0 Be careful writing your Ha! Think about how you– If you subtract afternoon subtracted: M-A Hdifferences mD>0should Notice morning; we mthen a:more D foris Ifused afternoon & it equals since the nullbeshould the0 differences + or -? be that there NOat difference. Don’t islook numbers!!!! Where mD is the true mean difference in whale sightings from morning minus afternoon -2 Differences 0 -1 -2 1 1 2 2 -2 finishing the hypothesis test: x m .4 0 t .945 s 1.639 n 15 p .1803 df 14 a .05 -2 -2 -1 -2 0 2 In your calculator, perform t-test Notice athat if the youusing subtracted differences (L3) A-M, then your test statistic t = + .945, but pvalue would be the same Since p-value > a, I fail to reject H0. There is insufficient evidence to suggest that more whales are sighted in the afternoon than in the morning. -2 Ex: The effect of exercise on the amount of lactic acid in the blood was examined in journal Research Quarterly for Exercise and Sport. Eight males were selected at random from those attending a week-long training camp. Blood lactate levels were measured before and after playing 3 games of racquetball, as shown in the table. What is the parameter of interest in this problem? Construct a 95% confidence interval for the mean change in blood lactate level. Player 1 2 3 4 5 6 7 8 Before 13 20 17 13 13 16 15 16 After 18 37 40 35 30 20 33 19 Based on the data, would you conclude that there is a significant difference, at the 5% level, that the mean difference in blood lactate level was over 10 points? Player 1 2 3 4 5 6 7 8 Before 13 20 17 13 13 16 15 16 After 18 37 40 35 30 20 33 19