STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 1 week of September 3, turn in on or before September 11 Reading – The Roman-numbered prelude, and the Arabic-numbered through page 21 of the text. Written exercises to be turned in. Note that in this, and subsequent assignments, some questions are labeled ‘do but don’t turn in’. You should do and be sure you understand these exercises, as the issues they test can appear in later tests. Do not however turn them in to be graded; you can check your answer from the sample solution posted on the Web. Page 13 21 23 28 30 33 Exercise 1.6 1.10 1.12 1.29 1.32 1.37 Histogram Stem plot Time plot Bar chart of a categorical variable. Pareto chart Recognize histogram shapes (Do this exercise, but do not turn it in) Comparative (back-to-back) stem plot Skills learned Introduction to uncontrolled variability and ‘standing back’ from a data set. Drawing a bar chart, a histogram, a stem plot (aka stem and leaf display), and a time plot. STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 2 week of September 10, turn in by September 18 Reading – Pages 37 to 82 Written exercises to be turned in Page 50 55 59 63 74 74 75 80 80 83 85 Exercise 2.9 Do, but do not turn in. Check your answer from the back of the book. 2.12 Display the three groups in a comparative box and whisker plot, and comment on the apparent impact of the logging. 2.31 Do, but do not turn in. Check your answer from the back of the book. 2.46 Calculate the five-number summary, and the values of the inner and outer fences. Identify any players that are boxplot outliers. 3.6 Normal curves, the 68/95/99.7 rule. 3.7 Normal drill. Do, but do not turn in. 3.8 Comparison using normal scores. 3.10 Standard normal drill – direct 3.11 Standard normal drill. Do but do not turn in. 3.14 Standard normal drill – inverse 3.24 Using z scores to solve proportion questions. Note. The chapter 3 assignments involve normal proportion calculations. Doing these calculations is a mechanical skill that you get with practice; while it is a low-level skill it is a vital one that, if not mastered now, will come back to haunt you later in the semester. In this homework I ask you to turn in just a minimal number of calculations. However you should do however many practice exercises (like those in Chapter 3) that it takes to develop total fluency in working with tables and figuring proportions. Skills learned: Numeric descriptive statistics; resistance; the five-number summary; the box and whisker plot. A first look at the normal curve, used as a handy potential approximation to a set of data. Skills of using a normal table. STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 3 week of September 17, turn in on or before September 25 Reading – Pages 91 to 125. Written exercises to be turned in. Do the calculations to three decimals. Using fewer than three decimals in these exercises can make enough of a difference in the answers to be noticeable; using much more than three decimals is probably overkill. Page 96 101 110 101 Exercise 4.6 Draw and interpret a scatter plot. 4.8 Scatterplot, calculate correlation from definition and using calculator. (The fourth exercise of this homework also uses this data set; you would be wise to do the two sets of calculations at the same time to avoid having to rekey the data.) 4.28 Scatterplot with repeated x; interpretation. Continuing with the data of exercise 4.8, find the regression line to predict forest lost from coffee price. Do this in two ways: (i) using the formula for calculating the intercept and slope from the means, standard deviations and correlation coefficients that you calculated for exercise 4.8; and (ii) using the two-variable capability of your calculator. Re-draw the scatterplot you made in the second exercise, and draw in the regression line on the plot. Us this to predict the percent forest lost if the coffee price were to rise to 80 cents per pound. Skills learned: Bivariate data, Display with a scatterplot. Measuring strength of association. What the correlation coefficient does and does not measure. The least squares regression line. Fitting the regression line using a ‘statistical’ calculator. STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 4 week of September 24 to be turned in by October 2 Reminder. Your first in-class test takes the place of the scheduled lecture October 10 Reading – Pages 126 - 136, 149 – 160, 167 - 171 Written exercises to be turned in 1. The following data set shows the city and the highway fuel consumption of 22 cars of the 2002 model year. We’d like to know if we can predict the highway fuel consumption y using the city fuel consumption x as a predictor Car AcuraNSX AudiTTRoadster AudiTTQuattro BMWMCoupe BMWZ3Coupe BMWZ3Roadster BMWZ8 ChevroletCorvette ChryslerProwler Ferrari360Modena FordThunderbird City 17 22 20 17 19 20 13 18 18 11 17 Highway 24 31 28 25 27 27 21 25 23 16 23 Car HondaInsight HondaS2000 LamborghiniMurcielago MazdaMiata Mercedes-BenzSL500 Mercedes-BenzSL600 Mercedes-BenzSLK230 Mercedes-BenzSLK320 Porsche911GT2 PorscheBoxster ToyotaMR2 City 57 20 9 22 16 13 23 20 15 19 25 Highway 56 26 13 28 23 19 30 26 22 27 30 The numbers give the following summary statistics: Variable City Highway N 22 22 Mean 19.591 25.909 SD 9.2204 8.0469 Minimum 9 13 Maximum 57 56 The correlation between the two is 0.9808. • • • • • Page 160 162 Using these summary statistics, calculate the regression line to predict the highway fuel consumption from the city fuel consumption. Over what range of values can you interpolate using your line? How do you interpret the intercept and the slope of this line? Do they make sense? What would you predict to be the highway fuel consumption of a car with city fuel consumption of 30 mpg? Using your regression line, calculate the residuals and plot the residuals (vertical axis) against the city gas mileage x. Comment on the plot generally, and on the Honda Insight specifically. Exercise 6.8 Simpson’s paradox 6.19-6.22 Working with a two-way table. Check the odd-numbered answers from the back of the book. Skills learned Fitting the regression line from the five bivariate summary numbers. Residual plots, outliers, influence. Working with two-way tables of categorical data. STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 5 week of October 1, turn in on or before October 12 Note that this is an automatic extension to the normal turning date, occasioned by your first in-term test on October 10. Reading – Pages 189 to 204 Written exercises to be turned in Page 194 199 201 204 209 211 Exercise 8.6 Example – a mail-shot survey 8.10 Use Table B starting at line 139. Drawing a random sample using a table. 8.12 Explain clearly how you use the random number table. A stratified sample. 8.14 Non-response. 8.34 Cautions on reported behavior. 8.46 Effect of wording Skills learned Use of random number tables, difficulties and issues in sampling. STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 6 week of October 8, turn in by the start of your recitation October 16 Reading – Pages 213 - 227 Written exercises. Page 215 222 226 228 228 230 Exercise 9.2 Terms in experimentation 9.10 Meaning of the word ‘significant’ 9.14 Describing a completely randomized experiment 9.16 Categorizing a study 9.18 Categorizing a study 9.28 Carry out the randomization for a CRD Skills learned. Concepts and methods of designed experiments. STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 7 week of October 15, turn in no later than start of recitation October 23. Reading – Pages 246 - 262 Written exercises to be turned in Page 250 252 254 256 256 260 265 Exercise 10.4 Matching probability with event 10.6 Working with tetrahedral dice 10.8 A sample space 10.11 Don’t turn in – check at the back of the book. Benford’s Law 10.12 A discrete probability distribution 10.15 Don’t turn in – check at the back of the book. Normal probability calc. 10.32 Some probability calculations Some normal table review. (Make sure you can do these exercises, but do not turn them in.) Suppose the weight of aspirin tablets is normal with µ=325, σ=3 mg. What is the probability that an individual tablet weighs: 1. less than 320 mg? (answer 0.0478) 2. less than 331 mg? (answer 0.9773) 3. between 331 and 320 mg? (answer 0.9295) 4. more than 318 mg? (answer 0.9902) What is the weight that 99.9% of aspirin tablets exceed? (answer 315.7 mg) What is the 90% normal range of weights if aspirin tablets? (answer 320 to 330 mg) If you have any trouble with these calculations, review the material on using normal tables that you will find on the class Web site. Skills learned Rules of probability; discrete and continuous random variables STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 8 week of October 22, turn in on or before start of recitation October 30 Reading – Pages 271 - 287 Page 272 272 294 275 275 280 286 291 298 298 Exercise 11.2 Parameters and statistics 11.3 Parameters and statistics (do not turn in) 11.18 Parameters and statistics 11.4 Behavior of a mean as n increases. 11.5 Why insurance works (do not turn in) 11.8 Distribution of a sample mean. Effect of sample size 11.12 Individual values and sample mean 11.14 A Shewhart Xbar control chart 11.38 A normal probability calculation 11.42 Using CLT to solve problems involving a total Skills learned: Sampling distribution of a mean. Central limit theorem. Introduction to confidence intervals STAT 1001 Introduction to the Ideas of Statistics. Hawkins Reminder – the second mid-term test takes the place of the regular class on Wednesday November 14 Homework 9 week of October 29. Turn-in date is November 6. Reading – pages 343 – 355 Written exercises to be turned in Page 348 352 354 357 357 357 357 359 Exercise 14.2 Interpreting a confidence interval. 14.6 A fuller analysis, from descriptives up to a CI. 14.8 Intervals with difference confidence levels. 14.13 Do not turn in – check answer at the back of the book 14.14 Confusion of CI with normal ranges 14.17 Do not turn in – relevance of sample size 14.18 Relevance of population size 14.28 A sample size calculation Skills learned: Introduction to confidence intervals. STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 10 week of November 5, turn in on or before November 16 (Note that this is an automatic extension for this homework because of the second interm test, which is November 14.) Reading – Pages 362 - 381 Written exercises to be turned in Page 365 366 368 371 376 376 383 Exercise 15.2 Setting up null and alternative hypotheses and test statistic 15.4 Continuation 15.8 Evidence against the null 15.14 Calculating a one-sided P value 15.18 Get a test statistic and P value 15.20 A fuller example, including some calculations 15.40 Effect of sample size on P values. Skills learned Some practical aspects of inferential procedures and cautions about taking the numbers at face value. STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 11 week of November 12, turn in on or before November 20 Reading pages 387 - 397 Page 391 395 395 405 406 407 Exercise 16.2 A confidence interval with σ estimated from a large sample. 16.7 Do not turn in 16.8 A confidence interval and a test 16.20 Interpreting a description of a trial 16.24 Relevance of sample size 16.32 Assessment of evidence Skills learned An overview of some further aspects of inference STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 12 week of November 19, turn in on or before November 27 Reading – Pages 433 - 446. It is some time since you used your calculator for data analysis. Check your skills by finding the mean and standard deviation of the numbers 2, 4, 6, 8, 10 (answer: mean = 6, standard deviation = 3.1623) and the mixed-sign numbers -3, -1, 1, 3, 5, 7 (answer: mean = 2, standard deviation = 3.7417) If this gives you any trouble, review the material covered in homework 2 and your calculator instruction manual. Written exercises From the text Page Exercise 434 18.1 (Do not turn in) The standard error of the mean 436 18.3 (Do not turn in) Reading t tables 437 18.4 Reading t tables 438 18.6 A confidence interval for a normal mean. First step – you need to find the mean and standard deviation of these numbers 439 18.7 (Do not turn in) Another confidence interval for a normal mean 448 18.13 (Do not turn in) A matched pairs t analysis and the impact of outliers 454 18.34 A matched pairs test 455 18.36 A confidence interval for a mean – precursor to a two-sample analysis Skills learned The t distribution, origin and use; tables. Confidence intervals and tests on small normal samples. Analysis of matched pairs data STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 13 week of November 26, turn in on or before December 4 Reading – Pages 461 - 473 Page 482 482 482 482 483 485 486 489 489 Exercise 19.22 Pick your technology 19.23 (do not turn in) continuation of technology pick 19.24 Pick your technology 19.27 (do not turn in) Assumptions for inference 19.28 Setting up two-sample hypotheses 19.37 (do not turn in) Two-sample analysis using original data. 19.40 Two-sample analysis given summary statistics. In addition to what the book asks for, set up a 95% confidence interval for the difference in means. 19.48 Two-sample analysis using original data. 19.49 (do not turn in). Follow a test with a confidence interval. Skills learned Two-sample inference on means STAT 1001 Introduction to the Ideas of Statistics. Hawkins Homework 14 week of December 3, turn in on or before December 11 This homework will not be turned back. The material on it is examinable Reading – Pages 491 – 498, 502 – 507. (Ignore the section “Accurate confidence intervals for a proportion”; we will not go down that road.) Written exercises to be turned in. From the text Page Exercise 496 20.6 Conditions for a CI on a proportion 499 20.7 (do not turn in) Conditions for a CI for a proportion 499 20.8 Analysis of proportion data 503 20.14 A sample size calculation 507 20.18 Sampling distribution of a proportion 507 20.19 (do not turn in) Continuation 507 20.20 Continuation 508 20.21 (do not turn in) Continuation 508 20.22 Continuation Not from the book We want to test whether people really can generate plausible random digits. Evidence is that most people give too many 3s and 7s, and not enough 0s. We ask a volunteer to give us 200 random digits; of these 11 are zero. Use this piece of information to test the hypothesis that the digits could be random. Not from the book We wonder whether a particular community has an unusual distribution of blood type. In the whole US population, 45% have Type O blood. We sample 400 individuals from the population and 150 have Type O blood. Is this evidence that they are different than the population as a whole? Follow up the test with a 95% confidence interval for the proportion of this community who have Type O blood. Skills learned Inference on proportions in single sample.