Connecting Data Analysis, Design, and Statistical Inference Daren Starnes The Lawrenceville School dstarnes@lawrenceville.org CMC South Oct 2015 AP Statistics Teachers Meeting • In this room at 12:30. • Grab a bite to eat and come join us for a panel discussion and some fabulous giveaways from the publishers. Why are we here? • Examine the statistical problem solving process: ask a question, collect data, analyze data, interpret results. • Explore three applets that make the connection between data collection, data analysis, and inference explicit. • Discuss how to use these applets to deepen students’ statistical understanding. • Use simulation as a tool for performing inference. I. Analyzing categorical data Who Watches Survivor? Television executives and companies who advertise on TV are interested in how many viewers watch particular shows. According to Nielsen ratings, Survivor was one of the most-watched television shows in the United States during every week that it aired. An avid Survivor fan (and textbook author) claims that 35% of all U.S. adults have watched Survivor. A skeptical editor believes this figure is too high. He asks a random sample of 200 U.S. adults if they have watched Survivor; 60 say “yes.” I. Analyzing categorical data Who Watches Survivor? http://www.cbs.com/shows/survivor/ • Design: How were the data produced? Random sample of 200 U.S. adults • Data Analysis: What is appropriate for one categorical variable? – Graph – Numerical summaries www.tinyurl.com/SPAapplets I. Analyzing categorical data Who Watches Survivor? • Inference: Testing a claim about a population proportion – Via simulation – Using traditional inference methods What conclusion should we draw? II. Comparing distributions of quantitative data Does sleep deprivation linger? Researchers have established that sleep deprivation has a harmful effect on visual learning. But do these effects linger for several days, or can a person “make up” for sleep deprivation by getting a full night’s sleep on subsequent nights? A recent study (Stickgold, James, and Hobson, 2000) investigated this question by randomly assigning 21 subjects (volunteers between the ages of 18 and 25) to one of two groups: one group was deprived of sleep on the night following training and pre-testing with a visual discrimination task, and the other group was permitted unrestricted sleep on that first night. II. Comparing distributions of quantitative data Does sleep deprivation linger? Both groups were allowed as much sleep as they wanted on the following two nights. All subjects were then re-tested on the third day. Subjects’ performance on the test was recorded as the minimum time (in milliseconds) between stimuli appearing on a computer screen for which they could accurately report what they had seen on the screen. The computer task a b c After display of the mask (c), subjects must report first on whether the letter ‘T’ (for example, in a) or ‘L’ (b) was displayed at fixation, and then whether the three diagonal bars were arrayed horizontally (a) or vertically (b). II. Comparing distributions of quantitative data Does sleep deprivation linger? The sorted data presented here are the improvements in those reporting times between the pre-test and post-test (a negative value indicates a decrease in performance): Sleep deprivation (n = 11): -14.7, -10.7, -10.7, 2.2, 2.4, 4.5, 7.2, 9.6, 10.0, 21.3, 21.8 Unrestricted sleep (n = 10): -7, 11.6, 12.1, 12.6, 14.5, 18.6, 25.2, 30.5, 34.5, 45.6 II. Comparing distributions of quantitative data Does sleep deprivation linger? • Design: How were the data produced? Completely randomized experiment with 21 volunteer subjects • Data Analysis: What is appropriate for one quantitative variable and two groups? – Graph – Numerical summaries www.tinyurl.com/SPAapplets II. Comparing distributions of quantitative data Does sleep deprivation linger? • Inference: Testing a claim about a difference between two means – Via physical simulation – Via computer simulation – Using traditional inference methods II. Comparing distributions of quantitative data Does sleep deprivation linger? Question: Based on the data, is it plausible that there’s really no harmful effect of sleep deprivation, and random chance alone produced the observed differences between these two groups? Let’s re-do the random assignment many, many times… • If no treatment effect, then values will be the same as in the original study. • Write each of the 21 data values on a separate card. • Place all of the cards (subjects) in a bag. • Mix your cards well and deal two groups—one with 10 cards (unrestricted sleep) and one with 11 cards (sleep deprived). • Calculate the difference in mean time improvement for the two groups (unrestricted – sleep). • Write value on sticky note and bring to me. II. Comparing distributions of quantitative data Does sleep deprivation linger? What conclusion should we draw? III. Challenges students face in tackling inference questions 1. Which inference method to choose • Estimating or testing a claim? • Means, proportions, relationships between categorical/quantitative variables? 2. Conditions for using each inference method, and why they are important • Random sampling or random assignment • Normal/Large Sample or Large Counts • Independence (of measurements/samples) III. Challenges students face in tackling inference questions 3. Different inferential thinking in sampling and experiments: Scope of inference – Random selection in sampling settings allows inference about a population – Random assignment in experiments allows inference about cause-and-effect III. Challenges students face in tackling inference questions 4. Communicating effectively • Using notation and statistical terminology correctly • Stating technically correct conclusions in context 5. Distinguishing among samples, populations, statistics, and parameters. 6. Using technology as a tool: The “Do” step IV. A resource! Larry Green’s Web site at Lake Tahoe Community College www.ltcconline.net/ greenL/java/Statistic s/catStatProb/categ orizingStatProblems JavaScript.html Strongly Disagree Disagree Agree Strongly Agree 0 1 2 3 Send your text message to this Phone Number: 37607 poll code for this session Speaker was engaging and an effective presenter (0-3) _8720___ (1 space) ___ ___ ___ (no spaces) Speaker was wellprepared and knowledgeable (0-3) Other comments, suggestions, or feedback (words) (1 space) ___________ Session matched title and description in program book (0-3) Example: 8720 323 Inspiring, good content Non-Example: 8720 3 2 3 Inspiring, good content Non-Example: 8720 3-2-3Inspiring, good content How Did We Do? • Questions and answers • Parting thoughts E-mail me with comments or questions: dstarnes@lawrenceville.org