STAT 105 Real-Life Statistics: Your Chance for Happiness (or Misery) ? 1 History of Statistics 105 Wee Lee Loh 2 © 2008 Department of Statistics, Harvard University History of Statistics 105 Linjuan Qian © 2008 Department of Statistics, Harvard University Reetu Kumra 3 History of Statistics 105 Yves Chretien 4 © 2008 Department of Statistics, Harvard University 5 © 2008 Department of Statistics, Harvard University Pedagogical Motivation To fill in the gap between intro-level courses and higher-level courses Intro “service” courses jam-packed with tools Higher-level courses require advanced maths To provide more depth and intuition Useful for Masters and PhD students as well Gen-Ed introduction to statistics Unforeseen side benefit: The Happy Team 6 © 2008 Department of Statistics, Harvard University Outcomes (so far) Positive mid-term feedback The process of developing the course Every student would recommend it to future students Graduate School Dean is recommending an institutionalized graduate seminars on designing new courses based on our model Attention to the subject and department Media Gazette Crimson Students Administration 7 © 2008 Department of Statistics, Harvard University © 2008 Department of Statistics, Harvard University 8 FINANCE •What do you want to learn from this data? • How do you summarize the data? • How do you visualize the signal behind the noise? © 2008 Department of Statistics, Harvard University 9 FINANCE • Would the “twistogram” idea work for the S&P 500 index over this extended time period? © 2008 Department of Statistics, Harvard University 10 ROMANCE • The dating world is full of questions we would all love answers to: • When you meet someone, should you play hard-to-get or make your attraction obvious? • Where should you go on a first date? • What is the best thing to do on the first date to impress your date? • What are the important factors that make two people “click” … © 2008 Department of Statistics, Harvard University 11 ROMANCE • Suppose you have been hired by a U.S. online dating company, and they want you to find out people’s opinions here in the US about these questions. • How would you go about collecting the information? © 2008 Department of Statistics, Harvard University 12 ROMANCE Survey 79% Q: You just met someone, and are initially interested. Are you more likely to maintain/increase interest in the person if he/she plays hard-to-get, or if he/she is obvious about being into you? RL Y G ET TO LE A RD (b )C A )H (a © 2008 Department of Statistics, Harvard University IN TO (I p ... (a) HARD TO GET (I prefer a person who initially plays hard-to-get) (b) CLEARLY INTO ME (I prefer someone who makes it clear he/she is very into me) M E. .. 21% 13 ROMANCE • Suppose during your survey you fell in love with a Chinese person, and subsequently moved to China and now work for a Chinese online dating company. • You want to impress your new boss (and your new love), so you decide to repeat your U.S. survey, which had 1000 subjects, in China © 2008 Department of Statistics, Harvard University 14 ROMANCE © 2008 Department of Statistics, Harvard University 43% 28% 11% 00 40 > 00 40 00 8% 30 00 20 00 9% 10 America has a population of about 304 million but China has a population of about 1.3 billion. How many people would you need to survey in China to get just as reliable results as in the U.S.? 1. 1000 2. 2000 3. 3000 4. 4000 5. > 4000 15 MEDICAL • How do you test whether a new drug is effective? • Ideally, we perform a controlled clinical trial, by randomly assign one group of people to take the drug, and another group to take a placebo. • It needs to be double blinded. • When such an experiment is not possible due to practical or ethical issues, what can go wrong? © 2008 Department of Statistics, Harvard University 16 MEDICAL Kidney stone treatment C. R. Charig, D. R. Webb, S. R. Payne, O. E. Wickham (March 1986) Br Med J (Clin Res Ed) 292 (6524): 879–882. Treatment A Treatment B 78% (273/350) 83% (289/350) Treatment B is better, right? WRONG! Treatment A Treatment B Small Stone 93% (81/87) 87% (234/270) Large Stone 73% (192/263) 69% (55/80) Simpson’s Paradox © 2008 Department of Statistics, Harvard University 17 © 2008 Department of Statistics, Harvard University 18 Slope = # successful / # unsuccessful = odds © 2008 Department of Statistics, Harvard University Small Stones Treatment A Treatment B Successful 81 (93%) 234 (87%) Unsuccessful 6 36 19 Slope = # successful / # unsuccessful = odds © 2008 Department of Statistics, Harvard University Large Stones Treatment A Treatment B Successful 192 (73%) 55 (69%) Unsuccessful 71 25 20 © 2008 Department of Statistics, Harvard University Combined Treatment A Treatment B Successful 81+192=27 3 289 Unsuccessful 6+71=77 61 21 © 2008 Department of Statistics, Harvard University Combined Treatment A Treatment B Successful 273 (78%) 289 (83%) Unsuccessful 77 61 22 © 2008 Department of Statistics, Harvard University Combined Treatment A Treatment B Successful 273 (78%) 289 (83%) Unsuccessful 77 61 23 © 2008 Department of Statistics, Harvard University 24 MEDICAL • When and why does Simpson’s paradox occur? • How do we deal with it? © 2008 Department of Statistics, Harvard University 25 LEGAL • How is statistics an important part of our legal system? • How might we use a statistic or probability as evidence in a trial? • How are statistics often misinterpreted by lawyers and juries? © 2008 Department of Statistics, Harvard University 26 LEGAL You have just been selected for jury duty. In 1996 in England, Denis Adams was suspect in a rape trial. Listen closely to the details of the case and the arguments presented before deciding your verdict. (We have simplified the actual case/arguments for the purpose of this illustration.) © 2008 Department of Statistics, Harvard University 27 LEGAL Prosecution Argument • Adams’ DNA profile matches that of evidence found at the scene of the crime •If Adams is innocent, there is only a 1 in 20 million chance that his DNA would match that found at the crime • Therefore, the probability Adams is innocent is only .00000005, hence the probability he is guilty is 1 minus that, .9999995. Thus Adams is guilty beyond the shadow of a doubt. © 2008 Department of Statistics, Harvard University 28 LEGAL Defense Argument • If the odds of a DNA match for any person is 1/ 20,000,000, since there are 60 million people in England, there are on average 3 other people with this DNA type (in 1996). •Since it is equally likely to be any of these others, the probability of Adams’ guilt is 1/3 = .33, which is not enough certainty to convict. © 2008 Department of Statistics, Harvard University 29 LEGAL Defense Argument • In an identity line up, victim failed to pick out Adams • Victim describes an attacker in his 20’s • Adams is 37 • Victim guessed Adams to be about 40 • Adams had an alibi for the night of the crime (he spent the night with his girlfriend) © 2008 Department of Statistics, Harvard University 30 LEGAL 53% Would you convict Adams? © 2008 Department of Statistics, Harvard University N Ye s 1. Yes 2. No o 47% 31 LEGAL 1) What is the probability that you drive into a tree given that you are drunk? 2) What is the probability that you are drunk given that you drive into a tree? Why is it important to distinguish them? © 2008 Department of Statistics, Harvard University 32 WINE AND CHOCOLATE 53% (a) 0 - .2 (b) .21 - .4 (c) .41 - .6 (d) .61 - .8 (e) .81 - 1 30% 14% 0% (e ). 81 -1 -. 8 ). (d ). (c 61 41 -. 4 21 (b ). 0 ) (a © 2008 Department of Statistics, Harvard University -. 6 2% -. 2 If I randomly pick up one of these chocolates, what do you think is the probability there is champagne inside? 33 WINE AND CHOCOLATE 36% 29% (a) 0 - .2 (b) .21 - .4 (c) .41 - .6 (d) .61 - .8 (e) .81 - 1 21% 14% (e ). 81 -1 -. 8 ). (d ). (c 61 41 -. 4 21 (b ). 0 ) (a © 2008 Department of Statistics, Harvard University -. 6 0% -. 2 If I randomly pick up one of these chocolates, what do you think is the probability there is champagne inside? 34 WINE AND CHOCOLATE © 2008 Department of Statistics, Harvard University 35% .05 .1 .35 .6 .75 1 24% 12% 12% 1 .7 5 .6 9% .3 5 9% .1 1. 2. 3. 4. 5. 6. .0 5 How certain are you about your estimate? If you were to give an interval that you are fairly confident contains the truth, how wide would this interval be? 35 WINE AND CHOCOLATE Let’s collect some data! © 2008 Department of Statistics, Harvard University 36 WINE AND CHOCOLATE 100% Did your chocolate have champagne in it? (a) Yes (b) No © 2008 Department of Statistics, Harvard University o N Ye s 0% 37 WINE AND CHOCOLATE If I randomly pick up one of these chocolates, what is your best guess for the probability of champagne inside? © 2008 Department of Statistics, Harvard University (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 38 WINE AND CHOCOLATE 24% © 2008 Department of Statistics, Harvard University 21% 21% 14% 1 10% .7 5 Let’s collect more data! .6 10% .3 5 .05 .1 .35 .6 .75 1 .1 1. 2. 3. 4. 5. 6. .0 5 How certain are you about your estimate? If you were to give an interval that you are fairly confident contains the truth, how wide would this interval be? 39 WINE AND CHOCOLATE 89% Did your chocolate have champagne in it? N Ye © 2008 Department of Statistics, Harvard University o 11% s (a) Yes (b) No 40 WINE AND CHOCOLATE If I randomly pick up one of these chocolates, what is your best guess for the probability of champagne inside? © 2008 Department of Statistics, Harvard University (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 41 WINE AND CHOCOLATE © 2008 Department of Statistics, Harvard University 1. 2. 3. 4. 5. 6. .05 .1 .35 .6 .75 1 26% 22% 22% 17% 9% 1 .7 5 And even more data… .6 .3 5 .1 4% .0 5 How certain are you about your estimate? If you were to give an interval that you are fairly confident contains the truth, how wide would this interval be? 42 WINE AND CHOCOLATE 83% Did your chocolate have champagne in it? N Ye © 2008 Department of Statistics, Harvard University o 17% s (a) Yes (b) No 43 WINE AND CHOCOLATE What happens as you accumulate more data? 1) Your estimates become more accurate 2) You can narrow in on your interval prediction (your uncertainty decreases) 3) In this case, you get to enjoy chocolate! © 2008 Department of Statistics, Harvard University 44 “Life is like a box of chocolates… you never know what you’re going to get.” BUT YOU CAN ESTIMATE IT! (especially after you take STAT 105!) http://movies.aol.com//movie/forrest-gump/1036/video/tom-hanks-greatest-moments/1138699 © 2008 Department of Statistics, Harvard University 45 Things We Do Differently … Student/Faculty course design collaboration Modules, allowing “out of sequence” teaching in terms of technical material The use of “Clickers” (Personal Response Devices) Module-based team projects and project presentations Module-based guest lecturers Assessment Peer evaluation Assignments, projects, no traditional exams © 2008 Department of Statistics, Harvard University 46 Module-Based Approach (MBA) Challenges Time management Structured material vs “improvised” discussions So much material, so little time Student team dynamics Prerequisites Can we offer stat105 without prerequisites? Funding for course material e.g. wine and chocolate Outside speaker expenses Scaling to a (much) larger class size in the future © 2008 Department of Statistics, Harvard University 48 Future Happiness … Developing more modules Prepare a multimedia-based teaching package Sports Nutrition …… Text book Website Similar courses aimed at different levels More advanced Less advanced Build more Happy Teams! 49 © 2008 Department of Statistics, Harvard University Thanks much! And we welcome your feedback! 50 © 2008 Department of Statistics, Harvard University