STAT 105 Real-Life Statistics: Your Chance for

advertisement
STAT 105
Real-Life Statistics:
Your Chance for Happiness
(or Misery)
?
1
History of Statistics 105
Wee Lee Loh
2
© 2008 Department of Statistics, Harvard University
History of Statistics 105
Linjuan Qian
© 2008 Department of Statistics, Harvard University
Reetu Kumra
3
History of Statistics 105
Yves Chretien
4
© 2008 Department of Statistics, Harvard University
5
© 2008 Department of Statistics, Harvard University
Pedagogical Motivation

To fill in the gap between intro-level
courses and higher-level courses
 Intro
“service” courses jam-packed with tools
 Higher-level courses require advanced maths

To provide more depth and intuition
 Useful
for Masters and PhD students as well
Gen-Ed introduction to statistics
 Unforeseen side benefit: The Happy Team

6
© 2008 Department of Statistics, Harvard University
Outcomes (so far)

Positive mid-term feedback


The process of developing the course


Every student would recommend it to future students
Graduate School Dean is recommending an
institutionalized graduate seminars on designing new
courses based on our model
Attention to the subject and department

Media




Gazette
Crimson
Students
Administration
7
© 2008 Department of Statistics, Harvard University
© 2008 Department of Statistics, Harvard University
8
FINANCE
•What do you want to
learn from this data?
• How do you
summarize the data?
• How do you visualize
the signal behind the
noise?
© 2008 Department of Statistics, Harvard University
9
FINANCE
• Would the
“twistogram” idea
work for the S&P
500 index over this
extended time
period?
© 2008 Department of Statistics, Harvard University
10
ROMANCE
• The dating world is full of questions we would
all love answers to:
• When you meet someone, should you play
hard-to-get or make your attraction obvious?
• Where should you go on a first date?
• What is the best thing to do on the first date
to impress your date?
• What are the important factors that make
two people “click” …
© 2008 Department of Statistics, Harvard University
11
ROMANCE
• Suppose you have been hired by a U.S. online
dating company, and they want you to find out
people’s opinions here in the US about these
questions.
• How would you go about collecting the
information?
© 2008 Department of Statistics, Harvard University
12
ROMANCE
Survey
79%
Q: You just met someone, and are
initially interested. Are you more likely to
maintain/increase interest in the person if
he/she plays hard-to-get, or if he/she is
obvious about being into you?
RL
Y
G
ET
TO
LE
A
RD
(b
)C
A
)H
(a
© 2008 Department of Statistics, Harvard University
IN
TO
(I
p
...
(a) HARD TO GET (I prefer a
person who initially plays
hard-to-get)
(b) CLEARLY INTO ME (I prefer
someone who makes it clear
he/she is very into me)
M
E.
..
21%
13
ROMANCE
• Suppose during your survey you fell in love
with a Chinese person, and subsequently moved
to China and now work for a Chinese online
dating company.
• You want to impress your new boss (and your
new love), so you decide to repeat your U.S.
survey, which had 1000 subjects, in China
© 2008 Department of Statistics, Harvard University
14
ROMANCE
© 2008 Department of Statistics, Harvard University
43%
28%
11%
00
40
>
00
40
00
8%
30
00
20
00
9%
10
America has a population of about
304 million but China has a
population of about 1.3 billion.
How many people would you need
to survey in China to get just as
reliable results as in the U.S.?
1. 1000
2. 2000
3. 3000
4. 4000
5. > 4000
15
MEDICAL
• How do you test whether a new drug is
effective?
• Ideally, we perform a controlled clinical trial, by
randomly assign one group of people to take the
drug, and another group to take a placebo.
• It needs to be double blinded.
• When such an experiment is not possible due to
practical or ethical issues, what can go wrong?
© 2008 Department of Statistics, Harvard University
16
MEDICAL
Kidney stone treatment
C. R. Charig, D. R. Webb, S. R. Payne, O. E. Wickham (March 1986)
Br Med J (Clin Res Ed) 292 (6524): 879–882.
Treatment A
Treatment B
78%
(273/350)
83%
(289/350)
Treatment B is better, right?
WRONG!
Treatment A
Treatment B
Small
Stone
93%
(81/87)
87%
(234/270)
Large
Stone
73%
(192/263)
69%
(55/80)
Simpson’s Paradox
© 2008 Department of Statistics, Harvard University
17
© 2008 Department of Statistics, Harvard University
18
Slope = # successful / # unsuccessful = odds
© 2008 Department of Statistics, Harvard University
Small Stones
Treatment
A
Treatment
B
Successful
81 (93%)
234 (87%)
Unsuccessful
6
36
19
Slope = # successful / # unsuccessful = odds
© 2008 Department of Statistics, Harvard University
Large Stones
Treatment
A
Treatment
B
Successful
192 (73%)
55 (69%)
Unsuccessful
71
25
20
© 2008 Department of Statistics, Harvard University
Combined
Treatment
A
Treatment
B
Successful
81+192=27
3
289
Unsuccessful
6+71=77
61
21
© 2008 Department of Statistics, Harvard University
Combined
Treatment
A
Treatment
B
Successful
273 (78%)
289 (83%)
Unsuccessful
77
61
22
© 2008 Department of Statistics, Harvard University
Combined
Treatment
A
Treatment
B
Successful
273 (78%)
289 (83%)
Unsuccessful
77
61
23
© 2008 Department of Statistics, Harvard University
24
MEDICAL
• When and why does Simpson’s
paradox occur?
• How do we deal with it?
© 2008 Department of Statistics, Harvard University
25
LEGAL
• How is statistics an important part of our legal
system?
• How might we use a statistic or probability as
evidence in a trial?
• How are statistics often misinterpreted by
lawyers and juries?
© 2008 Department of Statistics, Harvard University
26
LEGAL
You have just been selected for jury duty. In 1996 in
England, Denis Adams was suspect in a rape trial.
Listen closely to the details of the case and the
arguments presented before deciding your verdict.
(We have simplified the actual case/arguments for the
purpose of this illustration.)
© 2008 Department of Statistics, Harvard University
27
LEGAL
Prosecution Argument
• Adams’ DNA profile matches that of evidence found
at the scene of the crime
•If Adams is innocent, there is only a 1 in 20 million
chance that his DNA would match that found at the
crime
• Therefore, the probability Adams is innocent is only
.00000005, hence the probability he is guilty is 1
minus that, .9999995. Thus Adams is guilty beyond
the shadow of a doubt.
© 2008 Department of Statistics, Harvard University
28
LEGAL
Defense Argument
• If the odds of a DNA match for any person is
1/ 20,000,000, since there are 60 million people in
England, there are on average 3 other people with this
DNA type (in 1996).
•Since it is equally likely to be any of these others, the
probability of Adams’ guilt is 1/3 = .33, which is not
enough certainty to convict.
© 2008 Department of Statistics, Harvard University
29
LEGAL
Defense Argument
• In an identity line up, victim failed to pick out Adams
• Victim describes an attacker in his 20’s
• Adams is 37
• Victim guessed Adams to be about 40
• Adams had an alibi for the night of the crime (he
spent the night with his girlfriend)
© 2008 Department of Statistics, Harvard University
30
LEGAL
53%
Would you convict
Adams?
© 2008 Department of Statistics, Harvard University
N
Ye
s
1. Yes
2. No
o
47%
31
LEGAL
1) What is the probability that you drive into a
tree given that you are drunk?
2) What is the probability that you are drunk
given that you drive into a tree?
Why is it important to distinguish them?
© 2008 Department of Statistics, Harvard University
32
WINE AND CHOCOLATE
53%
(a) 0 - .2
(b) .21 - .4
(c) .41 - .6
(d) .61 - .8
(e) .81 - 1
30%
14%
0%
(e
).
81
-1
-.
8
).
(d
).
(c
61
41
-.
4
21
(b
).
0
)
(a
© 2008 Department of Statistics, Harvard University
-.
6
2%
-.
2
If I randomly pick
up one of these
chocolates, what do
you think is the
probability there is
champagne inside?
33
WINE AND CHOCOLATE
36%
29%
(a) 0 - .2
(b) .21 - .4
(c) .41 - .6
(d) .61 - .8
(e) .81 - 1
21%
14%
(e
).
81
-1
-.
8
).
(d
).
(c
61
41
-.
4
21
(b
).
0
)
(a
© 2008 Department of Statistics, Harvard University
-.
6
0%
-.
2
If I randomly pick
up one of these
chocolates, what do
you think is the
probability there is
champagne inside?
34
WINE AND CHOCOLATE
© 2008 Department of Statistics, Harvard University
35%
.05
.1
.35
.6
.75
1
24%
12%
12%
1
.7
5
.6
9%
.3
5
9%
.1
1.
2.
3.
4.
5.
6.
.0
5
How certain are
you about your
estimate? If you
were to give an
interval that you
are fairly confident
contains the truth,
how wide would
this interval be?
35
WINE AND CHOCOLATE
Let’s collect some data!
© 2008 Department of Statistics, Harvard University
36
WINE AND CHOCOLATE
100%
Did your chocolate
have champagne in it?
(a) Yes
(b) No
© 2008 Department of Statistics, Harvard University
o
N
Ye
s
0%
37
WINE AND CHOCOLATE
If I randomly pick
up one of these
chocolates, what is
your best guess for
the probability of
champagne inside?
© 2008 Department of Statistics, Harvard University
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
0
.1
.2
.3
.4
.5
.6
.7
.8
.9
38
WINE AND CHOCOLATE
24%
© 2008 Department of Statistics, Harvard University
21%
21%
14%
1
10%
.7
5
Let’s collect more data!
.6
10%
.3
5
.05
.1
.35
.6
.75
1
.1
1.
2.
3.
4.
5.
6.
.0
5
How certain are
you about your
estimate? If you
were to give an
interval that you
are fairly confident
contains the truth,
how wide would
this interval be?
39
WINE AND CHOCOLATE
89%
Did your chocolate
have champagne in it?
N
Ye
© 2008 Department of Statistics, Harvard University
o
11%
s
(a) Yes
(b) No
40
WINE AND CHOCOLATE
If I randomly pick
up one of these
chocolates, what is
your best guess for
the probability of
champagne inside?
© 2008 Department of Statistics, Harvard University
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
0
.1
.2
.3
.4
.5
.6
.7
.8
.9
41
WINE AND CHOCOLATE
© 2008 Department of Statistics, Harvard University
1.
2.
3.
4.
5.
6.
.05
.1
.35
.6
.75
1
26%
22%
22%
17%
9%
1
.7
5
And even more data…
.6
.3
5
.1
4%
.0
5
How certain are
you about your
estimate? If you
were to give an
interval that you
are fairly confident
contains the truth,
how wide would
this interval be?
42
WINE AND CHOCOLATE
83%
Did your chocolate
have champagne in it?
N
Ye
© 2008 Department of Statistics, Harvard University
o
17%
s
(a) Yes
(b) No
43
WINE AND CHOCOLATE
What happens as you accumulate more data?
1) Your estimates become more accurate
2) You can narrow in on your interval prediction
(your uncertainty decreases)
3) In this case, you get to enjoy chocolate! 
© 2008 Department of Statistics, Harvard University
44
“Life is like a box of
chocolates… you
never know what
you’re going to get.”
BUT YOU CAN ESTIMATE IT!
(especially after you take STAT 105!)
http://movies.aol.com//movie/forrest-gump/1036/video/tom-hanks-greatest-moments/1138699
© 2008 Department of Statistics, Harvard University
45
Things We Do Differently …
Student/Faculty course design collaboration
 Modules, allowing “out of sequence”
teaching in terms of technical material
 The use of “Clickers” (Personal Response
Devices)
 Module-based team projects and project
presentations
 Module-based guest lecturers
 Assessment

 Peer
evaluation
 Assignments, projects, no traditional exams
© 2008 Department of Statistics, Harvard University
46
Module-Based Approach (MBA)
Challenges

Time management
 Structured
material vs “improvised” discussions
So much material, so little time
 Student team dynamics
 Prerequisites

 Can

we offer stat105 without prerequisites?
Funding for course material
 e.g.
wine and chocolate
 Outside speaker expenses

Scaling to a (much) larger class size in the
future
© 2008 Department of Statistics, Harvard University
48
Future Happiness …

Developing more modules




Prepare a multimedia-based teaching package



Sports
Nutrition
……
Text book
Website
Similar courses aimed at different levels
More advanced
 Less advanced


Build more Happy Teams!
49
© 2008 Department of Statistics, Harvard University
Thanks much!
And we welcome your
feedback!
50
© 2008 Department of Statistics, Harvard University
Download