Chapter 1 Statistical Thinking - Department of Statistics and Probability

advertisement
Chapter 1 Statistical Thinking
•What is statistics?
•Why do we study statistics
Statistical Thinking
• the science of collecting, organizing, and analyzing data
• the mathematics of the collection, organization and
interpretation of numerical data
• The branch of mathematics which is the study of the
methods of collecting and analyzing data
• a branch of applied mathematics concerned with the collection
and interpretation of quantitative data and the use of
probability theory to estimate population parameters
Statistical Thinking
Statistics is a discipline which is concerned
with:
– designing experiments and other data
collection,
– summarizing information to aid
understanding,
– drawing conclusions from data, and
– estimating the present or predicting the
future.
Statistical Thinking
• "I like to think of statistics as the science of
learning from data...." Jon Kettenring, ASA
President, 1997
• Steps of statistical analysis involve:
– collecting information (Data Collection)
– evaluating the information (Data Analysis)
– drawing conclusions (Statistical Inference)
Statistical Thinking
• What type of information?
– A test group's favorite amount of sweetness in a blend of fruit juices
– The number of men and women hired by a city government
– The velocity of a burning gas on the sun's surface
– Clinical trials to investigate the effectiveness of new treatments
– Field experiments to evaluate irrigation methods
– Measurements of water quality
Statistical Thinking
Problems
• Is a new treatment for heart disease more
effective than a standard one?
• Is using a high octane gas beneficial to car
performance?
• Does reading an article in statistics
improve students’ statistics grade?
Statistical Thinking
• Is a new treatment for heart disease more
effective than a standard one?
– Pick, say, 100 heart patients
– Divide them into two groups, 50 in each group
– Group 1------------New treatment
– Group 2------------Standard treatment
Statistical Thinking
Results
• 40 out of 50 of Group 1 patients improved
• 30 out of 50 of Group 2 patients improved
• Conclusion: New treatment is more
effective!
Statistical Thinking
• How do you divide the patients?
• Have you controlled other factors? (fitness
level, life style, age, etc)
• How do you decide who gets what
treatment? Ethical issues????
Statistical Thinking
Comparing Test Scores
• Select 10 students and give them a journal
article in statistics.
• Test their knowledge about the article and
record their scores
• Repeat the test after they take STT 231.
Statistical Thinking
Result
• 8 out of the 10 students improved their
scores.
• Question: Can we conclude that reading
the article has improved students’
knowledge about statistics?
Statistical Thinking
Look at worst case scenarios:
“Under the assumption that the new
treatment is no better than the standard one,
what is the chance that 80% of the patients
benefit from this treatment?”
“Under the assumption that STT 231 brings
no benefit, how likely is it that we see 80%
of the students improve their scores? “
Statistical Thinking
Need a model to answer these questions!!
If STT 231 is not beneficial, then students’
scores may go up or down with 50%
chance.
This is equivalent to flipping a coin:
•
•
50% chance you get Head
50% chance you get Tail
Statistical Thinking
• Comparing pre and post test scores for 10
students is equivalent to
– flipping a coin 10 times and calculating the chance of
observing 8H
• Relevant Questions:
– Will the chance of observing 80% of the time H
depend on the number of students involved in the
experiment?
– Will this chance go up, down or remain the same if
you repeat the experiment with 200 students?
Statistical Thinking
• Suppose the proportion of improvement in
10 trials is 4.4%. What does this mean?
– If STT 231 is not beneficial, then there is a
4.4%chance that we will observe 8 out of 10
students’ scores improve.
– There is little hope that 8 students’ scores will
improve by just by CHANCE
Statistical Thinking
• Suppose the proportion of improvement in
10 trials is 4.4%.
• We observed 8 students’ scores out of 10
improve.
• What does this mean?
Statistical Thinking
• Course is highly effective
• Course is ineffective and we observed an
unlikely event.
• We do not know which one!
Statistical Thinking
• Suppose there is a “small” chance that an
event happens by CHANCE,
• Then this is an indication for a strong
evidence that the change that we observe
did not happen by CHANCE.
• Hence there is a strong evidence for a
factor to be responsible for this change.
Statistical Thinking
• The course is highly effective!!
• Reasoning: What we observed is very
unlikely if the course was ineffective.
Hence the course is effective.
• The 80% score increment is unlikely to be
achieved if the course was ineffective.
Statistical Thinking
Some Remarks
For questions that involve uncertainty:
– Carefully formulate the question you want to answer
(Modeling)
– Collect Data
– Summarize, analyze and present data
– Draw Conclusions. Conclusions always include
uncertainty
– Support your conclusions by quantifying how
confident you are about your conclusions.
Chapter 2 A Design Example
•
•
•
•
•
The Polio Vaccine Case
Caused by virus
Especially deadly in children
Big problem during the first half of the 20th
Century
Develop vaccine to fight the disease
Jonas Salk (~1950)
A Design Example
• Problem with vaccines:
– Are they safe?
– Are they effective?
• Undertake a large scale trial to answer
these questions
A Design Example
• Case 1: A Simple Study
– Distribute the vaccine widely (under the
assumption it is safe)
– Decrease in the number of polio cases after
the vaccine provides evidence that the
vaccine is effective
• Problem?????
A Design Example
Problems
• Lack of control group
– Is decrease in number of polio due to the
vaccine or other factors?
• How reliable is the assumption “vaccine is
safe”?
A Design Example
• Case 2: Adding a Control Group
– Have two groups
• Control group-----gets salt solution
• Treatment group---gets the actual vaccine
A Design Example
• Example (Observed Control Study)
– Control Group---all 1st and 3rd grade children
– Treatment group---all 2nd graders
• Assumption:
– Age difference between control and treatment
group was felt to be unimportant
A Design Example
• Potential Problems:
– Parents of 2nd graders may not agree to
vaccinating their kids
– Parents of sicker kids are most likely to accept
the vaccine
– More educated parents tend to accept the
vaccine
– Parents of sick 1st and 3rd graders may object
that their kids are not getting treatment
A Design Example
• Difficulty in diagnosing polio
– Extreme case of polio are easy to diagnose
– Less severe cases of polio have symptoms
similar to other common illnesses
A Design Example
• Potential Problems
– Physicians are aware of who has received the
vaccine and who has not
– Less severe case of polio in a 2nd grader (who
has received the vaccine) may wrongly
diagnosed as another illness
– Less severe case in a 1st or 3rd grader will
most likely be diagnosed as polio
A Design Example
• Case 3: Randomization, Placebo Control,
Double Blindness
– Random assignment of control and treatment
groups
• Select a child
• Flip a coin-------H-------Treatment Group
T---------Control Group
Design Example
• Placebo Control
– Kids in the control group receive salt solution
• Double Blind
– Neither the child
– nor the parents
– nor the doctors/nurses
who make the diagnosis of polio know whether a
kid receives the vaccine or the placebo
A Design Example
Summary
• In designing experiments
– Introduce some sort of control group
– Use randomization to avoid bias in selection and
assignment of subjects for the study
– Double blind experiments give protection against
biases, both intentional and unintentional
A Design Example
• Perform the experiment on a large number
of subjects (Polio case ~in millions of kids)
• Repeat the experiment several times
before making definitive conclusions
A Design Example
Basic Principles of Experimental Designs
• Randomization
• Blocking (Treatment/Control Groups)
• Replication
Download