Overview of How To Lie With Statistics by Darrell Huff

advertisement
Overview of How To Lie With
Statistics by Darrell Huff
With additional insights
Chapter 1 - Sampling Biases
• Response Bias: Tendency for people to over- or
under-state the truth
• Non-response: People who complete surveys are
systematically different from those who fail to
respond. Accessibility/Pride.
• Representative Sample: One where all sources of
bias have been removed. (Literary Digest)
• Questionnaire wording/Interviewer effects
• Recall Bias: Tendency for one group to remember
prior exposure in retrospective studies
Chapter 2 - Well-Chosen Average
• Arithmetic Mean: Evenly distributes the total
among individuals. Can be unrepresentative
when measurements are highly skewed right.
(e.g. per capita income)
• Median: Value dividing distribution into two
equal parts. 50th percentile. (e.g. median
household income)
• Mode: Most frequently observed outcome (rarely
reported with numeric data)
Chapter 3 - Little Figures Not There
• Small samples: Estimators with large standard
errors, can provide seemingly very strong effects
• Low incidence rates: Need very large samples for
meaningful estimates of low frequency events
• Significance levels/margins of error: Measures of
the strength and precision of inference
• Ranges: Report ranges or standard deviations
along with means (e.g. “normal” ranges)
• Inferring among individuals versus populations
• Clearly label chart axes
Chapter 4 - Much Ado About Nothing
• Probable Error: Estimation error with probability
0.5. If estimator is approximately normal, PE is
approximately 0.675 standard errors. (Old school)
• Margin of Error: Estimation error with probability
0.95. If estimator is approximately normal, PE is
approximately 2 standard errors
• Clinical (practical) significance: In very large
samples an effect may be significant statistically,
but not in a practical sense. Report confidence
intervals as well as P-values.
Chapter 6 - Eye-Catching Graphs
• Choice of ranges on graphs can have huge
impact on interpretation (e.g. percent change)
• Choice of proportion of y-axis to x-axis can
distort as well (very easy to do with modern
software)
• Can also distort bar charts by having them start
at positive values and/or trimming below an
artificial baseline to 0
Chapter 6 - 1-D Pictures
• Bar Charts and Pictorial Graphs should have
areas proportional to values (only make
comparisons in one dimension)
Chapter 7 - Semiattached Figure
• Target Population: Group we want to make
inference regarding
• Study Population: Group or items that
experiment or survey is conducted on
• When comparative studies are conducted among
products,treatments, or groups; what is the
comparison product, treatment, or group?
• Control for all other potential risk factors when
studying effects of factors
Chapter 8 - Causal Relationships
• Correlation does not imply causation
• Elements of causal relationships
– Association between Y and X
– Clear time ordering (X precedes Y)
– Removal of alternative explanations
(controlling for other factors)
– Dose-Response (when possible)
Download