REMINDER OF STATISTICAL SIGNIFICANCE

advertisement
Statistical and Practical Significance
Advanced Statistics
Petr Soukup
Outline
Reminder of statistical significance
 Limits of statistical significance
 Misuses of statistical significance
 Alternatives to statistical significance
 Practical significance
 Effect sizes

REMINDER OF STATISTICAL
SIGNIFICANCE (NHST)
Hypotheses and tests

Tested hypothesis in experiments (Fisher, 1925)

Null and alternative hypothesis (NHST)
(Neyman&Pearson, 1937)

Common tests - t-tests, analysis of variance,
analysis of covariance, correlation analysis etc.
Definition of statistical significance
Decision
True status
H0
H1
H0
OK (P=1- α)
Type I error (P= α)
H1
Type II error (P= β)
OK (P= 1-β) Test Power
Definition: Conditional probability, that our sample can
be drawn from population in which null hypothesis
is valid (α). Statistical significance is P(D/H0) and not P(H0/D)
LIMITS OF NHST
Assumptions for classical NHST
Big big probability samples from
infinite or very big finite populations
Three assumptions:
 Big (infinite) population (at least
100times bigger than the sample)
 Probability sampling (all units same
probability of selection)
 Big sample (> 30-50 units)

LIMITS OF NHST





1. data from censuses
2. data from non-probability samples
3. data from small samples
4. data based on sample that are big
proportion of the basic population
5. big data samples from merged
(internationally or by time) files
Beyond the limits of NHST in
CSR*
0%
5%
10%
15%
1-asterisks/tests
in aggregated
data
2a-inf. statistics
in quota
sampling
2b-weights in
quota samples
N=32 articles, Czech sociological review 2000-2006 (selected 29 issues), own research
*CSR-Czech sociological review
MISUSES OF NHST
Objections against NHST
(Misuses of NHST)
a) Insufficient statement about population,
b) null hypotheses are unreal (nill null),
c) mechanical usage of classical 5% statistical
significance (asterisks, stepwise methods, best
models etc.),
d) statistical significant doesn’t mean important,
e) publishing only statistical significant results (file
drawer problem).
Misuses of NHST in CSR*
0%
10%
20%
30%
40%
C1. asterisks
C2. P=0.000
D. important
E. file drawer
problem
N=32 articles, Czech sociological review 2000-2006 (selected 29 issues), own research
*CSR-Czech sociological review
Conference examples (P<0,01)
Conference examples (***)
Conference examples (*** and
stepwise)
ALTERNATIVES TO NHST
Some alternatives to statistical
significance
a) Confidence Intervals (Problems for r,
formulas, regression etc.)
b) Test power (quite good in sociology),
c) Estimate of minimum sample size & What if
strategy,
d) Comparison of models via information
criterias (AIC, BIC)
e) Bayesian approach
PRACTICAL
SIGNIFICANCE
Practical significance terminology
a) Practical significance
b) Substantive significance
c) Logical significance
d) Scientific significance
sometimes also:
e) result importance or
f) result meaningfulness
How to measure Practical sig.?
History - Absolute and relative approach
Example: Income differencies
Absolute and relative difference
How to measure Practical sig.?
Effect sizes – measures of practical significance
Some well known:
Cohen d
Hayes ω
But also R2, r, C, Fisher η2 are effect sizes
Problem: Sometimes published but not
interpreted
OTHER SIGNIFICANCES
Special significances
 Economic significance
 Clinical significance
 Etc.
CONCLUSION?
Statistical significance is:
LIMITED
MISUSED
BUT NOT BAD
Substantive significance is:
NOT OFTEN USED
BUT NECESSARY
Download