Statistical and Practical Significance Advanced Statistics Petr Soukup Outline Reminder of statistical significance Limits of statistical significance Misuses of statistical significance Alternatives to statistical significance Practical significance Effect sizes REMINDER OF STATISTICAL SIGNIFICANCE (NHST) Hypotheses and tests Tested hypothesis in experiments (Fisher, 1925) Null and alternative hypothesis (NHST) (Neyman&Pearson, 1937) Common tests - t-tests, analysis of variance, analysis of covariance, correlation analysis etc. Definition of statistical significance Decision True status H0 H1 H0 OK (P=1- α) Type I error (P= α) H1 Type II error (P= β) OK (P= 1-β) Test Power Definition: Conditional probability, that our sample can be drawn from population in which null hypothesis is valid (α). Statistical significance is P(D/H0) and not P(H0/D) LIMITS OF NHST Assumptions for classical NHST Big big probability samples from infinite or very big finite populations Three assumptions: Big (infinite) population (at least 100times bigger than the sample) Probability sampling (all units same probability of selection) Big sample (> 30-50 units) LIMITS OF NHST 1. data from censuses 2. data from non-probability samples 3. data from small samples 4. data based on sample that are big proportion of the basic population 5. big data samples from merged (internationally or by time) files Beyond the limits of NHST in CSR* 0% 5% 10% 15% 1-asterisks/tests in aggregated data 2a-inf. statistics in quota sampling 2b-weights in quota samples N=32 articles, Czech sociological review 2000-2006 (selected 29 issues), own research *CSR-Czech sociological review MISUSES OF NHST Objections against NHST (Misuses of NHST) a) Insufficient statement about population, b) null hypotheses are unreal (nill null), c) mechanical usage of classical 5% statistical significance (asterisks, stepwise methods, best models etc.), d) statistical significant doesn’t mean important, e) publishing only statistical significant results (file drawer problem). Misuses of NHST in CSR* 0% 10% 20% 30% 40% C1. asterisks C2. P=0.000 D. important E. file drawer problem N=32 articles, Czech sociological review 2000-2006 (selected 29 issues), own research *CSR-Czech sociological review Conference examples (P<0,01) Conference examples (***) Conference examples (*** and stepwise) ALTERNATIVES TO NHST Some alternatives to statistical significance a) Confidence Intervals (Problems for r, formulas, regression etc.) b) Test power (quite good in sociology), c) Estimate of minimum sample size & What if strategy, d) Comparison of models via information criterias (AIC, BIC) e) Bayesian approach PRACTICAL SIGNIFICANCE Practical significance terminology a) Practical significance b) Substantive significance c) Logical significance d) Scientific significance sometimes also: e) result importance or f) result meaningfulness How to measure Practical sig.? History - Absolute and relative approach Example: Income differencies Absolute and relative difference How to measure Practical sig.? Effect sizes – measures of practical significance Some well known: Cohen d Hayes ω But also R2, r, C, Fisher η2 are effect sizes Problem: Sometimes published but not interpreted OTHER SIGNIFICANCES Special significances Economic significance Clinical significance Etc. CONCLUSION? Statistical significance is: LIMITED MISUSED BUT NOT BAD Substantive significance is: NOT OFTEN USED BUT NECESSARY