Statistical significance – misused or bad concept

Petr Soukup Faculty of Social Science, Charles University, Sociological Institute (Czech Academy of Science) Statistical significance – misused or bad concept? Abstract Nowadays, computers offer us possibility to use statistical methods nearly without thinking. We teach students how to use statistics in a statistical package and not how to use statistics in science. One of the biggest problem is connected with statistical significance concept, established by sir Ronald Fisher, Jerzy Neyman and Egon Pearson. It seems to me, that most of social scientist use statistical significance in a bad way, most of students do not uderstand to the key idea of this concept. Scientists are used to publish more probably statistical significant results than nonsignificant ones. Is statistical significant result always important? Are there any alternatives to statistical significance for evaluation of result importance? Do Czech sociologists use statistical significance in bad way? I discuss limits of statistical signicance and statistical significance controversy in psychological and educational journals in the last 20 years. I offer some alternatives to classical statistical significance (confidence intervals, power analysis, minumum sample size, alternative models approach). I made enquiry of social scince students and tried to find out what is thein statistical significance knowledge. At the end of article I give some recommendations for teaching in methodological courses that can improve statistical significance practicing in future. Short history of statistical testing The history of statistical testing started in 1710 by Arbunthot paper [Arbunthot, 1710]. The author of this paper wanted to show differences of girl’s and boy’s birth rates in Britain. Idea of statistical testing was sleeping for two centuries and at the beginning of 20th century Fisher started to develop it. The basic Fisher’s idea was connected with experiment designs and key question was: Is there any diference between control and experimental group? Or is there any diference between two experimantal groups etc.? Fisher invented analysis of variance and idea of statistical hypothesis testing. Fisher proposed to test so called tested hypothesi which states: There s no diference between control and experimental groups in measured characterics (e.g. lenght of XXXXXnarcisy atd.) Fisher derived equations for analysis of variance and Fischer distribution (F-distribution). F=MSb/MSw According to the value of test criteria F it was possible to compute probability of validity of null hypothesis if this is true in basic population. This probability (often nowadays called Sig or P or significance etc.) can be between 0 and 1. Low values seems to support idea that control and experimental groups differ i.e. there is some imapct of effect which wass used in experimental group. Fisher proposed probability (dále jen P) below 0,05 (5 %) as „proof“ of effect impact and propose to continue in experiments with P lower that 0,05. These replications can confirm our idea about effect impact. The second Fisher’s propsal was conncted with P in interval <0.05,0.2>. In these cases Fisher proposed to continue in experiments, because our experiment design is may be problematic but we can reach some higher differences in next experiments. The third Fisher’s proposal can be formulated: If P is above 0.2 (20 %) follow tested hypothesis and conclude that no effect can be proven and no replication of experiment is neccessary. These ideas were developed by polish statistician Jerzy Neyman and british statistician Egon Pearson. These two developed idea of null and alternative hypothesi. At the begging researcher must state his/her null hypothesis and alternative one.

Statistical significance – misused or bad concept

Related documents

Products

Support

Statistical significance – misused or bad concept

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib