Torlre ifrAth Statist 6'...lans
R. A. Fisher: Contributions to Analysis of Variance and Experimental Design
Overview: This chapter provides a brief description of Sir Ronald A, Fisher's scientific contributions and some details of his personal life. His research findings were published in books, monographs, and yurrial. articles. They dealt with work in the substantive disciplines of
.4enetics, biometry, agriculture, evolution and eugenics. included are his pioneering statistical contributions in the analysis of variance and covariance and m.ultivariate statistics, as well as the .
following sampling distributions: product moment correlation coefficient; regression coefficient; p artial co rrelatio n ; mult iple correlation. Some other findings were the Fisher
Exact Test, maximum likelihood estimation, intraclass correlation coefficient., analysis of cross-classification data, Fisher Information, fiduciai probability
.
, multiple comparisons, randomization, replication, genetic findings on natural select ion, and the cause of evolutionary change. Fisher provided a unified treatment of many of the important distributions involved in testing hypotheses His controversies involving other important scientists are displayed in this chapter .
The purpose of this chapter is to summarize the scientific contributions of Sir Ronald
Aylmer Fisher, focusing especially on his work in statistics and experimental design.
Connections between Fisher's innovative findings in statistics and his personality are made, revealing his many controversies with other ltuninaries such as Karl Pearson, Jerzy
Heyman, and Harold Jeffreys
Family Background and Early Vocational Choice
R A. FiSher was horn on February 17, 1890 in London and he died on July 29, 1962 in
Adelaide, South Australia. He completed his secondary education at Harrow School in
1909, was awarded a scholarship to the University of Cambridge, and took his B.A. degree in mathematics with honors in 1912 An additional grant was awarded to Fisher in
1913 and he studied the theory of errors, statistical mechanics, and quantum theory.
In 1917 he married Ruth .Eileen. Guinness They had eight children --six girls and two boys. Joan, who was the second oldest of the girls, married the famotis statistician George
Box, and she wrote a well-received biography of her father titled R. A. Fisher: The 1.tfi!
Scientist (1978). In 1920 R. A Fisher took the M A degree. of
BOSON BOOKS 19
tom4
Ntaissfechrn:F
Fisher was deferred from military service during World War 1 because of his deficient eyesight. From 1915 to 1919 he was a public school teacher. During the two decades from 1920 to 1940, he demonstrated remarkable research productivity, publishing three important books, a collection of tables, and over 120 other titles (Kendall, 1.963, pp.1.2).
Vocationally, he worked as a statistician at Rotharnsted Experimental Station from 1919 to 193.3, "Within a few days [of the job offer from Rothanisted] Fisher also received an offer from Professor Pearson at the Ciahon Laboratory. Fisher's interests had always. been in the very subjects that were of interest at the Galton Laboratory, and for 5 years he had been in communication with Pearson, yet during those years he had been rather consistently snubbed. Now Pearson made him an offer on terms which would constrain him to teach and to publish only what Pearson himself approved. It seems that the lover was at last to be admitted to his lady's court —on condition that he first submit to castration. Fisher rejected the security and prestige of a post at the Galion Laborato ry and took up the temporary job as sole statistician in a small agricultural research station in the country.
-
(Box, 1978, p. 61)
Karl Pearson retired in 1933, and Fisher succeeded him as Galton Professor of
Eugenics at University College, London. In 19.3 3 Fisher also became editor of Thc
Annals of In 1943 he moved on to the. University of Cambridge, where he was named the Arthur Balfour Professor of Genetics and remained there until 1957. Fisher founded the international journal Heredity in 1947 and co-edited it until his death in
1962. in 1952 the Queen of England knighted him From 1959 to 1962, R.. A Fisher was
Research Fellow, Division of Mathematical Statistics at the University of Adelaide in
South Australia_
Young Ron Fisher differed from other children in two important ways and both of them could have influenced the way he learned (1) His eyesight was deficient: (2) He was precocious.
The following quotation from Joan Fisher Box's biography of her father (1978, pp. 12,13) displays the precocious behavior of the youngster
At about [the age of three] when he had been set up in his high chair for breakfast, he asked: What is a half of a half?' His nurse answered that it was a quarter. After a pause he asked, 'And what's a half of a quartet?' She told h im that it was an eighth There was a longer pause before he asked again, 'What's a half of an eighth, Nurse' When she had given her reply there was a long silence. Finally, Ronnie looked up, a plump pink and white face framed with red-gold hair, and said slowly., then I suppose that a half of a sixteenth must be a thirty-tool "'
We now turn to the problem of Fisher's eyesight, which was detected later on. During his adult years, when he was a well-trained sophisticated scientist in solving statistical prOblems, he made frequent use of viewing n observations in n•limensional space Thai is, he found geometrical solutions to statistical problems mare fruitful than algebraic approaches Some writers link this geometric ability with the fact that Fisher had poor eyesight, almost since birth. If so.. it is as though he had transformed a deficit into an advantage. Geometric approaches might he especially advantageous in studying multivariate statistical models.
Experimental Design
BOSON BOOKS 20
r.
, e8k She!
R A. Fisher's research in experimental design, which included the notion of randomization, is considered to he among the most important of his contributions to science. As well as independent groups factorial designs, he invented experimental designs where restrictions on randomization are imposed, but random processes are retained. Two examples are the randomized blocks design and the. Latin square These designs ordinarily have greater power or sensitivity than designs with independent groups In the language that is currently used, randomization supports the internal validity of an experiment, whereas random selection protects against threats to external validity, o r g eneralizabilit y H is work on design is summarized in th e ho o k, fle.sign of .
E Ape rtinc.nts (1960).
Randomization , replication, and blocking are the fundamental principles of experimental design introduced by R. A. Fisher. Replication is the main source of the estimate of error, while randomization insures that the estimate will be unbiased.
Blocking increases precision. Many statistical scientists viewed Fisher's technique as a revolution. It especially modified research in an agricultural setting.
There is controversy regarding the various types of multiple. comparisons. Yet many statisticians treat Fisher's Least Significant Difference (LSD) with respect (Puberty &
Morris, 1984 It is a simple procedure. The omnibus F-test is computed first. using a specified. a level If it turns out to he .statistically significant, painvise tests are. employed using the same a level. If not, analysis is terminated
Small. Sample Theory and the Various Sampling Distributions
R. A. Fisher created analysis of variance and analysis of covariance. In fact, George W.
Snedecor, who at the time was at Iowa State College, named the F -ratio in honor of
Fisher. These two techniques are very widely used in analyzing results in numerous research studies in psychology,. sociology biology, agriculture, education„ medicine, business, genetics, and many other disciplines_
R A. Fisher helped clarify the vocabulary and notation of some areas of statistical. inference. Fisher stressed the idea that one must carefully distinguish between population parameters and sample statistics. Ile introduced the following features of statistical estimators—sufficiency, consistency, efficiency, and maximum likelihood estimator in
1925 in his publication titled., "Theory of Statistical Estimation,
Proceedings of the Cambridge Philbsophical Society
-
which appeared in the
Also "Student" and Fisher made important inroads into exact small sample distributions, although R. A. FiSher ignored the assumption of normality. (see the summary of reviews of his book, Stansticaf Ifebriody
1 .
61. Research
(1925), which appear in chapter 9 of this book )
Fisher introduced the idea of statistical information. The general notion is that there is a certain amount of information in a sample. And in reducing the data one must minimize the amount of information lost. (Box, 19781
R. A. Fisher created the randomization tests. There are t wo of them—one for two independent samples and the other for paired observations These are considered to be nonparametric inferential statistical tests, and, assuming normality., their asymptotic relative efficiencies (ARE), relative to their respective t -tests, are equal to 1 00. Except for very small samples these. two tests cannot he computed by hand, and some of the
BOSON BOOKS 21
AV? fa h Sec i l i StV7 !i computer packages do not contain routines for these statistics. More generally, he did much original work in nonparametrics, including exact tests in general, tests of runs, various order statistics, the sun test, and normal scores test (Savage, 1976),
Fisher corrected the segment of Karl Pearson's findings regarding degrees of freedom, although Pearson never acknowledged the fact that Fisher's modification was correct.
Fisher introduced other properties of contingency table analysis including the Fisher
Exact Test, which yields an exact probability for a 2X2 table. This test is useful because tables of this form are ubiquitous since researchers often deal with dichotomous variables. The user can enter published tables and read out the appropriate probabilities.
Leo Goodman 1)984) initiated one of his lectures by honoring R. A. Fisher's contributions to analysis of cross-classification data.
Although Karl Pearson originally developed the intraclass correlation, Fisher provided a mathematical link between it and the F-ratio of analysis of variance. Lee Cronbach's coefficient of generalizability (1972), which is the ratio of universe score variance to expected observed score variance is an intraclass correlation coefficient. And all of the hallowing are intraclass correlations for certain designs: Kuder -Richard Formulas 20 an
21 Cronbach's coefficient alpha.; Hoyt's formula, a nd Rulon's split-half formula. Even the general fonnula stating reliability as the ratio of true score variance to observed score variance can be regarded as an intraclass correlation coefficient.
R. A. Fisher gave a rigorous proof of "Student's" result fo r the t•statistic, showing how it could be used to test various statistical hypotheses, and hence gave a unified treatment of many of the important distributions involved in testing (null) hypotheses. He also generalized Student's result to the case of unequal variances and unequal sample sizes.
Further studies by Behrens and others addressed the same topic, which today is referred to as the Fisher-Behren.s problem.
A study by Zimmerman (1996.) revealed that the same problem arises in nonparametric statistics. When scores in two groups with unequal variances are combined into a single group and ranked, as done in several nonparametrie significance tests, such as the Wileoxon-Mann-Whitney test, it turns out that the two corresponding groups of ranks also have unequal variances.
In producing the sampling distribution of the
.
product-moment correlation, Fisher
.
developed the r to Z transformation, which is in reality the inverse hyperbolic tangent function. One category of application of this transformation is to test hypotheses such as
.80 where it is necessary to deal with a negatively skewed sampling distribution.
Another category of application is to lest p : p l . For both of these cases research workers assume the samples are independent. But Fisher also prov ided the mathematics for the nonindependent case
Sir Ronald A. Fisher proved certain properties of discriminant function analysis, a multivariate statistical technique, and, for the two-group case, he showed a mathematical link between it and multiple regression_ He made a number of other contributions to multivariate statistics. For further details see T.W Anderson's (1990 "R. A Fisher and
Multivariate An.alysis," which was published in Statistical Sciences.
Fisher introduced the idea of fiducial probability Here one wants an interval rather than a point estimate. Today the theory of confidence intervals of Neyman and Pearson is advocated by most statisticians and Fisher's fiducial probability method is somewhat
rite .• Swum:tam neglected. The Neyman-Pearson approach is usually more powerful, which means it produces
narrow confidence intervals. It should be mentioned that some research workers advocate a strategy for testing hypotheses that contains elements of both philosophical camps.
Fisher proved certain properties of the maximum likelihood estimator and used them widely in his research. In employing the maximum likelihood estimator, one selects as an estimator of a parameter that value which will maximize the likelihood of the sample that is actually observed to occur. Early on he had no use for Bayesian estimators., but later, partly due to the adversary V. ha became his friend, the luminary Harold Jeffreys, he reversed his position,
The reader interested in Fisher's philosophy of science should consult Nancy Brenner.
Golomb's chapter entitled
-
Fisher's Philosophical Approach to Inductive Inference," in
Keren and Lewis's (Ed.) A Handbook ,.fitr Data Analysts in the Behavioral Sciences:
MeihadOlogical Issues_ (1993)
Population Genetics, Evolutionary Theory, and Eugenics
Some writers contend that R. A. Fisher's investigations into Mendelian Genetics are just as important as his contributions to theoretical and applied statistics.
Fisher developed a mathematical theory on the basis of extant genetic res earch to establish the principle of natural selection on a more rigorous basis than Charles Darwin had claimed as the cause of evolutionary change. This work and related matters are presented in his book, The Genetical Theory of Natural Selemon (1930). See also
Bartlett's (I %8) write up of R A. Fisher, which was published in the International
Entyclopedia of the Social Sciences.
Eugenics is the study of possible improvements in a race or breed, often focusing on human beings. Fisher served as honorary secretary of the Eugenics Society and he wrote a great number of reviews for the Eugenics Review. This work was done at the request of
Leonard Darwin, one of Charles Darwin's sons.
Controversies With Adversaries
Initially Harold Jeffries and R A. Fisher were a dversaries as they presumably represented probabilistic camps at opposite poles. Jeffreys was the most important
Bayesian since the Reverend Thomas Bayes. Fisher
.
mounted a frontal attack by uttering a potent negative statement in regard to Jeffreys' book, Theory of Preihaberily (1939). He said the book contained 395 invalid formulas and that this was due to the fact that page l contained Bayes' postulate. Later they both moved somewhat in the direction of the philosophical position of the other, and then eventually became friends. In fact, one day they both attended a lecture by Arthur Eddington on scientific inference and were so disgusted with the lecture that they shook hands and promised to refrain from insulting each other.
Two medical doctors, Raymond Doll and Bradford Hill, published a paper entitled
"Smoking and Carcinoma of the Lungs: Preliminary Report" in 1950 in the British
Medical .Journal. Their findings, which were based on non -experimental (i.e., observational) studies, seemed to imply that smoking causes lung cancer. They attempted to make their field observations resemble an experiment.
controlling for certain.
BOSON BOOKS
I Br ettsh
St
6.14 t wtons variables such as gender, age, and general physical health. 'Fisher attacked than in print and in speeches, indicating that only (true) experimental designs are capable of establish ins empirical causality.
R A Fisher's arguments on these issues might have been more acceptable to the medical profession if it hadn't been for the fact that at the time he w as serving as a consultant to a tobacco firm. Related to the issue being debated, Jerome Cornfield (1951) introduced the notion of defensible case-control studies.
Fisher treated his followers with great respect, but he was negative to the point of outrageous hostility to those who crossed him scientifically or philosophically His venom for the latter group was reinforced by his eloquent barbed phrases He was a charming fellow in relation to his disciples but was viciously hostile toward his antagonists, sometimes attacking them in an unfair fashion, (Bartlett, 19681.
We next turn to Fisher's relation with his two mortal enemies, Karl Pearson and Jerzy
Neyman In his hook., Ski/is/A:at Alethodv in Scientific Inference (1959), attacks on these two and others appear on the following pages: 3 (Pearson) 34 -35 (Pearson &-
Edgeworth), 89 t Neyinan and Egon Pearson); 98, (Neyman); 1.00 (Neyman); 102
(Neyman & Egon Pearson). The names of the persons being attacked appear in the parentheses. It was best to stay out of R A. Fisher's path.
Now why was Fisher so vicious and vindictive to Karl Pearson and Jerzey Neyman? interaction was involved. It was partly due to Fisher's temperament but also, we shall see,. both of these other scientists manifested behaviors that trigge red Fisher's ire. Pearson acted aloof to Fisher. He rejected Fisher's submitted papers intended for Thonwtrika and used his influence to lead another journal to reject a Fisher paper. When Fisher pointed out that Pearson had misinterpreted his own statistic, chi-square, regarding degrees of freedom, Pearson simply ignored hUn. Fisher attacked Pearson's correlation ratio.
Jerzy Neyman's main offence was that he and Egon Pearson had pioneered a method of hypothesis testing that at the time became the dominant theory and overturned Fisher's approach that used fiducial limits. It should also be mentioned that in March of 1935
Neyman read a paper titled "Statistical Problems in Agricultural Experimentation
-
at the meeting of the Royal Statistical Society which provided a persona] affrontery to Fisher.
(Box, I978), He contended that randomized block and Latin square designs were biased and attempted to correct their deficiencies. Fisher replied that before one attempts to criticize previous work it is necessary to completely understand it. For the hypotheses that Fisher intended to test his statistical models were defensible. The controversy in this case was triggered by the fact that the two men took different approaches to the subject
(Box, 1978, pp 84.88)
R. A. Fisher did more than just toss occasional insults to Karl Pearson He attempted to undermine Pearson's life work, and imply that he was inadequate as a statistician and as a scientist. And now, retroactively, it is seen that both of them were world-class statisticians. It should be mentioned that R. A. Fisher and Karl Pearson were incapable of selfcriticism.
Sir Ronald A. Fisher's Publications and Honorary Awards
BOSON BOOKS 24
fisvAv firle
Sham( eCianS
A number of present day statisticians would view Fisher as the most important statistician of the 20 century. He published seven books and almost 300 papers throughout his impressive career In addition to Fisher's classic on experimental design, and his book on the genetic theory of natural selection he produc ed Statistical Meihotis for Research Workers (192.5) for which a number of editions appeared subsequently, some of them in German, Japanese, French, Russian, and Spanish.
His published journal articles have been packaged in five volumes and are titled
Collected Papers .
of R. I. Fisher (Bennett, 1971) There are 294 papers in all and their publication dates range from 191 to 1%2, spanning a half century See also Williams,
Zuinho, & Zimmerman (2001)
The first edition of his statistical tables in research, by Fisher and Yates, was published in 1938. Shewart edited .R A. Fisher's Contributions to Mathematical Statistics (1950),
P.C. Mahalanobis wrote a brief biography of 'Fisher for this volume. He published it earlier in Sankhva (1938). What is unique about the Shewart volume is that Fisher selected 43 of the articles that he thought were most important Additionally, he wrote an annotation and described the context in which each paper was written.
A number of honorary memberships, foreign associateships, and meda ls were bestowed on R. A. Fisher. He was also knighted by Queen Elizabeth and received honorary doctorates from nine colleges (Box, 1983, p. 103).
Concluding Comments
This chapter has displayed the impressive intellectual versatility of Sir Ronald A.
Fisher, focusing not only on his contributions to statistics and experimental design, but also on biology„ population genetics, evolutionar y ' theory, eugenics, agriculture, and
Darwinism
It also should be mentioned that Fisher visited the United States from tim e to time to give invited lectures and to teach courses in statistics, experimental design, and related topics. In maintaining such contacts he hoped to make his work available to Americans
In judging R A. Fisher's work, it is necessary to consider the gre at wealth of his scientific contributions, but they must be weighed against his bias in controversial matters
References
Alexander, T.W. (1996). R. A. Fisher and multivariate analysis. Stativical Sciences, I I,
20-34.
Bartlett, M.S. (1968). R. A. Fisher. In DI_ Sills (Ed.) , International Encyclopedia of the Social Sciences ( pp. 485.491 New York
.
Macmillan.
Bennett, J.H (1971)- (Ed,). Collected papers of R, Fisher. New York: Wiley
Bennett, J. H (1983). Natural selection, heredity, and eugenics: Includ correspond to k A. Rsher with Leonard Darwin and others. Oxford t.
NC. ec.eI i n g
-
("lardon Press,
Box, J F (1978) R. A. Fisher: The life qii7 ,VCit.qtriSt. New York
.
. Wiley.
BOSON BOOKS 25
1 1 4 d lirettgr
B o x . , J F (1 983 ). R ona l d A yl me r Fi sh e r. In S . Kat z & N L. J o hn so n ( E ds ).
1-
.
neyelopedia qiihe Statistical Sciences (pp 103-11 I ), New York• Wiley.
Doll, R., & Hill, A.B. (1950). Smoking and carcinoma of the lungs: Preliminary report.
British Medical Journal, 2, 239..248.
Efron, B (1998). R. A. Fisher in the 21' , century. Statistical ,'fiewnce, 13, 95.122.
Fisher, R A. (1925). Statistical MethOdSliir ITSearch workers.. New York Rather.
Fisher, R A, (1930) The genetical theory of natural selection. Oxford: Oxford
University Press.
Fisher, R A. (1949). The theory of inbreeding, London, Oliver & Boyd.
Fisher,. R.A (1959), StanSticai methods and scientific inf&rence. (2' d ed Revised). New
York: Hafner,
Fisher,. R. A 196(1). The design of experiments (7 th ed.). New York: IA afner.
Fisher, R. A., & Yates, F. (1918). Statistkal tables fm biological. agricultural, and medical research, Edinburi4h: Oliver & Boyd.
Gridgeman, .N.1" (1972). Ronald Aylmer Fisher In C
Scientific Biograpi n
(pp, 7-11). New York Scrihners.
Jeffreys, H. (1939). Theory of probrability. Oxford: Clarendon Press.
(Ed.) 1.).(ciotiari
,
.M.G (1963). Ronald Alymer Fisher, 1890-1962, Blometrika, 50, I
,
,15.
Mahalariohis, P.C. (1938). Professor Ronald Aylmer Fisher. Sankliya,„ 4, 265-272.
Reid, C. (1982) Neyman.. frown New York Springer-Verlag.
Savage, L..j. H. (1976). On rereadini R. A. Fisher (with discussion)
4, 441 , 500.
-
Statistics,
Shewarl, W A (1950). (Ed ) Fisher's cowl,r4unon.r t nialhernatical statistics, New
York: Wiley.
R.H., Zumho, & Z i m m e r m a n , D W ( 2 0 0 t ) T h e s c i e n t i f i c contributions of R. A. Fisher. Starry :Vight Review, 2, 1-22,
Yates, F., & Mather, K. (1963) Ronald Aylmer Fisher. Biographical Memoirs cif the
Royal Soctesyt?f.Lmilory, 9, 91--I 20.
Zimmerman, D W. (19%) A note on homogeneity of variance of scores and ranks.
Journal of Experimental Education, 6-1, 351.362