Statistics at Rothamsted Experimental Station Gavin J. S. Ross Statistics Department 1961-97 Rothamsted Experimental Station was founded in 1843 by John Bennet Lawes, owner of Rothamsted Manor in Harpenden, 40 km North West of London. Lawes learnt of the value of certain chemical elements to plant growth, and patented a method of extracting phosphates from bones. He set up a factory in London and laid out the first trial of wheat on Broadbalk, a field next to Rothamsted Manor house. The Broadbalk experiment has grown winter wheat every year since 1843, and there are 170 years of records of crop yields on each plot. Rothamsted Manor John Bennet Lawes 1814-1900 Founder of Rothamsted Broadbalk Wheat Experiment Each long strip has a different combination of applied fertilisers. Wheat has been grown every year since 1843. Some horizontal strips have been treated differently, such as being cleared (fallow) to control weed growth. The plots with no nitrogen are palest in colour. By 1918 there were 70 years of records on Broadbalk and other long term experiments. The Director, Sir John Russell appointed Ronald A Fisher to interpret the data. Fisher asked for a modern calculator, and was provided with a large machine called the Millionaire. The smaller machine, a Brunsviga, was also used. Fisher had already published papers on the exact distribution of Student’s t and the Correlation coefficient. His ambition also was to show how Darwin’s theory of Evolution could be explained by Mendel’s theory of genetics. Rothamsted allowed him to continue his genetic research at the same time as his work with Rothamsted data. Fisher in 1929 Breeding mice at his home Fisher published his analysis of the Broadbalk wheat data and the weather records over the past 70 years. To do this he developed the method of fitting orthogonal polynomials, and multiple regression to determine the relationships between the yield data and the rainfall patterns for each year. From this he developed the method of significance testing using the analysis of variance, providing tables of what is now called the F distribution (named by Snedecor after Fisher). At the same time he developed the method of Maximum Likelihood Estimation for non-normal error distributions, described in his key 1922 paper on the fundamentals of statistical inference. Fisher’s experience with the Rothamsted experiments led him to propose the main principles of Experimental Design: Randomisation Replication Block structure Factorial combinations Estimation of error variance With his successor Frank Yates further designs were proposed, Latin Squares, Graeco-Latin Squares, Split Plots, Confounded Factorial Designs, Balanced Incomplete Blocks and Lattice Designs, and the Analysis of Covariance. Fisher’s ideas spread rapidly. His student, L. Tippett, introduced design to industry. George Snedecor brought Fisher’s methods to the USA, Harold Hotelling developed his ideas on multivariate methods, John Wishart and Frank Anscombe set up university courses in statistics. An early visitor to Rothamsted was Jerzy Neyman in 1926. In a letter from Student (W. Gosset) Fisher was told: “There is a Polish statistician anxious to meet you. He is the only person I know who is enthusiastic about your ‘Likelyhoods’. “ Neyman and Fisher had lively arguments about the nature of statistical inference. Fisher argued with everyone, especially Karl Pearson who still believed that large samples were necessary, and that the distribution had to be determined before any conclusions could be drawn. Fisher showed how to use all the information in a sample. Fisher left Rothamsted in 1933 after appointing Frank Yates as his successor. Yates appointed W.G. Cochran, Oscar Kempthorne and David Finney, who all made major contributions to design, sampling and bioassay. Yates showed the government how sample surveys were more efficient and cheaper than demanding data from everyone. In the 1939-45 War he advised on agriculture and food production, as well as helping the Air Force to improve its effectiveness. Rothamsted Manor was used by the secret signals and codebreaking unit from Bletchley Park, as one of the locations of Alan Turing’s early machines. Yates using Fisher’s Millionaire calculator Yates increased the department after 1945, analysing data for several British agricultural institutes and Government experimental farms. He organised an annual Survey of Fertiliser Practice, using punched cards and Hollerith Tabulators and Sorters. In 1954 the opportunity arose to acquire an early Electronic Computer , the Elliot 401. The first statistical programs were for single analyses, a different program for each type of designed experiment, regression model, curve fitting, probit analysis, and simple surveys. Later programs allowed transformation of variables in experiments, missing value estimation, multivariate analysis and cluster analysis. The programmers included Michael Healy, John Gower, Howard Simpson and Gavin Ross, under the direction of Frank Yates. A larger and faster machine, the Ferranti Orion, was installed in 1964, and more general programs were written, combining the facilities of the smaller individual programs. These included a General Experiments Program, a General Survey Program, a Cluster Analysis program and a Maximum Likelihood Program for non-linear model fitting. Yates encouraged staff to travel overseas, and to welcome visiting statisticians, including from Poland. Tadeusz Calinski first came in 1963, and returned 50 years later to see where he had his office. In 1968 Yates was succeeded by John Nelder, who had been head of statistics at the National Vegetable Research Station at Wellesbourne. Nelder and Mead had visited Rothamsted regularly to develop the famous Simplex Algorithm for numerical optimisation. Nelder planned a new general statistical package, GENSTAT, based on the analysis of variance algorithm developed by Graham Wilkinson in Adelaide. Wilkinson joined the department, and the programming team added many different analyses. The next computer, the ICL 4-70 allowed the programs to be written in Fortran, making them available on other machines. Nelder arranged with the Numerical Algorithms Group, NAG Ltd, to market Rothamsted software products. These included GENSTAT, RGSP and MLP. With the ideas of Robert Wedderburn, Nelder also developed GLIM, an interactive program for Generalised Linear Models. John Nelder John Gower Rothamsted statisticians were active in many statistical societies and publications, as officers and journal editors. Fisher was first President of the Biometric Society, as were Yates, Nelder and Rob Kempton in later years. Healy organised the 1962 Biometrics Conference in Cambridge. Yates was first President of the British Computer Society, and served as President of the Royal Statistical Society. Nelder was active in promoting the COMPSTAT conferences under the International Statistical Institute computing committee. Gower was active in founding the British Classification Society and the International Federation of Classification Societies. He was also secretary of the RSS and promoted the Algorithms Section of Applied Statistics. Gavin Ross was also President of the British Classification Society John Gower succeeded Nelder in 1984. He continued to develop new ideas in Multivariate Analysis, including the use of Biplots, to interpret data groupings and variables simultaneously. His earlier work on Principal Coordinate Analysis continues to be much used. During this period Rosemary Bailey and Donald Preece were very active in developing ideas on Experimental Design and Orthogonality. Gradually the department became less responsible for analysing everyone’s data, as the age of mainframe computers passed on to the age of PCs and networks, and the internet. Scientists used whatever statistical package they were most familiar with. John Gower retired in 1990 and was succeeded by Vic Barnett. Barnett was interested in Environmental Statistics and organised several SPRUCE conferences, including one at Rothamsted in 1996. Barnett was active as Treasurer of the Royal Statistical Society and organised its move to new offices in London. After 1997 the role of the department diminished, and the Genstat Team under Roger Payne became a separate commercial organisation, VSN, in a neighbouring town. Robin Thompson became Head of Statistics, with an interest in animal experiments, and the development of REML models with several error sources. Andrew Mead is now the Head of Statistics, and there is much current work on biological modelling using powerful computer algorithms. Other statisticians who have been at Rothamsted as staff or visitors: Wojtek Krzanowksi worked with Gower on multivariate methods, and teaches at Exeter University Rob Kempton developed models for species diversity and statistical genetics, and became head of the Scottish Agricultural Statistics Service. Roger Payne developed methods for identification of organisms and produces a package, GENKEY for this purpose. Janet Riley worked for the Overseas section and developed models for intercropping and aquaculture in developing countries. Mike Kenward developed methodology for longitudinal models. Polish visitors have included Stan Mejza, Pavel Krajevsky and Andrej Zielinski. Memorial window to Fisher and Venn at Caius College, Cambridge Fisher in 1929