Statistics at Rothamsted Experimental Station

advertisement
Statistics at Rothamsted
Experimental Station
Gavin J. S. Ross
Statistics Department 1961-97
Rothamsted Experimental Station was founded in 1843 by
John Bennet Lawes, owner of Rothamsted Manor in
Harpenden, 40 km North West of London.
Lawes learnt of the value of certain chemical elements to
plant growth, and patented a method of extracting
phosphates from bones. He set up a factory in London and
laid out the first trial of wheat on Broadbalk, a field next to
Rothamsted Manor house.
The Broadbalk experiment has grown winter wheat every
year since 1843, and there are 170 years of records of crop
yields on each plot.
Rothamsted Manor
John Bennet Lawes
1814-1900
Founder of
Rothamsted
Broadbalk Wheat Experiment
Each long strip has a different combination of applied fertilisers. Wheat has
been grown every year since 1843. Some horizontal strips have been treated
differently, such as being cleared (fallow) to control weed growth.
The plots with no nitrogen are palest in colour.
By 1918 there were 70 years of records on Broadbalk and other long term
experiments.
The Director, Sir John Russell appointed Ronald A Fisher to interpret the
data.
Fisher asked for a modern calculator, and was provided with a large
machine called the Millionaire. The smaller machine, a Brunsviga, was
also used.
Fisher had already published papers on the exact distribution of
Student’s t and the Correlation coefficient.
His ambition also was to show how Darwin’s theory of Evolution could
be explained by Mendel’s theory of genetics. Rothamsted allowed
him to continue his genetic research at the same time as his work
with Rothamsted data.
Fisher in
1929
Breeding
mice at his
home
Fisher published his analysis of the Broadbalk wheat data and the
weather records over the past 70 years.
To do this he developed the method of fitting orthogonal
polynomials, and multiple regression to determine the
relationships between the yield data and the rainfall patterns for
each year.
From this he developed the method of significance testing using
the analysis of variance, providing tables of what is now called the
F distribution (named by Snedecor after Fisher).
At the same time he developed the method of Maximum
Likelihood Estimation for non-normal error distributions,
described in his key 1922 paper on the fundamentals of statistical
inference.
Fisher’s experience with the Rothamsted
experiments led him to propose the main principles
of Experimental Design:
Randomisation
Replication
Block structure
Factorial combinations
Estimation of error variance
With his successor Frank Yates further designs were
proposed, Latin Squares, Graeco-Latin Squares, Split
Plots, Confounded Factorial Designs, Balanced
Incomplete Blocks and Lattice Designs, and the
Analysis of Covariance.
Fisher’s ideas spread rapidly. His student, L. Tippett, introduced
design to industry. George Snedecor brought Fisher’s methods
to the USA, Harold Hotelling developed his ideas on
multivariate methods, John Wishart and Frank Anscombe set
up university courses in statistics.
An early visitor to Rothamsted was Jerzy Neyman in 1926. In a
letter from Student (W. Gosset) Fisher was told:
“There is a Polish statistician anxious to meet you. He
is the only person I know who is enthusiastic about your
‘Likelyhoods’. “
Neyman and Fisher had lively arguments about the
nature of statistical inference.
Fisher argued with everyone, especially Karl Pearson who still
believed that large samples were necessary, and that the
distribution had to be determined before any conclusions could
be drawn. Fisher showed how to use all the information in a
sample.
Fisher left Rothamsted in 1933 after appointing Frank Yates as his
successor. Yates appointed W.G. Cochran, Oscar Kempthorne and
David Finney, who all made major contributions to design,
sampling and bioassay.
Yates showed the government how sample surveys were more
efficient and cheaper than demanding data from everyone. In the
1939-45 War he advised on agriculture and food production, as
well as helping the Air Force to improve its effectiveness.
Rothamsted Manor was used by the secret signals and codebreaking unit from Bletchley Park, as one of the locations of Alan
Turing’s early machines.
Yates using Fisher’s
Millionaire calculator
Yates increased the department after 1945, analysing data for
several British agricultural institutes and Government experimental
farms.
He organised an annual Survey of Fertiliser Practice, using punched
cards and Hollerith Tabulators and Sorters. In 1954 the opportunity
arose to acquire an early Electronic Computer , the Elliot 401.
The first statistical programs were for single analyses, a different
program for each type of designed experiment, regression model,
curve fitting, probit analysis, and simple surveys.
Later programs allowed transformation of variables in
experiments, missing value estimation, multivariate analysis and
cluster analysis.
The programmers included Michael Healy, John Gower, Howard
Simpson and Gavin Ross, under the direction of Frank Yates.
A larger and faster machine, the Ferranti Orion, was installed in
1964, and more general programs were written, combining the
facilities of the smaller individual programs. These included a
General Experiments Program, a General Survey Program, a
Cluster Analysis program and a Maximum Likelihood Program for
non-linear model fitting.
Yates encouraged staff to travel overseas, and to welcome
visiting statisticians, including from Poland.
Tadeusz Calinski first came in
1963, and returned 50 years
later to see where he had his
office.
In 1968 Yates was succeeded by John Nelder, who had been head of
statistics at the National Vegetable Research Station at
Wellesbourne. Nelder and Mead had visited Rothamsted regularly
to develop the famous Simplex Algorithm for numerical
optimisation.
Nelder planned a new general statistical package, GENSTAT, based
on the analysis of variance algorithm developed by Graham
Wilkinson in Adelaide. Wilkinson joined the department, and the
programming team added many different analyses.
The next computer, the ICL 4-70 allowed the programs to be written
in Fortran, making them available on other machines. Nelder
arranged with the Numerical Algorithms Group, NAG Ltd, to market
Rothamsted software products. These included GENSTAT, RGSP and
MLP.
With the ideas of Robert Wedderburn, Nelder also developed GLIM,
an interactive program for Generalised Linear Models.
John Nelder
John Gower
Rothamsted statisticians were active in many statistical societies
and publications, as officers and journal editors.
Fisher was first President of the Biometric Society, as were Yates,
Nelder and Rob Kempton in later years. Healy organised the 1962
Biometrics Conference in Cambridge.
Yates was first President of the British Computer Society, and
served as President of the Royal Statistical Society.
Nelder was active in promoting the COMPSTAT conferences under
the International Statistical Institute computing committee.
Gower was active in founding the British Classification Society and
the International Federation of Classification Societies. He was also
secretary of the RSS and promoted the Algorithms Section of
Applied Statistics. Gavin Ross was also President of the British
Classification Society
John Gower succeeded Nelder in 1984. He continued to
develop new ideas in Multivariate Analysis, including the
use of Biplots, to interpret data groupings and variables
simultaneously. His earlier work on Principal Coordinate
Analysis continues to be much used.
During this period Rosemary Bailey and Donald Preece
were very active in developing ideas on Experimental
Design and Orthogonality.
Gradually the department became less responsible for
analysing everyone’s data, as the age of mainframe
computers passed on to the age of PCs and networks, and
the internet. Scientists used whatever statistical package
they were most familiar with.
John Gower retired in 1990 and was succeeded by Vic Barnett.
Barnett was interested in Environmental Statistics and organised
several SPRUCE conferences, including one at Rothamsted in 1996.
Barnett was active as Treasurer of the Royal Statistical Society and
organised its move to new offices in London.
After 1997 the role of the department diminished, and the Genstat
Team under Roger Payne became a separate commercial
organisation, VSN, in a neighbouring town.
Robin Thompson became Head of Statistics, with an interest in
animal experiments, and the development of REML models with
several error sources.
Andrew Mead is now the Head of Statistics, and there is much
current work on biological modelling using powerful computer
algorithms.
Other statisticians who have been at Rothamsted as staff or visitors:
Wojtek Krzanowksi worked with Gower on multivariate methods, and
teaches at Exeter University
Rob Kempton developed models for species diversity and statistical
genetics, and became head of the Scottish Agricultural Statistics
Service.
Roger Payne developed methods for identification of organisms and
produces a package, GENKEY for this purpose.
Janet Riley worked for the Overseas section and developed models
for intercropping and aquaculture in developing countries.
Mike Kenward developed methodology for longitudinal models.
Polish visitors have included Stan Mejza, Pavel Krajevsky and Andrej
Zielinski.
Memorial window to Fisher
and Venn at Caius College,
Cambridge
Fisher in 1929
Download