On the common origins of psychology & statistics

advertisement
On the
common
origins of
psychology
&
statistics
Part One:
The struggle
against
subjectivity
Katja de Vries | Supervisor: Dr. Sacha Bem | Leiden University 2006
Illustrations on the front cover:
•
How the human soul sees what the hands feel.
Reproduced from: R. Descartes, L'Homme, Paris, 1664. Retrieved May 23,
2006, from http://gallica.bnf.fr/ark:/12148/bpt6k574850/f153.item
•
Miniature of blindfolded Lady Fortune.
Reproduced from: Augustine, La Cité de Dieu. Manuscipt made around 14001410, which belongs to the collection of the Dutch Royal Library. Retrieved
May 23, 2006, from: http://www.kb.nl/kb/manuscripts/
On the common origins of psychology and statistics
1
Looking at my own long life, I find that the main allurements which led me on and on […]
were preferences. The solutions were accidents.
Karl Popper, A world of propensities, p. 26
Nous avons beau avoir l’impression d’un mouvement presque ininterrompu de la ratio
européenne depuis la Renaissance jusqu’à nos jours, […] – toute cette quasi-continuité au
niveau des idées et des thèmes n’est sans doute qu’un effet de surface.
Michel Foucault, Les mots et les choses, p. 13-14.
On the common origins of psychology and statistics
2
SUMMARY
SUMMARY
(a) Structure of this masters thesis
This thesis consists of two parts. Although these two parts are highly interrelated, they can be
read separately. The first part is my masters thesis in psychology, whereas the second part is
my masters thesis in philosophy.
The subject of the first part is the role of objective (in particular the frequentist variant)
and subjective (in particular the Bayesian variant) statistical inference in, respectively, the
statistical methodology applied in psychology and in psychological theories of cognition.
The subject of the second part is the historical and philosophical intertwinement of the
notions underlying statistical methodology and cognitive psychological theory, namely
probability and rationality. The way in which these notions emerged in the seventeenth
century is part of a conceptual change that entailed the textualization of the world: ‘Nature’
became the ‘Book of Nature’. Considered in the light of this seventeenth century conceptual
change a new understanding of the ‘subjective’ and ‘objective’ interpretation of probability –
as discussed in part one – is gained.
(b) Some thoughts which play a major role throughout the whole thesis.
An attempt is made to think about the relation between psychology and its statistical
methodology. With regard to this relationship the question is raised why it is so difficult to
think about it without getting trapped in a discourse of either idolatry of the statistical method
or a romantic longing to a hermeneutical, non-statistical psychology. The aim of this thesis is
to find how one could speak about statistics and psychology without being pushed in a pro- or
contraposition.
It is hypothesized that the relationship between psychology and statistics can be
understood from a relationship in which it is grounded philosophically and historically: the
relationship between rationality and probability.
The relationship of the words ‘probability’ and ‘rationality’ is that they seemingly
cannot life without each other, nor with each other. Accordingly, the mutual relationship
between these two words has been since their very emergence rather opaque and subject to
shifts in meaning. In order to show the major conceptual changes within the relation between
rationality and probability, two historical periods are examined: (a) the second half of the
seventeenth century (the emergence of probability and the beginning of the period called
On the common origins of psychology and statistics
3
SUMMARY
classical probability) and (b) the second half of the nineteenth century / the first decades of
the twentieth century (the beginning of modern probability):
(a) Classical probability: In the first half of the seventeenth century Descartes tried to
conquer the prevailing scepticism of his time and gain certain knowledge, whose seat he
placed in the rationally thinking ‘subject’ – however, one could contend that Descartes failed
to conquer sceptical uncertainty and that he even deepened the gap between human
knowledge and the objective world.
The emergence of probability in the second half of the seventeenth century has to be
understood as an answer to this failure of Cartesian rationalism to gain certain knowledge.
Probability – ‘uncertain rationality’ or ‘rationalisme manqué’– became the ‘calculus of
reason’: it endeavoured to be rational and gain almost certain knowledge by the incorporation
of uncertainty. Probability was the rational guarantee that subjective human knowledge
corresponded to the objective world. This idea was supported by associationist psychology.
However, is ‘uncertain rationality’ rationality at all? After the French revolution in 1789 the
rational aspect of probability became more and more discredited.
(b) Modern probability: From the 1840s on there is a major change in the meaning of
rationality, probability and their relationship: probability is no longer the calculus of reason.
Moreover, probability becomes divided into subjective and objective probability. Classical
probability appears in retrospect an ambiguous amalgam of subjective as well as of objective
probability. Objective probability (in particular its frequentist variant), which sees probability
as the measure of the relative frequency of occurrence of events, becomes dominant and it
still is in statistical methodology. In the second half of the nineteenth century subjectivity is
detached from rationality and probability: subjectivity will be above all things the realm of
romantic irrationality. However, subjective probability (in particular its Bayesian variant),
which sees probability as an attribute of our subjective beliefs, will make some quite
unexpected comebacks: the first comeback is between 1901 and 1910 in Cambridge, and the
second comeback begins in the 1960s. This outbreak of the subjective interpretation in the
1960s has had a profound influence on psychological theories of cognition. Bayesianism has
become in psychological theory an important model against which the rationality of human
cognition is measured. There is a historical congruence between eighteenth century
associationist psychology and twentieth century Bayesian cognitive models.
Philosophically subjective statistical inference – which is used as a model of cognition
in psychological theory – and the psychological methodology that consists mainly of objective
statistical inference are not easily to reconcile, because they entail two completely different
On the common origins of psychology and statistics
4
SUMMARY
epistemologies. However, this philosophical incommensurability is hardly noticed, because
even objective statistical inferential methodology is blurred by a veil of subjective semantics.
This veil of subjective semantics obscures the relation between psychology and statistics.
On the common origins of psychology and statistics
5
CONTENTS
CONTENTS
SUMMARY
3
CONTENTS
6
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
8
[1.A] Three questions concerning the relationship between psychology and statistics
9
[1.B] Method
13
[1.C] Hypothesis: both psychology and statistics bear the marks of an outdated subjectivism
19
[1.D] Recapitulation & outlook on the next chapter
21
PART ONE: STATISTICS AND PSYCHOLOGY. THE STRUGGLE AGAINST SUBJECTIVITY.
22
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL
OVERVIEW OF THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
23
[2.A] Misconceptions about statistics & the necessity of a terminological and historical overview. 24
[2.B] The omnipresence of probability and statistics.
25
[2.C] Making sense in the pile of words: probability, statistics and statistical inference.
26
[2.D] The remarkable lack of probabilistic thinking until the second half of the seventeenth century
39
[2.E] From seventeenth century ‘probability’ to eighteenth century ‘Statistik’ and nineteenth and
twentieth century ‘statistical inference’.
48
[2.F] Recapitulation & outlook on the next chapter
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
65
66
[3.A] Probability entangled in subjective semantics and the problem of induction.
68
[3.B] Probability freed from subjective beliefs: rationality without a knowing subject.
70
[3.C] Statistics à la Popper: the natural selection of a falsifying rule for statistical hypotheses.
78
[3.D] Statistical methodology as a re-enactment of evolution: tamed variation and accelerated
selection
82
[3.E] The lure of ‘subjective’ semantics in statistical inference.
103
[3.F] The lure of ‘subjective’ semantics in cognitive psychology.
109
[3.G] Recapitulation & outlook on Part Two
119
INTERMEZZO: CONCLUSIONS OF PART ONE
120
On the common origins of psychology and statistics
6
CONTENTS
PART TWO: THE INTERTWINEMENT OF THE NOTIONS UNDERLYING STATISTICS AND
PSYCHOLOGY: PROBABILITY AND RATIONALITY
122
CHAPTER 4: THE LATENT PERIOD
123
[4.A] It all begins with Descartes – the rational, representational consciousness.
[4.B] The book of Nature, epistemological uncertainty and the equivocation of the Cartesian
‘subject’.
[4.C] Associationist psychology or the unproblematic ambivalence in the concept of probability
[4.D] Rationality – between the Principle of Sufficient Reason (Leibniz) and the Principle of Nonsufficient Reason (J. Bernoulli)
CHAPTER 5: THE CLASH WITH ‘NATURE’: THE LOCUS OF RATIONALITY RECONSIDERED. 123
[5.A] The transition from classical to modern probability – the same probability, but in a different
way.
[5.B] The nineteenth century confrontation with 'nature' – an objective interpretation of chance.
[5.C] Aversion of statistics and love of absolute chance – Nietzsche's absolute subjectivism
[5.D] Just a name? The realism of C.S. Peirce and the nominalism of Pearson
CHAPTER 6: WHERE DO I STAND? THE SIGNIFICANCE OF STATISTICS
124
[6.A] Fechner and Peirce: the Kollektiv as an end in itself.
[6.B] The semantics of statistics.
[6.C] From “metaphysics” to “prophysics”
REFERENCES
125
On the common origins of psychology and statistics
7
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
What is this thesis all about? In this chapter (A) some rather remarkable features of the
relationship between psychology and statistics are introduced, (B) the method that is used in
this thesis is elucidated, and (C) a hypothesis about the relation between statistics and
psychology is formulated.
On the common origins of psychology and statistics
8
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
[1.A] Three questions concerning the relationship between psychology and
statistics
The relationship between psychology and its statistical methodology is a remarkable one. That
may not seem obvious immediately – the statistical methodology is the scientifically accepted
method in almost every domain, and the psychological research field is no exception: so
what’s the problem?
Question I: Is statistics a methodology that has been adopted from elsewhere or is it
intrinsically linked with psychology?
Psychology students usually see statistics as a hardship that has to be taken in order to reach
the luring Cockaigne of becoming a scientific psychologist: “For seven years, you know well,
he must wade in pig's dung all the way up to the chin, in order that he shall attain the land”
(Lucas, 1995). Of course: nobody will deny that statistical methodology is a necessary tool in
psychological research. Convincing statistical evidence is not only a sure way to win an
academic dispute, but also the fuel that has made psychology produce useful knowledge.
However, although a rational scientist will prefer a thesis that is well-founded by statistical
arguments to a thesis that lacks a statistical foundation, there are still often some uncanny
feelings about the eagerness of psychologists to use statistics. One time or another every
psychometric researcher probably will have the gut feeling that his statistical quantification
forces the psychological research object – that ‘wonderful complex mind’ – on a Procrustean
bed (Michell, 1999; Thompson, 2001).
Yet statistical methods and psychology seem to have belonged already from their
beginnings inextricably together: statistical methods entered into psychology in 1860 through
the psychophysical research of Gustav Fechner and by the 1880s the success and acceptance
of this statistical approach was settled down – while in other social sciences, like economy
and sociology, the statistical methods entered only thirty or forty years later (Stigler, 1999, p.
189). When asked why psychology has a relatively great amount of statistics in its curriculum
in comparison with other social sciences, two answers are frequently heard.
The first answer is that the function of statistical methodology – i.e., in particular the
much used significance testing – in psychology is to ‘cover up’ the lack of theoretical
understanding:
Significance tests are for situations where we do not understand, in any theoretical sense, what is
happening. […] Experimental psychology has become the heartland of significance testing. This fits our
On the common origins of psychology and statistics
9
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
paradigm. We understand, in a deep theoretical way, almost nothing about human psychology. So we do
lots and lots of purely empirical experiments. We design experiments, obtain results, and quote
significance levels. (Hacking, 2001, p. 216)
The second frequently heard answer is that psychology is a relatively young science
that wants to prove itself in the scientific arena (e.g. Bem, 2005). This way the important role
of statistics in psychology is explained in an almost Freudian manner as ‘physics envy’
(Rogers, 2002). Viewed in this way, psychology appears to be in fact even more Catholic than
the pope: the null hypothesis or significance testing that proliferates in psychological research
is in physics hardly ever used (Gigerenzer, 1987a, p. 25 and p. 29; Gigerenzer & Murray,
1987, p. 179-180; Gigerenzer et al., 1990, p. 211). Or to put it differently: one can imagine to
a certain degree physics without statistics. But can one imagine a psychology ‘uncontamined’
by statistics? One wavers between the thought of a psychology that is contaminated by a
statistical methodology that is originally foreign to it and the thought of a psychology that is
intrinsically intertwined with its statistical methods (Rogers, 2002; Thompson, 2001). Well,
psychology did not adopt statistical methodology from the exact sciences, but it did adopt
something else: whereas the probabilistic revolution in physics led to the rejection of
classical physics, the introduction of statistical methods in psychology on the contrary to the
adoption of the ideals of classical physics, viz., determinism and objectivity (Gigerenzer,
1987a, p. 22, p. 25 and p. 29; cf. Porter, 2003a; Porter, 2003b).
That probability seemed to imply uncertainty clearly discouraged its use by physicists. From the
standpoint of social science, on the other hand, statistical method was synonymous with quantification,
and while some were skeptical of the appropriateness of mathematics as a tool of sociology, many more
viewed it as the key to exactitude and scientific certainty. Most statistical enthusiasts simply ignored the
dependence of statistical reasoning on probability, and those who acknowledged it generally stressed
the ties between probability and that most ancient and dignified among the exact sciences, astronomy.
(Porter, 1986, p. 10)
So psychology and physics do not only differ in the area of their theoretical worldview in
relation to probability, but also in their appraisal of statistical inference. The statistical
inferences made on the base of probabilistic axioms by psychologists – as “an indication of
the precision of their results” (Heiser, 2003, p. 268) – are viewed argus-eyed in the exact
sciences and are mostly thought of as a weak extraction of the calculus of probabilities.
On the common origins of psychology and statistics
10
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
Question II: Why is statistical methodology in psychology presented as a monolithic, timeless
unquestionable truth?
How statistics is presented in psychology (and in other social sciences) makes one’s eyebrows
raise: as an “abstract truth, the monolithic logic of inductive inference” (Gigerenzer et al.,
1990, p. 106). I think that the methodological variability in twentieth century psychology,
which is emphasized by Dehue (1990) pales into insignificance in comparison with the
overwhelming tendency to an uniformized methodology:
By 1955, more than 80% of the articles in four leading journals from four different areas of psychology
used significance tests to justify conclusions from the data. […] Today, the figure is between 90 and
100%. (Gigerenzer et al., 1990, p. 206)
Of the continuing debates among statisticians barely any trace can be found in the textbooks
that teach statistics to the students in the social sciences. In their courses on statistics,
psychology students are often taught a seemingly unified theory which actually is a hybrid
variant of several theories, whose concepts are irreconcilable or at least do not blend very well
(Gigerenzer et al., 1990, p. 106).
Therefore it is not very surprising that the conceptual understanding of commonly
used statistical procedures is quite weak, as is shown by the answers given to a questionnaire
that Gigerenzer, Krauss & Vitouch (2004) recently presented to students and teachers in
psychology about the meaning of a significant result. Moreover, I do not think that a
psychological researcher when using SPSS or some other statistical package is very
concerned by theoretical issues. Although most psychologists will have a certain idea of what
e.g. Poppers falsificationism (Bem & Jong, 1998) means, they would find it probably rather
difficult to answer the question what Poppers position was in the controversies on statistics
and probability and to appraise what the relevance is of his propensity theory (Gillies, 2003)
for the statistical analysis they are running.
Yet, exciting things could be said for instance about the relation between Fisher’s
Design of experiments (Fisher, 1951) and Popper’s Logic of Scientific Discovery (Popper,
1972): respectively the practical and the theoretical book that where both published in 1935
and furnished the methodological ground still in use in psychology – both insisted “that a null
hypothesis can only be shown implausible, and never be shown plausible” (Gigerenzer et al.,
1990, p. 96). However, if psychologists attend any courses in the history and philosophy of
psychology at all, these courses tend to focus on the historical succession of theories of the
On the common origins of psychology and statistics
11
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
mind and practically never draw any attention to theoretical and historical questions
concerning the statistical method such as, e.g.: “What does the ‘Normality’ as assumed in so
many statistical procedures actually mean, am I a victim of the ‘Myth of Normality’ and what
has Fisher to do with it?” (Gigerenzer et al., 1990, p. 114) or “Why has one of the main
mechanism in statistics – regression – such a gloomy, Darwinistic name?” (Heiser, 1990). In
this thesis I will answer these questions and show why it is important to show future
psychologists that statistics is a historical phenomenon, embedded in theoretical implications.
In the seventies and eighties of the twentieth century there was a real explosion of
popular scientific literature (Gribbin, 1985; Monod, 1970; Prigogine & Stengers, 1985) linked
with the indeterminism that is implied in some statistical interpretations in physics and
biology of the end of the nineteenth and beginning of the twentieth century, which today have
become quite mainstream. The question what probability is or how it has to be interpreted is a
deeply philosophical question that is related to the question ‘Do we live in an indeterministic
world or not?’. However, the probabilistic axioms of the statistics in the methodological
textbooks are presented just as given facts without a lot of theoretical concerns.
Question III: Who is in control: psychological theory or statistical method?
It is not customary in psychology to neglect controversial issues. When somebody asks me
about some domain of psychology if I could name a leading thinker or theory of it, I almost
always have to come up with a nuanced answer like: “Well, X is a proponent of theory A and
he is supported by studies K,L and M, but there are also several studies that say the opposite
and meta-analysis of the main studies in the field showed the theory of X more probable”. So
you could say that psychology is a method-driven science: not because there is a lack of
psychological theories – on the contrary: there is an abundance of them! – but because the
arbiter that judges on the value of each psychological theory is in the end the statistical
method that itself is presented as an unity without inner controversies or competitive
alternative theories. It was the adoption of the methodology of statistical inference that
marked the emergence of a unified research practice and paradigm in psychology (Danziger,
1985; 1987, p. 46). One would expect a very deep level of understanding of methodology in a
science wherein method takes such a central position. However, statistical methodology is on
a theoretical level notoriously poor understood by its applicants – although the statistical
algorithm is correctly applied, an often heard complaint is that the applicant of statistics does
not exactly grasp what he is doing.
On the common origins of psychology and statistics
12
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
[1.B] Method
Every scientific psychological publication has a methodological section. Because this is a
thesis in the philosophy and theory of psychology that deals with the relationship between
psychology and its methods it would be circular reasoning to use the methods commonly used
by scientific psychologists. It is therefore obvious that the approach in this thesis will not be
statistical.
This methodological section is divided in the following subsections:
I. A philosophical attempt to think about statistical thinking.
II. Statistical discernment, psychological discernment & philosophical discernment.
III. Implicit assumptions and the historical-etymological approach
IV. A final demarcation: a philosophical exploration of the (shared) assumptions on
rationality in cognitive psychology and statistics – no more, no less.
V. Philosophical position
I. A philosophical attempt to think about statistical thinking.
After reading the previous section the reader could have the misapprehension that he will be
reading a philosophical critique of psychological research or philosophy of science, i.e. a
critical opinion on what the philosophical understanding of the method of psychology should
be. Let me be clear on this: this thesis is not an attempt to point out a theoretical shortcoming
in psychological research. After all, from a pragmatic point of view one could say: “So
psychologists use a methodology without hardly any historical or theoretical concerns about
this methodology and the fact that this methodology that is artificially presented as a unity –
but so what? Psychology students are trained to be psychological researchers, not historians or
philosophers. As long as they use their methodology well – why bother?”. And I would agree
with this practical point of view: the psychological scientific enterprise can run perfectly on
its statistical fuel without any theoretical or historical thought. Of course, one could use
historical or theoretical remarks as a way to substantiate a proposal to a technical
improvement of a certain statistical method, e.g. to promote that not only the significant
results of a significance test have to be mentioned, but also its power (Gigerenzer et al.,
2004).
However, this thesis is not a methodological critique either, and there will not be
presented any ‘alternatives’ to the statistical methodology in use. Actually, I tried to avoid
On the common origins of psychology and statistics
13
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
technical issues as much as possible and statistical formulas will be hardly found in this
thesis.
Well, is it maybe a historical or sociological work? No – although I probably feel
more affiliated with those historians and sociologists of statistics who approach its history
with an emphasis on an externalist view as a social process (e.g., Theodore Porter) than with
those who approach it with an emphasis on the internalist view as autonomous history of
scientific ideas (e.g., Stephen Stigler), the aim of this thesis is not to give a complete historical
or sociological account of the position of statistics in psychology (see for the distinction of
different historical approaches: Desrosières, 1993; Rosser Matthews, 2000; Swijtink, 2000).
The long historical detours serve only as a way to create space for a philosophical
attempt to think where no thought seems necessary: it is exactly the almost unquestionable
and unreflected superiority of the statistical methodology in psychology that made me wonder
and moved me to write this thesis. Why is the superiority of statistical methodology so
unquestionable and unreflected? When one tries to say something about statistical
methodology it is easy to get trapped in a discourse of either positivistic idolatry of statistics
or a romantic longing to a hermeneutical, non-statistical psychology. Therefore we have to
ask ourselves what makes it so hard to think about statistics, without being pushed in a pro- or
contraposition.
I think the reason that makes it so hard to think in a sober, empirical way about
statistics is the fact that statistical methodology is a way of thinking itself: when one tries to
think about statistics there always is a certain ‘duplicity’, because one tries to think about
thinking. And what is more: one tries to think about a way of thought that is so powerful, that
it is hardly possible to withdraw one’s own (philosophical) way of thinking from it. A
philosopher who endeavours to think about statistical thought wavers between Scylla and
Charybdis. If a philosopher says: “Well, I personally find this statistical way of thinking a
rather limited way of thinking, overlooking a lot of qualitative aspects of life”, he himself
overlooks the enormous power of statistical thought that has become omnipresent in our
modern lives. The thought of the philosopher that he can simply ‘reject’ or ‘criticize’
statistical thought will be ‘harmless’ and ‘marginal’, because it did not grasp the impact and
amplitude of this way of thinking. However, if the philosopher does acknowledge the power
and superiority of the statistical way of thinking, he will not be able to resist to the idea that
the best way to think about statistics is therefore through statistical thinking: and he will feel
himself consequently obliged to do statistical research about the statistical way of thought. So
how should one think philosophically about statistical thought?
On the common origins of psychology and statistics
14
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
II. Statistical discernment, psychological discernment & philosophical discernment.
Notwithstanding that one can make quite often some comment on technological flaws in the
statistical technologies used, it is anachronistic to deny the overwhelming power to discern
that is added to the naked eye by the use of statistical inferential techniques. It is easy to
forget how e.g., the analysis of variance has made it quite easy to say if it is highly
improbable or not that a certain treatment has an effect. When the founding father of the
analysis of variance, Ronald Aylmer Fisher arrived in 1919 at the Rothamsted Agricultural
Experimental Station it seemed impossible to infer from the data gathered during 90 years of
experiments if the fertilizer used had some effect on the crop yields or not. The variation
measured seemed to be inconsistent and equivocal, and it was unclear which effects had to be
ascribed to the fertilizer and which to other factors like e.g. rainfall, the presence of weeds and
soil type (Fisher Box, 1978; MacKenzie, 1981; Salsburg, 2001). In the decade that followed
on Fishers arrival on Rothamsted he solved this problem and the analysis of variance was
born (Fisher, 1921, 1924). Due to this method of analysis of variance it is today quite easy to
discern – even when to the naked eye the data seem to be an inextricable tangle of variation –
if it is improbable that the variation in a given sample is due to chance or not; in this way it is
possible to partition the variation among various factors and infer if a certain treatment
produces an effect. As such, modern statistical methodology is a new way to discern, reason
and know.
Yet, it stands in a long tradition of Western thought concerning the question how to
discern order amidst of chaotic variation. Plato stands at the beginning of this long
metaphysical tradition that discerns the visible, changeable world from the eternal intelligible
world of ideas. What is it that makes us recognize a horse as a horse – despite the differences
that exist among particular horses? “It is the idea of a horse, the ‘horseness’, that enables us to
recognize a horse as a horse”, Plato answers.
But the ability of a modern scientist to discern horses goes a lot further then that of Plato: e.g.
the biostatistician is perfectly able to perform an analysis of variance on the the expression
level of a certain gene (indicated by the intensity of the fluorescence signals on a DNAmicroarray) in the Equus Caballus (domesticated horse) and the Equus ferus przewalskii (wild
horse) and conclude a significant difference between them; and a psychologists can perform
an analysis of variance to discern if laypeople are able to discern between an ordinary Equus
Caballus and an Equus ferus przewalskii or if their ability to discern these horses is affected
by alcohol consumption.
On the common origins of psychology and statistics
15
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
Both the biostatistician and the psychologist use the same statistical methodology to discern
between structural and accidental variation; yet in the psychological research the word
‘discernment’ not only relates to the research method, but also to the research object.
Moreover, it is evident that the statistical methodology in psychology has had repercussions
on the way the psychological research object – human, non-scientific cognition – is
understood: with the help of scientific statistical methods the psychologist may try to gain
insight in the workings of the ‘intuitive statistician’ (Gigerenzer & Murray, 1987).
Philosophically the relation between ‘scientific’ and ‘intuitive’ statistical discernment in
psychology raises the question how these two relate, what criterion demarcates them and how
one should approach the circularity entailed by ‘discerning about discernment’ or ‘thinking
about thinking’.
These philosophical questions will of course not be of no great concern to the daily practice of
a scientific cognitive psychologist. The only thing which matters to a experimental
psychologist as well as to every other scientist applying statistical methods is the fact that
their experimental data (i.e. on genetic differences, on crop yields, or on the ability to discern
horses under different circumstances) are a lot more meaningful when analysed with a
statistical analysis.
However, from a philosophical standpoint it is of great interest that cognitive
psychology as well as statistical methodology have implicit assumptions about ‘discerning’,
‘knowing’, ‘thinking’ or ‘cognition’: assumptions that are implied in our understanding of the
words ‘rationality’ and ‘probability’. Since both statistical thought and cognitive psychology
rely on certain assumptions concerning ‘probability’ and ‘rationality’, the comparison of these
shared assumptions may provide a possibility to think about thinking. Whereas it is hard to
find a starting-point to think philosophically about statistical thinking, it possibly might be
possible to think (philosophically) about how the underlying assumptions of the scientifically
generally accepted way of thinking (statistical thought) relate to the underlying assumptions
of how psychology thinks that we think (psychological view on human cognition): i.e., to
‘compare’ the role of the notions ‘probability’ and ‘rationality’ in statistical thought and
psychological thought. Are these assumptions the same for cognitive psychology as well as
for statistical methodology? Or do they differ?
On the common origins of psychology and statistics
16
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
III. Implicit assumptions and the historical-etymological approach
In every course on statistics students will be told that they may use certain statistical methods
only when certain assumptions – such as e.g. that the data were drawn from ‘a normally
distributed population’ or from two populations with the same spread of scores – are met.
These assumptions do not seem to concern the question of what the statistical method
assumes about how the human thought has access to and knowledge of its surrounding world
– or do they?
Implicit assumptions are assumptions that seem so self-evident, that I think the only way to
bring them to light is by a historical or semantic-etymological approach: after all, when it can
be shown that certain assumptions have a historical origin it may shed a different light on
their presumed self-evident nature and it may become possible to see their philosophical
meaning.
I endorse therefore the philosophical approach of Hacking, when he says:
There is an anti-positivist model which, for all its obscurity, may […] have its appeal. We should
perhaps imagine that concepts are less subject to our decisions than a positivist would think, and that
they play out their lives in, as it were, a space of their own. […] In the past 300 years there have been
plenty of theories about probability, but anyone who stands back from the history sees the same cycle of
theories reasserting itself again and again. (Hacking, 1975, p. 15)
IV. A final demarcation: a philosophical exploration of the (shared) assumptions on
rationality in cognitive psychology and statistics – no more, no less.
Of course, one can not expect to find in this thesis a complete historical account on the history
of statistics, nor a exhaustive treatment of all philosophical aspects or an extensive linguistic
study on the etymological development of words related to probability and statistical
methodology.
Moreover, I will restrict myself to ‘orthodox’ cognitive psychology for the sake of
clarity. I will not make any digressions about areas such as social psychology or clinical
psychology.
I will restrain myself of digressions on any anti-statistical, hermeneutical movements
in psychology either, because they have had (except for the Skinnerian movement) scarcely
any effect if any on scientific practice. Nor will I go into measurement theoretical issues that
question the possibility of measuring psychological concepts (see e.g. Michell, 1999): I think
that the question if abstract concepts (such as “attention” or “commitment”) can be expressed
On the common origins of psychology and statistics
17
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
in numbers is a question that every scientist always has to ask himself – but in the end this is a
practical question that can be restated as: “ Is my measurement level yielding useful and
sound results”?
All the historical, etymological and philosophical deliberations to be found in this
thesis only have one aim: to awake us from our thoughtlessness about the seemingly selfevident assumptions that may be hidden in statistical methodology and cognitive
psychological science on the relation of human thought and ‘the-world-as-it-is’ and to
provoke a philosophical wondering about them instead.
The focus of attention will be constantly shifting between cognitive psychology and
statistics. Each time the leading question will be on which points the assumptions on
rationality and probability that are made in them are the same and on which points they
diverge.
V. Philosophical position
Thoughts do not appear out of the blue. Nor do the thoughts in this thesis. There are several
thinkers who I would like to name in particular, because it will give the reader of this thesis an
idea of what he might expect. I owe my vision upon falsification, the impossibility of
induction and the objectivity of probability to Karl Popper; from Ian Hacking I learned to
know the Janus-faced character of probability; I am indebted to Gerd Gigerenzer for the idea
of how the statistical method has developed into a metaphor in psychological theories of the
mind; Lorraine Daston made me see the link between classical probability and associationism;
Martin Heidegger has been decisive in shaping my thoughts on the role of the ‘subject’ in
Western thought.
On the common origins of psychology and statistics
18
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
[1.C] Hypothesis: both psychology and statistics bear the marks of an outdated
subjectivism
The statistical-research practice in psychology is a success (cf. Cowles, 1989). If a scientist
wants to do psychological research, statistical methodology is practically a sine qua non. No
other methodology can compete with the success of statistics.
Yet, the level of conceptual understanding of statistics among psychologists seems to
be rather low. How can this be?
In part one of this thesis I will defend the hypothesis that the many conceptual
misunderstandings about statistics are rooted in the tendency among psychologists to interpret
probability (whose assumptions underlie the statistical inferential methodology) subjectively.
In order to substantiate this hypothesis the role of subjective (in particular Bayesian) and
objective (in particular frequentist) statistical inference in statistical methodology and
psychological cognitive theories will be studied in detail. Moreover, it will be shown how a
lot of the contemporary cognitive theories are build upon the idea that subjective probability
can be seen as a standard against which one can measure human rationality. I will argue that
from a philosophical point of view the subjective and objective interpretation cannot be
reconciled, because they endorse two different epistemological positions: respectively,
classical epistemology and evolutionary epistemology.
Yet – although the frequentist statistical methodology and its underlying evolutionary
epistemology show that rationality and probability have nothing to do with my personal,
Cartesian subjective beliefs – it is my thought and my words which are affected by it. Statistics
and probability affect how I discern and how I know. How can I say how probability and
statistics concern me – without relapsing in Cartesian, subjective talk?
This will be the central question of part two. The hypothesis, which will be proposed
in order to answer this question, is that there was a seventeenth century conceptual turn which
entailed the textualization of the world (cf. Hacking, 1975): ‘nature’ became the ‘book of
nature’. As Galileo wrote famously in the Assayer in 1623:
Philosophy is written in this grand book - the universe - which stands continuously open to our gaze.
But the book cannot be understood unless one first learns to comprehend the language and interpret the
characters in which it is written. It is written in the language of mathematics, […]. (Galileo, 1957, p.
237)
On the common origins of psychology and statistics
19
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
Subsequently I will argue that within this seventeenth century textualization of the
world the notions of rationality and probability emerge in such a way that they entail the
latent beginnings of both psychology and statistics. This latent period will last until the
nineteenth century, wherein the notions rationality and probability will change in a radical
way. This change will lead to the nineteenth century emergence of psychology and its
statistical inferential methodology as we know them now.
To substantiate this hypothesis I will focus on:
(1) how the predecessors of psychology (viz., the seventeenth century Cartesian
rational subject and eighteenth century associationist psychology) were entangled
with the predecessors of modern statistical inferential methodology (viz., seventeenth
and eighteenth century probability theory).
(2) how the nineteenth century change of the notions ‘probability’ and ‘rationality’,
on the one hand made it possible for psychology and frequentist statistical
methodology – both useful and fruitful disciplines – to emerge, but on the other hand
entailed rather unpleasant epistemological consequences which lead to hermeneutical,
i.e. anti-statistical and anti-psychological, reactions in nineteenth century thought.
I will try interpret these two periods, i.e. the classical probability of the seventeenth
and eighteenth century and the modern probability of the nineteenth century until our present
days, in the light of its origin: the textualization of the world. This may put the question how
probability and statistics concern me in a completely different light.
On the common origins of psychology and statistics
20
CHAPTER 1: WHAT IS THIS THESIS ALL ABOUT?
[1.D] Recapitulation & outlook on the next chapter
Recapitulation of chapter 1:
The relation between psychology and its statistical methodology, i.e., statistical inference, has some remarkable
features: (a) statistical methodology is intrinsically linked to psychology but at the same time seems to be a
‘contamination’ from the exact sciences; (b) statistical methodology is presented as a timeless, monolithic truth –
which it is not! – and as such it unifies psychological research that itself lacks a coordinating global theory; (c)
one would expect a very deep level of understanding of methodology in a science wherein method takes such a
central position: however, statistical methodology is on a theoretical level notoriously poor understood by its
applicants. It is assumed that these aspects that characterize the relation between statistics and psychology can be
thought of on a philosophical level if their relation to the underlying notions ‘rationality’ and ‘probability’ could
be clarified. The meaning of this notions and their relationship in two historical periods (seventeenth / eighteenth
century classical probability and nineteenth / twentieth century modern probability) will be explored.
Outlook on chapter 2:
A historical and terminological overview of the words ‘probability’ and ‘statistics’ is presented. There is a
remarkable lack of probabilistic thinking until the second half of the seventeenth century. It will take another
two centuries before the probabilistic calculus will lead to the development of inferential statistics: until the end
of the nineteenth century practically no inferential statistics existed, only descriptive statistics. Although there is
a great consensus about the probabilistic calculus, on a theoretical level it is unclear if probability has to be
interpreted as a subjective or an objective phenomenon. This debate had also its repercussions on inferential
statistical methodology, which is based on probabilistic assumptions.
On the common origins of psychology and statistics
21
PART ONE:
ONE: Statistics and Psychology.
Psychology.
The struggle against subjectivity.
On the common origins of psychology and statistics
22
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL &
PHILOSOPHICAL OVERVIEW OF THE WORDS ‘PROBABILITY’ AND
‘STATISTICS’
In our exploration of the relation of psychology and statistics we will put ‘psychology’ for a
moment aside (I will return to psychology in the next chapter, i.e., chapter 3) and concentrate
ourselves first on the word ‘statistics’. Stop reading for a moment and ask yourself: “What is
statistics?”.
To grasp the drift of the word ‘statistics’ is not an easy matter. In order to understand its
philosophical impact it will be necessary to clarify its relation to the ‘underlying’ notions
‘probability’ and ‘rationality’. In this chapter I will aim to show the relation between the
word ‘statistics’ and the word ‘probability’.
The clarification of the relation of the words ‘statistics’ and ‘probability’ to the word
‘rationality’ will be largely postponed to chapter 4 and 5. I only will touch upon the relation
between ‘probability’ and ‘rationality’ very superficially in section (D) and section (E) of this
chapter.
The elucidation of the relation between the words ‘statistics’ and ‘probability’ will take some
long historical and etymological detours.
In the first two sections of this chapter I will explain why it is necessary to approach these
words in such a circumstantial way: in section (A) I will present some misconceptions about
statistics and in section (B) I will show the omnipresence of probability and statistics. In
section (C) a terminological overview of the words ‘probability’, ‘statistics’ and ‘statistical
inference’ is presented.
The last two sections have a historical character. Section (D) goes back to the very beginnings
of probability and its emergence in the second half of the seventeenth century. The last
section, viz., section (E) shows how seventeenth century ‘probability’ developed into
eighteenth century ‘Statistik’ and nineteenth and twentieth century ‘statistical inference’. A
central theme in this whole chapter will be the difference between the subjective and objective
interpretation of probability.
On the common origins of psychology and statistics
23
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
[2.A] Misconceptions about statistics & the necessity of a terminological and
historical overview.
It is stunning how many misconceptions exist about statistics. One of the most popular
misconceptions among lay people is that statistical methodology is just quantification and
counting. Another well known fallacy is the idea that everything in the world is normally
distributed – a distortion of Bernoulli’s central limit theorem. And even when people have
followed some courses in statistical methodology and are perfectly capable to perform certain
statistical operations (to calculate a standard deviation, to perform a t-test, etc.) they often will
have only a hazy idea of the meaning of these statistical operations: it is especially due to the
introduction of user-friendly statistical computer packages that it has become much easier to
perform a regression analysis than to grasp the theoretical meaning of it.
Yes, statistical misconceptions do abound – but in general they are misconceptions
with obvious historical roots: after all, there has been a time that statistics was mostly ‘just
counting’, because there were no generally accepted rules for statistical inference yet; there
has been a time when scientists believed that the normal distribution was an almost divine law
with which everything in this world has to comply; and there has been a time that the majority
of statisticians had a complete wrong understanding of a statistical phenomenon like
‘regression’. I believe that the abundance of statistical fallacies that have prevailed and still
prevail is not a consequence of the stupidity of psychology students, but has rather to be seen
as a sign of the fact that statistical thinking is in a sense ‘unnatural’ and counter-intuitive.
Therefore I think it is necessary – previous to all philosophical thinking about statistical
methodology – to gain a proper understanding of some basic terms in statistics and what their
mutual relationships are, e.g.: “How do statistics and probability relate to each other?”, “What
is inferred from what in statistical inference?”, “What is the status of the Normal
Distribution?”, etc.
This chapter aims to clarify some of the often misunderstood issues in statistics and
presents a guideline with regard to the terms and names to which I will often refer henceforth
in this thesis. Because one can point out historical reasons for the majority of misconceptions
concerning statistics, this orientating overview will have a historical emphasis.
On the common origins of psychology and statistics
24
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
[2.B] The omnipresence of probability and statistics.
I mentioned already in chapter 1 that statistical methodology thrives in almost every science
and in psychology maybe even more than in most other sciences. But it is good to realize that
statistics is not confined to scientific methodology. Probability and statistics are everywhere.
The prominent philosopher on probability Ian Hacking (2004, p. 4) states that there are “more
explicit statements of probability presented on American prime time television than explicit
acts of violence (I’m counting the ads)”. You just have to turn the television on to see this:
weather forecasters tell how much chance there is that it will rain tomorrow, advertisers claim
that their detergent is 83% more effective than other cleaning products and sternly looking
scientists estimate the probabilities of bird flu pandemics, global greenhouses and cancers.
The ‘imperialism of probabilities’ (Hacking, 2004,p. 5) is not confined to the television
screen. We all have insurances against all sorts of risk and insurance companies are the
employee of many a statistician. The making of reasonable decisions has become equivalent
with probability based decision-making: “No public decision, no risk analysis, no
environmental impact, no military strategy can be conducted without decision theory couched
in terms of probabilities” (Hacking, 2004, p. 4). Before stepping into our car after drinking 3
glasses of wine, we consider our odds of getting a fine or causing an accident. We weigh
ourselves and hope that our weight does not exceed the norms of the Quetelet-index. We take
multiple choice tests and our answers are statistically analysed. We take IQ-tests when
applying for a job. We observe anxiously if our children’s ability and characteristics deviate
from the average. In short: our lives are drenched with probabilities and statistics.
On the common origins of psychology and statistics
25
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
[2.C] Making sense in the pile of words: probability, statistics and statistical
inference.
Probability and statistics are words that are often mixed up. The situation becomes even more
complex because the word ‘statistics’ actually refers to two different phenomena: descriptive
statistics, which in principle are unrelated to probability, and inferential statistics – i.e., the
scientific methodology that makes it possible to draw general conclusions from a limited
amount of observations and which is ‘build’ for a large part on probabilistic axioms.
In the following sections I will clarify:
(I) Probability: what ‘meaning’ or ‘interpretation’ can be given to probability, viz.,
subjective and objective interpretations;
(II) Statistics: descriptive statistics and inferential statistics: why the emergence of
statistical inferential methods was a great scientific breakthrough and how the lack of
reliable inferential statistical methods sometimes had disastrous consequences;
(III) Inferential statistics built on probabilistic axioms: how the different
interpretations of probability lead to different forms of statistical inference, viz.,
Bayesian inference and frequentist inference.
I. Probability
When we throw a regular looking die, what is the probability of getting a 5? The calculation
to answer this question is very straightforward. We count the possibilities of an event and
divide this by the total number of possibilities, in order to obtain a number between 0 and 1:
the less likely an event is to occur, the closer its probability will be to 0. So the probability of
getting a 5 is quite obviously: 1/6. But what does this number 1/6 mean; i.e. what reason makes
it seem so obvious that the probability is 1/6?
This seems to be an easy question. Yet it has exercised many of the great minds of the
twentieth century and also today the discussions rage on this still ‘unsolved’ issue. The
literature on the interpretation of probability is overwhelmingly multitudinous and scattered.
Still one can , roughly spoken, distinguish two camps whose adherents are traditionally called
the subjectivists and the objectivists (Gillies, 2003). Both of these camps are subdivided in
several subcamps and conglomerates of different camps, that may even encompass certain
combinations of subjective interpretations as well as objective interpretations: sometimes it
seems as if everybody presents his own particular interpretation of probability.
On the common origins of psychology and statistics
26
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
However, the distinction in subjective and objective probabilities reflects a major
distinction within the possible interpretations that can be given to probability, namely: is
probability a ‘subjective’, epistemological phenomenon that has its roots in yourself (e.g., in
your logic or your perception) or is it a ‘objective’ phenomenon that really exists –
ontologically – in the ‘outer world’? If you say that a die has a probability of 1/6 to show a
certain face, is the word probability then referring to your knowledge or to some really
existing ‘attribute’ of the die? The subdivisions within the subjective and objective
interpretation are endless. To avoid getting astray in the endless variants of interpretations of
probability, I will elucidate here only one particular variant of the subjective interpretation –
the so called personalistic interpretation – and one particular variant of the objective
interpretation – the so called frequentist interpretation (Gillies, 2003); on the basis of these
two variants the difference between the objectivist and subjectivist interpretation in general
can be further clarified on a very basic level which will provide a sufficient understanding for
the moment:
(a) The subjective-personalistic interpretation says that probability is only an
expression of ignorance of real causes – probability in itself does not exist. Due to your
ignorance you have to guess. This guess is a subjective belief: when you for instance look at a
die, you assign a certain probability (i.e., your subjective belief) to the occurrence of the five
turning up. What is the best guess you can make? There can be plenty of little causes that
influence the outcome where you have no knowledge of (e.g., the fairness of the die, the
sweatiness of your hand, the subtle north-easterly wind, the smoothness of the surface of the
table and the movements of the earth), but exactly because you have no knowledge of them
you do not have any reason to assume that any of the six possibility outcomes has a differing
probability. Thus you assume that the six faces of the die have an equal probability of turning
up. However, when you would notice after throwing several times with the die that the face
with the number 5 on it keeps turning up much more often than you expected at first, you
could adjust your subjectively assigned prior probability of 1/6 to a higher probability –
“Probably this die is biased”, you conclude. So, probability is in this interpretation a matter of
purely personal degree of belief – Popper (1972, p. 148) calls this subjective interpretation
therefore psychologistic – in a situation of epistemological uncertainty. You have a lack of
knowledge and therefore you have to rely on probabilistic beliefs; when you would have no
lack of knowledge there would be no probabilities either. Implicitly this view endorses
therefore a deterministic worldview, because if you would have no epistemic uncertainty and
On the common origins of psychology and statistics
27
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
would exactly know all the conditions influencing your die throw you would be able to make
an exact prediction instead of a probabilistic one. So, an adherent of the subjective
interpretation of probability would say that when you throw a die, there is no chance involved,
but just a lack of knowledge.
This creates a big problem for the subjective interpretation of probability: for if it’s
true that probability is just a personal belief, then it is to a certain extent (of course their may
be some practical reasons for choosing a certain belief) an arbitrary belief or at least a mere
construction. And also logically it seems strange that one is creating knowledge from a lack
of knowledge. So, is there really any logic in it if you assume that probability is purely
subjective? Is probability just a palliative for your lacking knowledge or is there a objective
situation (the shape of the die, the surface of the table on which it falls, the gravitational
forces) that ‘produces’ a certain frequency of the number 5 turning up when a repetition of
throws is made? Does it not make sense as well that the probability of 1/6 is also something
that exists independent of our beliefs? When we are throwing a die in a black box where it
cannot be seen, their probably would still turn out to be a certain stable frequency of
occurrence of the face with the number 5 on it turning up, not?
(b) The objective-frequentist interpretation sees probability as a phenomenon that can
be scientifically measured and established: probability is “a statement about the relative
frequency with which an event of a certain kind occurs within a sequence of occurrences”
(Popper, 1972, p.149). So within the objective-frequentist interpretation probability is not just
a belief resulting from a the lack of knowledge of a “specific individual” concerning
“particular events” (Gillies, 2003, p. 89), but probability is instead a stable frequency of
occurrence that manifests itself in a long sequence of repetitions. This fact – that probability is
a stable frequency which manifests itself in a long sequence of repetitions in a measurable
way – makes it scientific. Scientifically there is no necessity to pinpoint exactly the
‘meaning’, ‘cause’ or ‘reason’ of this frequency of occurrence; the observation that this
probability ‘manifests’ itself and has useful applications is scientifically spoken sufficient.
Probability is seen as an ontological ‘reality’ – independent of our knowledge – because it can
be measured objectively. Von Mises (1883-1953), one of the first great adherents of the
frequency interpretation of probability, compares probability with ‘length’: although it is also
difficult to say philosophically what ‘length’ is, nobody will reproach a surveyor measuring
land – a very useful and practical job – that he is just measuring a figment of his imagination
(Mises, 1936, p. 36).
On the common origins of psychology and statistics
28
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
“Denn in der Anwendbarkeit einer Theorie auf die Wirklichkeit sehe ich den wesentlichsten, wenn nicht
einzigen Prüfstein ihres Wertes”. (Mises, 1936, p. 34)
And the usefulness of probability is something that hardly can be doubted. The life-insurance
company that gathers a great amount of data about the mortality of 41-years old German men
can calculate the probability of mortality within this collective and do well out of it: it seems
ridiculous to say that they are just measuring their ‘belief’ about the amount of 41-years old
German men dying. As the French mathematician Poincaré (1854-1912) asked himself: “How
could insurance companies make regular profits if there is no objective reality corresponding
to their probability calculations?” (Gillies, 2003, p. 86). Insurance companies live on the so
called ‘law of large numbers’: the plain fact that if you make many independent observations
then the average of the sample is close to the average of the population and therefore stable
and predictable (Moore & McCabe, 1999). So frequencies are real and it is science which can
observe them: “In frequency theory, probabilities are associated with collections of events or
other elements and are considered to be objective and independent of the individual who
estimates them, just as the masses of bodies in mechanics are independent of the person who
measures them” (Gillies, 2003, p. 89).
One of the objections to the frequency interpretation is that a finite empirical collection
(e.g., a sample of 41-years old German men) is represented in the mathematical theory by an
infinite mathematical collective (Gillies, 2003, p.90), because “probability values are defined
as the limit of an infinite sequence” (Cowles, 1989, p. 58). One could answer this objection by
referring to the fact that this is something that occurs everywhere in physics and that the
obtained frequency-ratio can be used as a hypothesis for the true value of the infinite
collective and that this can be tested (Cowles, 1989, p. 58). Some people also object to the
frequency theory that probability as a long run frequency seems quite a ‘mystical’ property.
Objectivist probabilist have come up with several explanations why there is nothing
‘mystical’ about the statement that probability is a ‘frequency’ or a ‘attribute’ that arises in
certain ‘chance setups’. However, these objectivist explanations are quite technical and
therefore I will not go further in these matters. For reasonably comprehensible and convincing
accounts I refer to Hacking (Hacking, 1976) and Popper (see e.g. Popper, 1972; Popper,
1983).
Probability seems to be shifting between the objective and the subjective interpretation – it is
‘Janus-faced’ (Hacking, 1975). However, most scientists who use probability could not care
On the common origins of psychology and statistics
29
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
less what interpretation is given to it or they even might see the philosophical debate as an
irritating obstacle for practitioners of probability (Robert, 2001). It is especially since the
Russian mathematician Kolmogorov published in 1933 in his famous book Grundbegriffe der
Wahrscheinlichkeitsrechnung an arithmetical axiomatic model for probability theory, that
most practical scientists lost every interest in the philosophical meaning of probability. The
axiomatic model of Kolmogorov is a formal system – the axioms do not have a fundament in
the ‘real world’ and do not answer the question how to interpret probability on a philosophical
level (Hacking, 1976) – but on a formal or mathematical level the nearly universally accepted
system has settled the majority of disputes:
Following [Kolmogorov’s Grundbegriffe der Wahrscheinlichkeitsrechnung], a mathematician would
answer the question of what is probability by saying: Anything that satisfies the axioms. Expressed in
technical jargon, probability is a normalized denumerably additive measure defined over a σ-algebra of
subsets of an abstract space. Something is lost in the answer, however. For if the space is finite, the
answer shrinks down to saying: Probabilities are numbers between 0 and 1 such that if two events
cannot occur simultaneously, the probability of either one of them occurring is the sum of the
probability of the first and the probability of the second. The mathematician’s formalistic approach to
the question does not address the meaning of probability […]. (von Plato, 1995, pp. 1-2)
Figure 1. A. N. Kolmogorov
So, although the formalistic system of Kolmogorov settles the majority of disputes on a
mathematical level, it does not address the fundamental philosophical questions on the
meaning of probability, and even less on the meaning of chance.
On the common origins of psychology and statistics
30
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
II. Statistics: descriptive statistics and inferential statistics
Until this moment we have spoken about probability and how to interpret it, not about
statistics. In day-to-day language the word ‘statistics’ is a word used in such a way that two
different but interrelated meanings of the word are mixed-up.
The first sense in which the word ‘statistics’ is used, is for tabulated numerical data
relating to aggregates of individuals, e.g., when the results of an examination are published
the results are often neatly categorized in a table, so that one can see how well or how bad an
obtained mark is in comparison with the results of the other students. I would be quite normal
if a student in such a situation would say: “Although I did not get a first grade, but only an
upper second, the statistics tell me that I still belong to the 15% of the highest scores”. This
kind of statistics is only descriptive, for it does not draw any conclusions about, for instance, a
whole population of students.
The second sense in which the word ‘statistics’ is used is when one actually means
‘inferential statistics’. Inferential statistics is a form of ‘inferential’ or ‘inductive’ reasoning,
grounded on probabilistic axioms. In inductive reasoning general conclusions are drawn – or
‘inferred’ – from a limited amount of observations, e.g. after observing ten black rabbits you
conclude that all rabbits are black. It is not at all obvious that induction is a valid way of
reasoning. Is it not quite illogic to draw from a particular observation (“the ten rabbits I
observed were black”) a general conclusion (“all rabbits are black”)? This general conclusion
was not implied in the particular observation and seems to appear out of nothing: like the
proverbial rabbit out of the hat. This ‘problem of induction’ was for the first time explicitly
described by Hume (1711-1776):
Any degree, therefore, of regularity in our perception, can never be a foundation for us to infer a greater
degree of regularity in some objects which were not perceived, since this supposes a contradiction, viz.,
a habit acquired by what was never present to the mind. (Hume, 2002/1739, book 1, part 4, section 2, p.
131)
The problem of induction is still bothering philosophers; in particular analytic philosophers
“still drive themselves up to the wall (to put it mildly) when they think about it seriously”
(Hacking, 2001, p. 190).
On the common origins of psychology and statistics
31
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
Figure 2. David Hume
Statistical inference is a form of inductive reasoning that “presents itself as a
mathematical solution to the problem of induction” (Cowles, 1989, p. 27). The inductive
problem is in statistical inference replaced – or at least evaded (Hacking, 2001, p. 252 and 261
f.f.) – by the question if certain probabilistic axioms, e.g. normality and independence of the
observations, that would validate the inference are rightly assumed. These probabilistic
axioms assume that in the long run there is constancy in time: the number 5 on a fair die will
have a probability of 1/6 now, as well as in 2012 and in 2096, when you threw it a large
number of times, e.g. 10.000 times; and given the same circumstances or ‘chance set-up’ the
average probability of dying when you are a 41-year old German will be in 2012 the same as
it is now, when only a great number of 41-year old German is taken into account.
Still, one could not say that an answer has been found to Hume’s problem of
induction: only the problem itself has become more or less irrelevant, because of a change in
what we think ‘reason’ is (this is one of the major themes of this thesis, so I will discuss this
later more extensively):
Anyone who tries to argue that the future will be like the past, on the ground that past futures have been
like the pasts, is arguing in a circle. […] It is not, therefore, reason which is the guide of life, but
custom. That alone determines the mind in all instances to suppose the future conformable to the past.
However easy this step may seem, reason would never, to all eternity, be able to make it. […]
We can do more with probability than Hume imagined. Probability theory was just beginning in
Hume’s day. (Hacking, 2001, p. 251)
So, statistical inference evades the problem of induction – but what is statistical
inference? Statistical inference is a form of inductive reasoning that “may be defined as the
use of the methods based on the rules of chance to draw conclusions from quantitative data”
On the common origins of psychology and statistics
32
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
(Cowles, 1989, p.29). These rather abstract and dry formulations of what statistical inference
means, conceal the revolutionary amplification of research successes that the emergence of
statistical inference has brought about. Statistical inference as a scientific methodology only
timidly emerged in the second half of the nineteenth century and it started efflorescing as late
as the beginning of the twentieth century (see e.g. Gigerenzer et al., 1990).
Before statistical inference emerged there was a major problem; although lots of data
were gathered, the tools to draw inferences from these data were almost completely lacking.
A bitter example of this lack of inferential methodology is the outburst of blood-letting in
France between 1815 and 1835, which has led to many unnecessary lost lives. Never in
history blood-letting was so widely employed as then. And although the blood-letting mania
was embedded in a discussion wherein both the major advocate of bloodletting, doctor
Broussais, as his adversaries threw at each other with impressive looking statistics, the
abundant statistics would not lead to any substantive conclusions. “There were, then, statistics
galore, but few conclusive statistical inferences. They were tools of rhetoric, not science. For
all the enthusiasm for numbers, they did not have the immediate effect that one would have
expected” (Hacking, 2004, p. 85).
Figure 3. F.-J.-V. Broussais
III. Inferential statistics built on probabilistic axioms
Scientific statistical inference is able to infer from a limited amount of observations
something that is in principle unobservable – based on probabilistic axioms. Because these
probabilistic axioms are needed to infer conclusions from otherwise merely descriptive
statistics, the distinction between the subjective and the objective interpretation of probability
can be felt also in inferential statistics: there is inference based on subjective probabilistic
On the common origins of psychology and statistics
33
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
axioms – called Bayesian statistical inference – and there is inference based on objective
probabilistic axioms – called frequentist statistical inference. Although the formal probability
rules of the axiomatic system of Kolmogorov hold for all types of probability, the system can
be interpreted in a subjective way (then it is Bayes’ rule that forms the base of the subjective
or Bayesian statistical inference) or in an objective way (then it is Bernoulli’s theorem – also
known as the law of large numbers or the first limit theorem – that forms the base of the
objective or frequentist statistical inference) (Maistrov, 1974, p. 264). In the following three
subsections I will clarify some aspects of [a] subjective statistical inference (often called
‘Bayesian’), [b] objective (in particular: frequentist) statistical inference, and, [c] why
Bayesian statistical inference in my opinion cannot be used in scientific, statistical
methodology.
(a) Bayesian (i.e., ‘subjective’) statistical inference:
When a physician observes that a particular patient shows the symptoms fever, rash and red
bumps he may infer from these symptoms a subjective degree of belief, i.e., a hypothesis with
a certain probability, that these symptoms are caused by the measles virus. When the
physician uses statistical data (e.g. the prevalence of measles among patients of a certain age
and the frequency of occurrence that these symptoms indeed turned out to be caused by
measles) to assign a probability to his measles-hypothesis, he is making a Bayesian statistical
inference. Crucial in Bayesian statistical inference is that a hypothesis has a probability and
that the probability of this hypothesis can be adjusted on the base of new empirical evidence:
every time the physician is confronted with the symptoms fever, rash and red bumps that turn
out to be caused by measles the probability of his hypothesis goes up.
Bayesian statistical inference is a rarely used method among psychologists and other
scientists, because there are not many situations wherein the a priori degree of belief (i.e.,
before it is adjusted by further experience) can be established in an unequivocal way. The
‘solutions’ proposed to avoid this ambivalence in the a priori degree of belief have been
manifold. The statistician I.J. Good once counted, a bit tongue in cheek, up to 46,656 possible
different Bayesian views (Gigerenzer, 2000, p. 16). I will present here just one them, to give
an impression of the problems that are at stake in Bayesianism. The Italian statistician Bruno
de Finetti (1906-1985) noticed that most people are not very good in estimating the
probability of their beliefs (Aczel, 2004).
On the common origins of psychology and statistics
34
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
Figure 4. Bruno de Finetti
Suppose you have a friend, whose girlfriend went for a holiday to Ibiza with a
girlfriend. When you ask your friend if he is sure that his girlfriend will not cheat on him he
answers that he is one hundred percent sure that she will be true. But is your friend ‘really’ in
touch with his inner feelings? Is the probability of his a priori belief in the faithfulness of his
girlfriend ‘really’ one hundred percent? De Finetti proposed an ‘objective’ way to measure
subjective probability. Offer your friend a hypothetical choice: “If it will turn out that your
girlfriend has been indeed faithful to you, you will win one million dollars, but you can also
choose to pick a ball out of bag with 90 red and 10 blue balls, and if you pick out a red ball
then you will win the million”. If your friend chooses to draw from the bag instead of relying
on his girlfriends faithfulness, you can conclude that he is not as confident as he claimed to
be: apparently he is at most 90% confident. You can adjust the ratio of red to blue balls in the
bag until your friend prefers to rely on his girlfriend to a pick from the bag, to find out the real
probability of his confidence. Although a many Bayesianist will rely on less ‘psychological’
or ‘personalistic’ methods to establish the probability of a belief, this ‘de Finetti game’
clarifies why Bayesian inference is viewed by quite some people – including myself – as
suspiciously ‘soft’ and ‘vague’.
Bayesian statistical inference is mainly used in areas where such a ‘framework for
thinking’ is a helpful heuristic, e.g. in the courtroom where it helps to combine evidence in a
structural manner and in artificial intelligence applications. Although since the 1960’s there
has been a growing amount of statisticians and philosophers of science who promote the use
of Bayesian statistical inference as an appropriate methodology for scientific research (e.g.
Gigerenzer et al., 2004; Romeijn, 2005; Winkler, 1974), most statistical textbooks do not even
mention Bayesian techniques (Gigerenzer et al., 2004) and researchers using Bayesian
techniques are unique phenomena:
On the common origins of psychology and statistics
35
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
Hays (1963) had a chapter on Bayesian statistics in the second edition of his widely read textbook
[Statistics for psychologists] but dropped it in subsequent editions. As he explained to one of us (GG),
he dropped the chapter upon pressure from his publisher to produce a statistical cookbook that did not
hint at the existence of alternative tools for statistical inference. Furthermore, he believed that many
researchers are not interested in statistical thinking in the first place but solely in getting their papers
published. (Gigerenzer et al., 2004, p. 395)
(b) Frequentist (i.e., ‘objective’) statistical inference:
When a scientists wants to draw inferences about a population on the basis of the
observations which he obtained from a random sample, it is most likely that he will use
frequentist statistical inference. Frequentist statistical inference relies on probability as it
manifests itself in the long run. Statistical hypotheses are compared with data, obtained in
carefully designed experiments. Contrary to the hypotheses in Bayesian inference these
hypotheses can only be true or false – in frequentist inference it is impossible to assign a
probability to a hypothesis (Hacking, 2001, p. 211); the probability relates only to the data,
given a certain hypothesis. Suppose for instance that you want to test a hypothesis that a die is
fair and that the probability of the turning up of the face with the number 5 on it is 1/6. This
hypothesis cannot be ‘a little bit’ true. There are only two possibilities: reject the hypothesis
or stick to it – one cannot be sure a hypothesis is ‘true’ when it has stand up to a test, because
it may be falsified in another test. When a hypothesis stands up to many tests, it is a strong
hypothesis: Popper calls such hypotheses corroborated hypotheses (Popper, 1972). When has
a hypothesis be rejected? A hypothesis has to be rejected when the probability of the observed
data given the hypothesis is very low. The scientist can choose how low is ‘very low’: often
used levels are 5% or 1%. So when in 1800 throws the number 5 turned only 6 times face-up,
these data seem rather improbable given the hypothesis that the die is fair – but it could be,
although the chance of such a low occurrence of the number 5 when throwing a fair die is
quite small: therefore we may conclude that either we observed something quite unusual, or
the hypothesis that the die is biased is false. In this case the observed frequency has such a
low probability, that a scientist will probably reject the hypothesis. To calculate the
probability of the number 5 turning up only 6 times out of 1800, he uses frequency-type
probability.
On the common origins of psychology and statistics
36
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
(c) Why Bayesian (i.e., ‘subjective’) statistical inference in my opinion cannot be used
in scientific, statistical methodology:
Objective, frequentist inferential statistics assigns probabilities to data; whereas subjective,
Bayesian inferential statistics assigns probabilities to hypotheses.
The reason why probabilities may be assigned to data is quite clear. Data – when gathered in
an accurate scientific manner – may be assumed to be random. This assumption of
randomness allows a scientist to apply probabilistic axioms to them and assess their
probability given a certain hypothesis.
However, hypotheses are not random events. Therefore the assignment of probabilities
to hypotheses cannot be grounded in the assumption of randomness. This raises the question
on which ground subjective, Bayesian statisticians assign a probability to a hypothesis.
Apparently it is a fallacy to think – as some subjective probabilists do – that probability grows
with experience. It is untenable to hold that the more white swans one encounters, the more
probable the hypothesis ‘All swans are white’ becomes; after all, one just needs to encounter
one black swan to falsify this hypothesis completely (Popper, 1972). Also the idea that
scientists should “discover their degrees of beliefs by introspection, perhaps by considering
the odds they might give if presented with […] a series of bets” (Mayo, 1996, p. 75), sounds
suspiciously subjectivistic and unsolid. Moreover, why would one assume at all that beliefs
are expressible as probabilities? Although contemporary Bayesianists have performed endless
learned tours de forces to make their subjectivism look unimpeachably objective (Mayo,
1996, p. 85), Bayesianist statistics stays in nuce always a subjective, inductive method –
struggling with the question why a certain probability should be assigned to a hypothesis –
and therefore, at least in my opinion, is utterly unfit as a method in an experimental science
which wants to put hypotheses to a test.
So, to summarize this section, up to this point we have seen that inferential statistical
methodology is built on probability, which may be interpreted in either a subjective
(‘Bayesian’) or an objective way. In my personal opinion Bayesian statistical inference is
unfit to be used as a scientific method.
However, whereas probability calculus itself stands on a stable arithmetical axiomatic model
(mainly the axioms formulated by Kolmogorov), a strong axiomatic model for statistical
inference – both objective and subjective! – is lacking. Once I had a talk with a student of
mathematics. When he heard that I studied psychology he asked me with a condescending
smile: “So you work with statistics? In my first year in college I had to attend a course that
On the common origins of psychology and statistics
37
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
consisted for one half of classes on the calculus of probabilities and for another half of classes
on statistics: and as to every mathematician it became already then immediately clear to me
that the former was a venerable calculus, firm as a rock, while the latter – although build upon
the respectable calculus of probabilities – was a ramshackle discipline, juggling in a highly
suspicious way with assumptions.”
Of course, this student put things in very oversimplified way: but his remark conveys a
nucleus of truth to which the majority of exact scientists would subscribe. After all, in the
calculus of probability there has not changed a lot since Laplace (1749-1827) and Gauss
(1777-1855) whereas statistical inference as a scientific methodology at that time was still in
an embryonic stage.
On the common origins of psychology and statistics
38
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
[2.D] The remarkable lack of probabilistic thinking until the second half of the
seventeenth century
After the terminological overview of the words ‘probability’, ‘descriptive statistics’ and
‘inferential statistics’ in the previous section, the remaining two sections of chapter two (§2.D
and §2.E) will be devoted to some historical aspects of these notions.
In this section (§2.D) the remarkable lack of probabilistic thinking until the second half of the
seventeenth century will be studied.
The first subsection of this section explains that – although it is different to imagine! –
the emergence of probability really involved a deep conceptual change and that the
probability such as it emerged in the seventeenth century is incommensurably different from
its ‘precursors’.
The second subsection elucidates the circumstances that triggered the conceptual
change that made the emergence of probability possible.
I. Why is the lack of probability until the second half of the seventeenth century amazing us so
much?
Mathematical probability was almost completely lacking until the second half of the
seventeenth century. Even the word ‘probability’ itself, in the probabilistic sense as we know
it today (‘likely’, ‘apparently true’) was also non-existent before the seventeenth century
(Room, 1986). This fact, that probability is of quite recent origin and has an abrupt emergence
in the seventeenth century, baffles scientists (Daston, 1988; Hacking, 1975).
However, why is it so startling that probability is only such a recent ‘discovery’?
Nobody seems for instance to be very amazed by the fact that psychology as a science did not
exist before the nineteenth century. Or that quantum mechanics was not discovered earlier.
Because probability is in our present time so omnipresent, it is hardly possible to think of
something which is not subject to probability. Already in 1942 the prominent British
statistician Maurice Kendall said that statisticians “have already overrun every branch of
science with a rapidity of conquest rivalled only by Atilla, Mohammed, and the Colorado
beetle” (Kendall, 1942, p. 69). It is therefore very luring to think that probability has always
existed:
On the common origins of psychology and statistics
39
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
The probability expert I.J. Good claims that probability predates the human race. He argues that animals
have a sense of probability – a predator might instinctively assess the probabilities that the prey will
choose among various escape routes, and chase down the route that is most probable. (Aczel, 2004, p.
3)
It sounds probable, not? And what to think about the cognitive research, that thinks of the
human mind as an ‘intuitive statistician’1 (Brunswik, 1943; Gigerenzer, 2000; Gigerenzer &
Murray, 1987)? It also sounds very probable, i.e. reasonable.
Of course, when you search for it – in particular with the benefit of the hindsight – you
can even find numerous foretokens and precursors of probability in the ancient world and in
prehistory. But there is here a great danger of anachronism. The fact that the ancient
Babylonians, Egyptians, Greeks, and pre-Christian Romans used the heel- or knucklebones,
‘astralagi’, as dice (David, 1962) makes their lack of a concept of probability even more
remarkable from a modern point of view, but does not justify the conclusion that they had
already a rudimentary probabilistic concept.
Figure 5. 'Astralagi': heel- or knucklebones that were used as dice in Ancient times.
There was the concept of ‘chance’ or ‘Fortuna’ in the Ancient world and the Middle
Ages – gambling and divination where widespread (Cioffari, 1973; Kendall, 1973) – but this
concept of chance was not linked to rationality, at least certainly not in the way it was in the
word ‘probability’ as it emerged in the second half of the seventeenth century.
The emergence of probability in the second half of the seventeenth century is therefore
the emergence of the word ‘probability’ as a certain way of thinking together of chance and
rationality. I assume that this fact – that the changed meaning of the word ‘probability’
involves our rationality and therefore our thinking – makes it so difficult to remember that
probability has a quite clear historical origin.
As I mentioned earlier, the frequentist Von Mises compared ‘probability’ with the
word ‘length’. I think that his comparison touches upon a very empirical fact. Once you have
seen the world ‘through’ the word ‘length’, it has become impossible to see the world without
it – e.g., you perceive John as longer than Frank. Although in retrospect the ‘concept’ of
1
See also chapter 3 (p. 66) and § 3.F “The lure of ‘subjective’ semantics in cognitive psychology” (p. 109 f.f.).
On the common origins of psychology and statistics
40
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
‘length’ seems to have existed always, it seems to be at the same time undeniable that there
must have been somebody who uttered the word ‘length’ for the first time. The same holds for
the word ‘probability’. Moreover, the word ‘probability’ keeps probably even more after us
then the word ‘length’, because it affects how we think – when thinking we think of our
thoughts as having certain probabilities and thinking of a time when thoughts were not linked
with probabilities seems impossible.
There are adversaries of the idea that in the seventeenth century a conceptual change
happened – such as Daniel Garber and Sandy Zabell – but it is important to bear in mind that
their aversion may be partly dictated by the impossibility they meet with when they try to
‘imagine’ a world without probabilities:
[…], it is difficult to imagine a period of modern history in which concepts of probability, evidence and
chance did not exist, when the epistemic and the aleatory were not intermixed. There should be an
underlying suspicion that, like the legendary pre-logical peoples of the pre-Quinean anthropologist,
these recent pre-probabilistic times may be a figment of the investor’s imagination. (Garber & Zabell,
1979, p. 37)
It cannot be denied that writers as the above mentioned Garber and Zadell, or David in her
book Games, gods and gambling (1962) – that has become in the meanwhile a classic in the
history of statistics – and may others (e.g. Sambursky, 1956; Sheynin, 1974), have done
conscientious research wherein is convincingly shown that there can be found many
forebodes of the ‘concept’ of probability, but nevertheless all the instances of this ‘prehistory
of the theory of probability’ are rather anecdotic:
The origins [of probability] have been sought in astronomy, fine arts, gambling, medicine, alchemy, and
the insurance trade. The quest for antecedents has been a frustrating one, uncovering proto-probabilistic
thinking everywhere and nowhere. Certain passages of Aristotle, for example, could be construed as an
embryonic version of statistical correlation or a scale of subjective probabilities; with an even greater
effort of the imagination, Bayes’ theorem may be discovered in medieval Talmudic exegesis. However,
these philosophical discourses on the nature of chance and rules of thumb for dealing with situations
fraught with uncertainty […] not only fall short of a mathematical treatment of probability considered in
and of themselves, but they also manifestly failed to generate such a theory. (Daston, 1988, p. 8)
On the common origins of psychology and statistics
41
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
II. How the meaning of the word ‘probability’ changed in the seventeenth century
There is a general consensus that the official moment of the birth of mathematical probability
has to be assigned to the period from July to October 1654, when Pascal and Fermat sent to
each other five letters concerning the so called ‘problem of points’: a probabilistic problem
that was posed to Pascal by the gambler Chevalier de Méré and that questions how the stakes
in a dice game should be divided when it is prematurely cut off, i.e., if there is a fair division
of the stakes based on the probability each of the two players has of winning the total game
given the results of the previous rounds. The letters have been translated to English (David,
1962) and practically every book on probability mentions them as the origin of mathematical
probability (e.g. Daston, 1988; Gillies, 2003; Hacking, 1975; Maistrov, 1974; Oosterhuis,
1991; Schuh, 1964; Vlis & Heemstra, 1988).
Figure 6. Fermat
Figure 7. Pascal
Simultaneous or immediately following the Pascal-Fermat correspondence there is a real
explosion in probabilistic ideas. Christiaan Huygens visited Paris in 1655 (Stigler, 1999, p.
239), heard from the Pascal-Fermat correspondence and wrote in 1656 the booklet Rekeningh
in Spelen van Geluck, that was translated into Latin in 1657 as De ratiociniis in ludo aleae.
Figure 8. Christaan Huygens
On the common origins of psychology and statistics
42
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
This text of Christiaan Huygens was the point of departure for the famous Ars conjectandi
(The art of conjecturing) of Jacob Bernoulli (1654-1705), that was probably written in 1692
(Hacking, 1975) and posthumously published in 1713 by his nephew Nicolaus Bernoulli.
Figure 9. Jacob Bernoulli, Ars conjectandi (1713).
In Ars conjectandi Bernoulli formulated the ‘law of large numbers’ – otherwise known as
‘Bernoulli’s theorem’ or the ‘first limit theorem’. This monumental discovery is the
fundament of all modern developments in probability calculus and statistical inference
(Bockstaele, Cerulus, & Vanpaemel, 2004). Without the ‘law of large numbers’ the
frequentist statistical inference – on which all psychological research hinges – would not
exist.
Jacques Bernoulli’s Ars conjectandi presents the most decisive conceptual innovations in the early
history of probability. […] probability came before the public with a brilliant portent of all the things
we know about it know: its mathematical profundity, its unbounded practical applications, its squirming
duality and its constant invitation for philosophizing. Probability had fully emerged. (Hacking, 1975, p.
143)
Figure 10. J. Bernoulli
Figure 11. N. Bernoulli
So it is evident that between 1654 and 1713 there was a spurt of probability.
However, why did it happen only then?
On the common origins of psychology and statistics
43
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
Several, sometimes quite far-fetched, suggestions have been made. Some explanations
are sociological or psychological (Garber & Zabell, 1979), some search for a “more
fundamental factor” (Kendall, 1956, p.10) and put emphasis on the conceptual change.
Probably one of the most cited explanations is that of David (1955) who assumes that
probability calculus could not develop earlier because dices were often used for magical or
religious purposes – calculation of probabilities would be a blasphemous interference with the
deity expressing his wishes – and because in ancient times people did not use uniform sixfaced dices but uneven knuckle-bones (see figure 5) to gamble: whereas with modern dices it
is clear that the probability of each side is equiprobable, with the unequal sized bones this was
not the case, which could have made probability calculations less obvious (David, 1955).
Kendall (1956) thinks that it was the Reformation that gave rise to the development of a
concept of probability; Garber and Zabell point to the general rise of scientific activity in the
seventeenth century (1979); Maistrov searches the reason for the raise of probability in
economic circumstances – but when one takes in consideration that his book was written in
communist Soviet Union it seems wise to take his references to Marxist-economic theory with
a pinch of salt (Maistrov, 1974, p. 5); Daston (1988, p. 14) argues in a quite convincing way
that probability emerged from the context of the law and is connected with seventeenth
century legal reforms “concerning evidence both in and out of the courtroom”.
Daston (1988) furthermore does a wonderful job in clarifying why certain theories that
try to explain the emergence of probability are likely to be false: there was no lack of
mathematical knowledge that would have prohibited an earlier emergence (cf. Hacking,
1975); it were not the games of chance that provided the conceptual framework or catalyst (cf.
Maistrov, 1974); it cannot be said that the raise of mathematical probability is identical with
the raise of a uniform concept of probability – the seventeenth century theoretical concept of
probability is nowhere near a uniform concept; and probability did not arise because there was
a need for it in business (cf. Hacking, 1975) – although seventeenth and eighteenth century
probabilists such as Johan De Witt (1625-1672) and Edmund Halley (1656-1742) used very
‘practical’ examples concerning insurances and annuities (both booming business in the
seventeenth and eighteenth century), the commercial implications of their probabilistic ideas
were distrusted or at least not understood and their contribution to practice was consequently
nil until the end of the eighteenth century:
…the vogue for insurance seems to have been less prudential than reckless, fuelled more by the spirit of
gambling than foresight. (Daston, 1988, p. 165)
On the common origins of psychology and statistics
44
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
Figure 12. E. Halley
Figure 13. Johan de Witt
So it turns out that the question why probability arose in the second half of the seventeenth
century, is not very easy to answer.
There is however one answer to this question – formulated by Ian Hacking – that is
much more sophisticated and stimulating (Daston, 1988, p. 11) than all the other explanations.
Hacking (1975) argues that the word ‘probable’ before the seventeenth century was a
predicate that could be ascribed to an opinion when it was approved by intelligent people;
‘probability’ meant therefore mainly the ‘approvability of an opinion’ (Hacking, 1975, p. 23)
and one could speak of a ‘probable opinion’ when that opinion was supported by the authority
of authorative persons or ancient books, e.g., “My opinion is probable because Plato,
Aristotle and Paracelsus subscribe to it”.
Until the seventeenth century there was a strict separation between scientia
(knowledge), i.e., the demonstrative knowledge of the ‘higher’ sciences such as mechanics
and astronomy, and opinio (opinion), i.e. the beliefs and doctrines of the ‘lower’ sciences,
such as alchemy and medicine, which were not begot by demonstration. This distinction is
unfamiliar to the modern mind and therefore needs some further clarification.
By dropping balls of different masses from the Leaning Tower of Pisa Galileo Galilei
(1564-1642) demonstrated that their acceleration was independent of their mass, because the
heavier balls reached the ground just as fast as the lighter balls – therefore this knowledge was
worthy of the name scientia. When talking about scientia it would have been nonsensical to
have spoken of something like ‘evidence’, because scientia involved an absolute
demonstration without any room for doubts, interpretations, probabilities or evidence. After
all, if you are able to demonstrate a fact, further support by authorities (in order to make it
‘probable’) or by supplementary evidence is superfluous.
On the common origins of psychology and statistics
45
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
It is obvious that this kind of demonstrative knowledge was unattainable for a
astrologer or a physician: the position of the stars and the symptoms of a disease are only
signs of an underlying reality that never can be demonstrated in an absolute way. The
knowledge of an alchemist, astrologer, geologist or physician consequently was not scientia,
but was mere opinio. Due to the strict distinction between opinio and scientia, it would have
been absurd to assign the attribute ‘probability’ to scientia. However, in retrospective one
could say that when Renaissance thinkers started on a large scale to look at nature as “the
Book of Nature” the seeds for the dissolving of the strict separation between scientia and
opinio were sewn:
Nature is the written word, the writ of the Author of Nature. Signs have probability because they come
from this ultimate authority. (Hacking, 1975, p.30)
This ‘metaphor’ of Nature as a book whose signs have to be read in the right way did exist
also before the Renaissance (Garber & Zabell, 1979), but the Renaissance thought was really
completely imbued by this ‘metaphor’ (cf. Foucault, 1966). Although one can only guess at
the reasons that made the idea of the “Book of Nature” so immensely popular, it seems
plausible that the Reformation and the return to the text of the Scripture were helpful (Popkin,
1964). However, whatever the reason may have been for the popularization of the idea that
nature is a book, it is a fact that it entailed a loosing up of the strict distinction between
scientia and opinio and consequently brought along also a slightly changed meaning of the
word ‘probability’:
A new kind of testimony was accepted: the testimony of nature which, like any authority, was to be
read. Nature now could confer evidence, not, it seemed, in some new way but in the old way of reading
and authority. A proposition was now probable, as we should say, if there was evidence for it, but in
those days it was probable because it was testified to by the best authority. (Hacking, 1975, p. 44)
The mutation from ‘probable opinion’ as nondemonstrative knowledge supported by
authorities to ‘probable opinion’ as the only possible knowledge supported by the authorative
signs of the book of nature, led at the turn of the seventeenth century to a ‘janus-faced’
concept of probability (Hacking, 1975), namely a concept that is grounded as well in the
frequencies of nature (aleatory probability) as in the insufficient knowledge of man (epistemic
probability).
On the common origins of psychology and statistics
46
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
So, for instance suppose you are a fourteenth century monk reading Virgil’s pastoral
poems, the so-called Eclogues, that he wrote around 37 BC, i.e. before the birth of Christ.
You have the ‘opinio’ that Virgil had a presentiment of the birth of Christ and that his fourth
Eclogue has to be interpreted as a prophecy of the coming of Jesus Christ. You explain to
your fellow monks that your ‘opinio’ is very probable, because it is supported by some very
authorative thinkers, such as the abbot of your monastery and a very erudite cardinal. The
word ‘probable’ in this sense is not ‘janus-faced’: it is clearly ‘epistemic’ (although it is a bit
anachronistic to use this notion in the context of fourteenth century thought), because it is
only used to express the extent to which some non-demonstrative knowledge is supported by
authorative thinkers. However, as soon as the domain of ‘opinio’ no longer concerns your
interpretation of Virgil but your ‘opinio’ on Nature (or: the Book of Nature) and the authority
whereon you rely to show that your ‘opinio’ is ‘probable’ is no longer an authorative abbot or
cardinal but Nature itself, probability becomes ‘janus-faced’: it is both ‘epistemic’ or
‘subjective’ (for it still deals with ‘opinions’ and ‘supportive opinions’ who rely on the
authority of Nature), as well as ‘objective’ because it is the ‘authority’ of Nature itself who
gives us evidence through the ‘frequencies’ of its ‘signs’ if an ‘opinio’ on Nature is
‘probable’.
On the common origins of psychology and statistics
47
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
[2.E] From seventeenth century ‘probability’ to eighteenth century ‘Statistik’
and nineteenth and twentieth century ‘statistical inference’.
In this section is explored how probability evolved after its emergence. This section is divided
in the following subsections:
I.
Why the classical concept of probability seems equivocal from a modern point
of view.
II.
‘Statistik’ – the counting of figures as an expression of power of a state.
III.
The combination of probabilistic thinking and statistics into statistical
inference. Frequentist and Bayesian inference.
I. Why the classical concept of probability seems equivocal from a modern point of view
The seventeenth and eighteenth century concept of probability seems from a modern point of
view equivocal (Gigerenzer et al., 1990). As I have shown earlier in this chapter, in section
2.C, the modern interpretations of probability are divided in objective and subjective
interpretations. This clear duality is completely alien to classical probability. A seventeenth or
eighteenth century probabilist would not have called his concept ‘janus-faced’: it is the
modern mind who has to label it like that in retrospective. A classical probabilist did not
notice any ambiguity in his concept of ‘probability’: however, to a modern probabilist this
ambiguity in classical probability is so obvious, that it is hard to understand how this
ambiguity could not have struck the attention of the classical probabilists.
When was the transition from classical to modern probability? Although the break was
“neither sudden nor clear” (Daston, 1988, p. 371), it is in the 1830s and 1840s in the work of
Poisson (1781-1840) and Cournot (1801-1877) – in particular in his book ‘Exposition de la
théorie des chances et des probabilités (1843) – that the distinction between subjective
probability (‘probabilité’) and objective probability (‘chance’) is for the first time explicitly
problematized (Daston, 1988; Hacking, 2004).
Of course there are a lot of modern probabilistic thinkers who try to merge the objective and
the subjective interpretation in some way or another, but they are always aware of the fact that
they are trying to bridge the subjective and the objective interpretation. It seems impossible to
the modern probabilist to erase the duality of probability.
On the common origins of psychology and statistics
48
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
Figure 14. Poisson
Figure 15. Cournot
The momentous book Logical Foundations of Probability (1951) of Carnap is one of the most
well known attempts of the twentieth century of coping with this distinction. Carnap (1951)
distinguishes ‘probability1’ (inductive probability) and ‘probability2’ (statistical probability),
i.e. ‘probability1’ is subjective and ‘probability2’ is objective. But,
Carnap, and Cournot before him, notoriously failed to bring tranquillity out of controversy by their
judicious mixture of conceptual analysis and linguistic distinction. Philosophers seem singularly unable
to put asunder the aleatory and the epistemological side of probability. (Hacking, 1975, p. 15)
The fact that classical probility has a janus-faced character, which only from a modern
point of view seems ambigious, has led to a lot of misinterpretations of classical probabilist
thought. The misunderstandings surrounding the thought of Jacob Bernoulli and Thomas
Bayes are good examples of the difficulties a modern probabilist encounters when he tries to
grasp theories from the era of classical probability.
As I mentioned earlier the ‘law of large numbers’ of Jacob Bernoulli (1654-1705) has
laid the fundament for the frequentist interpretation of probability; but at the same time Jacob
Bernoulli has been called also a subjectivist because he is the first to use the word ‘subjective’
in relation to probability theory (Hacking, 1975) and because he talks of probability as a
“degree of certainty” (Hald, 1998); yet, “we do not know how Bernoulli would have used his
theory in practice because he never analyzed a set of real data” (Hald, 1998, p. 157).
And although the term ‘Bayesian statistics’ is used as the stock phrase for statistics
based on the subjective interpretation of probability, it is highly doubtful if Reverend Thomas
Bayes (1702(?)-1761) would have recognized his own thoughts in what is now ranged under
the denominator ‘Bayesian statistics’. It was the ‘subjective’ interpretation of Bayes by
Laplace (1749-1827) – notwithstanding that Laplace himself is still a classical thinker and
On the common origins of psychology and statistics
49
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
therefore does not make a clear distinction either between subjective and objective
probabilities – which formed the probabilistic ideas that are known today as Bayesian.
It is this incommensurability between classical and modern probabilistic thought that
makes classical probabilistic ideas look very ambiguous to the modern mind and that probably
could help explain why ‘Stigler’s Law of Eponymy’, i.e. the law that “no scientific discovery
is named after its original discoverer” (Stigler, 1999, p. 277), applies so extraordinarily well
to probabilistic ideas.
II. ‘Statistik’ – the counting of figures as an expression of power of a state
After the emergence of probability in the second half of seventeenth century it took another
two centuries before the probabilistic calculus obtained his star role in statistical inference.
Nonetheless, this historical observation does not imply that before the end of the nineteenth
century there was no practice of gathering ‘statistical’ data – on the contrary, although “often
incomplete and unreliable” (Daston, 1988, p. 127) demographic data existed from the early
sixteenth century and from the second half of the seventeenth century it became really
booming business! – nor does it imply that probabilistic calculus and the gathering of
‘statistical’ data were unrelated areas in the seventeenth, eighteenth and beginning of the
nineteenth century.
Why would one take great pains to gather data, if one does not want to draw
conclusions from them? The answer to this question is hinted already by the etymology of the
word ‘statistics’: it was an expression of the power of a state – e.g., the more fertile women,
the more revenue income or the more man capable of serving in the army, the more strength
would be ascribed to a state (Desrosières, 1993; Hacking, 2004; Nikolow, 2001; Westergaard,
1969). The word ‘Statistik’ was first used by the German ‘statist’ Gottfried Achenwall in his
book Staatsverfassung der heutigen vornehmsten europäischen Reiche und Völker im
Grundrisse (1749). However, Achenwall’s ‘Statistik’ was not an isolated idea. Every selfrespecting nation at that time was seized by the ‘statistical’ frenzy, although the name, the
degree of quantification and institutionalization, exact techniques of counting and emphasis
varied from country to country depending partly of the form of government (Desrosières,
1993): e.g., the English ‘political arithmetic’ – mostly a quantitative hobby of well-to-do
dilettantes – is renown for its bills of mortality, whereas the German ‘Statistik’,
‘Staatswissenschaft’ or ‘Kameralwissenschaft’ such as practiced in the multitude of the
German-speaking states became reasonably fast institutionalized and focused more on
comparisons between cultural-geographical particularities and the idea that a nation-state is
On the common origins of psychology and statistics
50
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
characterized by its descriptive statistics (Desrosières, 1999; Porter, 2003c). Still, all the
‘statisticians’ or ‘political arithmeticians’ such as John Graunt (1620-1674), William Petty
(1623-1687), Hermann Conring (1606-1681), Daniel Bernoulli (1700-1782), Johann Peter
Süßmilch (1707-1767), Gottfried Achenwall (1719–1772) Anton Friedrich Büsching (1724–
1793), August Friedrich Wilhelm Crome (1753–1833) and Sir John Sinclair (1754-1835)
share the idea that certain figures express the wealth, strength or power of a state.
So, are the data that are gathered in seventeenth and eighteenth century solely an
expression of the power of a state? They were mainly, although one can see in the beginning
of the nineteenth century that the ‘rhetorical’ function is not only applied in relation to the
strength of the state, but also for example in the already mentioned (pp. 31-32) blood-letting
debate in France between doctor Broussais and his adversaries (Hacking, 2004).
However, did not the practice of gathering data cross the calculus of probabilities
before the 1830s? On the face of it, it looks as if the calculus of probabilities was from the
moment of its emergence in the seventeenth century immediately connected with statistical
data. Directly after the publication of Graunt’s mortality table in 1662, probabilistic thinkers
all over Europe – including Christiaan and Ludwig Huygens, Johann de Witt, Halley, Leibniz,
Jacob and Nicolaus Bernouilli – were eager to apply the young probability calculus to the data
of Graunt’s tables.
Nevertheless, this ‘application’ of probability calculus during the period of ‘classical
probability’ is incomparably different from the modern practice of statistical inference as we
know it since the beginnings of the twentieth century. Although the seventeenth and
eighteenth century probabilistic thinkers themselves would have described their approach to
Graunt’s data as empirical, their treatment of the data seems to a modern observer
unacceptable: the data were moulded and trimmed until they formed supportive evidence for a
probabilistic regularity (Daston, 1988; Gigerenzer et al., 1990). The probabilistic thinkers
were often ‘natural theologians’ (Gigerenzer et al., 1990), who believed that their calculus
revealed a divine design in nature: and if the data did not fit the assumed probabilistic
regularities, the data had to be adjusted until they did. A significant example of this believe in
the omnipresence of the regularities of the “divine handiwork” (Gigerenzer et al., 1990, p. 13)
is the influential book „Die Göttliche Ordnung in den Verhältnissen des menschlichen
Geschlechts, aus der Geburt, dem Tode und der Fortpflanzung desselben erwiesen“ (1741) of
Johann Peter Süßmilch (Pearson, 1978). The general tendency was to neglect deviations in
data and therefore the complaints made by the Dutch probabilist Nicholas Struyck in 1740
that “mortality doesn’t listen to our suppositions” and that many of the tables allegedly based
On the common origins of psychology and statistics
51
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
on observation were in fact “pure hypotheses” (Daston, 1988, p.130) are very unique in the
eighteenth century.
Figure 16. Arbuthnot
Although it is possible to look at the ‘discovery’ of John Arbuthnot (1667-1735) that
year after year there is a slightly higher probability for male births than for female births
(communicated in 1710 to the Royal Society under the title ‘An Argument for Divine
Providence taken from the Constant Regularity of the Births of Both Sexes’) as a very
primitive form of an inferential statistical test (Gigerenzer et al., 1990; Hacking, 1976), his
interest in the ratio of male-to-female births was far from purely methodological and clearly
guided by his belief in a divine order that seemed to be expressed in the male/female ratio:
Arbuthnot noted that male births consistently exceeded female births in the ratio 18 to 17. He argued
that if this regularity were due to what he called “mere chance” – that is, assuming that the probability
of either male or female equals ½ – the probability of the observed ratio over a long period was
astronomically small. Arbuthnot concluded that this was palpable evidence of design, namely, the
divine provision for an equal number of men and women of marriageable age to ensure the propagation
of the race via monogamy. Due to the greater “wastage” of young men, who led more hazardous lives, it
was prudent to begin with a small surplus. (Daston, 1987, p. 302)
Though the approach of Arbuthnot might seem to be extraordinarily modern (Pearson, 1978)
and is likely to remind the contemporary statistician of modern hypothesis testing (H0: the
male/female ratio is due to mere chance; H1: the male/female ratio is an expression of Gods
design), it would take another two centuries before statistical testing became a clearly defined,
autonomous method:
Probabilistic tests, however, never became routine operations in any discipline until the beginning of the
twentieth century, and hence there was no sustained effort to develop and improve a methodology of
significance testing. (Gigerenzer et al., 1990, p. 79)
On the common origins of psychology and statistics
52
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
III. The combination of probabilistic thinking and statistics into statistical inference.
Frequentist and Bayesian inference.
When did the practice of gathering aggregates of descriptive statistical data cross the path of
calculus of probability in such a way that statistical inference became possible? When was the
moment that one could make in a fruitful manner inferences concerning some unknown
parameter (e.g., the mean) of a population from a limited amount of observations? When did
it became possible to infer the unknown composition of an urn filled with red and blue balls
from a random sample drawn from it? When did chance became so ‘domesticated’ that
inexorcisable variability – such as, e.g., the variable reactions of patients to a drug or crops to
a fertilizer – stopped being a barrier to making reliable inferences? When were the techniques
to test hypotheses developed?
To these questions – all variations on the same theme, viz., ‘when did statistical
inference emerge?’ – two kinds of answers are possible: (a) a simple ‘historical-anecdotic’
answer and (b) a more complex ‘historical-philosophical’ answer.
(a.) The simple historical-anecdotic answer to the question: ‘When did statistical
inference emerge?’.
The ‘simple’ historical-anecdotic answer, i.e. the answer you are most likely going to get, is
that the scientific methodology of statistical inference emerged only in the beginning of the
twentieth century. The statistical inference that emerged at the beginnings of the twentieth
century was frequentist statistical inference – the scientific methodology that has become the
methodological monopolist in psychology and in almost every other modern science. Thus
when searching for the historical origins of the statistical methodology used in modern
science, the natural manoeuvre (see e.g. the popular-scientific book of Salsburg, 2001) is
consequently to turn to the great names of the beginning of the twentieth century – Ronald
Fisher (1890-1962), Egon Pearson (1895-1980), Jerzy Neyman (1894-1981) and William
Gosset (1876-1937).
Yet, it has to be mentioned that during the ‘flower power’ sixties of the twentieth
century it was attempted to dethrone frequentist statistical inference and replace it by
Bayesian inferential statistics. Bayesian inferential statistics2 was introduced by people like
Savage (1917-1971) and de Finetti (1906-1985). This Bayesian vogue left some traces in
artificial intelligence, game theory and the weighting of evidence in the courtroom, but within
2
See also above § 2.C “Making sense in the pile of words: probability, statistics and statistical inference” (p. 34
f.f.).
On the common origins of psychology and statistics
53
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
scientific methodology it left virtually no traces. Bayesian statistics smells too much of ‘soft
subjectivity’ and ‘dubious a priori probabilities’, I guess. I does not seem to be likely that the
Bayesian ‘heresy’ will ever dethrone frequentist statistics, although it seems sometimes that if
you are a social science methodologist or a probabilist philosopher that it is again very
‘fashionable’ to sympathize with Bayesian ideas or at least with some combination between
Bayesian and frequentist statistics (Romeijn, 2005, p. 12). In this sense Bayesian statistics is
now booming.
So, to summarize, the historical-anecdotic way of looking at the emergence of
statistical inference as a scientific method presents the history of statistical inference as
follows: First – approximately between 1920 and 1940 – a frequentist inferential
methodology was developed, whose status as an indispensable instrument to psychology
would be established in the period 1940-1955 (Gigerenzer & Murray, 1987), and later – in the
1960s and 1970s – the Bayesian approach gained some ground, albeit hardly as a scientific
methodology. The antagonisms between the principal characters who developed frequentist
methodology in the period 1920-1940 are explained in a rather psychological manner.
After this schematic depiction it is time to add some couleur locale to it – for
otherwise this account would only be historical, instead of historical-anecdotic. The relations
between the principle statisticians at the dawn of the statistical inferential era – Francis Galton
(1822-1911), Karl Pearson (1857-1936), Ronald Fisher (1890-1962), Egon Pearson (18951980), Jerzy Neyman (1894-1981),William Gosset (1876-1937), etc. – are a source of juicy
anecdotes and it would be a shame not to mention them.
The early history of frequentist statistical inference is located in Britain – in particular
in and around London and Cambridge – and is embedded in Darwinism and eugenics
(MacKenzie, 1981): it is there that Francis Galton (1822-1911), the nephew of Charles
Darwin (1809-1882), studied human variability and heredity and invented ‘regression’ and the
use of the regression-line3; it is there that his statistical heir Karl Pearson (1857-1936) defines
mathematically the statistical concept of the ‘correlation coefficient’4 in 1896 (Porter, 1986),
becomes the first holder of the ‘Galton Professorship of Eugenics’ at University College
London, leads the biometric laboratory and the eugenics laboratory and that had been
established according to Galton’s will after his dead in 1911 (Porter, 1986) and writes a fourvolume biography about the life and labours of his teacher (Pearson, 1914, 1924, 1930).
3
See also §3.D “Statistical methodology as a re-enactement of evolution: tamed variation and accelerated
selction” (p. 91 f.f.).
4
See also §3.D “Statistical methodology as a re-enactement of evolution: tamed variation and accelerated
selction” (p. 97 f.f.).
On the common origins of psychology and statistics
54
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
Figure 17. Karl Pearson (left) and Galton (right)
Galton and Karl Pearson are thus the founding fathers of the basic language of statistics and
they laid the foundations for the institutionalization of biometrics and statistics (e.g.
Desrosières, 1993); it is on those fundaments that the generation after Karl Pearson will
produce in the first decades of the twentieth century a new statistical inferential methodology
that will lead in the 1940s to the massive adoption of these methods in psychology, i.e., the
so-called the ‘statistical inferential revolution’(see e.g. Gigerenzer & Murray, 1987;
Gigerenzer et al., 1990).
Between Karl Pearson and the founding fathers of frequentist statistical inference lies
a deep generation gap. When one would have walked into Galton’s biometrical laboratory in
London at the turn of the century, one would have seen legions of young women, so-called
‘calculators’, busy with laborious arithmetical operations that Karl Pearson had ordered them
to make so that from enormous amounts of biological data the distribution-parameters (e.g.,
the mean, the standard deviation and symmetry) could be extracted (Salsburg, 2001).
Everything that could be measured was measured, e.g., lengths of human arms and legs, beak
lengths of exotic tropical birds, leaves of mulberry trees and the cranial capacity of human
skulls from ancient cemeteries (Salsburg, 2001).
The inductive or inferential problem – ‘Why would one assume that one can draw
conclusions about a whole population from a limited amount of observations?’ – was not very
salient yet in this approach, because the collectors of data tried to gather as much data as they
could and the observations were not so ‘limited’ at all: “as copious observational and
experimental data as possible” (Pearson, Weldon, & Davenport, 1901, p. 5). Another factor
that entailed that the problem of induction was not very pressing, was the fact that there was
On the common origins of psychology and statistics
55
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
no direct ‘practical’ purpose for the calculated distributions besides the rather abstract hope
that one would be able to signal changes in the distributional parameters (e.g., the mean size
of an elephant) that could indicate evolutionary shifts which one could not see with the naked
eye:
The primary object of Biometry is to afford material that shall be exact enough for the discovery of
incipient changes in evolution which are too small to be otherwise apparent. The distribution of any
given attribute, within any given species, at any given time, has to be determined, together with its
relations to external influences. (Galton, 1901, p. 9)
The concerns of the next generation – the founding fathers of the frequentist statistical
inference – were much more concrete, involving inferences from small samples, experimental
designs, randomization, significance and hypothesis testing. The statistical inferences had to
provide answers to practical questions like ‘Is the concentration of yeast cells in a sample of
Guinness beer representative for the amount of yeast cells in the whole jar?’ (W.S.Gosset) and
‘Are the variations in crop yield at the Rothamsted Agricultural Experimental Station the
result of the use of a certain fertilizer or of other incontrollable factors?’ (R.A. Fisher).
Figure 18. Example of Fisher's experimental design: 5 x 5 Latin square of different trees laid out at Bettgelert
Forest in 1929. (Fisher Box, 1978)
Gosset invented the t-statistic that made it possible to draw inferences from small samples.
Fisher integrated the use of significance testing with experimental design (Gigerenzer et al.,
1990, p. 79): his in 1925 published book Statistical methods for research workers (Fisher,
1970) and his in 1935 published book The Design of Experiments (Fisher, 1951) are
landmarks in the history of statistics (Kendall, 1963). You just have to look in a random
On the common origins of psychology and statistics
56
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
scientific journal to see the effect of Fisher’s ideas: you will see that words concerning the
design of experiments like ‘randomization’, ‘blocking’ and ‘replication’ abound (Gigerenzer
et al., 1990, p. 75) and the same holds for the significance levels “p< 0.05” and “p< 0.01”.
For a proper understanding of what happened during the statistical inference
revolution it is necessary to enlarge a bit upon ‘significance testing’, because not every reader
will be familiar with the concept. In significance testing data are compared with the
distribution one would expect if the null hypothesis, i.e. the distribution when there is no
effect beside the variation that one would expect from ‘mere chance’, were true. “The smaller
the level of significance, the more discordant the data are with the null hypothesis”
(Gigerenzer et al., 1990, p. 78). So what does it mean when a researcher states that he rejected
his null hypothesis (e.g., treatment X has no effect on disease A) because he obtained
“significant results (p=0.01)”? Only this:
The probability of the data, according to the null hypothesis, is 0.01. […] Either the null hypothesis is
true, in which case something unusual happened by chance (probability 1%), or the null hypothesis is
false. (Hacking, 2001, p. 215)
A world of difference lies between, on the one hand, Fisher’s inferential techniques
and those of the other statisticians of his generation – although of course indebted to their
statistical predecessors – and, on the other hand, the biometrical statistics of Galton and Karl
Pearson. The differences in approach led to frictions between the old and the new generation
of statisticians.
Small sample statistics were effectively invented by a professional employee of the Guinness brewery,
W.S. Gosset, for whom the repetition of trials hundreds of times would have been far more trouble than
it was worth. Gosset spent the year 1906-1907 at University College with Pearson, but with the practical
needs of the brewery always in mind. […] He [Pearson] remarked in a letter of 1912 to Gosset that only
“naughty brewers” used so small in their work. (Porter, 1986, p. 317)
However, the misunderstandings between Karl Pearson and the younger generation
statisticians sink into nothingness in comparison with the frictions within this new generation
itself. They were torn asunder in two camps: Fisher on one side and Egon Pearson (Karl
Pearson’s son) and his friend Jerzy Neyman on the other – both camps claiming that W.S.
Gosset was on their side (Gigerenzer et al., 1990, p. 105). The feud raged from the 1930s until
Fisher’s dead in 1962 (Kendall, 1963).
On the common origins of psychology and statistics
57
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
Figure 19. W.S. Gosset
One suggests that the antagonisms started in the second decade of the twentieth century when
Karl Pearson had declined to publish an article of the at that moment still young and unknown
Fisher in his pre-eminent statistical journal Biometrika. Fisher, endowed with a great
mathematical ability, had solved in this article a problem – concerning the statistical
distribution of Galton’s correlation coefficient – which Karl Pearson had tried to solve for
some time without success.
Figure 20. R.A. Fisher
Pearson rejected Fisher’s article because he had difficulty understanding the complex
mathematics Fisher had used – and so had Gosset whom Pearson asked for advice in this
matter – and possibly also because he was offended by the fact that Fisher had pointed out
errors in his work (Gigerenzer et al., 1990; Salsburg, 2001). Fisher, who sometimes has been
described as ‘cantankerous’ (Dawkins, 2000), never forgave Karl Pearson and even “when
Karl had been dead for twenty years, Fisher wrote as if wounds were still smarting” (cf.
Kendall, 1963, p. 3). In 1956 Fisher writes about Karl Pearson:
On the common origins of psychology and statistics
58
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
…the terrible weakness of his mathematical and scientific work flowed from his incapacity in selfcriticism, and unwillingness to admit the possibility that he had anything to learn from others, […].
(Gigerenzer et al., 1990, p. 98)
Although Karl Pearson’s son Egon Pearson, “like his father a distinguished historian of
statistics as well as an eminent statistician” (Porter, 1986, p.305) strongly disagreed on some
of the most fundamental statistical issues with his father and was impressed by Fisher’s new
ideas, Fisher’s opinion on Karl Pearson extended also over his son. The scientific debates
between Fisher and Egon Pearson were always characterized by “a bitter personal tone”
(Gigerenzer et al., 1990, p. 98).
Figure 21. E. Pearson
Figure 22. J. Neyman
Egon Pearson will write later to Jerzy Neyman about his feelings towards R.A. Fisher and
Fisher’s attacks on his father:
I was torn apart by conflicting emotions: (a) finding it difficult to understand R.A.F., (b) hating him for
his attacks on my paternal ‘god’, (c) realising that in some things at least he was right. (Yates, 1984, p.
116)
However, notwithstanding the collisions between Fisher’s statistical ideas and those of
Pearson and Neyman, they all promulgated frequentist statistical inference: after all, it is only
in the 1960s and 1970s that Savage and de Finetti started opposing the frequentist approach
with Bayesian statistical inference.
Although the historical-anecdotic answer is not incorrect, it does not clarify the conflict
between Fisher and Pearson/Neyman in substance. Moreover, it smothers the fact that even in
the first decades of the twentieth century – the heydays of the frequentist interpretation, when
it was seen as a mortal insult to accuse someone of ‘Bayesianism’ or ‘subjectivism’ – it still
On the common origins of psychology and statistics
59
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
seems to be the subjective interpretation of probability that is from behind the scene pulling
the strings of the fierce frequentist statistical debates; the subjective interpretation is like the
imaginary blood that lady Macbeths tries to wash off her hands in vain – even when it is not
there, it guides the direction of the proceedings.
(b.) The more complex ‘historical-philosophical’ answer to the question: ‘When did
statistical inference emerge?’.
Neyman and Egon Pearson – and later several writers would support them therein (e.g.
Hogben, 1957) – accused Fisher of a “quasi-Bayesian view” (Gigerenzer et al., 1990, p. 103).
What brought them to make this accusation? Fisher would not have described himself ever as
a Bayesian and there are also good scientific reasons to draw a clear distinction between
Fisher’s thought and Bayesianism (e.g. Barnard, 1987). Fisher actually said quite harsh words
about Bayesianism and its presumed intellectual father, Reverend Thomas Bayes (1702(?)1761).
Fisher once congratulated the Reverend Thomas Bayes for his insight to withhold his treatise from
publication (it was published posthumously in 1763). (Gigerenzer et al., 2004, p. 405; see also
Hacking, 1976, p. 201)
As I explained earlier5, Bayesian statistical inference relies on a subjective interpretation of
probability, which interprets probability as a degree of belief that we reasonable must attach to
a hypothesis and that can be revised in the light of new data: e.g. the ‘Bayesian probability’ or
‘degree of belief’ that the sun will rise tomorrow becomes higher with every morning we see
the sun rise. Fisher, Neyman and E.S. Pearson all agreed that the Bayesian position was
untenable: for it juggles with ‘probabilities’ that have to be attached to ‘beliefs’ which tend to
get lost in subjective arbitrariness and it always gets entangled in the induction problem,
which undermines the rational status of Bayesian inference.
Yet, Fisher could not get rid of the idea that probability in some way or another had to
be related to the degree of belief one has. Thus he coined the notion ‘fiducial probability’.
There has always been a great deal of confusion about the meaning of this notion. Fisher
claimed that he was an absolute frequentist and that the notion ‘fiducial probability’ had
nothing to do with subjective probability. As Gigerenzer puts it: “Fisher wanted to both reject
the Bayesian cake and eat it, too” (Gigerenzer, 2000, p. 271).
5
See above, in § 2.C “Making sense in the pile of words: probability, statistics and statistical inference” (p. 34
f.f.).
On the common origins of psychology and statistics
60
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
Figure 23. Reverend Thomas Bayes
When the probabilist Savage – who adhered to a subjective interpretation of
probability – once asked Fisher to tell him the exact meaning of ‘fiducial probability’ Fisher
gave him a very candid answer:
I don’t understand yet what fiducial probability does. We shall have to live with it a long time before we
know what it’s doing for us. (quoted by Gigerenzer & Murray, 1987, p. 8)
This is an inner tension that runs through a large part of Fisher’s work: on the one
hand he completely rejects Bayesian probabilities – “as befits a good frequentist” (Gigerenzer
& Murray, 1987, p. 10) – but on the other hand he believes that inductive inference is possible
and that the acceptance or rejection of a null hypothesis does affect the degree of belief one
has to attach to a hypothesis (Gigerenzer & Murray, 1987).
So, viewed from the perspective of Neyman and E.S. Pearson, who were
radical frequentists, it is quite comprehensible that they accused Fisher from Bayesianism.
The space that Fisher seemingly left for ‘subjectivity’ in rather opaque notions such as
‘fiducial probability’ and ‘likelihood’ (Hacking, 2001, p. 245; Kendall, 1963) and the fact that
Fisher did not exclude the possibility of inductive thinking was absolutely indigestible to
Neyman and E.S. Pearson.
Neyman said that there is no such thing as inductive inference; we only engage in
inductive behavior (Hacking, 2001, pp. 242-243; Hubbard, 2004). Neyman and E.S. Pearson,
pragmatic and anti-metaphysical, felt that Fishers ideas on ‘fiducial inference’ were
metaphysical balderdash. According to Neyman Hume was right when he concluded that he
did not have any reasons for believing any one conclusion (Hacking, 2001, p. 262): “To act as
if a hypothesis were true does not mean to know or even to believe” (Gigerenzer & Murray,
1987, p. 14). Neyman and E.S. Pearson argued that the only reason for using statistical
On the common origins of psychology and statistics
61
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
inferential methods is purely pragmatic, namely these statistical methods should be such a
“rule of behavior” that “in the long run of experience, we shall not be too often wrong”
(Neyman & Pearson, 1933, p. 291).
So, if we say: “The hypothesis H0 (for instance: ‘treatment X has no effect’) is rejected
at the one percent level (p < 0.01)”, this means that the data were very improbable given the
hypothesis H0 and the probability of getting these data if hypothesis H0 is actually true is no
more than one percent. So, it could be that hypothesis H0 is true and that we have falsely
rejected it. Yet, if hypothesis H0 actually would be true and we would repeat our experiment
many times, such results should occur no more than one percent of the time. The rejection of
the hypothesis H0 at the one percent level only expresses the frequency of occurrence of
certain data – given a certain hypothesis – in the long run: it does not imply that it has become
more reasonable to believe this hypothesis H0 is false!
The collision between Fisher’s views and those of Neyman and E.S. Pearson shows
how difficult it is to formulate a purely frequentist inferential theory. It turns out to be very
hard to think of inference – ‘the drawing of a conclusions about a unknown population, based
on a limited amount of observations where you have knowledge of’ – as procedure that has
nothing to do with your knowledge, believes and representations, but that it is solely a form of
decision-making behavior that appears to be right most of the time.
Neyman and Pearson believed that they made Fisher’s theory of ‘significance testing’
“more complete and consistent” (Gigerenzer et al., 1990, p. 102) by introducing next to the
null hypothesis also an alternative hypothesis. Neyman and Pearson named their theory
‘hypothesis testing’. Because they did not believe in the existence of inductive inference
resulting in ‘mere beliefs’ or ‘ideas’, but solely in ‘inductive behavior’, every statistical test
had to entail a ‘decision’ or a ‘choice’ between hypotheses (see e.g. Hubbard, 2004). An
example may clarify this position.
Assume that a medical researcher has done an experiment to test if a certain drug has
any positive effects. The subjects that took the drug in this experiment varied in their
reactions to it. Fisher would probably just look how well the data fit statistically with the null
hypothesis (H0: the drug has not any effect: the variability is not greater than one would
expect to see due to mere chance) and discuss with some colleagues how to interpret the
degree of fit.
However, Neyman and Pearson would object that the researcher is not only confronted
with the question how well the results fit with the distribution of the null hypothesis, but that
On the common origins of psychology and statistics
62
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
he has to make an ‘economic’ decision between two hypotheses (H0, but also HA: the drug has
positive effects) because he has to decide whether to use the drug or not.
Therefore the researcher has to know what the power of his test is, i.e., the probability that
when the alternative hypothesis is true (the drug has positive effects) that the data analysis
will indeed lead to a significant effect and the rejection of the null hypothesis. The ‘power’ of
a test is imbedded in the “cost-benefit calculations” (Gigerenzer et al., 2004, p. 399) the
researcher has to make, i.e., he has to balance, for instance, the relative severity of making the
error of deciding to give to patients a drug that does not work to the error of clutching to a too
strict criterion and abusively not using a drug that actually does work.
Figure 24. The null hypothesis and the alternative hypothesis
Fisher thought that the statistical approach of Neyman and Pearson was a mechanical,
thoughtless reduction of statistical inference (Gigerenzer, 2000, p. 277; Gigerenzer et al.,
1990, p. 78):
“We have the duty of formulating, of summarizing, and of communicating our conclusions, in
intelligible form, in recognition of the right of other free minds to utilize them in making their own
decisions.” (Fisher, 1955, p. 77)
Fisher seized the fact that Jerzy Neyman was of Polish origin to present his debate with him
as a confrontation between the free western world and the technocratic, dictatorial world
behind the iron curtain:
I shall hope to bring out some of the logical differences more distinctly, but there is also, I fancy, in the
background an ideological difference. Russians are made familiar with the ideal that research in pure
science can and should be geared to technological performance, in the comprehensive organized effort
of a five-year plan for the nation. […] In the US also the great importance of organized technology has I
On the common origins of psychology and statistics
63
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
think made it easy to confuse the process appropriate for drawing correct conclusions, with those aimed
rather at, let us say, speeding production, or saving money. There is therefore something to be gained by
at least being able to think of our scientific problems in a language distinct from that of technological
efficiency. (Fisher, 1955, p. 70)
Fisher makes very clear what is here at stake: it is the ‘free mind’ and ‘a language distinct
from that of technological efficiency’. What is it that makes that Fisher – notwithstanding his
frequentist approach – sticks so stubborn to the last shreds of ‘subjectivity’ or ‘Bayesianism’
with his ‘fiducial inference’? Why was it that “Fisher wanted to both reject the Bayesian cake
and eat it, too” (Gigerenzer, 2000, p. 271)? Is it just that he is “determined by his emotional
make-up, not by reason or mathematics” (Kendall, 1963)? Gigerenzer’s explanation of this
matter has the same ‘psychological’ orientation as Kendall’ explanation. Gigerenzer (2000)
clarifies the relation between the Neyman-Pearson theory, the Fisherian theory and the
Bayesian theory by using a Freudian analogy: the Neyman-Pearson theory functions as the
Superego (it “forbids epistemic statements about particular outcomes or intervals”); the
Fisherian theory functions as the Ego (it “makes abundant epistemic statements about
particular results”, but “is left with feelings of guilt and shame for having violated the rules”)
and the Bayesian theory functions as the Id (it makes “statements about probabilities of
hypotheses”, although it is censored “by both the frequentist Superego and the pragmatic
Ego”) (Gigerenzer, 2000, p. 280). But is such a ‘psychological’ explanation sufficient? Or is
there nonetheless a philosophical or logical necessity for the ‘fiducial argument’, as Hacking
claims (1976) in his endeavour to remould Fisher’s ideas to a “Fisher-Hacking theory of
fiducial inference” (Bartlett, 1966, p. 632)?
On the common origins of psychology and statistics
64
CHAPTER 2: FROM AN EAGLE’S POINT OF VIEW - A HISTORICAL & PHILOSOPHICAL OVERVIEW OF
THE WORDS ‘PROBABILITY’ AND ‘STATISTICS’
[2.F] Recapitulation & outlook on the next chapter
Recapitulation of chapter 2:
A historical, philosophical and terminological overview of the words ‘probability’ and ‘statistics’ has been
presented. Since its beginnings in the second half of the seventeenth century until the 1840s probability is called
classical probability. After the 1840s modern probability emerges. In retrospective classical probability seems to
be an ambiguous amalgam of subjective and objective probability. Since Kolmogorov formulated in the
beginnings of the twentieth century his axiomatic system there is a great consensus about the probabilistic
calculus: however, on a theoretical level it is unclear if probability has to be interpreted as a subjective or an
objective phenomenon. This debate had also its repercussions on inferential statistical methodology, that is based
on probabilistic assumptions. Since the beginnings of the statistical inferential revolution in the first decades of
the twentieth century scientific statistical methodology is almost completely based on the ‘objective’, i.e.
‘frequentist’, interpretation of probability. Although in the 1960s and 1970s Bayesian inference (whose
underlying probabilistic assumptions are given a subjective interpretation) gained some ground – for instance in
Artificial Intelligence research and in the courtroom in the evaluation of the weight of evidence – it stayed of
marginal significance to scientific methodology. Nevertheless, even within frequentist inference it appears to be
very hard to get rid completely of the subjective (or: ‘Bayesian’) interpretation and shreds of subjectivity
abound.
Outlook on chapter 3:
Whereas chapter 2 presented the contrast between frequentist (objective) and Bayesian (subjective) statistics
from a statistical and a historical or even anecdotic point of view and the ‘contamination’ of frequentist statistics
by subjective semantics as a psychological curiosity, chapter 3 will:
(a) show how the frequentist and Bayesian position are a philosophically incommensurable and hard opposition
by linking Bayesian statistics to ‘classical epistemology’ and frequentist statistics to ‘evolutionary epistemology’,
(b) show how the contrast between Bayesianism and frequentism underlies also a contrast between, respectively,
the cognitive theories and the statistical methodology in psychology.
I will concretise this now a little bit. From the perspective of evolutionary epistemology – that was formulated in
the first place by the philosopher Karl Popper – rationality has no need for a ‘knowing subject’, ‘beliefs’ or
‘representations’. Popper has fought his whole life against ‘subjectivity’ and has tried to introduce ‘nonsubjective’ words as ‘corroboration’ and ‘falsification’ to describe the growth of knowledge. In this evolutionary
approach statistical methodology has to be understood as a re-enactment of the evolutionary process of natural
selection amongst psychological theories and ideas – not much different from the evolution of plants and animals
– whereby both the researchers as the researched subjects are just ‘cogs’ in wheel of science. The insupportable
‘bleakness’ of this idea may explain why the frequentist ‘meaning’ of statistics is often so poorly grasped by its
applicants: both the semantics of cognitive psychological theories as well as of frequentist statistical inferential
methodology are still drenched in ‘subjectivity’. However, whereas in a lot of contemporary cognitive theories
Bayesianism is the ‘official’ standard against which rationality is measured, in statistical methodology the use of
subjective semantics is ‘just a slip of the tongue’ which keeps creeping into frequentist semantics.
On the common origins of psychology and statistics
65
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A
KNOWING SUBJECT?
The development of statistical inferential thought in the beginnings of the twentieth century
concerned frequentist (i.e. ‘objective’) statistical methodology. Bayesian (i.e., ‘subjective’)
statistical inference is mostly not even mentioned in the methodological textbooks in
psychology.
I . The lure of subjective semantics: outlook on sections (A), (E) and (F) of this chapter.
So, Bayesian statistics is hardly ever mentioned in the methodological textbooks in
psychology. However, as became apparent from the description of Fisher’s ideas in the
previous chapter and as is showed for instance by the research of Gigerenzer (2004) that will
be described in this chapter, the (incorrect) application of ‘subjective’ semantics to frequentist
statistical inference appears to be widespread and ineradicable. The lure of ‘subjective’
semantics in statistical inference will be the subject of section (E) of this chapter.
The lure of ‘subjective’ semantics is also present in psychological theories of mind.
The normative idea that human cognition should obey to the rules of Bayesian inference – i.e.,
that the nonconformity of human cognition to the rules of Bayesian inference is a sign of its
limitations and weaknesses – is a theory that has gained since the 1970s a lot of support in
cognitive psychology. Kahneman and Tversky, who won in 2002 the Nobel Prize of
Economics, have promulgated this normative idea that human cognition is fundamentally
deficient: in order to be rational it should have been a flawless Bayesian ‘intuitive statistician’
– but it is not (Gigerenzer, 2000; Gigerenzer & Murray, 1987; Gigerenzer et al., 1990).
Following the research of Kahneman and Tversky – wherein they argued that “in his
evaluation of evidence, man is apparently not a conservative Bayesian: he is not a Bayesian at
all” (Kahneman & Tversky, 1972, p. 450) – cognitive illusions, heuristics and biases have
become “the fodder for classroom demonstrations and textbooks” (Gigerenzer, 2000, p. 237).
The success of this “heuristics-and-biases” movement can be partly explained by the fact that
it is in a way very much fun to gloat upon how dumb an intuitive statistician human cognition
is (Gigerenzer, 2000, p. 237).
However, underlying the “heuristics-and-biases” program (whose origin can be found
in: Kahneman, Slovic, & Tversky, 1982) is nonetheless the idea of a subjective mind that
makes inferences according to the rules of probability – sometimes successful, sometimes
On the common origins of psychology and statistics
66
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
not. The idea that human rationality should be measured by the standards given by Bayesian
‘intuitive statistics’, fits into a general tendency in cognitive psychology to formulate theories
on human cognition in very ‘subjective’ semantics, encompassing words such as ‘beliefs’ and
‘representations’. This tendency is enhanced by the popularity of neural network modelling in
psychology (Wheeler, 2005) that is closely related to artificial intelligence, wherein Bayesian
inference has found a lot of applications. The lure of ‘subjective’ semantics in cognitive
psychology will be the subject of section (F) of this chapter.
II . Frequentist statistics as a ‘re-enactment’ of evolution that has no place for ‘subjectivism’:
outlook on section (B), (C), (D) and of this chapter.
However, before expanding on the point how statistical methodology and cognitive
psychology are ensnared in ‘subjective’ semantics, I will answer in sections (B), (C) and (D)
of this chapter the question what is ‘wrong’ with ‘subjective’ semantics. Is statistics not an
expression of how probable our belief in a hypothesis is? No, certainly not! And is
psychology not a science about our representations and beliefs? No, certainly not!
To substantiate this I will turn to the philosophy of Karl Popper (1902-1994), who has shown
convincingly the shortcomings of a subjective approach to probability. He approaches
knowledge from an evolutionary approach, that excludes Bayesian statistics. Hence, I will
show how in the same vein the use of ‘subjective’ semantics does not correspond to the
practice of psychological research.
III. Overview of the sections in this chapter
So, I summarize the subjects that will be treated in the six sections of this chapter. The
sections marked with an asterisk (*) deal with the allurement of subjective semantics. The
sections without asterisk deal with frequentist statistical inference as an evolutionary process.
[A] * Probability entangled in subjective semantics and the problem of induction.
[B]
Probability freed from subjective beliefs: rationality without a knowing subject.
[C]
Statistics à la Popper: natural selection of a falsifying rule for statistical hypotheses.
[D]
The statistical research methodology as a re-enactment of evolution: stability in the
long run.
[E] * The lure of ‘subjective’ semantics in statistical inference.
[F] * The lure of ‘subjective’ semantics in cognitive psychology.
On the common origins of psychology and statistics
67
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
[3.A] Probability entangled in subjective semantics and the problem of
induction.
In retrospective it appears that the first two centuries after its emergence probability was
‘janus-faced’: it was from a modern point of view a curious amalgam of ‘subjective’ (i.e.,
‘epistemological’) and ‘objective’ (i.e., ‘frequentist’) probability. Classical probabilists
thought that probability was a phenomenon not only related to frequencies in reality, but also
to human beliefs. Moreover, the classical probabilist believed that there was an exact
correspondence between the tendency of a relative frequency to stabilize in the long run (the
longer the series of throws with a die, the closer will the relative frequency with whom a
certain number turns face up approach the true probability), and the increasing probability
that can be attached to our beliefs: i.e., as the number of throws increases, we become more
‘experienced’ and subsequently our beliefs will be less and less subject to uncertainty.
However, as Hume pointed out already in the seventeenth century6 there is no rational ground
why we should assume that an increasing amount of experience would justify that the
probability which we may attach to our beliefs will increase too. This is the so-called
‘problem of induction’: there is no rational ground for assuming that the future will be
conform to the past. The fact that the sun raises every day for millions of years does not entail
that it is rational to think that the hypothesis that the sun will raise tomorrow has a very high
probability: we have just grown accustomed to the fact that the sun raises every day. There are
probably not many people who worry every evening about the possibility that the sun will not
raise the following morning: however this widespread unconcern is based solely on habit and
has no rational ground whatsoever.
In the 1840s classical probability came to an end and probabilists began to discern between
subjective and objective probability. Objective, i.e., ‘frequentist’, probability is in principle
unrelated to our beliefs and knowledge and therefore should not be affected by the problem of
induction: the tendency of the relative frequency to stabilize in the long run is independent of
observation. However, because our language is so imbued with ‘subjectivity’ (“I observe how
the relative frequency stabilizes in the long run and this brings me to involve in inductive
reasoning and make statements about the probability of future throws”) that it took quite some
time before a theory of frequentist probability could be formulated that was not subject to the
6
See above, in § 2.C “Making sense in the pile of words: probability, statistics and statistical inference” (p. 3132).
On the common origins of psychology and statistics
68
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
problem of induction. After all, we saw in the previous how even radical frequentists such as
Jerzy Neyman and Egon Pearson spoke of ‘inductive behavior’: apparently even they could
not get rid of the word ‘induction’.
In “1927 or thereabouts”(Popper, 1979, p. 1) the philosopher Karl Popper (1902-1994)
found a way to formulate objective probability without getting trapped in ‘subjective’
semantics and the induction problem that is entailed by this semantics:
Of course, I may be mistaken; but I think that I have solved a major philosophical problem: the problem
of induction. (Popper, 1979, p. 1)
Figure 25. Karl Popper
The ‘de-subjectivization’ of probability by Popper was made in two steps:
(i) showing why the subjective interpretation of probability and the induction problem
are irrelevancies, following from a mistaken epistemology.
(ii) formulating an objective interpretation of probability, uncontaminated by
subjectivist semantics.
These steps will be the subject of the next section.
On the common origins of psychology and statistics
69
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
[3.B] Probability freed from subjective beliefs: rationality without a knowing
subject.
This section shows why the subjective interpretation of probability and the induction problem
are irrelevancies following from a mistaken epistemology, viz., from pre-Darwinian classical
epistemology. The first part of this section will be devoted to contrasting classical
epistemology with the ‘Darwinian’ or ‘evolutionary’ theory of knowledge. The second part of
this section will show how there is no place for subjective probability in an evolutionary
approach. The third part formulates the objective propensity interpretation of probability of
Popper.
I. The process of tentative solutions (conjectures) and falsifications (error elimination).
Darwinism changed the way we view the world. Probably the most fundamental change was
that Darwinism made the idea possible that there could be rationality without previous design,
viz., that for instance such a complex organ as the human eye could emerge on the base of
mere variation and selection. Adaptations can arise by natural selection, without need of
intelligence: all the ‘designs’ in the biosphere emerged from ‘mindless’ algorithmic process
(Dennett, 1996).
It was Darwinism that made it possible for Popper to escape from ‘subjective’
semantics: Popper solved the problem of induction because he rejected classical epistemology
and adopted a ‘biological’, ‘Darwinist’ or ‘evolutionary’ outlook on knowledge instead. This
‘evolutionary approach’(Popper, 1979) – Popper considered the expression ‘evolutionary
epistemology’ too pretentious and preferred to speak of an ‘evolutionary theory of knowledge’
(Popper, 1990) – allowed him to formulate an objective interpretation of probability that has
no need for the problematic subjective concepts from classical epistemology, such as ‘beliefs’
or ‘representations’.
In ordinary language knowledge is most of the time tight up to a knower: we are used
to utterances as “I know”, “I belief” or “I am thinking”. Classical epistemology, i.e.,
practically every epistemology before Popper, assumed that there is no other knowledge than
this subjective knowledge – knowledge tight up to a knower. In classical epistemology
knowledge was consequently seen as “a certain kind of belief – justifiable belief, such as
belief based on perception” (Popper, 1979, p. 122).
Popper argued that this thought – ‘all knowledge is subjective knowledge’ – is a
fallacy. The growth of scientific knowledge is a growth of objective knowledge – knowledge
On the common origins of psychology and statistics
70
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
without a knowing subject. Objective knowledge ‘grows’ or ‘evolves’ in a way that is very
similar to “biological growth; that is, the evolution of plants and animals” (Popper, 1979, p.
112). In the same way as the evolution of plants and animals is a rational process that takes
place without some underlying design, knowledge also evolves in a rational way independent
of particular ‘knowing subjects’: therefore one can call it objective. The growth of knowledge
is not a result of an increase in ‘subjective’ knowledge located in the mind of some knower,
but is instead an ‘objective’ process of error elimination, viz., the elimination of unsuccessful
ideas, resulting from the exposure of knowledge to the surrounding world.
An example may elucidate this process of error elimination. There have been times,
for instance, wherein the idea that the world is a flat floating disk was a very successful
replicator – Dawkins (1989) would call such a cultural replicator a ‘meme’. Of course it
originally must have been uttered by somebody – however, immediately after its ‘conception’
it began to lead its own life as a replicator: it nestled itself in the human mind and was
transmitted from generation to generation. Yet, after the idea of a flat earth became more and
more exposed to evidence contradicting it, it eventually was conquered by the idea that the
surface of the earth is spherical. The belief in a flat earth was eliminated in a Darwinist
struggle for life between competing theories.
So, according to Popper the growth of knowledge does not follow from inferences or
inductions, but solely from error elimination. This is quite counter-intuitive – growth of
knowledge is in fact reduction of errors: we just err less and less. However, the theory that the
surface of the earth is spherical is – like all our theories – still a conjecture, i.e., a tentative
theory. We can never be sure that this theory is true. The only ‘progress’ in our knowledge is
due to error elimination: viz., that we effectively eliminated the theory that earth is flat.
Popper argues consequently that the problem of induction now becomes irrelevant,
because it wrongly assumes our knowledge grows due to inductive reasoning, whereas it only
‘grows’ due to error elimination. The evolution of knowledge is described by the following
simple schema: “P1 TT EE P2”, that is “problem P1 tentative theoretical solution
evaluative error elimination problem P2” (Popper, 1979, p. 119 f.f. and p. 144).
The evolution of objective knowledge does in principle not differ from the evolution
of other “non-living structures which animals produce, such as spiders’ webs, or nests built by
wasps or ants, the burrows of badgers, dams constructed by beavers, or paths made by
animals in forests” (Popper, 1979, p. 112).
On the common origins of psychology and statistics
71
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
Figure 26. Karl Popper lecturing. On the blackboard behind him he has written: P1 TT EE P2
For instance, beavers – Popper used these animals very frequently as an example (cf. Popper,
1990, p. 50-51) – are in principle not too particular in their choice of material that they use to
build there dams. Nevertheless, their choice of material is subject to feedback from the
environment. If dams build of tin-cans do not succeed in slowing the river water down this
apparently unsuccessful solutions will be eliminated through environmental negative
feedback: either the beavers will adjust their behaviour and their attempts to build dams out of
tin-cans will extinguish, or – if some erroneous beavers would persist in their preference for
tin-cans – they will be ‘eliminated’ themselves through natural selection and the dams build
of tin-cans will perish accordingly. However, every choice of material for building a dam
always will stay a tentative solution to the problem how to slow the water in the river down.
Figure 27. A beaver. This animal is the mascot of the London School of Economics, where Popper was a
Professor for 23 years. The dam-repairing beaver was one of Popper’s favourite examples of how his ‘critical
approach’ was also present in the animal kingdom (e.g. Popper, 1990, p. 50-51).
On the common origins of psychology and statistics
72
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
The past experience of a beaver that twigs, gnawed branches and rocks form solid building
blocks for a dam, gives him no guarantee whatsoever that his future dams of twigs, gnawed
branches and rocks will be successful too.
So, according to Popper human knowledge is not fundamentally different of other
“objective products of life, such as spiders’ webs, birds’ nests, or beaver dams”, for they are
all “products that can be repaired or improved” (Popper, 1990, p. 50). Moreover, both beaver
dams and human objective knowledge is always a conjecture, i.e. a tentative solution to a
problem, that can be ‘eliminated’ in the struggle for life. Both beaver-dams and scientific
knowledge evolve according to the ‘mindless’ algorithmic process of P1 TT EE P2,
etc. Scientists let “their false theories die in their stead” (Popper, 1979, p. 122), whereas
beavers – at least those who have the capacity to learn from their faults – let their ‘false’
beaver-dams die in their stead. Thus, the scientist and the beaver are united by the fact that
their conjectural solutions can die in their stead – in this sense they are both “Popperian
creatures” (Dennett, 1996, p. 375 f.f.).
Yet, there are of course differences between beavers and scientists. For instance, if a
beaver will meet by chance another beaver in the woods they will not be able to have a little
chit-chat about their experiences with building dams out of tin-cans: after all, the beaver is not
able to transmit the information that tin-cans are poor material for building dams.
No wonder their comprehension is limited. Ours would be, too, if we had to generate it all on our own.
(Dennett, 1996, p. 380)
The transmission of information in non-genetic ways – through ‘memes’ subject to ‘cultural
evolution’ – has enabled human knowledge to evolve at a pace that is an incomparably faster
pace than that of genetic evolution (Dennett, 1996): “…we today – every one of us – can
easily understand many ideas that were simply unthinkable by the geniuses in our
grandparents’ generation!” (Dennett, 1996, p. 377). At a very quick rate the grains of
knowledge are sifted from the husk. Nevertheless, it is evident that the pace of the growth of
knowledge in, for instance, Antiquity was much lower than it is now: especially since the first
decades of the twentieth century the pace of the growth of scientific knowledge has known an
immense acceleration, viz., it has known a tremendous acceleration of the elimination of
probably erroneous information.
On the common origins of psychology and statistics
73
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
I believe that frequentist statistical inference has played a major role in the ‘speeding
up’ of science, because in a sense it re-enacts the evolutionary processes in such a way that
variation is ‘tamed’ in it and falsification elicited.
An example may clarify this. A beaver does not build his tin-can dam for the sake of
hypothesis testing: if his dam is swept away by the water, it is conceivable that he will rise ‘a
sadder and a wiser’ beaver on the ‘morrow morn’ – for the beaver may have the ability to
learn from his faults – but the beaver did not elicit this ‘falsification’ of his tin-can
‘hypothesis’, nor did he accelerate the ‘mindless algorithm of natural selection’ through an
analysis of the events on the basis of probability-based modelling of his hypotheses and
statistical inference that would permit him to ‘tame’ the ‘chance factor’ in this manner.
Now, assume that, for instance, Aristotle passes by the river and wonders why the
beaver-dam was swept away. He discusses this question with a friend, who tells him that
every year around this time there are such heavy flash floods, that the beaver-dams are swept
away – no matter what they are made of. Aristotle has a considerable advantage in
comparison to the beaver: for he has access to a world of information, which the beaver has
not.
However, if a modern scientist would pass the river he could set-up an experiment
with a neatly randomized design of beaver-dams made of different materials to see if there is
any statistically significant difference between the stability of these dams. The personal
beliefs of the modern scientist about the stability of a particular beaver-dam would not matter:
what matters is that he created a experimental ‘set-up’ which results in data that may be so
improbable – given the null-hypothesis (‘there is no more difference between the stability of
beaver-dams made of twigs and beaver-dams made of tin-cans than one would expect due to
mere chance variation’) – that this null-hypothesis has to be rejected: a probably erroneous
theory has been eliminated. I will elaborate on the relationship between statistical
methodology and evolution in section (D) of this chapter.
II. There is no place for subjective probability in an evolutionary approach.
Popper argued – as I mentioned already briefly above7 – that it absolutely impossible to attach
a ‘probability’ to a hypothesis: for a probability can only refer to the data. It is a fallacy to
think that the more white swans one encounters, the more probable the hypothesis ‘All swans
are white’ becomes; after all, one just needs to encounter one black swan to falsify this
7
See above, in § 2.C “Making sense in the pile of words: probability, statistics and statistical inference” (p. 3637).
On the common origins of psychology and statistics
74
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
hypothesis completely (Popper, 1972). Popper therefore strongly opposed to every form of
Bayesian statistics which does attach a probability to hypotheses. Even Fisherian statistical
inference, i.e. frequentist interference with a subjectivist twist, is absolutely irreconcilable
with Popper’s thought (Gigerenzer & Murray, 1987; Mayo, 1996). An example may clarify
this.
Imagine that it is the 1st September of 1994 and that Karl Popper has a nurse who is a
strong Bayesian believer. Although Popper tries to explain to her that Bayesianism is
nonsensical and that a hypothesis cannot have a probability, the nurse wants to calculate what
the probability of her hypothesis is that Popper will be alive on the next day. She calculates
the Bayesian probability and the next day Popper indeed is alive. So she decides to calculate
the probability that he will be alive also on the next day. With every day that Popper lives the
Bayesian probability that he will live the next day rises, because every day the probability of
the belief of the nurse in strengthened with the evidence of a living Karl Popper.
There is a strange paradox here, and I myself find it highly moving to read how Popper says
in 1988, when he is already 87 years old:
How probable is it that you will live another 20 years? This has its own little mathematical problems.
Thus, the probability that you will live another 20 years from today – that is that you will be still alive
in 2008 – increases for most of you every day and every week as long as you survive, until it reaches
the probability 1 on the 24th of August 2008. Nevertheless, the probability that you will survive for
another 20 years from any of the days following today goes down and down with every sneeze and with
every cough; unless you die by some accident, it is not unlikely that this probability will become close
to 0 years before your actual death. (Popper, 1990, p. 8)
On the 17th September of 1994 Karl Popper dies: on this date the datum of Popper’s dead has
‘falsified’ the hypothesis of the nurse. Data have probabilities, hypotheses have not8. A
frequentist statistician may gather the data of birth and death of Austrian philosophers and
study if these data – expressing the lengths of their lives – are probable, given the hypotheses
that they do not differ from the life expectancies of other West-European males.
Keynes – who adhered to a subjective interpretation of probability (Gillies, 2003; Hacking,
2001) – jeered at the frequentist position by paraphrasing it as: “In the long run we are all
dead” (von Plato, 1987, p. 381). Nevertheless, this ironical paraphrasing of the frequentist
interpretation shows something of the abyss that lies between the subjective probability of the
8
See also above, in § 2.C “Making sense in the pile of words: probability, statistics and statistical inference” (p.
37).
On the common origins of psychology and statistics
75
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
belief of the nurse concerning the particular human being Karl Popper and the objectivefrequentist probability of the ‘long run’ and the ‘we…all’.
As long as hypotheses are not falsified, one has to stick to it. However, when one has
to stick to a hypothesis because it has stand up to a test, this does not imply that a hypothesis
is ‘true’ or ‘very probable’: after all, all our knowledge is only tentative, i.e., ‘conjectural’,
and may be falsified in a later test. Nonetheless, one can contend that a hypothesis that stands
up to many tests is a strong hypothesis: Popper calls such hypotheses corroborated
hypotheses (Popper, 1972). Yet, the degree of corroboration has nothing to do with the
calculus of probabilities.
Popper’s fight against the subjective interpretation of probability was not an easy one.
Popper saw around him how philosophers of probability constantly fell back to the ‘old’
subjectivist beliefs concerning probability. An exemplary history is that of the philosopher
Carnap. To conclude this section I will cite Popper himself:
Carnap was then [in 1934], and for some years afterwards, entirely on my side, especially concerning
induction […]. Carnap and I had come, in those days, to something like an agreement on a common
research programme on probability, based on my Logik der Forschung. […] and we agreed not to
assume […] that the degree of confirmation or corroboration of a hypothesis satisfies the calculus of
probabilities […].
This was the state of the discussion reached in 1934 and 1935. But 15 years later Carnap sent me his
new big book, Logical Foundations of Probability, and, opening it, I found that his explicit starting
point in this book was the precise opposite – the bare, unargued assumption that the degree of
confirmation is a probability in the sense of probability calculus. I felt as a father must feel whose son
has joined the Moonies; though, of course, they did not yet exist in those days. (Popper, 1990, p. 4-5)
III. The objective propensity interpretation of probability.
So, if probability is not an expression of the degree of certainty we can attach to our beliefs
and hypotheses – what is it? Until Popper all objectivist interpretations of probability had
been frequentist interpretations of probability, viz., they stated that in the long run relative
frequencies tend to stabilize. The question why a relative frequency (for example of a ‘heads’
turning up in 50% of all tosses with a coin) tends to stabilize in the long run was mostly
evaded: the only matter of importance was that this frequentist probability could be measured
objectively. Popper however was more explicit about objectivist probability and argued that
probability is a ‘tendency’, ‘disposition’ or ‘propensity’ of certain conditions to generate the
observed relative frequency. He called this the ‘propensity’ interpretation of probability.
On the common origins of psychology and statistics
76
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
According to Popper probability is a pure objective propensity (Popper, 1972, 1983, 1990),
i.e., a ‘tendency’ of a system to behave in a certain way: a coin may have the objective
propensity to turn up heads in approximately ½ of all tosses, independent from our subjective
beliefs.
But this means that we have to visualise the conditions as endowed with a tendency or disposition, or
propensity, to produce sequences whose frequencies are equal to the probabilities; which is precisely
what the propensity interpretation asserts. (Popper, 1959, p. 35)
On the common origins of psychology and statistics
77
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
[3.C] Statistics à la Popper: the natural selection of a falsifying rule for
statistical hypotheses.
What kind of statistical inference would Popper’s evolutionary theory of knowledge and his
objective propensity interpretation of probability have endorsed?
Though it is clear that Popper’s ideas concerning statistical inference are much closer
to those of Jerzy Neyman and Egon Pearson than to those of Fisher, I do not think that Popper
would have ever used, for instance, an expression as ‘inductive behavior’: both the notion of
‘induction’ and ‘behavior’ seem to me rather inappropriate from the Popperian perspective –
for Popper despised both inductivism as well as behaviourism (see e.g. Popper, 1974). After
all, it is the job of philosophers to be extremely particular about words. So, in which words
should statistical inference be thought of, according to Popper?
In the first place it would have to be clear that in experimental research9 the name
‘statistical inference’ does not imply ‘statistical induction’. After all, we do not aim to induce
theories but only to eliminate erring theories. I think that from Popper’s point of view it would
be maybe more appropriate to speak of, for instance, ‘statistical falsification methodology’ or
‘statistical error elimination methodology’. Nevertheless, science never adopted an
expression that combined the words ‘falsification’ and ‘statistics’.
The reason hereof is contained in the second remark I have to make, viz., that
probabilistic statements are in principle not falsifiable (Popper, 1972)! For instance, the
hypothesis that all swans are white can be easily falsified by one black swan; however, there
is no analogous method to falsify the hypothesis that a coin is unbiased, i.e., has a probability
of ½ to turn up tails. Only if we would be able to produce an infinite sequence of tosses with
this coin – which is of course impossible! – and the relative frequency of tails would turn out
to be for instance â…“, we could falsify a probabilistic hypothesis: only “an infinite sequence of
events […] could contradict a probability estimate” (Popper, 1972, p. 190). One could think
that this would form a major problem for the empirical sciences – such as psychology –
because practically all hypotheses they formulate are statistical, i.e., probabilistic, hypotheses:
for instance, when a scientist wants to know if a certain treatment has any statistically
significant effect his situation can be compared with a scientist tossing a coin and hoping to
find out that the coin is biased, viz. that there is a difference between the treatment group and
9
However, it must of course be clear that in inferential non-experimental research – e.g. in a survey study – the
estimations of population parameters from sample statistics are in fact inductions.
On the common origins of psychology and statistics
78
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
the control group. However, the only way to falsify the null hypothesis that the treatment has
no effect would require an infinite sequence of trials.
For although probability statements play such a vitally important role in empirical science, they turn out
to be impervious to strict falsification. (Popper, 1972, p. 146)
Nevertheless, the empirical sciences are very successful in deciding when to accept
and when to reject a hypothesis. Assume for instance that a scientist, whose hypothesis is that
a coin is unbiased, has made 10.000 tosses of whom only 5 turned tails up. Given the
hypothesis this result, i.e. these data, are highly improbable (although not impossible!) and
therefore the scientist may decide to consider his hypothesis as “practically falsified” (Popper,
1972, p. 191):
It is fairly clear that this ‘practical falsification’ can be obtained only through a methodological decision
to regard highly improbable events as ruled out – as prohibited. But with what right can they be so
regarded? Where are we to draw the line? Where does this ‘high improbability’ begin? (Popper, 1972,
p. 191)
To summarize: statistical inference in experimental research turned out (a) not to infer in the
sense in which ‘inference’ is mostly understood, viz., as ‘induction’, but only to falsify, (b)
however, frequency evidence cannot falsify statistical hypotheses – in any case not in the
strict sense of the word ‘falsification’, therefore (c) statistical methodology has to rely on
‘practical falsification’, i.e., make a pragmatic decision about how low the probability of an
observed result given the hypothesis should be, in order to lead to the rejection of the
hypothesis. This pragmatic solution ‘solves’ the conflicting conclusions following from
Popper’s evolutionary approach that shows on the one hand that induction is unscientific and
that the growth of scientific knowledge can only follow from ‘error elimination’ or
‘falsification’, but on the other hand that statistical hypotheses are in principle unfalsifiable.
This leads to the question how one should derive such a ‘pragmatic’ criterion for
rejecting statistical hypotheses – in Popperian terminology one would say “a falsifying rule
for probability statements” (Gillies, 1971). The only direction for the formulation of such a
pragmatic criterion is that, given the hypothesis, the probability of an observed result should
be ‘very low’. Although from a mathematical point of view the definition ‘very low
probability’ is too simplistic (see for an extensive elaboration on this problem, e.g. Gillies,
1971) – I will for simplicity’s sake restrain from getting into mathematical delicacies and
On the common origins of psychology and statistics
79
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
concentrate on the general theoretical aspects of how a falsification-criterion should be
established.
I will try to interpret the choice of such a criterion in the light of the evolutionary
theory of knowledge that was expounded in the previous section (B).
Viewed from the standpoint of an evolutionary theory of knowledge it becomes
immediately clear that the ‘choice’ of a falsification-criterion is in itself also the result of an
evolutionary process of variation and selection, or as Popper says, of accidents and
preferences. It would be maybe better to speak of the ‘emergence’ (instead of ‘choice’) of a
criterion out of a struggle for existence among a range of possible criteria:
It is obvious that in the evolution of life there were almost infinite possibilities. But they were largely
exclusive possibilities; so most steps were exclusive choices, destroying many possibilities. As a
consequence, only comparatively few propensities could realize themselves. Still, the variety of those
that have realized themselves is staggering. I believe that this was a process in which both accidents and
preferences, preferences of the organisms for certain possibilities, were mixed: the organisms were in
search of a better world. Here the preferred possibilities were, indeed, allurements. (Popper, 1990, p. 26)
In the ‘struggle for existence’ amongst various statistical methods and criteria for the rejection
of hypotheses there has turned out to be one winning team, namely significance testing
combined with the rejection of the null hypothesis in significance testing at p < 0.05 or p <
0.01. The expressions p < 0.05 and p < 0.01 indicate that the data have a rather low
probability given the null hypothesis:
Either the null hypothesis is true, in which case something unusual happened by chance (probability
[5% or] 1%), or the null hypothesis is false. (Hacking, 2001, p. 217)
However, why would one not use for instance p < 0.03 or p < 0.005? In fact one could
contend that the most widely applied rejection criterions in statistical methodology – the
rejection of the null hypothesis in significance testing at p < 0.05 or p < 0.01 – actually was
“a sort of mathematical accident (italics mine)” (Hacking, 2001, p. 217).
Long before pocket calculators made some calculations trivial, these figures became easy standards for
comparison, simply because you could compute them without weeks of back-breaking labor. Today
many investigators use a statistical software package without really understanding what it does. You can
just enter data, and press a button to select a program. (Hacking, 2001, p. 217)
On the common origins of psychology and statistics
80
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
In retrospective one can see that this ‘mathematical accident’ was a very successful accident.
In a sense it is the success of this methodology that justifies why one should assume that a
probability of 5% or 1% is such a ‘very low probability’, that the hypothesis can be
considered as ‘practically falsified’ and therefore rejected. The success of significance testing
on a 5% or 1% legitimizes its practice – and from a Darwinistic point of view this is perfectly
reasonable. Micheal Cowles concludes his book on the role of statistics in psychology in the
same vain:
If it is to be admitted that the logical foundations of psychology’s most widespread method of data
assessment are shaky, what are we to make of the “findings” of experimental psychology? Is the whole
edifice of data and theory to be compared with the buildings in the towns of the “wild west” – a gaudy
false front, and little of substance behind? This is a unreasonable conclusion and it is not a conclusion
that is borne out by even a cursory examination of the successful predictions of behaviour and the
confident applications of psychology, in areas stretching from market research to clinical practice, that
have a utility that is indisputable. The plain fact of matter is that psychology is using a set of tools that
leaves much to be desired. […] But, they seem to have been doing a job. Psychology is a success.
From a Darwinist point of view the statistical-psychological research practice is rational
because it is turned out to be a success – however this can be seen only in retrospective. Of
course: it could happen that the statistical-psychological research practice will be ‘falsified’
one day, but at this moment it is indisputably a success and subsequently rational too.
On the common origins of psychology and statistics
81
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
[3.D] Statistical methodology as a re-enactment of evolution: tamed variation
and accelerated selection
Darwinism made rationality an equivalent to ‘that-what-does-not-perish-in-the-long-run’ in
the evolutionary process of variation and selection: for if something survives it is apparently
well enough adapted to its environment and subsequently it can be called rational. Thus
rationality is no longer tight up to the rationality of a ‘rational subject’, i.e. the thought of
person X or person Y (although the thoughts of person X or Y may turn out to be rational – if
they survive in the Darwinist struggle for existence among thoughts).
All the time theories are conquered by other theories: in that respect the fact that
frequentist statistical methodology is the scientific methodology that has prevailed in the
struggle for existence among different statistical approaches – at least for the time being –
does not differ fundamentally from the fact that the belief in a flat earth has been conquered
by the theory that the surface of the earth is spherical.
Yet, the frequentist statistical methodology has a ‘feature’ which makes it to stand out
against other theories which have been successful in the struggle for existence – after all,
frequentist statistical methodology is not only is a ‘result’ of evolutionary processes, but it
also ‘re-enacts’ these evolutionary processes to a certain extent.
Just like ‘natural’ processes of evolution the frequentist statistical methodology is an
algorithm (Dennett, 1996) consisting of competition and selection, which depends on
(chance) variation and generates rational results in the long run. However, compared to
‘natural’ process of evolution, statistical inferential methods enhance and accelerate the
processes of selection because they enable scientists to distinguish much better between
‘chance’ variation and ‘structural’ variation. I will discuss in this section in more detail the
role in both evolution and frequentist statistics of (chance) variation and selection:
(§1) (Chance) variation in evolution and statistics: Quetelet, Darwin, Galton and K.
Pearson.
(§1a) Quetelet and Variation: the constant cause of real variation.
(§1b) Darwin and Variation: in search for the causes of the ‘details’ of
variation.
(§1c) Galton and Pearson: distinguishing structural variation from chance
variation.
(§ 2) Selection in evolution and statistics.
On the common origins of psychology and statistics
82
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
(§1) (Chance) variation in evolution and statistics: Quetelet, Darwin, Galton and K. Pearson.
Evolution depends on variation: if you have to select something, you have to be presented in
the first place with a choice among different variants. Fortunately, there is no lack of
variation. It is an empirical fact that variation is everywhere around us, for no replicative
process is perfect: not only is biological replication subject to, for instance, genetic mutations,
but also processes that are considered to be perfect ‘artificial’ copying processes are subject to
deviations: although modern photocopying methods are of course more reliable than the
handwritten copies made in monasteries by copyists in the days before printing, yet no
copying system is perfect. “Mistakes will happen” (Dawkins, 1989).
However, not all variation is ‘just’ meaningless and insignificant deviation. Some variation
can be qualified as meaningful, while some variation has to be called ‘dumb’ chance. How
can one distinguish between these kinds of variation? The ‘magic word’ which answered this
question was the bell-shaped curve – the so-called ‘normal distribution’.
Because this normal curve apparently governs many phenomena in this world – for
instance biological features such as people’s heights, IQ scores or a random (non systematic)
varying phenomenon such as unbiased measurement error – it became possible to estimate the
probability of the variability of certain data, given a hypothesis.
Figure 27. ‘Normal’ or ‘bell-shaped’ curve.
The normal curve makes it for instance possible to say things like: “If the assumption is true,
that the height of men in the Netherlands is normally distributed with an average of seventy
inches and a standard deviation (the measure of the average distance of the data values from
their mean) of four inches, then the probability that a randomly chosen man from the same
On the common origins of psychology and statistics
83
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
population has a height within sixty-six and seventy-four inches is about sixty-eight percent”
(cf. Aczel, 2004, p. 109 f.f.). Variation is in this way ‘domesticated’: for one can see now if
certain variability is probable or not. Moreover, because chance variation – ‘error’ – is
assumed to be distributed normally, improbable deviations from normality are a very useful
indicator of systematic variation which has not been taken into account: deviations from
normality have to be explained.
The nineteenth century history and philosophical foundations of the normal curve can
be found in the thought of Quetelet (1a), Darwin (1b) and Galton and Karl Pearson (1c).
(§1a) Quetelet and Variation: the constant cause of real variation.
To my knowledge the Belgian statistician Adolphe Quetelet was the first one who formulated
the idea that mistakes are not only ‘errors’ following from replicative measurement (i.e., from
our limited knowledge), but that there is also real variation in every replicative process (see
e.g. Desrosières, 2002; Porter, 1986) and that this ‘variability’ is subject to the laws of
probability.
Figure 28. Quetelet
In the twentieth letter of his Lettres (1846) Quetelet invites his readers to imagine that
the king of Prussia gives the decree that thousand copies are to be made of the famous ancient
statue, known as the ‘Borghese Gladiator’. If these thousand copies would be subsequently
scrupulously measured it is beyond doubt that the copies would show inaccuracies and
deviations from the original.
On the common origins of psychology and statistics
84
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
These variations follow both from real deviations from the original and from
inevitable measurement inaccuracy – after all, the successive measurement of one particular
copy will probably generate slightly differing results (Desrosières, 2002).
Hence Quetelet argues that analogous variation will be found also in the measurement
of human beings, referring implicitly to the probabilistic calculations he had applied in 1844
to the height and chest measurements of 5,738 Scottish soldiers (Hacking, 2004; Porter, 1986;
Stigler, 1986).
Figure 29. Borghese Gladiatior (from the collection of the Louvre)
Yet, from what does this ‘variation’ deviate? In the example of Quetelet of the
‘Borghese Gladiator’ it was clear: the variating copies deviate from the original statue.
However, in most replicative processes such an ‘original’ does not exist. When we for
instance would meet an extremely talented pianist, we probably would not say that he is a
‘deviation’ or ‘variety’ from the original man. Still, we would say that it is an ‘extraordinary’
or ‘non average’ person – so actually we deem him to be deviating, but deviating from what?
Our contemporary speech is so imbued with ‘means’, ‘normality’ and ‘averages’, that it luring
to think that these notions are timeless – but they actually only arose in their modern sense in
the nineteenth century. Although in the 1840s the words ‘mean’, ‘normality’ and ‘average’
seemed to be in the air and one can easily quarrel about the question who was historically
person the first to formulate these notion (see e.g. Stigler, 1999), I think that the ‘conceptual’
change is most clear in the thought of Quetelet. After all, it was he who formulated in the
1840s the groundbreaking analogy between the original statue of the ‘Borghese Gladiator’
and ‘l’homme moyen’ – true average man as the golden mean of man (Desrosières, 1993,
1999, 2002; Porter, 1986; Stigler, 1986).
On the common origins of psychology and statistics
85
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
Thus Quetelet replaced Platonist thought – that we live in a world of mere appearances
that reflect in a rather fluctuating and imperfect way the ‘other worldly’, fundamental and
eternal ‘ideas’ or ‘essences’ that lie behind them – by the idea that the ‘essence’ of man does
not lie in some metaphysical realm, but in the mean of a population, whose traits are the most
probable to occur. Though there is of course no reason to assume that this perfectly ‘average’
man really should exist, Quetelet claimed that this homme moyen was the “constant cause”
(Desrosières, 2002, p. 4; Stigler, 1986, p. 173 f.f.) behind all variation.
Figure 30. Although Quetelet (1796-1874) lived before the eugenics movement (which arose only to the end of
the nineteenth century), his ideas formed to a certain extent an ‘inspiration’ for it as is expressed by this statue of
the ‘average American male’, that was an exhibit on the Second International Exhibition of Eugenics in 1921 in
Cold Spring Harbor (Exhibits Book, p. 69).
When Quetelet studied in 1844 the height and chest measurements of 5,738 Scottish
soldiers (Stigler, 1986) he was struck by the fact that these measurements neatly scattered
around the mean in a distribution that looked like a “chapeau de gendarme” (Desrosières,
2002, p. 4).
On the common origins of psychology and statistics
86
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
Figure 31. A data plot by Quetelet (1846, p. 396). Quetelet studied the chest circumferences of 5738 Scottish
soldiers. Quetelet’s distribution comes close to what we today call a normal distribution curve.
Quetelet had been acquainted with this bell-shaped curve decades before, but in a
totally different context, namely astronomy. After all, before Quetelet’s scientific interests
turned to statistics and sociology in the 1830s, he had been an active astronomer – through his
efforts in 1828 the Observatory in Brussels was founded – and he would always stay an
astronomer: until his dead in 1874 he would be the director of the Observatory that he
founded.
Figure 32. Royal Observatory in Brussels
Around 1800 the so-called ‘law of errors’ was discovered in astronomy – for at that
time it had become clear that when several astronomers try to chart the position of a star, that
their observations will vary. However, the variations of their observations tended to conform
On the common origins of psychology and statistics
87
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
to a bell-shaped distribution – and the mean of this distribution was seen as the ‘real’ position
of the star, i.e. the ‘constant cause’ behind all observations (see for a popular account
Menand, 2002).
As an astronomer Quetelet knew this bell-shaped ‘astronomical law of errors’ very
well – so, when he discovered that human measurements conformed to the same distribution
he reasoned analogously that this implied the existence of a ‘constant cause’, namely the
‘homme moyen’. Yet, whereas an observed star has probably a ‘true’ position, the ‘homme
moyen’ is not a living individual, but solely an ‘abstraction’. However, this ‘abstraction’
became in a sense more real – after all, it is the ‘constant cause’ – than particular living
individuals who, due to ‘accidental causes’, deviate from the ‘homme moyen’. The ‘morethan-reality’ or ‘essentiality’ of statistical parameters such as the ‘mean’ is an issue which
even today triggers quite some contemplations (e.g. Desrosières, 1993; Desrosières, 2001;
Salsburg, 2001). The idea that an ‘abstract’ mean is the underlying cause behind all worldly
phenomena sounds of course like a rather Platonist idea
There is nevertheless a major difference between the platonic timeless ‘idea’ or
‘essence’ and the average man of Quetelet: for Quetelet argued that the ‘average man’ is not a
timeless ‘constant cause’, but – as he formulated himself already in 1835 – always
“conformable to and necessitated by time and place” (Quetelet, 1835, vol. 2, p. 274; Stigler,
1986, p. 171). So, this is of course quite remarkable: the constant cause is just temporary – it
is stability for the time being (see for a nice popular account Menand, 2002, p. 189 f.f.). This
applied even to social phenomena such as suicides and murders:
…when the “milieu” does not vary appreciably it will give rise to the same mean number of annual
suicides, murders, and so forth. (Schweber, 1982, p. 346)
Therefore, Quetelet deemed it the task of his ‘social physics’ to change the ‘milieu’, i.e. the
physical, social, and institutional causes, that are responsible for these ‘ fearful regularities’
(Schweber, 1982, p. 347).
(§1b) Darwin and Variation: in search for the causes of the ‘details’ of variation.
Quetelet had shown that reproduction leads inevitably to variation and that this variation is
not just some untransparent mishmash of variations, but that variation seemed to have the
tendency to spread evenly, in a bell-shaped curve, around its mean.
On the common origins of psychology and statistics
88
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
It is therefore not surprising that Darwin – whose theory consisted of variation and
selection – turned to Quetelet in order “to obtain quantitative statements regarding variations
and populations” (Schweber, 1977, p. 286). In 1838 there are several entries in Darwin’s
notebooks that indicate his interest in Quetelet – whose work was well known in England in
that time and who corresponded with several British scientists with whom Darwin was closely
acquainted. In September 1838 Darwin read the extensive review of Quetelet’s Sur l’homme.
Darwin got his own copy of Quetelet’s Sur l’homme, but because he had a rather poor
knowledge of French it is not clear how carefully he looked at it (Schweber, 1977, p. 289).
So, replication leads apparently to variation – but why does replication lead to
variation? According to Schweber (1977; 1982; 1983) this question bothered Charles Darwin
so much that it formed one of the reasons why his Origin of Species was only published in
1859, whereas Darwin had already in 1838 accepted “the ‘randomness’ of variations as a
phenomenological fact” (Schweber, 1983, p. 43) and had developed by July 1839 “…a
unitary evolutionary view of everything around him” (Schweber, 1977, p. 233). Darwin
assumed that variation follows from a myriad of small, incontrollable influences such as for
instance a fluctuation in temperature, exposure to light, nurture, etc.
Figure 33. Darwin
Thus Darwin considered ‘chance’ to be nothing in itself but just a provisional notion
to express ignorance of the real causes of variation, thereby following the ideas of classical
probability theory – which was the dominating theory on probability until the 1840s. In the
Origins of Species (1859) he writes:
I HAVE hitherto sometimes spoken as if the variations so common and multiform in organic beings
under domestication, and in a lesser degree in those in a state of nature had been due to chance. This, of
On the common origins of psychology and statistics
89
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
course, is a wholly incorrect expression, but it serves to acknowledge plainly our ignorance of the cause
of each particular variation. (Darwin, 1998/1859, p. 131)
and:
Our ignorance of the laws of variation is profound. Not in one case out of a hundred can we pretend to
assign any reason why this or that part differs, more or less, from the same part in the parents. But […],
the same laws appear to have acted in producing the lesser differences between varieties of the same
species, and the greater differences between species of the same genus. (Darwin, 1998/1859, p. 167)
Darwin thought that the ignorance of causes was something that had to be conquered
(Schweber, 1977): however, his search in following decades for the causes of what he called
himself “the tendency of small change” (Schweber, 1983, p. 43) was rather unsuccessful and
after the publication of the Origins of Species one can notice a slight shift in his opinions on
variation – for he begins to assign to ‘chance’ a more ‘autonomous’ status.
For instance, in a letter of 14 February 1861 to Leonard Horner he speaks of variations
as “accidental or spontaneous” (Schweber, 1983, p. 79) and in 1860 he confesses – though
apparently reluctant – that the ‘details’ of variation are due to ‘what we may call chance’:
I am inclined to look at everything as resulting from designed laws, with details, whether good or bad,
left to the working out of what we may call chance. Not that this notion at all satisfies me. (Schweber,
1983, p. 80)
Thus the idea that eventually a growing amount of knowledge about the causes of biological
variation would lead to the elimination of chance started to fade, for it became apparent that
the ‘unpredictable’ or (to use an anachronistic term) ‘stochastic’ individual variation cannot
be explicated completely (Schweber, 1982). Thus Schweber (1983, p. 79) argues that Darwin
began after the publication of the Origin of Species to see variations as ‘chance phenomena in
the “ontic” sense’.
Yet, one had to answer the question how one had to deal with ‘chance variation’, now
that the search for causes of variations had turned out to be unable to explain or predict the
“details” of variation.
(§1c) Galton and Pearson: distinguishing structural variation from chance variation.
The answer to the question ‘how to deal with chance variation?’ would be provided by
frequentist statistics with its emphasis on the long run and populations.
On the common origins of psychology and statistics
90
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
Darwin’s own half-cousin Francis Galton (1822-1911) stands at the beginnings of the
statistical ‘solution’ of this problem. One of the many research interests of Galton was the
possibility to apply probabilistic models to heredity. It is absolutely clear that Galton was
herein enormously influenced by Darwin’s Origin of Species (1859) as well as by the work of
Quetelet (Stigler, 1986).
However, Galton gave some completely new twists to the directions set out by Darwin
and Quetelet. Whereas Darwin had claimed to be “unstatistical by disposition” (Porter, 1986,
p. 134) Galton seemed to be statistical by disposition. And whereas Quetelet’s focus of
attention had been to the average, Galton was more interested in the deviations from the
average – such as genius.
It is difficult to understand why statisticians commonly limit their inquiries to Averages, and do not
revel in more comprehensive views. Their souls seem as dull to the charm of variety as that of the
native of one of our flat English counties, whose retrospect of Switzerland was that, if mountains could
be thrown into its lakes, two nuisances would be got rid of at once. (Galton, 1889, p. 62)
In 1869 Galton wrote the book Hereditary Genius. This book was groundbreaking in the
sense that it was “the first time quantitative data and statistical analysis had been brought to
bear on the problem of mental ability” (Wozniak, 1999). However, for Galton Hereditary
Genius (1869) formed only the starting point of a problem that would puzzle him for almost
two decades (Stigler, 1986), namely: How can it be that on the one hand mental eminence
runs in families – suggesting that it is inheritable – whereas on the other hand the children of
two geniuses on average do not seem to inherit these exceptional abilities of their parents?
…that the offspring did not tend to resemble their parents in size, but always to be more mediocre than
they – to be smaller than the parents, if the parents were large; to be larger than the parents, if the
parents were small. (Galton, 1886, p. 246)
Galton solved this question little by little. His efforts would culminate in 1889 in the book
Natural Inheritance (1889).
The phenomenon Galton struggled with is now generally known – due to the ideas
developed by Galton himself – as ‘regression towards the mean’. It is not restricted to
heredity, but appears in practically every “stochastic time-varying phenomenon, where two
correlated measurements are taken of the same person or object at two different times”
(Stigler, 1997, p. 104; 1999, p. 174).
On the common origins of psychology and statistics
91
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
A simple example may clarify this (cf. Stigler, 1997, 1999). Suppose that you have to
make two examinations at two successive times. The first time you get an exceptionally high
grade. The sad news that ‘regression to the mean’ teaches us, is that on average one may
expect that your second grade will be less high.
The explanation of this fact is that the extremely high grade on the first test is likely to
be due to a combination of “successes in two components, to a high degree of skill (a
permanent component) and to a high degree of luck (a transient component) ” (Stigler, 1997,
p. 104; 1999, p. 174). Thus, the second time you take the examination your skill is likely to
persist, whereas your extreme luck – on average! – will not show up. Yet, an unusually low
grade would also tend to regress toward the mean.
Figure 34. Graphical illustration of regression made by Galton (1886, p. 249); the circles give the average
heights for groups of children whose midparental heights (the average height of both parents) can be read from
the line AB. The difference between the line CD and AB represents regression towards mediocrity. Reproduced
from Stigler (1986), p. 295.
Actually, as long as the scores on the two test are not perfectly correlated (after all, if
there would be a correlation of 1.0 between the two exams, i.e., if the value of the grade for
the first exam would always lead to an exactly proportional variation in the grade for the
second exam, then there would be no regression to the mean) there will be always ‘on
average’ a regression ‘towards the average’. Admittedly, the weaker the correlation between
On the common origins of psychology and statistics
92
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
the two events, two generations or two exams is, the bigger the effect of ‘regression to the
mean’ will be: the higher the amount of non-structural variation in comparison with the
structural variation, the more regression to the mean you will see.
If properly understood, regression is a concept that is “transparent to the point of
being obvious” (Stigler, 1997, p. 103; 1999, p. 173). Yet, it is a source of endless
misunderstandings. The most common misinterpretation is that ‘regression to the mean’
would imply that all things in the world are becoming mediocre. However, ‘regression to the
mean’ gives no cause for such cultural pessimism. The fact that on average there will be a
regression towards the average value, does not mean that heights, mental abilities or whatever
other traits are degenerating into grey uniformity. The most spectacular instance of this
fatalistic misinterpretation was made in 1933 by a Northwestern University professor named
Horace Secrist, who wrote The triumph of mediocrity in business – a book that was
embarrassingly enough applauded by most reviewers (Stigler, 1996, 1997, 1999):
In over 200 charts and tables, Secrist ‘demonstrated’ what he took to be an important economic
phenomenon, one that likely lay at the root of the great depression: a tendency for firms to grow more
mediocre over time. (Stigler, 1997, p. 112)
The idea that is completely overlooked by such misinterpretations as Secrist’s, is that
the brilliancy of Galton’s regression lays in the fact that he distinguished between the
structural ‘shared’ variation of certain variables – proportionally going up and down together
– and the ‘random’ chance variation – piling up evenly around the average in a bell-shaped
way. The real importance of ‘regression’ can be clarified with the help of another important
statistical notion developed by Galton: ‘correlation’.
In December 1888 Galton wrote a paper for the Royal Society entitled ‘Co-relations
and their Measurement Chiefly from Anthropometric Data’ (Galton, 1888), wherein he
explained that some traits have on the average, i.e. in the long run, the tendency to vary in the
same direction: “tall people tend to have big feet, long arms and long fingers” (Hacking,
2004, p. 187).
On the common origins of psychology and statistics
93
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
Figure 35. Francis Galton on one of his own anthropometry cards (1893), with profile and full-face photos and
spaces for key body measurements, taken by Alphonse Bertillon
Of course, there may be a tall person with very small feet, but in the long run a
statistical relation may be observed between feet size and body length. Galton called this
relation between the variability of different observations – that can be mathematically
expressed – correlation. This means that Galton envisioned that variation can be proportioned
into ‘structural’ or ‘systematical’ variation (e.g., on average tall people tend to have big feet)
and ‘chance’ variation (e.g., a tall person may have – ‘by chance’ – small feet).
Basically the idea of correlation was also underlying the other important notion
invented by Galton, regression. After all, if you can discern the ‘chance’ factor in the height
of children – leading to regression to the mean – from the structural or correlated variation
(tall parents tend to have tall children), one may formulate a linear ‘regression’ formula to
predict the variation of one ‘variable’ (e.g., height of a child) from the variation of another
‘variable’ (e.g., height of the parents). However, how should one discern chance variation
from structural variation? The answer lies in the conceptual ‘discoveries’ Galton had made
already in the 1870s concerning the bell-shaped ‘error curve’.
Galton had designed in 1873 a very clear model – called the quincunx – which
illustrated the establishment of a bell-shaped curve (see e.g. Stigler, 1986, p. 276 f.f.). This
quincunx is a device wherein shot is poured through a regular pattern of pins. Each shot has a
probability of fifty percent to fall to either to the left or to right of each pin. The more rows of
pins you add, the better the final outline approximates a normal curve.
On the common origins of psychology and statistics
94
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
Figure 36. The original Quincunx (left) designed by Galton (1873), a modern replica, and a schematic depiction.
In 1875 Galton made another major step in his conceptual understanding of the bellshaped distribution (Stigler, 1986), for he gained the insight that such a curve may consist of a
lot of smaller bell-shaped curves – a fact that may explain why the normal curve applies to so
many phenomena.
Figure 37. Drawings of Karl Pearson (Pearson, 1930, p. 466), based on some hasty sketches of Galton (made in
a 12 January 1877 letter to his cousin George Darwin), showing that an accumulation of normal distribution will
itself be normal too.
After all, a lot of phenomena in this world can be seen as the result of an accumulation of
different random, i.e. normally distributed, causes. Galton’s 1875 insight makes it clear that
On the common origins of psychology and statistics
95
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
such an accumulation of normally distributed influences will lead to a unified normal
distribution.
The insights concerning the bell-shaped curve (or the ‘error curve’) struck Galton as
epiphanic:
I know of scarcely anything so apt to impress the imagination as the wonderful form of cosmic order
expressed by the “Law of Frequency of Error.” The law would have been personified by the Greeks and
deified, if they had known of it. It reigns with serenity and in complete self-effacement, amidst the
wildest confusion. The huger the mob, and the greater the apparent anarchy, the more perfect is its
sway. It is the supreme law of Unreason. Whenever a large sample of chaotic elements are taken in hand
and marshalled in the order of their magnitude, an unsuspected and most beautiful form of regularity
proves to have been latent all along. The tops of the marshalled rows form a flowing curve of invariable
proportions ; and each element, as it is sorted into place, finds, as it were, a preordained niche,
accurately adapted to fit it. (Galton, 1889, p. 66)
From a modern point of view it seems maybe a little exaggerated to assume that the normal
curve would have been ‘deified’ by the ancient Greeks ‘if they had known it’. Moreover, as
we now know, not all phenomena are distributed according to the normal curve. Yet, the sole
fact that (according to the so-called central limit theorem) the sampling distribution, i.e. the
distribution of the means of different samples, does converge to a normal distribution (even if
a population in itself is not normally distributed!) and the fact that one may assume that
unbiased measurement errors will be normally distributed, provided scientists with powerful
tools to calculate probabilities to help them to distinguish structural variation from chance
variation.
[Galton] was regarding the Normal distribution of many traits as an autonomous statistical law.
Statistical law had come into the world fully-fledged. Galton saw that chance had been tamed.
(Hacking, 2004, p. 186)
The statistical methods which Galton had applied in Natural Inheritance (1889) were quickly
taken up in the 1890s by disciplines as anthropometry, sociology, economics, psychology and
studies of education (Gigerenzer et al., 1990, p. 58).
From the 1890s Galton’s student Karl Pearson10 would play a major role in the
development of a more ‘autonomous’ (cf. Hacking, 2004, p. 181 f.f.), abstract and
10
See also §2.E “From seventeenth century ‘probability’ to eighteenth century ‘Statistik’ and nineteenth and
twentieth century ‘statistical inference’” (p. 54 f.f.).
On the common origins of psychology and statistics
96
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
mathematical (cf. Heiser & Verduin, 2005) statistical methodology which could “confront a
wide range of scientific and practical problems” (Gigerenzer et al., 1990, p. 59). Moreover,
he argued that, instead of the search for causes, it is much more fruitful to ‘accept’ the
phenomenon of chance in variation and ‘tame’ it (Hacking, 2004) by the application of
statistical methods based on ideas such as the normal curve, correlation and regression. In the
tradition of the positivist philosopher Ernst Mach, Karl Pearson rejected the notion of
causality altogether as a subjective chimera and a meaningless, metaphysical notion
(Desrosières, 1999; Hacking, 2004), for every ‘cause’ is ‘caused’ by other ‘causes’ leading to
infinite regress that makes the search for causes a hopelessly subjective and arbitrary
enterprise:
Cause is scientifically used to denote an antecedent stage in a routine of perceptions. In this
sense force as a cause is meaningless. First cause is only limit, permanent or temporary, to
knowledge. (Pearson, 1911, p. 150)
Hence, Pearson argued that the notion of ‘causation’ had to be completely replaced by
‘correlation’ (see, e.g. Hacking, 2004; Porter, 1986).
Figure 38. Karl Pearson
So, what is the meaning of the ideas of Quetelet, Darwin, Galton and K. Pearson with
respect to the word variation? To summarize, one can see that in less than a century variation
had, on the one hand, become more ‘real’ than it ever had been; however, on the other hand,
it also had become more domesticated than it ever had been – for the bell-curve, regression
and correlation provided the tools to determine how probable it was that certain variation
could be ascribed to ‘dumb’ chance. In 1953 R.A. Fisher would even say:
On the common origins of psychology and statistics
97
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
The effects of chance are the most accurately calculable, and therefore the least doubtful, of all the
factors of an evolutionary situation. (Fisher, 1953, p. 515)
The taming of variation – at least the taming of the chance factor in variation – had important
practical implications, namely that it accelerated the selection amongst hypotheses.
(§2) Selection in evolution and statistics.
Among the ‘variations’ that emerge in replication there will be almost always a struggle for
existence.
In biology variants of replicators – for instance peas or finches – have to struggle in
order to gain access to limited resources such as food and space, so that they will survive long
enough to reproduce themselves.
In science ‘memes’, like hypotheses and theories, struggle for existence too: in a fight
of life-and-dead against oblivion, the waste-paper basket or falsification11.
Why does one finch survive and why does the other perish? Why is one hypothesis
rejected and the other not? The interesting fact is that selection occurs according to a
‘mindless’ algorithm (Dennett, 1996): evolution generates rational designs – such as for
instance a complex organ as the human eye – on the base of mere variation and selection.
The most interesting and powerful algorithms – in computers as well as in natural evolution –
finesse ignorance and produce rational results with this blind and mindless tactic, viz. by
“randomly generating a candidate and then testing it out mechanically” (Dennett, 1996, p. 53).
Yet, how can a mindless algorithm produce rational designs? Dennett shows how a very basic
algorithm – such as a tennis tournament – can generate rational results: a amateur may have
luck and win a set, but in the long run the chances rise that the most talented and professional
tennis players will float to the surface.
The tennis tournament is a simple algorithm that “takes as input a set of competitors and
guarantees to terminate by identifying a single winner”.
Of course, to win a tournament is a combination of skill and luck. Still, in the long run “more
of the better players would tend, statistically, to get to the late rounds” (Dennett, 1996, p. 55).
Even if a tournament is very luck-ridden – for instance if the tennis players were required “to
play Russian roulette with a loaded revolver before continuing after the first set” (Dennett,
11
See also above, §3.C “Statistics à la Popper: natural selection of a falsifying rule for statistical hypotheses”
(p. 78-81).
On the common origins of psychology and statistics
98
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
1996, p. 55) – the better players still would have better chances to make it to the final rounds
than the untalented amateurs:
The power of a tournament to “discriminate” skill differences in the long run may be diminished by
haphazard catastrophe, but is not in general reduced to zero. This fact […] is as true of evolutionary
algorithms in nature as of elimination tournaments in sports, […]. (Dennett, 1996, p. 55)
However, it may take quite some ‘deep time’, i.e., sometimes even billion of years (see e.g.
Gee, 1999), before natural selection selects chance from structural variation. Statistical
methodology – which can make pretty good guesses about what is due to chance variation
and what to structural variation, based on probabilistic calculations – may accelerate this
process.
Boris Becker
Boris Becker
Dan Dennett
(write winner’s name above)
George Smith
Pete Sampras
Pete Sampras
Figure 39. An example of an simple algorithm: the tennis tournament. In the long run rational results will be
generated (figure reproduced from Dennett, 1996, p. 53).
Probably it sounds quite remarkable that humans are capable of ‘speeding up’ natural
selection. Yet, think for instance about dogs. All dogs belong to the same species (Canis lupus
familiaris) – all dog breeds share the same genome – that has emerged approximately 15,000
years ago out of wolves: they are a rather ‘young’ species compared to for instance rats and
mice who exist approximately 750,000 year12. However, the really interesting fact about
domestic dogs is that it is a species that was created by selective human breeding and that this
‘artificial selection’ apparently has been so powerful that the time it took to develop from
wolf to dog is much shorter than one would expect to see in ‘normal’ natural selection – even
though the two processes operate on the same underlying gene pool.
12
These and other petty facts about genetics can be found at a nice website, where everybody can ask his
questions to geneticists from renowned universities: http://www.thetech.org/genetics/asklist.php
On the common origins of psychology and statistics
99
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
In the same way the ‘taming’ of chance has led to an accelerated selection of
hypotheses.
hypothesis 1
hypothesis 2
winning hypothesis
hypothesis 3
hypothesis 4
Figure 40. The selection of hypotheses by a ‘mindless’ algorithm (figure inspired by Dennett, 1996, p. 53).
It now becomes clear that scientists are not – as they may like to think of themselves – the
‘inventors’ of the ‘scientific game’ and its rules, nor does it matter which hypothesis they
believe will win the scientific game: their only function is to implement the algorithm (e.g., to
‘shoot’ the losers and bury them), reduce the influence of unwarranted biases (e.g. level
‘bumpy courts’ which could raise the luck ratio in a completely unwarranted way), and ‘tame’
chance by making good guesses – on the base of their probabilistic calculations which draw a
line at a certain level (e.g., p < 0.05 or p < 0.01) between what has to be considered as chance
and what as systematic variation – in order to accelerate the eliminating algorithm amongst
the scientific hypotheses. Thus, viewed from this perspective, scientists are the ‘umpires’ and
‘ball boys’ in the scientific elimination tournament among hypotheses.
So, a good umpire and a good scientist are both characterized by the fact that they are
unbiased – i.e. that their personal beliefs do not influence their algorithmic application of the
rules. Both have to be ‘blind’ in their judgements. Their blindness guarantees a ‘fair’ (i.e. a
‘rational’) result: namely, that it is really the best that will win. In science this idea is, for
instance, expressed by the fact that experiments should have a blind or even double-blind
design. Because ‘blind’ chance can be domesticated, the scientist has to be ‘blind’ or
‘unbiased’. In a sense both the umpire and the scientist stand in a long tradition of two other
On the common origins of psychology and statistics
100
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
instances that are renown for the fact that they apply their (algorithmic) rules blindly, namely
Justice (‘Justitia’) and Chance (‘Fortuna’).
Figure 41. Both Justice (left) and Fortune (right) have been traditionally depicted blindfolded: both apply their
rules ‘blindly’. (The oil painting of lady Justice is dated 1804, made by J.H. Fredriks, and belongs to the
collection of the municipality of Breda in the Netherlands; the miniature of lady Fortuna is a miniature from a
manuscript of Augustine’s La Cité de Dieu, made around 1400-1410, which belongs to the collection of the
Dutch Royal Library.)
Yet, although both the scientific rules as well as the tennis rules are applied ‘blindly’,
the algorithmic rules of a game of tennis are of course not the same as those of the ‘game’ of
science. For instance, science uses significance levels as ‘p < 0.05’ or ‘p < 0.01’, whereas
tennis does not. After all, a tennis tournament is not a scientific way to determine the best
tennis player. It deals with individual tennis players, whereas the scientist would probably
more interested in the structural variations between different groups of tennis players (e.g., ‘is
there any structural difference between female and male tennis players?’).
To summarize, the algorithmic selection of hypotheses in science is accelerated by the
statistical approach. However, the statistical methodology deals only with ‘collectives’ or
‘groups’. This concept is apparently rather hard to grasp to human mind: some examples
about our attitudes towards psychological research may clarify this.
Statistical-psychological research is able to say rational things about human rationality or
irrationality because it does not deal with the cognition of one single individual but with
whole collectives.
On the one hand we have grown used to this idea: for instance, the attempts of Oswald
Külpe (1862-1915) to ‘research’ psychological laws by introspection of one single person
(mostly the researcher himself) now seem ‘cute’ curiosities from the history of psychology
On the common origins of psychology and statistics
101
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
that cannot be taken completely serious. To give an impression of introspective rapports I will
cite here a short fragment from an introspective rapport that was made by Külpe’s assistant
Karl Bühler in 1907, wherein Külpe himself is the subject and has to tell the direction of his
thoughts after he has been asked the rather sophisticated question ‘Can we capture the nature
of thinking by our thoughts?’:
The question at first struck me as odd; I thought it might be a trick question. Then it suddenly occurred
to me how Hegel had criticized Kant, and then I answered decisively: Yes. The thought about Hegel’s
critique was quite rich, I knew at the moment exactly what I amounted to, I didn’t say anything about it,
and also didn’t imagine anything, only the word “Hegel” resounded to me subsequently (acousticmotoric)… (Gigerenzer & Murray, 1987, p. 139)
As experimental psychology progressed, its research estranged more and more from such
accounts of individual observations and turned into research wherein the “anonymous
members had no individual existence in the experimental report” (Danziger, 1990, p. 100).
We have grown so used to this kind of research that Kühler’s introspection strikes us as
peculiar.
Yet, on the other hand lots of people still experience some ‘unease’ when they are
confronted with the fact that they are ‘just’ one subject in a psychological research and that
their particularities are of no interest to the researcher: every “extraexperimental” identity is
annihilated in the common denominator of being a “subject” in a research (Danziger, 1990, p.
99).
The subject of the next section will be this ‘unease’ we experience when we are
confronted with the ‘bleak’ Darwinist depiction of science as a process which encompasses,
the taming of chance (i.e., the discrimination between chance and structural variation) by the
‘blind’ application of algorithmic ‘selection’ rules to replicative, collective phenomena.
On the common origins of psychology and statistics
102
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
[3.E] The lure of ‘subjective’ semantics in statistical inference.
In the previous three sections, I showed the evolutionary epistemology that underlies the
frequentist statistical methodology, such as it is applied in psychology and a lot of other
sciences. In this evolutionary epistemology is no place for an autonomous, individual subject.
Nevertheless, most people – including psychologists and psychology students – think they are
autonomous, individual subjects and subsequently they assume that their individual beliefs,
ideas and experiences do matter scientifically. I think that this discrepancy may explain why
statistical methodology is so poorly understood: both scientists and students in psychology are
‘lured’ to a misplaced subjective interpretation of their frequentist methodology. This is very
nicely exemplified by a study of Haller and Krauss (Gigerenzer et al., 2004; 2002), who in the
year 2000 confronted 44 students of psychology, 39 professors and lecturers of psychology,
and 30 statistics teachers from six German universities with a short questionnaire about the
meaning of a significant p-value in a t-test. The questionnaire, which was developed by Oakes
(1986) in a similar study, is presented in figure 42.
The Questionnaire
Suppose you have a treatment that you suspect may alter performance on a certain task.
You compare the means of your control and experimental groups (say 20 subjects in each
sample). Further, suppose you use a simple independent means t-test and your result is (t
= 2.7, d.f. = 18, p = 0.01). Please mark each of the statements below as “true” or “false”.
“False” means that the statement does not follow logically from the above premises. Also
note that several or none of the statements may be correct.
1) You have absolutely disproved the null hypothesis (that is, there is no difference
between the population means).
[ ] true / false [ ]
2) You have found the probability of the null hypothesis being true.
[ ] true / false [ ]
3) You have absolutely proved your experimental hypothesis (that there is a difference
between the population means).
[ ] true / false [ ]
4) You can deduce the probability of the experimental hypothesis being true.
[ ] true / false [ ]
5) You know, if you decide to reject the null hypothesis, the probability that you are
making the wrong decision.
[ ] true / false [ ]
6) You have a reliable experimental finding in the sense that if, hypothetically, the
experiment were repeated a great number of times, you would obtain a significant result
on 99% of occasions.
[ ] true / false [ ]
Figure 42. Questionnaire from the Haller & Krauss study (Gigerenzer et al., 2004, pp. 392-93; Haller & Krauss,
2002, p. 5)
On the common origins of psychology and statistics
103
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
All the statements are in fact wrong: they all represent common misconceptions about the
meaning of a significant result in a significance test. However, the majority of the subjects
who answered this questionnaire endorsed one or more of these illusions
100%
80%
90%
100%
97%
0%
Professors &
lecturers
teaching
statistics
(N=30)
Professors &
lecturers not
teaching
statistics
(N=39)
Psychology
students
(N=44)
Professors &
lecturers from
the Oakes
study (1986)
(N= 68)
Figure 43. The amount of delusions about the meaning of “p = .01”. Percentages of participants in each group
who made at least one mistake in the questionnaire, in comparison to Oakes’ original study (1986). Based on
Haller & Kraus. (Gigerenzer et al., 2004, p. 394; 2002, p. 7)
So, there are quite some sufferers from misconceptions within the department of psychology!
On a closer look one may conclude that a large part of this misconceptions is based on faulty
subjective interpretations of frequentist methodology. Clearly, a lot of the subjects have not
understood that “p = .01” only means that – under the null hypothesis – the probability of the
test statistic being at least as large as the one calculated from the data is 0.01. It is in
frequentist statistical methodology absolutely impossible that we could assign a probability to
the hypothesis: the hypothesis can only be completely false or completely true. Thus, the
meaning of “p = .01” is just that either the null hypothesis is true, in which case something
unusual happened by chance (probability 1%), or the null hypothesis is false (cf. Hacking,
2001, p. 215). Subjective, in particular Bayesian, statistical inferential methodology allows to
assign probabilities to hypotheses, but frequentist statistics does not!
With these remarks in mind we can now take a closer look to the false statements of the
questionnaire (cf. Gigerenzer et al., 2004; Haller & Krauss, 2002):
Statements 1 and 3
As Popper showed clearly, hypotheses can only be (‘practically’) falsified, but never
proved or disproved! Hypotheses that survive a lot of severe tests may be called
On the common origins of psychology and statistics
104
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
corroborated – however, corroboration has nothing to do with probability calculus or
statistics. Therefore statements 1 and 3 are both false.
Statements 2, 4, and 5
Statement 2, 4 and 5 all claim that the p-value assigns a probability to the hypothesis,
which is a clear subjectivist delusion. As was already clarified by Hume in his socalled ‘induction problem’, the fact that we have seen the sun rise thousands of times
gives no rational guarantee whatsoever that the sun will rise tomorrow too. One does
not have any rational ground to assign a probability to the hypothesis that the sun will
rise tomorrow and subsequently it is of course also impossible to say that this
probability is ‘true’ or ‘wrong’. Thus, it is clear that statement 2 and 4 are false.
Statement 5 makes essentially the same claim as statement 2 does – ‘the probability
that you make a wrong decision’ being a reformulation of ‘the probability that the null
hypothesis is true’ – and consequently is false too.
In fact one may say that statement 1, 2, 3, 4, and 5 all suffer basically from the same
delusion, namely that they assume that one can assign some probability to the
hypothesis – however, the whole idea of assigning any probability whatsoever to a
hypothesis is nonsensical in frequentist statistics!
Statement 6
Recall that the “p = .01” just means that – if one assumes that the null hypothesis is
true – the probability that the test statistic turns out as it did is only 1%. So, one
concludes that either the null hypothesis is true, in which case something unusual
happened by chance (probability 1%), or that the null hypothesis is false.
Although statement 6 is the only statement of the questionnaire that rightly says that
“p = .01” concern the probability of the data, it nevertheless overlooks the fact that “p
= .01” only says something about the probability of the data, given the assumption that
the null hypothesis is true! Statement 6 pretends as if that “p = .01” says something
about the probability of the data per se, instead of ‘given the hypothesis’. Yet,
statement 6 would only be true if one would know with absolute certainty that the null
hypothesis is true – however, one does only assume the null hypothesis to be true.
To sum up, all the six incorrect statements suffer from ‘subjective’ illusions. All these
illusions endorse the idea that we ‘really’ can know and that our knowledge ‘really’ may grow
On the common origins of psychology and statistics
105
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
– in spite of Popper’s thought which showed that the growth of knowledge in fact is only a
diminishment of error. In table 1 the exact percentages of false answers are showed. On
average 2.5 illusions were endorsed by students, 2.0 illusions by their professors and lecturers
not teaching statistics, and 1.9 illusions by professors and lecturers teaching statistics.
Table 1
Percentages of false answers (i.e., statements marked as true).
Germany 2000
Psychological departments of German
universities (Haller & Krauss, 2002)
United
Kingdom
1986 (Oakes,
1986)
Statements (abbreviated)
Psychology
Students
(N = 44)
Professors
and Lecturers:
Not Teaching
Statistics
(N = 39)
Professors
and
Lecturers:
Teaching
Statistics
(N = 30)
Professors
and Lecturers
(N = 68)
1) H0 is absolutely disproved
34%
15%
10%
1%
2) Probability of H0 is found
32%
26%
17%
36%
3) H1 is absolutely proved
20%
13%
10%
6%
4) Probability of H1 is found
59%
33%
33%
66%
5) Probability of wrong decision
68%
67%
73 %
86%
6) Probability of replication
41%
49%
37%
60%
Note. Percentages of false answers (i.e., statements marked as true) in the three groups studied by Haller &
Krauss (2002), in comparison to the percentages of false answers among the academic psychologist in Oakes’
original study (1986). Based on Haller and Krauss (Gigerenzer et al., 2004; Haller & Krauss, 2002).
Statement 1 and 3 were most frequently identified as being incorrect, probably because the
subjects understood that the word ‘absolutely’ is misplaced – statistical methodology will
never give absolute certainty. However, in view of the rather high percentages of subjects
who considered statements 2, 4 and 5 to be correct, one may conclude that the subjective
misconception, that “p = .01” says something about the amount of belief (i.e. ‘probability’)
one has to assign to a hypothesis, is apparently very widespread.
Is it surprising that so many students, lecturers and professors of psychology endorse
‘subjective’ illusions? Well, on the on hand it seems to be surprising, that a methodology that
is so central to psychology is conceptually poorly understood. However, on the other hand, it
On the common origins of psychology and statistics
106
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
is not surprising at all: as we saw earlier13, even R.A. Fisher – whose ideas formed the base
for the statistical methodology such as it is applied in psychology – could not got rid of the
last shreds of subjectivism in his frequentist methodology and spoke of vague subjective
notions as ‘fiducial probability’ and ‘likelihood’ (Hacking, 2001, p. 245; Kendall, 1963). The
‘unease’ we experience when we are confronted with the ‘bleak’ Darwinist depiction of
science’s blind and mindless application of algorithmic selection to replicative phenomena
was apparently also shared by Fisher when he spoke about the ‘free mind’ and ‘a language
distinct from that of technological efficiency’ (Fisher, 1955, p. 70).
If you only pay some attention to it, you may see how widespread the allurement of
subjective semantics is. Mayo (1996, p. 364, footnote 1), for instance, even contends – and I
think she is right – that the subjective semantics that were indigestible to Egon Pearson and
Jerzy Neyman, not only caused the fundamental controversies between them and Fisher, but
also underlie – at least partly – the antagonisms between Karl Pearson and his son Egon.
However, although Fisher was a frequentist with a subjectivist twist, he strongly
opposed outright Bayesianism. Yet, today it seems to be in vogue to defend Bayesianism as an
apt statistical methodology for psychology – not just as an alternative, next to frequentist
statistical methodology, but maybe even as a methodology that may partly replace frequentist
statistics. In a very recent dissertation Romeijn (2005) argues that conventional frequentist
methodology may be used next to Bayesian statistics as an “epistemic shortcut” (p. 255). It is
evident that Romeijn – in a way just as Fisher, whom he depicts as a stubborn frequentist –
dreams of scientists with ‘free minds’:
The proposed scheme can be used here to formalise a view that finds its roots already in Kant, […]. It is
the view that knowledge can only emerge on the intersection of observation, presented by a mindindependent world, and a conceptual framework, devised by, partly world-independent, minds [italics
added]. (Romeijn, 2005, p. 12)
This is evidently in complete opposition with the Popper’s (i.e. ‘evolutionary’ or ‘Darwinist’)
epistemology! After all, we ourselves and our knowledge are a result of a long evolutionary
process of adaptation to our environment through variation and selection – thus, the idea that
our minds could be world-independent and make objective observations is a “colossal
mistake” (Popper, 1990, p. 37). Of course, the methodology proposed by Romeijn – who
adheres strongly to ideas of Bayes and Carnap – is much more ‘cheery’ than that of Popper. It
13
See also §2.E “From seventeenth century ‘probability’ to eighteenth century ‘Statistik’ and nineteenth and
twentieth century ‘statistical inference” (p. 60 f.f.).
On the common origins of psychology and statistics
107
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
is much nicer to think of yourself as a individual, making observations and inductions,
choosing freely you ‘input probabilities’, etc.:
Finally, it is again notable that apart from the observations, the Bayesian scheme consists of a range of
input probabilities which are entirely free for choice. There is no further restriction on what input
probabilities may be rational or acceptable. […] It is that both in the Carnapian and in the Bayesian
scheme, the observations do not determine what predictions are warranted. In choosing the input
probabilities we effectively determine the patterns in the observations on which the predictions focus,
but there is no restriction stemming from the observations alone. (Romeijn, 2005, p. 39)
Why is someone like Romeijn – working in a psychology department and lecturing on
philosophy of statistics – so attracted to Bayesianism? Apart from the fact that Bayesianism is
based on a much more ‘cheery’ – though unfortunately philosophically untenable –
epistemology, than the ‘bleak’ Darwinist epistemology underlying frequentist statistics, there
are some reasons why Bayesianism especially interests psychologists: for on the level of
psychological theories of cognition Bayesianist theories are really ‘hot’. One of the favourite
‘models’ or ‘metaphors’ of human cognition is already for several decades the model of the
intuitive Bayesian statistician. Why the Bayesian model is so popular in psychological
cognitive theories will be the subject of the next section (F). However, for now I will end this
section with the suggestion that – besides the earlier mentioned reasons – the relative
popularity of Bayesianism among some methodologists in the statistical departments of
psychology may be due to a ‘contamination’ from the Bayesianist enthusiasm on the level of
psychological theory.
On the common origins of psychology and statistics
108
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
[3.F] The lure of ‘subjective’ semantics in cognitive psychology.
Before I started to study psychology I used to believe in the old Aristotelian idea (Aristotle,
1995, 1.1253a ) that man is a animal endowed with reason: a ‘zôion logon echon’ or – in
Latin – an ‘animal rationale’ (see also Heidegger, 1988, p. 21-28). Yet, one has only to open
an introductory course on human reasoning and thinking (see e.g. Garnham & Oakhill, 2001)
to conclude that during the last decades it has become generally accepted in psychology that
man is a ‘cognitive miser’ at the utmost endowed with a ‘bounded rationality’ which
sometimes even seems to be completely ‘irrational’ – cognitive illusions, heuristics and biases
have become “the fodder for classroom demonstrations and textbooks” (Gigerenzer, 2000, p.
237). So, if man is not the seat of rationality, what is?
Well, as we saw in the previous section on the Darwinist epistemology underlying
frequentist statistical inference, rationality is everywhere around us! After all, rationality is
the rational design which the algorithm of natural selection selects in the long run: the human
eye, the finch, the hypothesis that was not eliminated, etc. Thus, we may conclude that
frequentist statistical inference is a rational method – for its ‘blind’ or ‘mindless’ application
of selecting algorithms to replicative, collective phenomena generates rational results. At the
same time this leads us to the conclusion that Bayesian statistical inference – a conclusion
made basically already by Hume when he formulated his ‘induction problem’ – is not rational.
Subsequently it is clear why Bayesian statistics – despite the wishful hopes of methodologists
like Romeijn (2005) – have hardly any place in scientific methodology.
Paradoxically however, in cognitive psychology the rationality of human cognition is
evaluated against Bayesian statistics: human cognition would be considered rational if its
reasoning would follow the rules of Bayesian statistics – which it does not! Since the famous
psychologists Kahneman and Tversky (1982) launched in the 1970s their “heuristics-andbiases” movement, cognitive science has generally adopted the idea – repeated in textbooks
again and again – that the human cognition is fundamentally deficient in its rationality,
because it fails to be a flawless Bayesian ‘intuitive statistician’ (Gigerenzer, 2000; Gigerenzer
& Murray, 1987; Gigerenzer et al., 1990).
This is quite a remarkable situation: after all, the subjective (i.e. Bayesian)
interpretation of probability that is mostly deemed to be too ‘suspicious’ to apply in scientific
methodology, is used on a theoretical level in cognitive psychology as the standard for
measuring the rationality of the human mind – only to conclude that the human mind is rather
On the common origins of psychology and statistics
109
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
irrational. The ambiguous attitude to Bayesian statistics of cognitive psychologists is clearly
expressed by Gigerenzer:
Why do psychologists of the calibre of Kahnemann and Tversky nevertheless adhere to the idea that
Bayes’ theorem is rationality for all contents and contexts? In most studies that claim to have
demonstrated human errors, biases, and shortcomings, no argument is given to explain why the
statistical rule applied is rational, nor is rationality independently defined. (Gigerenzer & Murray, 1987,
p. 179)
I. Bayesian inference and Artificial Intelligence
One may indeed wonder why so many cognitive psychologists have embraced Bayesian
statistical inference as the one and only standard of rationality. The answer lies in the fact that
especially since the 1980s Bayesian inference has scored impressive successes in Artifical
Intelligence: computer simulations of artificial ‘neural’ or ‘connectionist’ networks – which
can weight ‘evidence’ according to Bayesian inference – have showed to be capable of
discerning patterns and dealing with learning tasks (see e.g. McClelland, 1994). “New
technologies have been a steady source of metaphors of the mind” (Gigerenzer, 2000, p. 21)
and the “surprising things” (Dennett, 1991) connectionist models seem to be capable of has
made it of course a very attractive model of cognition.
So, why is Bayesian inference working so well in artificial connectionist networks? To
answer this question we must clarify first what a connectionist network is.
Figure 44. A simple 'neural' or 'connectionist' network. (Reproduced from Garson, 2002)
Connectionist networks are models consisting of large numbers of units (seen by
connectionist psychologists as analogous to neurons in the human brain), whose level of
activation is regulated by the activation level of other units to which it is connected (Wheeler,
On the common origins of psychology and statistics
110
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
2005). The effect of one unit on another can be excitatory or inhibitory. The connections with
which the units are connected with each other can be of varying ‘strength’ or ‘weight’.
The ‘weight’, i.e. the ‘strength’, of a connection can be adapted by experience: the
more dogs you have seen, the more like you are to recognize a stimulus pattern as a dog.
Computer simulations of such connectionist models “have demonstrated an ability to learn
such skills as face recognition, reading, and the detection of simple grammatical structure”
(Garson, 2002). How your ‘experience’ may influence your recognition may be exemplified
by an anecdote told by Dennett:
The philosopher Samuel Alexander, was hard of hearing in his old age, and used an ear trumpet. One
day a colleague came up to him in the common room at Manchester University, and attempted to
introduce a visiting American philosopher to him. "THIS IS PROFESSOR JONES, FROM
AMERICA!" he bellowed into the ear trumpet. "Yes, Yes, Jones, from America" echoed Alexander,
smiling. "HE'S A PROFESSOR OF BUSINESS ETHICS!" continued the colleague. "What?" replied
Alexander. "BUSINESS ETHICS!" "What? Professor of what?" "PROFESSOR OF BUSINESS
ETHICS!" Alexander shook his head and gave up: "Sorry. I can't get it. Sounds just like 'business
ethics'! (Dennett, 1998, p. 250)
There is no doubt that experience may modify our expectations: for instance I expect
the sun to rise tomorrow, because as long as I live the sun has risen every day. This
modification of expectations according to experience is most of the time very practical,
adaptive behaviour. The expectation that the sun will rise tomorrow makes that I set my alarm
clock. Moreover, I find it also very pleasant that my dog starts wagging his tail every day
around 7 p.m., because he has ‘learned’ from experience that is when I return home from
work (cf. Popper, 1990, p. 30). It is the same capacity that will make robots so attractive in the
future. The Sony entertainment robot dog ‘Aibo’ is nice, because he may learn to wag his tail
around 7 p.m.: just like a real dog.
Figure 45. Sony’s legged entertainment robot 'Aibo' and a real dog.
On the common origins of psychology and statistics
111
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
Thus, it is absolutely evident that Bayesian inference can be very useful in artificial
intelligence. Yet, the fact that some technique is practical and useful does not has to lead
automatically to becoming a standard for ‘human rationality’: for instance the invention of
photography also has lead to very practical and useful applications, but nobody has
considered to make it the standard against which the ‘rationality’ of human observation
should be measured. So one may ask why Bayesian inference should be considered the
standard of rationality. I think the only reason is the allurement of its subjective semantics –
Bayesian inference entails the comforting idea that an individual knowing subject may have
rational beliefs, i.e. beliefs governed by the laws of probability. It is the reintroduction of
classical, eighteenth century, epistemology of probability theory as the “formal description of
the intuitions of a prototypical reasonable man” ( cf. Gigerenzer, 2000, p. 266; Gigerenzer et
al., 1990, p. 226). It is its Cartesian talk about representations, beliefs and subject-object
dichotomies (see also the analysis of connectionist thought in the same vain: Wheeler, 2005)
which coaxes us to call Bayesian inference rational. I will substantiate this hypothesis by
taking a closer look in the following subsection to the research of Tversky and Kahneman
(1982).
II. Is human mind a Bayesian? The research of Tversky and Kahneman.
That experience modifies our expectations is evident (cf. Popper, 1990). Yet, as soon as we
start mingling probability calculus into this obvious statement, things become more difficult.
Bayes’ theorem gives the probabilistic rules how to adjust or revise in a rational way our
beliefs in light of new evidence. If E and H are two events, then p(E|H) is the probability of
observing E given the fact that event H has occurred and p(H|E) is the probability of
observing H given the fact that event E has occurred. Now assume that E is an event which
you have observed and that Hi (which is one of the n possible and mutually exclusive causes
H1,….Hn) is a hypothetical cause of the observed event. This sounds quite abstract, so I will
clarify this with an example (cf. Amossé, Andrieux, & Muller, 2001).
Suppose that you are worried that you might have the disease ‘Bayesomia’. You go to the
hospital to get tested. Unfortunately you get a positive result. This positive testing result we
will call event E. Though your test result was positive, you still have some hope that it was a
false positive result, because you know that the testing methods for ‘Bayesomia’ are accurate
only 99 percent of the time (regardless of whether the results come back positive or negative).
Moreover, your physician has told you that the disease ‘Bayesomia’ is present in one of every
1,000 people. So the event E may be caused by the disease – this is hypothesis Hi. However,
On the common origins of psychology and statistics
112
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
event E may also be caused by something else, i.e. not by the disease: this hypothesis we call
‘~Hi’. About hypothesis Hi the physician gave us the ‘a priori’ knowledge that its occurrence
in the population is 0.001. Thus if you had been asked before the test what the probability was
that you had been struck by ‘Bayesomia’ your ‘rational’ answer should be 0.001. However,
the positive result of the test is new evidence which puts things in a whole different light.
What should you answer if you were asked to estimate the probability that you suffer from
‘Bayesomia’ after you was informed about your positive test result? Bayes’ theorem gives
you the answer:
p(Hi|E) =
P(E|Hi)*p(Hi)
p(E|Hi)*p(Hi) + p(E|~Hi)*p(~Hi)
Bayes’ theorem shows you how to evaluate your a posteriori probabilistic knowledge about
the occurrence of the disease – p(Hi|E) – in the light of the probability of observing the event
E if it is caused by ‘Bayesomia’ – p(E|Hi). Subsequently you have to integrate this with the
probability that you are not a sufferer from ‘Bayesomia’, i.e. that you are one of the 999
healthy people and the probability of 0.01 that the test was a false positive.
p(Hi|E) =
0.99 * 0.001
[0.99 * 0.001] +[0.01 * 0.999]
So, from Bayes theorem you have to conclude that the a posteriori probability (after you got
your positive result) that you indeed have contracted ‘Bayesomia’ is 0.09.
However, from the famous studies conducted by Kahneman and Tversky in 1973
(‘Engineer-Lawyer Problem’) and 1980 (‘Cab Problem’) seemed to follow that the human
mind is very poor is performing this Bayesian calculation intuitively (Amossé et al., 2001): it
would be completely in line with the findings of Kahneman and Tversky if you – after hearing
your positive test result – would ‘irrationaly’ believe that it was almost absolutely certain that
you contracted ‘Bayesomia’ instead of drawing the ‘rational’ conclusion that the probability
you suffer from this disease has become now 9%. After all, the results of the Kahneman and
Tversky studies indicated that the subjects tended to neglect the a priori knowledge, i.e. that
there is a base rate neglect. So, for instance, in the so-called ‘Cab Problem’ study Tversky
and Kahneman (1980) presented their subjects with the following problem:
On the common origins of psychology and statistics
113
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
A cab was involved in a hit-and-run accident at night. Two cab companies, the Green and the Blue,
operate in the city. You are given the following data:
(i) 85% of the cabs in the city are Green and 15% are Blue.
(ii) A witness identified the cab as a Blue cab. The court tested his ability to identify cabs under the
appropriate visibility conditions. When presented with a sample of cabs (half of which were Blue and
half of which were Green) the witness made correct identifications in 80% of the cases and erred in 20%
of the cases.
Question: What is the probability that the cab involved in the accident was Blue rather than Green?
(Tversky & Kahneman, 1980, p. 162)
You may notice that this problem is exactly analogous to my ‘Bayesomia’ example: only the
Hi is now a Blue cab (probability 0.15), whereas the event E is the identification of the cab by
the witness as a Blue cab. The probability that the witness has made a correct identification is
0.8.
p(Hi|E) =
0.8 * 0.15
[0.8 * 0.15] +[0.2 * 0.85]
Thus, the a posteriori probability ( after the identification by the witness) that the cab
involved in the accident was Blue rather than Green is – according to Bayes’ Theorem – 41%.
Yet, Tversky and Kahnemann (Tversky & Kahneman, 1980) report that most of the subjects
gave probabilities around 80%. According to Kahneman and Tversky this indicates that the
human mind acts ‘irrationally’ – at least non Bayesian – because it neglects the base rates it
should have taken into account according to Bayes’ theorem. The research of Kahneman and
Tversky lead to an explosion of research which showed how ‘bounded’ human rationality is.
The proponents of this “heuristics-and-biases” program (Gigerenzer & Murray, 1987) have
tried to show again and again how the laws of cognition seem to be at odds with the laws of
probability.
However, the interpretation of the results of the studies of Kahneman and Tversky is
not uncontested. Gigerenzer and Murray have shown in their work Cognition as Intuitive
Statistics (1987) that the probabilistic, formal way wherein Kahneman and Tversky presented
their problems may be the actual cause of the ‘biases’ in the estimates made by the subjects.
Gigerenzer (2000) showed that if the same problem is presented in a different way
(formulated in ‘natural frequencies’ instead of abstract probabilities, more contextualized,
etc.) the ‘biases’ and ‘neglect’ largely disappear. Moreover, the assumption, which apparently
On the common origins of psychology and statistics
114
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
underlies the heuristics-and-biases program, that the application of Bayes’ theorem is the only
correct solution of their problem is highly arguable. For example, application of the statistical
theory of Neyman and Egon Pearson to the ‘Cab Problem’ leads to a completely different
answer. Actually, following from a Neyman and Pearsonian analysis, the probability that the
cab really was Blue – given that the witness said it was Blue – is 0.82 (Gigerenzer & Murray,
1987), which happens to be very close to the answer the majority of the subjects in the study
of Kahneman and Tversky gave.
III. Can you bear to see psychology as it is?
Of course, Bayesian statistics is a very practical tool in Artificial Intelligence. Yet,
psychology seems to have found in the successes of Bayesian reasoning in AI, a pretext to
backslide into philosophically outdated subjective semantics. Darwinist or ‘evolutionary’
epistemology has shown that objective rational knowledge is not seated in the individual
subject – rationality is not to be found in my believes or in my experiences, but only
replicative, collective phenomena which adapt to their environment through the algorithm of
natural selection.
Contemporary experimental psychology with its frequentist statistical methodology –
consisting of a mix of the Fisherian and Neyman-Pearsonian ideas – corresponds to such a
Darwinist epistemology. In 1957 the famous statistician Lee Cronbach could still talk of “two
disciplines of scientific psychology”, whereby he defined Fisherian experimental psychology
as a “Tight Little Island” in comparison to “Holy Roman Island” (Cronbach, 1957, p. 671) of
correlational psychology, which stood in the tradition of Galton and Karl Pearson, of the
study of intelligence and personality and whose “purpose was to find a measurement
instrument […] for an “objective” registration of individual differences” (Gigerenzer, 1987b,
p. 60). Since 1957 correlational psychology has lost more and more terrain in favour of the
onrushing Fisherian experimental psychology. Even disciplines such as social or clinical
psychology that used to employ mostly correlational psychology, have changed their
methodological taste very strongly into an experimental one. In psychology the victory of the
experimental Fisherian method over every other method is undeniable.
The enormous ‘force’ or ‘success’ of experimental psychology lies in its statistical
methodology that accelerates this ‘adaptation’ or ‘attunement’ between us and our
environment, because it can make good guesses about what is structural variation and what
chance variation. In this way psychology is able to attune human intelligence and its
environment to each other in such a way that error is minimized. The ‘fruits’ of the
On the common origins of psychology and statistics
115
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
psychological science are its answers to questions like: ‘How should we attune the employee
and his workspace in such a way that his production is maximized?’, ‘How to attune the pilot
and the control panel in the plane in such a way that the chance that he will push a wrong
button is minimized?’, ‘How to attune the patient suffering from depression and his treatment
in an optimal way?’ or ‘How to optimize the attunement of the Artificial Intelligence of a
robotic entertainment dog – like the Aibo – to the emotional needs the senior citizen?’. Yet,
the employee, the pilot, the patient suffering from depression and the senior citizen are not
particular individuals, but averages. Thus the average pilot, the average employee, the
average patient suffering from depression and the average senior citizen are the environment
which ‘selects’ control panels, workspaces and anti-depressive treatments. Psychological
research makes that control panels, workspaces and anti-depressive treatments become more
rationalised, i.e. better adapted or attuned to average human intelligence. Psychology is the
enterprise wherein hypotheses and models are subject to a process of selection, leading to an
increasing attunement – i.e. ‘rationalisation’ – of data and psychological hypotheses, models
and theories.
In his address of the American Psychological Association in 1957 Lee Cronbach describes the
difference between the correlational approach and the applied experimental as follows:
The program of applied experimental psychology is to modify treatments so as to obtain the highest
average performance when all persons are treated alike – a search, that is, for "the one best way." The
program of applied correlational psychology is to raise average performance by treating persons
differently – different job assignments, different therapies, different disciplinary methods. […] If the
engineering psychologist succeeds: information rates will be so reduced that the most laggard of us can
keep up, visual displays will be so enlarged that the most myopic can see them, automatic feedback will
prevent the most accident-prone from spoiling the work or his fingers. (Cronbach, 1957, p. 677)
Although psychological testing is a booming in all sorts of educational and professional
assessments, scientific psychology itself has become more and more experimental – searching
for “the one best way” in continually accelerating cycles of “problem P1 tentative
theoretical solution evaluative error elimination problem P2” (Popper, 1979, p. 119 f.f.
and p. 149).
I once talked with a student in industrial design about the fact that industrial design
students have to study a lot of statistics to be able to make calculations like: ‘What size should
a chair have to fit to 95% of the population?’. Could you say that a chair that fits 95% of the
population is more rational than a chair that fits only 35% of the population? Yes, of course!
On the common origins of psychology and statistics
116
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
However, not only are control panels and robotic entertainment dogs attuned to us, but
we ourselves are also the result of a long process of variation and selection that attuned us to
our environment. All our knowledge is an adaptation to our environment (cf. Popper, 1990). I
therefore agree with Popper, who argued that psychology suffers from the “utterly naïve and
completely mistaken” idea that knowledge is “what we do learn through the entry of
experience into our sense openings” (Popper, 1979, p. 61): for instance behaviourist
psychology, but also more cognitive psychological theories imbued with ideas derived from
associationist psychology (see chapter 4), made and still make this mistake (Popper, 1979).
Though Popper has been often very critical of psychology as a science (e.g. see
Popper, 1972, p. 47; Popper, 1979, p. 96 and p.156), he himself could not withstand the lure
of identifying psychology with ‘subjectivity’ too. After all, Popper argued that all the
subjective experiences, feelings, beliefs and convictions which should have no place in
science must be banished to psychology. Apparently Popper thought of psychology either as a
very weak and disorderly science, or as a non-scientific place of exile for all the subjectivity
he wanted to get rid of.
…a subjective experience, or a feeling of conviction, can never justify a scientific statement, and […]
within science it can play no part except that of an object of an empirical (a psychological) inquiry.
(Popper, 1972, p. 46)
So why is it so difficult, or even impossible, to think about psychological, human cognition in
non subjective terms? I belief the only reason for adopting the ‘subjective’ outlook to
knowledge is that it is less bleak and unsettling than Darwinist epistemology:
I don’t know about you, but I am not initially attracted by the idea of my brain as a sort of dungheap in
which larvae of other people’s ideas renew themselves, before sending out copies of themselves in an
informational diaspora. It does seem to rob my mind of its importance as both author and critic. […]
We would like to think of ourselves as godlike creators of ideas, manipulating and controlling them as
our whim dictates, and judging them from an independent, Olympian standpoint. (Dennett, 1996, p.
346)
Yet, the adoption of Bayesian inference as a standard for human rationality out of a
need for a comforting, subjective epistemology places psychology in an awkward
predicament: for it leads to endless misconceptions about its frequentist statistical
methodology and to a strange discrepancy with the Bayesian ideas on a theoretical level.
Although Bayesian statistics may be very practical in artificial intelligence research, its
On the common origins of psychology and statistics
117
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
subjective epistemology is philosophically a total faux pas. Philosophically Bayesian
inference is old-fashioned; a reanimation of a eighteenth century idea. However, can
psychologists bear to see psychology as it is? Time will tell.
On the common origins of psychology and statistics
118
CHAPTER 3: PSYCHOLOGY AND STATISTICS: WITH OR WITHOUT A KNOWING SUBJECT?
[3.G] Recapitulation & outlook on Part Two
Recapitulation of chapter 3:
Chapter 3 showed the philosophical incommensurability between frequentist and Bayesian statistics, by linking
Bayesian statistics to ‘classical epistemology’ and frequentist statistics to ‘evolutionary epistemology’. From the
perspective of evolutionary epistemology – that was formulated in the first place by the philosopher Karl Popper
– rationality has no need for a ‘knowing subject’, ‘beliefs’ or ‘representations’. Popper has fought his whole life
against ‘subjectivity’ and has tried to introduce ‘non-subjective’ words as ‘corroboration’ and ‘falsification’ to
describe the growth of knowledge. It is clarified that frequentist statistical inference – such as it is widely applied
in psychology – is successful because it ‘re-enacts’ the evolutionary algorithm of natural selection in an
‘accelerated’ way: the fact that chance is ‘tamed’ – because it is distinguished from ‘structural’ variation – leads
to an accelerated pace of hypothesis elimination. Because this evolutionary epistemology is rather ‘bleak’ the
frequentist ‘meaning’ of statistics is often poorly grasped by its applicants: both the semantics of cognitive
psychological theories as well as of frequentist statistical inferential methodology are still drenched in
‘subjectivity’. The success of Bayesian statistics in connectionist artificial intelligence, has probably contributed
to the fact that it has become in the so-called heuristics-and-biases program – initialized by Tversky and
Kahneman in the 1970s – in cognitive psychology the ‘official’ standard of rationality against which human
cognition is measured. Proponents of the heuristics-and-biases program have however concluded that human
cognition does not function according to the rules of Bayesian inference and therefore is irrational or at least
‘bounded’ in its rationality. It is assumed that the adoption of Bayesian inference as a standard of rationality was
guided by the fact that its classical epistemology is comfortingly subjective.
Outlook on Part Two of this thesis:
The central question of the second part of this thesis is how statistics and probability still concern my thought and
my words, although they have nothing to do with my personal, Cartesian subjective beliefs. How can I say how
probability and statistics concern me – without relapsing in Cartesian, subjective talk?
The hypothesis, which will be proposed in order to answer this question, is that there was a seventeenth century
conceptual turn which entailed the ‘textualization’ of the world – i.e., ‘nature’ became the ‘book of nature’ – and
that changed the notions of rationality and probability in such a way that it entailed the latent beginnings of both
psychology and statistics. This latent period lasted until the nineteenth century, wherein the notions rationality
and probability again changed in a radical way. This change lead to the nineteenth century emergence of
psychology and its statistical inferential methodology as we know them now. To substantiate this hypothesis I
will focus on: (1) how the ‘predecessors’ of psychology – viz., the seventeenth century Cartesian rational
subject and eighteenth century associationist psychology – were entangled with the ‘predecessors’ of modern
statistical inferential methodology – viz., seventeenth and eighteenth century probability theory; (2) how the
nineteenth century change of the notions ‘probability’ and ‘rationality’ which made it possible for psychology
and frequentist statistical methodology – both useful and fruitful disciplines – to emerge, entailed at the same
time rather unpleasant epistemological consequences that lead to hermeneutical, i.e. anti-statistical and antipsychological, reactions in nineteenth century thought. Finally, I will show how the ‘textual’ origin of
probability and rationality may put the question how statistics concern me in a completely different light.
On the common origins of psychology and statistics
119
INTERMEZZO: CONCLUSIONS OF PART ONE
INTERMEZZO: CONCLUSIONS OF PART ONE
Of course, the Cartesian substance dualism and the idea of man as the seat of rationality are
outdated in contemporary psychology. Every psychologist knows how ‘bounded’ human
rationality is, depending on ‘prosaic tricks’ such as heuristics, biases (cf. Kahneman et al.,
1982) and somatic markers (Damasio, 1998). Yet, psychologists apparently still like to think
of themselves as conscious knowing subjects endowed with ‘beliefs’ and ‘representations’
(Wheeler, 2005).
However, the classical subjective epistemology which psychologists endorse is not
congruent with the frequentist statistical methods they apply, for this frequentist methodology
entails an evolutionary epistemology. Hence why there are so many conceptual
misconceptions about statistics: it is the tendency of psychologists to interpret probability
(whose assumptions underlie the statistical inferential methodology) subjectively. The
Darwinian idea of the scientific enterprise as a process of mindless, blind error elimination on
the level of collective, replicative phenomena does not agree well with how the average
psychologist would like to think of himself, viz. as a individualistically thinking scientist with
subjective considerations and beliefs. The idea of objective knowledge as a result of ‘blind
algorithms’ is of course less ‘comfortable’ than the idea of a growing body of subjective
knowledge. This explains why it is so difficult to grasp the ‘meaning’ of statistics and why
statistical methodology is presented in psychology as a monolithic, timeless truth: it is a way
to avoid the confrontation with the philosophical ideas underlying the frequentist
methodology.
If we look at the ideas of two nineteenth century thinkers who have endeavoured to
formulate the philosophical impact of frequentist statistics – G. Th. Fechner (1801-1887) and
C.S. Peirce (1839-1914) – one can notice that they both try to overcome their tendency to
speak in subjective semantics ( see for Fechner: Heidelberger, 1987; Mayo, 1996; see for
Peirce e.g. Reynolds, 2002). As I showed in chapter 3 Popper – who was inspired by Peirce
(Popper, 1979) – was the first one who succeeded reasonably well in this task of overcoming
subjective semantics.
So, we may conclude with Popper that statistical methodology, rationality and
probability have nothing to do with my personal, Cartesian subjective beliefs. However,
people who think about statistics usually have quite strong feelings about it. It is apparently
On the common origins of psychology and statistics
120
INTERMEZZO: CONCLUSIONS OF PART ONE
difficult to think about statistical-psychological research practice without getting trapped in a
discourse of either idolatry of the statistical method or a romantic longing to a hermeneutical,
non-statistical psychology. The reason for these strong feelings is evident: although statistics,
rationality and probability have apparently nothing to do with my personal subjective beliefs,
they affect the question who I am ( apparently not a rational knowing subject) and my thought
and my words. How can I say how statistics, probability and rationality concern me – without
relapsing in Cartesian, subjective talk? This will be the subject of the second part of this
thesis.
On the common origins of psychology and statistics
121
PART TWO:
TWO:
The intertwinement of the notions
underlying Statistics and Psychology:
Probability and Rationality
On the common origins of psychology and statistics
122
PART TWO:The intertwinement of the notions underlying Statistics and Psychology: Probability and
Rationality
PART TWO CONSISTS OF THE FOLLOWING CHAPTERS:
CHAPTER 4: THE LATENT PERIOD
[4.A] It all begins with Descartes – the rational, representational consciousness.
[4.B] The book of Nature, epistemological uncertainty and the equivocation of
the Cartesian ‘subject’.
[4.C] Associationist psychology or the unproblematic ambivalence in the
concept of probability
[4.D] Rationality – between the Principle of Sufficient Reason (Leibniz) and the
Principle of Non-sufficient Reason (J. Bernoulli)
CHAPTER 5: THE CLASH WITH ‘NATURE’: THE LOCUS OF
RATIONALITY RECONSIDERED.
[5.A] The transition from classical to modern probability – the same probability,
but in a different way.
[5.B] The nineteenth century confrontation with 'nature' – an objective
interpretation of chance.
[5.C] Aversion of statistics and love of absolute chance – Nietzsche's absolute
subjectivism
[5.D] Just a name? The realism of C.S. Peirce and the nominalism of Pearson
On the common origins of psychology and statistics
123
PART TWO:The intertwinement of the notions underlying Statistics and Psychology: Probability and
Rationality
CHAPTER 6: WHERE DO I STAND? THE SIGNIFICANCE OF STATISTICS
[6.A] Fechner and Peirce: the Kollektiv as an end in itself.
[6.B] The semantics of statistics.
[6.C] From “metaphysics” to “prophysics”
On the common origins of psychology and statistics
124
REFERENCES
REFERENCES
Aczel, A. D. (2004). Chance. A guide to gambling, love, the stock market & just about
everything else. New York: Thunder's Mouth Press.
Amossé, T., Andrieux, Y.-V., & Muller, L. (2001). L'esprit humain est-il bayésien? Courrier
des statistiques(100), 25-28.
Aristotle. (1995). Politics : books I and II (T. J. Saunders, Trans.). Oxford: Clarendon.
Barnard, G. A. (1987). R.A. Fisher: A True Bayesian? International Statistical Review, 55(2),
183-189.
Bartlett, M. S. (1966). Review of 'Logic of Statistical Inference', by Ian Hacking. Biometrika,
53(3-4), 631-633.
Bem, S. (2005). Bent u daar nog? Over subjectiviteit en psychologie. (Afscheidscollege
3/12/2004, Universiteit Leiden, Faculteit sociale wetenschappen, Cognitieve
Psychologie) [Are you still there? About subjectivity and psychology. Farewell lecture
3/12/2004, Leiden University, Faculty of Social Sciences, Cognitive Psychology ]. s.l.:
s.n.
Bem, S., & Jong, H. L. d. (1998). Theoretical Issues in Psychology. An introduction. London:
Sage Publications.
Bockstaele, P., Cerulus, F., & Vanpaemel, G. (Eds.). (2004). Ars Conjectandi. Over gokkers,
geleerden en grote getallen. [On gamblers, scholars and large numbers. Catologue to
the exhibition in the library of the Catholic University of Leuven, 26 May - 27 June
2004]. s.l.: s.n.
Brunswik, E. (1943). Organismic achievement and environemental probability. Psychological
Review, 50, 255-272.
Carnap, R. (1951). Logic foundations of probability. London: Routledge and Kegan Paul.
Cioffari, V. (1973). Fortune, fate, and chance. In P. P. Wiener (Ed.), Dictionary of the history
of ideas. Studies of selected pivotal ideas. (Vol. 2, pp. 226-236). New York: Scribner.
Cowles, M. (1989). Statistics in psychology: an historical perspective. Hillsdale, N.J.:
Lawrence Erlbaum.
Cronbach, L. J. (1957). The two disciplines of scientific psychology. American psychologist,
12, 671-684.
Damasio, A. R. (1998). De vergissing van Descartes. Gevoel, verstand en het menselijk brein.
[Descartes' error - Emotion, Reason and the Human Brain] (L. Teixeira de Mattos,
Trans.). Amsterdam: Wereldbibliotheek.
Danziger, K. (1985). The Methodological Imperative in Psychology. Philosophy of the Social
Sciences, 15(1), 1-13.
Danziger, K. (1987). Statistical Method and the Historical Development of Research Practice
in American Psychology. In L. Krüger & G. Gigerenzer & M. S. Morgan (Eds.), The
probabilistic revolution. Volume 2: Ideas in the sciences (Vol. 2, pp. 35-37).
Cambridge, Massachusetts: MIT Press.
Danziger, K. (1990). Constructing the subject. Historical origins of psychological research.
Cambridge: Cambridge University Press.
Darwin, C. (1998/1859). On the Origin of Species by Means of Natural Selection, or the
preservation of Favoured Races in the Struggle for Life (facsimile of the first ed.).
Cambridge, Mass.: Harvard University Press.
Daston, L. (1987). Rational individuals versus laws of society. In L. Krüger & L. J. Daston &
M. Heidelberger (Eds.), The probabilistic revolution. Volume 1: Ideas in history (Vol.
1, pp. 295-304). Cambridge, Massachusetts: MIT Press.
Daston, L. (1988). Classical probability in the Enlightenment (1st ed.). Princeton: Princeton
University Press.
On the common origins of psychology and statistics
125
REFERENCES
David, F. N. (1955). Studies in the history of probability and statistics (I). Dicing and gaming.
(A note on the history of probability). Biometrika, 42(1/2), 1-15.
David, F. N. (1962). Games, gods and gambling. The origins and history of probability and
statistical ideas from the earliest times to the Newtonian era. London: Charles Griffin
& Co.
Dawkins, R. (1989). The selfish gene. Oxford: Oxford University Press.
Dawkins, R. (2000, 10/3/2000). Obituary of Richard Dawkins on W.D. Hamilton (19362000). The Independent.
Dehue, T. (1990). De regels van het vak. Nederlandse psychologen en hun methodologie,
1900-1985. Amsterdam: Van Gennep.
Dennett, D. C. (1991). Mother Nature versus the Walking Encyclopedia: A Western Drama.
In W. Ramsey & S. P. Stich & D. E. Rumelhart (Eds.), Philosophy and connectionist
theory (pp. 21-30). Hillsdale, N.J.: Erlbaum Associates, 1991.
Dennett, D. C. (1996). Darwin's Dangerous Idea. Evolution and the meanings of life. London:
Penguin.
Dennett, D. C. (1998). Brainchildren : essays on designing minds. Cambridge, MA: MIT
Press.
Desrosières, A. (1993). La politique des grands nombres. Histoire de la raison statistique
(first edition ed.). Paris: Éditions la découverte.
Desrosières, A. (1999). Statistique. In D. Lecourt & T. Bourgeois (Eds.), Dictionnaire
d'histoire et philosophie des sciences (pp. 874-880). Paris: Presses Universitaires de
France.
Desrosières, A. (2001). How real are statistics? Four possible attitudes. Social Research,
68(2), 339-355.
Desrosières, A. (2002). Adolphe Quetelet. Courrier des statistiques(104), 3-6.
Fisher Box, J. (1978). R.A. Fisher. The life of a scientist. New York: John Wiley & Sons.
Fisher, R. A. (1921). Studies in Crop Variation I. An Examination of the Yield of Dressed
Grain from Broadbalk. Journal of Agricultural Sciences, 11, 109-135.
Fisher, R. A. (1924). Studies in Crop Variation III. The Influence of Rainfall on the Yield of
Wheat at Rothamsted. Philosophical Transactions of the Royal Society of London, Ser.
B, 213, 89-142.
Fisher, R. A. (1951). The Design of Experiments. Edinburgh: Oliver and Boyd.
Fisher, R. A. (1953). Croonian lecture: Population Genetics. Proceedings of the Royal Society
of London. Series B, Biological sciences, 141, 510-523.
Fisher, R. A. (1955). Statistical methods and scientific induction. Journal of the Royal
Statistical Society. Series B (Methodological), 17, 69-78.
Fisher, R. A. (1970). Statistical methods for research workers (14 ed.). Edinburgh: Oliver and
Boyd.
Foucault, M. (1966). Les mots et les choses. Une archéologie des sciences humaines. Paris:
Gallimard.
Galileo, G. (1957). Discoveries and Opinions of Galileo (S. Drake, Trans.). New York:
Garden City.
Galton, F. (1869). Hereditary genius: an inquiry into its laws and consequences. London: s.n.
Galton, F. (1886). Regression towards mediocrity in hereditary stature. Journal of the
Anthroplogical Institute, 15, 246-263.
Galton, F. (1888). Co-relations and their measurement, chiefly from anthropometric data.
Proceedings of the Royal Society, 45, 135-145.
Galton, F. (1889). Natural Inheritance. London: Macmillan.
Galton, F. (1901). Biometry. Biometrika, 1(1), 7-10.
On the common origins of psychology and statistics
126
REFERENCES
Garber, D., & Zabell, S. (1979). On the emergence of probability. Archive for History of
Exact Sciences, 21, 33-53.
Garnham, A., & Oakhill, J. (2001). Thinking and reasoning. Oxford: Blackwell Publishers.
Garson, J. (2002). Connectionism. In E. N. Zalta (Ed.), The Stanford Encyclopedia of
Philosophy. Retrieved May 23, 2006, from:
http://plato.stanford.edu/archives/win2002/entries/connectionism/.
Gee, H. (1999). In search of deep time : beyond the fossil record to a new history of life. New
York: Free Press.
Gigerenzer, G. (1987a). Probabilistic thinking and the fight against subjectivity. In L. Krüger
& G. Gigerenzer & M. S. Morgan (Eds.), The probabilistic revolution. Volume 2:
Ideas in the sciences (Vol. 2, pp. 11-33). Cambridge, Massachusetts: MIT Press.
Gigerenzer, G. (1987b). Survival of the fittest probabilist: Brunswik, Thurstone, and the two
disciplines of psychology. In L. Krüger & G. Gigerenzer & M. S. Morgan (Eds.), The
probabilistic revolution. Volume 2: Ideas in the sciences (Vol. 2, pp. 49-72).
Cambridge, Massachusetts: MIT Press.
Gigerenzer, G. (2000). Adaptive Thinking. Rationality in the real world. Oxford: Oxford
University Press.
Gigerenzer, G., Krauss, S., & Vitouch, O. (2004). The null ritual: what you always wanted to
know about significance testing but were afraid to ask. In D. Kaplan (Ed.), The Sage
handbook of quantitative methodology for the social sciences (pp. 391-408). Thousand
Oaks, California: Sage publications.
Gigerenzer, G., & Murray, D. J. (1987). Cognition as intuitive statistics. Hillsdale, NJ:
Lawrence Erlbaum Associates.
Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., Beatty, J., & Krüger, L. (1990). The
empire of chance. How probability changed science and everyday life. Cambridge:
Cambridge University Press.
Gillies, D. (1971). A falsifying rule for probability statements. The British Journal for the
Philosophy of Science, 22, 231-261.
Gillies, D. (2003). Philosophical theories of probability. London: Routledge.
Gribbin, J. (1985). Op zoek naar Schrödingers kat. Quantumfysica en de werkelijkheid. [In
search of Schrödinger's cat]. Amsterdam: Contact.
Hacking, I. (1975). The emergence of probability. A philosophical study of early ideas about
probability, induction and statistical inference (1st ed.). London: Cambridge
University Press.
Hacking, I. (1976). Logic of statistical inference (1st pbk ed.). Cambridge: Cambridge
University Press.
Hacking, I. (2001). An introduction to probability and inductive logic. Cambridge: Cambridge
University Press.
Hacking, I. (2004). The taming of chance (8th ed.). Cambridge: Cambridge University Press.
Hald, A. (1998). A history of mathematical statistics from 1750 to 1930 (1st ed.). New York:
John Wiley & Sons.
Haller, H., & Krauss, S. (2002). Misinterpretations of significance: A problem students share
with their teachers? Methods of Psychological Research Online [Online Serial,
retrievable from http://www.mpr-online.de ], 7(1), 1-20.
Heidegger, M. (1988). Ontologie: Hermeneutik der Faktizität. (Freiburger Vorlesung
Sommersemester 1923) (Vol. 63). Frankfurt am Main: Klostermann.
Heidelberger, M. (1987). Fechner's indeterminism: from freedom to laws of chance. In L.
Krüger & L. Daston & M. Heidelberger (Eds.), The probabilistic revolution. Volume
1: Ideas in history (Vol. 1, pp. 117-156). Cambridge, Massachusetts: MIT Press.
Heiser, W. J. (1990). Datatheorie [Datatheory]. Leiden.
On the common origins of psychology and statistics
127
REFERENCES
Heiser, W. J. (2003). Trust in Relations. Measurement: Interdisciplinary Research and
Perspectives, 1(4), 264-269.
Heiser, W. J., & Verduin, K. (2005). Spreiding zonder fouten. Hoe de standaarddeviatie tot
stand kwam als maat voor verscheidenheid. [Dispersion without errors. How the
standard deviation emerged as a measure of diversity]. STAtOR, 6(3), 14-20.
Hogben, L. (1957). Statistical theory : the relationship of probability, credibility and error.
An examination of the contemporary crisis in statistical theory from a behaviourist
viewpoint. London: Allen and Unwin.
Hubbard, R. (2004). Alphabet soup: blurring the distinctions between p's and a's in research
in psychology. Theory & Psychology, 14(3), 295-327.
Hume, D. (2002/1739). A treatise of human nature. Oxford: Oxford University Press.
Kahneman, D., Slovic, P., & Tversky, A. (Eds.). (1982). Judgement under Uncertainty:
Heuristics and Biases. Cambridge: Cambridge University Press.
Kahneman, D., & Tversky, A. (1972). Subjective probability: A judgment of
representativeness. Cognitive Psychology, 3, 430-454.
Kendall, M. G. (1942). On the future of statistics. Journal of the Royal Statistical Society,
105, 69-80.
Kendall, M. G. (1956). Studies in the history of probability and statistics (II).The beginnings
of a probability calculus. Biometrika, 43(1/2), 1-14.
Kendall, M. G. (1963). Ronald Aymler Fisher, 1890-1962. Biometrika, 50(1-2), 1-15.
Kendall, M. G. (1973). Chance. In P. P. Wiener (Ed.), Dictionary of the history of ideas.
Studies of selected pivotal ideas. (Vol. 1, pp. 336-340). New York: Scribner.
Lucas, A. M. (1995). Anglo-Irish Poems of the middle Ages. Dublin: Columba Press.
MacKenzie, D. A. (1981). Statistics in Britain, 1865-1930. Edinburgh: Edinburgh University
Press.
Maistrov, L. E. (1974). Probability theory. A historical sketch. New York: Academic Press.
Mayo, D. G. (1996). Error and the growth of experimental knowledge. Chicago: The
University of Chicago Press.
McClelland, J. L. (1994). Comment. Neural Networks and Cognitive Science: Motivations
and Applications. Statistical science, 9(1), 42-45.
Menand, L. (2002). The Metaphysical Club. A story of ideas in America. New York: Farrar,
Straus and Giroux.
Michell, J. (1999). Measurement in psychology. A critical history of a methodological concept
(1st ed.). Cambridge: Cambridge University Press.
Mises, R. von. (1936). Wahrscheinlichkeit, Statistik und Wahrheit. Einführung in die neue
Wahrscheinlichkeitslehre und ihre Anwendung. (2nd ed.). Wien: Verlag von Julius
Springer.
Monod, J. (1970). Le hasard et la nécessité. Essai sur la philosophie naturelle de la biologie
moderne. Paris: Seuil.
Moore, D. S., & McCabe, G. P. (1999). Introduction to the practice of statistics. New York:
W.H. Freeman & co.
Neyman, J., & Pearson, E. S. (1933). On the problem of the most efficient tests of statistical
hypotheses. Philosophical Transactions of the Royal Society of London, Ser. A, 231,
289–337.
Nikolow, S. (2001). A. F. W. Crome's Measurements of the "Strength of the State": Statistical
Representations in Central Europe around 1800. History of Political Economy,
33(Annual Supplement), 23-56.
Oakes, M. (1986). Statistical inference: a commentary for the social and behavioral sciences.
Chichester, UK: Wiley.
On the common origins of psychology and statistics
128
REFERENCES
Oosterhuis, T. (1991). De pijl van Zeno: een verhaal over de geschiedenis van de statistiek.
Baarn: Fontein.
Pearson, K. (1911). The grammar of science. Part I. Physical (3th, revised and enlarged ed.
Vol. 1). London: Adam and Charles Black.
Pearson, K. (1914). The life, letters and labours of Francis Galton. Volume I. Birth 1822 to
Marriage 1853. (Vol. 1). Cambridge: Cambridge University Press.
Pearson, K. (1924). The life, letters and labours of Francis Galton. Volume II. Researches of
middle life (Vol. 2). Cambridge: Cambridge University Press.
Pearson, K. (1930). The life, letters and labours of Francis Galton. Volume III. A:
Correlation, personal identification and eugenics. B: Characterisation, especially by
letters (Vol. 3). Cambridge: Cambridge University Press.
Pearson, K. (1978). The history of statistics in the 17th and 18th centuries against the
changing background of intellectual, scientific and religious thought. Lectures by Karl
Pearson given at University College London during the academic sessions 1921-1933.
Edited by E.S. Pearson. London: Charles Griffin & Company.
Pearson, K., Weldon, W. R. F., & Davenport, C. B. (1901). Editorial: The spirit of
Biometrika. Biometrika, 1(1), 3-6.
Popkin, R. H. (1964). The History of Scepticism from Erasmus to Descartes (2nd ed.). Assen:
Van Garcum.
Popper, K. (1959). The Propensity Interpretation of Probability. British Journal for the
Philosophy of Science, 10, 25-42.
Popper, K. (1972). The logic of scientific discovery (1 ed.). London: Hutchinson.
Popper, K. (1974). Conjectures and refutations. The growth of scientific knowledge. (5 ed.).
London: Routledge and Kegan Paul.
Popper, K. (1979). Objective Knowledge. An evolutionary approach (2nd revised ed.).
Oxford: Clarendon Press.
Popper, K. (1983). Realism and the aim of science. From the postscript to "The logic of
scientific discovery". (1st ed.). London: Hutchinson & Co.
Popper, K. (1990). A world of propensities. Bristol: Thoemmes.
Porter, T. M. (1986). The rise of statistical thinking, 1820-1900. Princeton, NJ: Princeton
University Press.
Porter, T. M. (2003a). Measurement, Objectivity and Trust. Measurement: Interdisciplinary
Research and Perspectives, 1(4), 241-255.
Porter, T. M. (2003b). Objectivity and Trust: A Measured Rejoinder. Measurement:
Interdisciplinary Research and Perspectives, 1(4), 286-298.
Porter, T. M. (2003c). Statistics and statistical methods. In T. M. Porter & D. Ross (Eds.), The
Cambridge history of science. The modern social sciences (Vol. 7, pp. 238-250).
Cambridge: Cambridge University Press.
Prigogine, I., & Stengers, I. (1985). Orde uit chaos (Order out of chaos). Amsterdam: Bert
Bakker.
Quetelet, A. (1835). Sur l'homme et le développement de ses facultés, ou Essai de physique
sociale. Paris: Bachelier.
Quetelet, A. (1846). Lettres à S.A.R. le duc regnant de Saxe Coburg et Gotha sur la théorie
des probabilités, appliquées aux sciences morales et politiques. Bruxelles: M. Hayez.
Reynolds, A. (2002). Peirce's scientific metaphysics. The philosophy of chance, law, and
evolution. Nashville: Vanderbilt University Press.
Robert, C. P. (2001). L'analyse statique bayésienne. Courrier des statistiques(100), 3-4.
Rogers, T. B. (2002). Book review: Joel Michell. Measurement in psychology: A critical
history of a methodological concept. Journal of History of the Behavioral Sciences,
38(1), 61-62.
On the common origins of psychology and statistics
129
REFERENCES
Romeijn, J. W. (2005). Bayesian inductive logic: inductive predictions from statistical
hypotheses. Rijksuniversiteit Groningen.
Room, A. (1986). Dictionary of changes in meaning. London: Routledge & Kegan Paul.
Rosser Matthews, J. (2000). Statistics. In A. Hessenbruch (Ed.), Reader's guide to the history
of science (pp. 706-707). London: Fitzroy Dearborn.
Salsburg, D. (2001). The lady tasting tea. How statistics revolutionized science in the
twentieth century. New York: W.H. Friedman and Company.
Sambursky, S. (1956). On the possible and the probable in ancient Greece. Osiris, 12, 35-48.
Schuh, F. (1964). Hoe bepaal ik mijn kans? Kansrekening met toepassing op spel en
statistiek. Amsterdam: Agon Elsevier.
Schweber, S. S. (1977). The Origin of the Origin Revisited. Journal of the History of Biology,
10(2), 229-316.
Schweber, S. S. (1982). Demons, Angels, and Probability: some aspects of British Science in
the Nineteenth Century. In A. Shimony & H. Feshbach (Eds.), Physics as natural
philosophy: essays in honor Laszlo Tisza on his seventy-fifth birthday (pp. 319-363).
Cambridge, Mass.: MIT Press.
Schweber, S. S. (1983). Aspects of probabilistic thought in Great Britain during the 19th
century: Darwin and Maxwell. In M. Heidelberger & L. Krüger & R. Rheinwald
(Eds.), Probability since 1800. Interdisciplinary sudies od scientific development.
Workshop at the centre for interdisciplinary research of the University of Bielefeld,
September 16-20, 1982 (pp. 41-96). Bielefeld: B.K. Verlag.
Sheynin, O. B. (1974). On the prehistory of probability. Archive for History of Exact
Sciences, 12, 97-141.
Stigler, S. M. (1986). The history of statistics. The measurement of uncertainty before 1900
(1st ed.). Cambridge, Massachusetts: The Belknap Press of Harvard University Press.
Stigler, S. M. (1996). The history of statistics in 1933. Statistical Science, 11(3), 244-252.
Stigler, S. M. (1997). Regression towards the mean, historically considered. Statistical
Methods in Medical Research, 6, 103-114.
Stigler, S. M. (1999). Statistics on the table. The history of statistical concepts and methods.
Cambridge, Massachusetts: Harvard University Press.
Swijtink, Z. G. (2000). Probability. In A. Hessenbruch (Ed.), Reader's guide to the history of
science (pp. 596-597). London: Fitzroy Dearborn.
Thompson, B. (2001). A Critical Review of a Critical History of Measurement (review of Joel
Michell's Measurement in Psychology: A Critical History of a Methodological
Concept). Theory & Psychology, 11(6), 855-856.
Tversky, A., & Kahneman, D. (1980). Causal schemata in judgements under uncertainty. In
M. Fishbein (Ed.), Progress in social psychology (Vol. 1). Hillsdale, NJ: Lawrence
Erlbaum Associates.
Vlis, J. H. v. d., & Heemstra, E. R. (1988). Geschiedenis van de kansrekening en statistiek
(1st ed.). Utrecht: Pandata.
von Plato, J. (1987). Probabilistic physics the classical way. In L. Krüger & G. Gigerenzer &
M. S. Morgan (Eds.), The probabilistic revolution. Volume 2: Ideas in the sciences
(Vol. 2, pp. 379-407). Cambridge, Massachusetts: MIT press.
von Plato, J. (1995). Creating modern probability. Its mathematics, physics and philosophy in
historical perspective. Cambridge: Cambridge University Press.
Westergaard, H. (1969). Contributions to the history of statistics (1st ed.). The Hague:
Mouton Publishers.
Wheeler, M. (2005). Reconstructing the cognitive world: the next step. Cambridge, Mass.:
MIT Press.
On the common origins of psychology and statistics
130
REFERENCES
Winkler, R. L. (1974). Statistical analysis: theory versus practice. In C.-A. S. Staël von
Holstein (Ed.), The concept of probability in psychological experiments (pp. 127-140).
Dordrecht, Holland: D. Reidel Publishing Company.
Wozniak, R. H. (1999). Classics in psychology. 1855-1914: Historical essays. Bristol:
Thoemmes Press ( and Tokyo: Maruzen) [co-published].
Yates, F. (1984). Book review of 'Neyman: from life', by C. Reid. Journal of the Royal
Statistical Society. Series A (General), 147(1), 116-118.
On the common origins of psychology and statistics
131
Download