Why did mathematicians switch their definition of a function from

advertisement
Why did mathematicians switch their definition of a
function from Bernoulli’s definition to the one
given in the modern set-theoretic view?
-Matt Insall
This essay includes the discussion of several topics that may have
played a role in the choice made by many mathematicians in the late
nineteenth century, and almost wholly in the twentieth century to
abandon the definition Bernoulli used for the concept of a function,
and adopt the very abstract, set-theoretic one we teach our graduate
students now.
Let me ``play'' with History a bit, by discussing the possibility of
writing some fiction around this historical question. I shall imagine
that I am a mathematician who has discovered that Bernoulli's
definition does not work for me, and that I wish to demonstrate to the
mathematical community why they should not accept his definition
either. To do so, I would need to write a paper in which I present
something a mathematician of the times of Bernoulli would refer to as a
function, but which fails to satisfy Bernoulli's definition of a
function. (In fact, the best of all possible worlds for showing that
the mathematical community should abandon Bernoulli's definition would
be to find a work of Bernoulli in which my example, or a very similar
example, is given, and in which Bernoulli refers to it as a function.
In this case, I will have provided an example of a ``Bernoulli
function'' that fails to be a ``Bernoulli function''.) But, let us
imagine that, during Bernoulli's lifetime, I am also writing my work
down in a style that will not exist until the twentieth century.
I will begin with the Bernoulli concept of a function, but I shall, in
a very twentieth-century manner, write my definition out explicitly,
and number it, so it may be easily recalled at a later time, and so
that careful delineations may be made.
Definition 1: Let X and Y be sets, and let f be a relation on the
Cartesian product of X with Y. Then f is a ``Bernoulli function''
provided that there is an expression formed from variables and
constants that gives a rule for one to decide for a given x in X which
y in Y is paired with x in the relation f, and there is only one such y
for each x in X for which there is such a y in Y.
I would also write a definition of function that pleases me more for my
specific application, and I would call such things ``functions'', and
show that every ``Bernoulli function'' is a ``function'' (according to
my definition). Thus, in such a paper, I might write the following:
Definition 2: Let X and Y be sets, and let f be a relation on the
Cartesian product of X with Y. Then f is a ``function'' provided that
for each x in X, there is at most one element y of Y such that x is
related to y via f.
Now, as I seem to have heard, the expressions Fourier wanted to use to
define functions were Sine and Cosine series. I also heard that he
boasted that any function from the reals to the reals can be so
expressed, and this prompted another mathematician (Was is Dirichlet or
Lagrange or Cauchy? I cannot remember.) to present an example of a
function from the reals to the reals which has an everywhere divergent
cosine series and an everywhere divergent sine series, or of a function
f whose sine and cosine series converge everywhere to some function
other than f. Others on this list may be able to confirm or correct my
account, or fill in some details. The example given satisfies
definition 2, but not definition 1. Or does it? I will show that
there is a sense in which Bernoulli's definition is equivalent to mine
in a trivial, but wholly unsatisfactory, way: To do so requires that I
investigate what I would mean by ``expression'', in Bernoulli's
definition. This has been looked at quite a bit in the twentieth
century.
If we check through the work of Bernoulli's time, I think we will find
that the term ``expression’’ was used quite loosely. I approach it
from a logic or universal algebra perspective. In logic, one considers
the symbols of a language in a very formal way. Similarly, in
universal algebra, language is highly formalized, so that there is
actually a definition for the term ``expression'', and in both logic
and universal algebra, the objects that are called ``expressions''
model quite accurately what one means when one uses that term in
everyday mathematics. Putting all this together, I can ``show'' that
if one accepts Bernoulli's definition of function for Sine and Cosine
series, and then one is faced with an example of a specific function we
shall call ``the strange function'', that is not representable as a
Sine series or as a Cosine series, then one may easily demonstrate that
the strange function is a Bernoulli function. (Statements of theorems
will be in some cases somewhat informal, as was done in the days of
Bernoulli.)
Theorem 1: Let X be the set of real numbers, and let S denote the
strange function. Let L denote the language for analysis in which Sine
and Cosine series are studied, in such a way that it is reasonable to
refer to any function that has either a Sine series representation or a
Cosine series representation as a Bernoulli function, and let E denote
the collection of expressions of the language L. Then there is a
language, for analysis, L', with collection of expressions E' with the
following properties:
(i)
Every expression in E is also in E'.
(ii) the strange function is a Bernoulli function in the language L',
meaning that in E', there is an expression that defines the strange
function.
The proof of this theorem is quite obvious to a beginning logic
student, I would expect, so I shall not write it out in a very formal
way. I'll outline how it goes briefly: Let the symbol S that denotes
the strange function be appended to the language L as a constant, and
then refer to the newly obtained language as the language L'.
Now, there is a philosophically burning point here that has been
overlooked. In particular, the language L' is obtained by fiat, and in
some sense this is unsatisfying. But, in fact, it seems to me that
this theorem highlights a different kind of troubling nature about the
Bernoulli definition of a function that is philosophically related to
all the classical problems that are similar in some way to the Sorites
problems. In particular, the definition 1 is ambiguous. I can argue
that it should even have appeared overly ambiguous at the time of
Bernoulli to those who care about reducing ambiguity. For in fact, the
mere observation that someone as brilliant as Fourier missed out on the
construction of the strange function indicates to me that he fell
victim to the ambiguity of definition 1, which at the time was
essentially embodied in the inherent ambiguity in the definition of the
notion of a function, as it was used at that time, and still is used in
many books outside rigourous mathematics.
Consider the problem that is mentioned in many first-year calculus
books, for which an answer is given, but a solution is never presented:
``If possible, express the anti-derivative of e^(-x^2) in terms of
elementary functions.''
In fact, calculus books point out that this problem is insoluble, in
the sense that there is no ``expression'' for the anti-derivative of
e^(-x^2) in terms of elementary functions. The text we now use for
calculus (Stewart), includes a section in which the students form new
language, for computing and approximating integrals, that is analogous
to the language L' in theorem 1 above, if the ``language of elementary
functions'' is chosen as being represented by L, and the corresponding
theorem would merely indicate that it is possible to be coherent when
writing such a section into a calculus text, because the question of
whether there is a solution to a problem is relative to the tools at
hand. (Given only the language of analysis of elementary functions,
the ``collection of expressions'' does not include the antiderivative
of e^(-x^2), even though in that language, one may fairly easily
demonstrate that such an anti-derivative exists. Given the powerful
language of analysis of trigonometric series, many applications
problems can be solved, and perhaps Fourier would have been considered
to be correct had he said ``every function applicable to some physical
process can be expressed in a Sine or Cosine series'', for I do not
think that the strange function appears very often as anything more
than a curiosity in the physical sciences.)
Now, if we take Bernoulli seriously, but grant him the courtesy of
anyone who is human - he had something worthwhile in mind, even if he
did not properly express it - then we might say that what Bernoulli
meant by the term ``expression'' was something like the meaning we have
used above, but with the fundamental building blocks of his ``language
of analysis'' being the same ones that were used by, for instance,
Baire, in classifying functions according to their level of complexity
as limits of functions ``previously described''. In this case, I would
say that he would include the strange function as an expression, for
pointwise limits were taken all the time, and, as I recall, the strange
function is constructed using those very tools of analysis, such as
pointwise limits of sequences of functions defined on the real line.
Goedel and Cohen each dealt this problem a tremendous blow with his
discovery of the sxiom of constructability. In this study, he proved
that if the theory ZF is consistent, then so is ZFC+GCH. But he went
further, and designed a set theoretic universe in which every set is
given by expressions in terms of previously defined sets. The axiom of
constructability, denoted V=L, states that every set is constructible.
His actual result was that ZF+V=L is consistent, as long as ZF is
consistent. In models of ZF+V=L, every set, relation and function is
an expression, so this brings together the definit8ions 1 and 2
completely. However, also, Cohen showed that if ZF is consistent, then
so is ZF+not(V=L). That is, it is also okay to assume that definitions
1 and 2 describe different functions. Now the ``expressions’’ are not
at all what Bernoulli and Fourier envisioned. They are developed in
terms of transfinite ordinals, and a well-ordering of the universe
ensues. This makes the assumption that everything is given by an
expression take on a new unintended meaning, by creating a setting in
which many counterintuitive results can be proved, because of the
Generalized Continuum Hypothesis (GCH) and the definable well-ordering
of the universe of sets. Goedel and Cohen, and now other
mathematicians, such as Woodin, have decided that the axiom of
constructibility goes too far, and, for example, Woodin suggests that
the Continuum Hypothesis (CH) should be violated in such a way that the
cardinality of the continuum is aleph_2.
I take a different stance. I accept ZFC, but I reject the axiom of
constructability strongly. I contend that in some sense, ``most’’
functions (and sets and relations, etc) are not given by expressions.
In a sense, I deny the axiom of constructability ``locally
everywhere’’. In particular, I suggest that the cardinality of the
continuum should be aleph_{2^aleph_0}. This is as large as it can
theoretically be, and this means that the types of subsets of the real
line are as varied as they can possibly be, in any model of ZFC.
Download