Supernumeration: Vagueness and Numbers Peter Simons1 Abstract After reviewing the principal and well-known defects of fuzzy logic this paper proposes to use numerical values in conjunction with a supervaluational approach to vagueness. The two principal ideas are degrees of candidature and expected truth value. I present an outline of the theory, which combines vagueness of predicates and vagueness of objects, and discuss its pros and cons. INTRODUCTION There is a notable bifurcation between what philosophers think and say about vagueness and what people do who have to deal with it practically. There is a widespread consensus in the philosophical literature on vagueness that fuzzy logic, which essentially includes the assignment of numerical values to represent degrees of truth of vague sentences, is a flawed method, and that some other theory is to be preferred if we are to give a correct account of vagueness. When it comes to practical applications however, for people with actual problems to solve and computers and software to hand, fuzzy logic is the overwhelmingly predominant approach. Such applications include: Geographical Information Systems (GIS) Medical Diagnostics and Treatment (Expert Systems) Astrophysical Data Data mining and data fusion Control Systems 1 A previous version of this paper presented at St. Andrews benefitted from critical discussion by Timothy Williamson, Dorothy Edgington and other conference participants. I am also grateful for critical remarks by two anonymous referees for Oxford University Press. The approach has most affinities with that of Dorothy Edgington, who also believes we need numerical measures in considering vagueness: see Edgington 1997. Like Edgington, I exploit the analogy with probability theory, as in two previous papers, Simons 1997, 1999, but I stress that vagueness and probability are two different things, so the analogy must be exploited with care. 1 None of these is insignificant. By philosophical lights, this work is all either mistaken or concerned with something other than vagueness. By the lights of applied science, philosophers have their heads stuck well and truly either in the clouds or in the sand, or, paraconsistently, both. I advocate a way out of this impasse which addresses the concerns of both sides. FUZZY LOGIC In fuzzy logic at its most basic, a vague statement is assigned a real number v as its truth-value: v = 1 represents classical, complete or total truth, v = 0 represents classical falsity, and 0 < v < 1 represents a non-classical or in-between case. This scheme has certain marked advantages: it gives a way of calculating with truth-values with a simple extensional (value-functional) logic: if we symbolize the truth-value of a statement p by |p|, the truth-values for negations, conjunctions and universal quantifications are given by |p| = 1 – |p| |p q| = min(|p|,|q|) | xA[x]| = minx(|A[x]|) Vague statements are those which do not have a classical truth-value (0 or 1), and the numbers take account of the intuition that some statements are closer to truth (or falsity) than others. It also has a very simple and plausible account for Sorites Paradoxes. In a Sorites sequence a wholly true premiss and a long sequence of almost true implications leads via many applications of modus ponens to a wholly false conclusion, the minute drops in truth-value at each step cumulating to an overall drop from 1 to 0. Against these theoretical and practical advantages are two serious theoretical flaws. Firstly, contradictions need not be false, tautologies need not be true, and a contradiction may have the same truth-value as a tautology: if |p| = 0.5 then |p p| = |p p| = 0.5. This makes statements appear vague which definitely are not, and therefore makes nonsense of hedging. If we are unsure whether someone is bald, for example, we can hedge, not only by saying something like “Well, he’s on the way to bald”, but in the extreme case, by retreating to “Well, at least he’s bald or he’s not.” According to fuzzy logic, we may gain no security at all by so hedging, which is 2 absurd.2 Secondly, unclear cases are required to have a precise fuzzy truth-value, one real number out of a continuum of others. This imparts vague statements, which seem to have no clear truth-value, a spurious exactness. These seem to me as to many others to be finally damning reasons why fuzzy logic cannot capture the phenomenon of vagueness. Less crucially, fuzzy logic needs to resort to special tricks to cope with so-called higher-order vagueness. WHY PRACTITIONERS USE FUZZY LOGIC If fuzzy logic is theoretically such a no-hoper, why do practitioners use almost nothing else? Crucially, because it is numerical, it is easy to develop numerical and algorithmic methods for dealing with the data. There exist algorithms and software packages, help and discussion in superabundance. A simple Google™ search for ‘software’ + ‘fuzzy logic’ returned 1.65 million hits, whereas ‘software’ + ‘supervaluation’ returned 679.3 Also, not all fuzzy reasoning concerns truth-values. Many data are already quantitative and consist of approximate values for such parameters as mass, length, failure quota, and other statistical measures. Because it was the first approach to be used in programs, fuzzy logic cornered the market, and in engineering there is a strong founder effect.4 There is no effective alternative that practitioners can use, there exists a plethora of methods to suit different situations, and fuzzy logic gets results. That is not in itself an argument for the correctness of fuzzy logic’s analysis of vagueness, but it does show that philosophical alternatives have signally failed to produce tools for use outside the philosophy room, leaving practitioners with no alternative but to use what from a philosophical and theoretical point of view is regarded as a flawed theory. Use of numbers, even if they are not God-written, can be more useful than their non-use out of theoretical purity. VAGUE OBJECTS An object is vague when it is unclear where it starts and finishes, or what its parts are. The mountain Helvellyn has no clear boundaries. Of many a small object, from the 2 Thanks to a referee for this point. 18 August 2007. The catchiness of the term ‘fuzzy logic’ is only partly responsible. 4 Thanks to the same referee. 3 3 atomic scale to larger chunks of rock, it is unclear whether it is part of Helvellyn or not. Object vagueness thus arises because of a special case of predicate vagueness: the source is the vagueness of the predicate ‘is part of’. Object vagueness raises the further question whether there are vague objects in reality (ontic vagueness). Leaving that question unanswered, I note only that a viable account of vagueness ought to be able to cope with object vagueness, whatever its source, as well as with predicate vagueness. SUPERVALUATIONS Philosophically the most favoured theory of vagueness employs supervaluations.5 Here the idea is that vague statements are treated by considering a range of admissible precisifications, each of which makes the statements involved classically precise, i.e. true or false. A statement’s truth-status is the result of considering all admissible precisifications for it. If on all precisifications it is true, the statement is given the overall value of being true (sometimes called ‘supertrue’); if on all precisifications it is false, the overall value is false (‘superfalse’); if it is true on some preicisifications and false on others, it receives no overall truth-value. Logically complex statements are evaluated first within each precisification, using classical logic, and the overall outcome assessed in the same way as above. The advantages of this approach are that it retains classical logical tautologies and contradictions; vague statements do not have a sharp truth-value; it is compatible with the world being sharp or exact, and it appears to locate vagueness in our concepts rather than in the world or in our beliefs. The disadvantages are that while it seems some statements are vaguer than others, and some vague statements closer to truth or falsity than others, the supervaluation approach provides no measure of how vague a statement is, or how true one is. The notion of an admissible valuation, which is standardly employed in supervaluational approaches, seems itself not to be exact, but to make the approach work there needs to be a sharp cut-off. This again runs into the issue of higher-order vagueness. Then the logic, while preserving classical tautologies and contradictions, is not truth-functional, and it is contended that inference patterns for some statements no longer classical.6 5 6 See Fine 1975, Keefe 2000. This is controversial: against it, see Williams 200x. 4 WHAT IS TO BE DONE? One thing that cannot be done is to carry on as if the philosophy on the one hand and the science and practice on the other had nothing to do with one another. Since the scientific needs will not go away, I suggest it is adventitious to look again at the use of numbers in connection with vagueness, and see if we can come up with a way of providing materials for algorithmic treatments of inexactness which are less philosophically objectionable than fuzzy logic. EXPECTED TRUTH-VALUES The approach I suggest combines aspects of supervaluations and fuzzy logic. From supervaluation theory it takes the idea of a range of different valuations, while from fuzzy logic it takes the idea of assigning numbers to truth- and other values. It then puts the two together to give what I call an expected truth-value for a statement. The term ‘expected truth-value’ is adapted from probability theory, where the expected value (mathematical expectation or mean) of a random variable is the sum of the probabilities of each possible outcome multiplied by its outcome value or “payoff”. ‘Expected’ in this context carries no epistemic connotations: an “expected” outcome is not always even a possible outcome. For example the expected value for a single roll of a fair die is 3.5, which cannot be expected in any epistemic sense since there is no face with this value. If x is a sharp object, and a is a vague object, let the goodness of candidature of x to be a, |x for a| be a number in [0,1] with x|x for a| = 1. The summation is over all candidates x whose goodness is non-zero, or it could be over all objects. There is no need for Angst about a sharp cut-off between candidates and non-candidates, since on a numerical approach non-candidates have goodness zero and contribute nothing to the sum, but their nearest candidate fellows have goodness almost zero and contribute almost nothing to the sum. If R is a vague predicate and Z is an exact predicate, similarly define the goodness of candidature of Z to be R as a number |Z for R| such that Z|Z for R| = 1, where Z again ranges over all candidate relations. 5 If Z is (say) two-placed then for any sharp objects x, y the statement xZy is true (1) or false (0): notate its truth-value as [xZy]. The expected truth value (ETV) of the atomic statement aRb, written as ||aRb||, is defined as ||aRb|| = xZy |x for a|.|Z for R|.|y for b|.[xZy] This is also a number in [0,1], so can be reckoned with. However the method for simple atomic statements does not generalize in the obvious way to complex statements, because we need to take account of what Fine calls penumbral connections, that is, logical relations among vague predicates.7 Consider two people: a, aged 41 and b, aged 39. It is absolutely and determinately true that a is older than b. Take now the two vague predicates ‘old’ and ‘young’. The ETVs ||a is young|| and ||b is old|| are both (we may suppose) non-zero. But if we attempt to calculate the ETV of ‘a is young and b is old’ in the obvious way as ||a is young and b is old|| = FG |F for young|.|G for old |.[Fa Gb] then since there are candidates for ‘young’ which make a young and candidates for ‘old’ which make b old, if these are allowed to vary independently we get that ||a is young and b is old|| > 0 which is absurd. It is wrong to allow these predicates to be precisified independently, since they are connected in meaning. Any precisification of the two predicates must respect the following three constraints No one is both old and young No one who is not young is younger than someone who is young No one who is not old is older than someone who is old8 7 Fine 1975. Clearly the last two are instances of a general kind: Nothing which is not F is F-er than something which is F, where it is admissible to form the comparative. 8 6 which means they must be precisified together in a linked and constrained way. If we notate this linked precisification as ‘F;G for old;young’ then the ETV for ‘a is old and b is young’ is || a is old and b is young || = F;G|F;G for old;young|.[Fa Gb] and we will obtain that ||a is young and b is old|| = 0, as required. There can be links also between candidates for vague objects, in particular if these are related by part–whole relations. For example if x and y are candidates for being a certain marsh m and x is a proper part of y and we are interested in the ETV of ‘m is a large marsh’ then under no precisification LM of ‘large marsh’ can we have LMx but not LMy. Similarly if a is a vague object and x and y are two candidates for a, even though x ≠ y is absolutely true, and both x and y have non-zero candidature to be a, the ETV ||a = a|| must be 1: in other words we need to evaluate the ETV via the predicate x[x = x] and not via the predicate xy[x = y]. When considering the ETV of any complex statement, containing predicates and objects which are penumbrally linked, we must therefore evaluate each precisification for all the linked terms together, which is the method of supervaluations, and not separately, which is the method of fuzzy logic. In this respect the approach is much closer to supervaluationism than to fuzzy logic, the principal difference being in the assignment of “goodnesses” to precisifications. AN EXAMPLE A commonly instanced example of a vague predicate is tall. In order to show the approach in action we shall look at a small range of interconnected vague predicates: tall, short, of medium height, very tall, very short. To avoid contextual complications we consider a single population of adult males at a single time. We assume that there is a height function h defined on the population so that at this time the height of an individual a is h(a), and we ignore diurnal height variation. If the population is large enough it is known that the height distribution approximates closely to the normal or Gaussian distribution given by the probability distribution function 7 1 x 2 f (x) exp 2 2 2 1 where is the mean and is the standard variation of the heights in the population. The actual values are furnished by the cumulated individual facts. We can consider a range of heights or, more practically, height-intervals, which are as precise as we need to make them, and look at the individuals falling in these height-intervals. The question is then, how we arrive at expected truth-values for the vague predicates tall, short, of medium height, very tall, very short, so that the penumbral connections of the predicates is taller than, is shorter than and is as tall as with one another and the mentioned vague predicates are all suitably respected. The first thing to note is that within the margin of error given by the width of the height-intervals, the binary predicates is taller than, is shorter than, is as tall as are precise (classical). Of these, the first two are asymmetric and transitive, the last is an equivalence. Their truth-values in any case can be deduced by looking at the relative values of h(a) and h(b) for a given pair of men a, b. The contraints to be respected are that for any precisification of the predicates if a is very tall then a is tall if a is very short then a is short if a is tall then a is not short if a is very tall and b is tall but not very tall then a is taller than b if b is tall and c is of medium height then b is taller than c if c is of medium height and d is short then c is taller than d if d is short but not very short and e is very short then d is taller than e a is of medium height iff a is neither tall nor short if a is of average height (), a is neither tall nor short The last constraint means that the cut-offs for tall and short must be above, respectively below, the mean. This strongly suggests we should treat the penultimate constraint as a definition of ‘is of medium height’. Suppose the mean height in the population is 176 cm and that we consider intervals of 2 cm from the odd-number cm heights, so the group of average height is in the range 175–177 cm. We look at the following five precisifications and tabulate their lower cut-offs: 8 short 165 169 167 165 163 medium 169 171 171 173 173 tall 183 181 181 179 179 very tall 187 183 185 187 189 Goodness 0.1 0.3 0.3 0.2 0.1 The goodnesses assigned to precisifications in this mini-example have been done intuitively, but more methodical ways to do so would be to look at the height distribution curve and consider the percentiles assigned to the different categories, or, more empirically, to do a survey of people’s opinions on which men are tall, short and so on. We would also expect a much larger number of precisifications to be used. We are using a small number for illustration only, and while we have for simplicity arrayed tall respectively very tall symmetrically to short respectively very short about the mean, in real life this may not be what happens. Taking now a range different heights for men, the expected truth-values of their falling under the various predicates, given the precisifications above and their respective goodnesses, are as given in the table below, generated by simple spreadsheet calculation: Height 162 VS 1 S 1 M 0 T 0 VT 0 164 0.9 0.6 0.3 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0.9 0.3 0 0 0 0 0 0 0 0 0 0 0 0 0.1 0.7 1 1 1 0.7 0.1 0 0 0 0 0 0 0 0 0 0 0 0 0.3 0.9 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0.3 0.6 0.9 1 166 168 170 172 174 176 178 180 182 184 186 188 190 Again the numbers should not be taken too seriously, but they do illustrate how the gradations between for example very tall and tall or short and medium-sized give 9 useful information about the gradual transitions in a way that a non-numerical supervaluational approach does not. At the same time the constraints respecting penumbral connections are reflected in the table. ADVANTAGES AND PROBLEMS This numerical-supervaluational approach, which for short I call supernumeration, as here outlined promises some clear advantages over the alternatives: it yields calculable numerical values (like fuzzy logic); it gives tautologies the value 1 and contradictions the value 0 (like supervaluations); unlike supervaluations, it quantifies goodness of case; it is iterable; and it minimizes the effect of higher-order vagueness because the numerical contribution of cases near to the extremes 1 and 0 is close to the values for those extreme cases. It further generalizes using integration to the infinite case, via the notions of candidature and truth-value density functions, analogously to the way finitary probability generalizes to the infinite case; it can be applied equally well to quantities other than truth-values, such as mass, size etc.; and finally, it does not need to deny that the world in itself is sharp (whether we wish to affirm this is another matter). There are equally some obvious prima facie disadvantages with supernumeration: it is complicated, and harbours elements of arbitrariness. There are two obvious issues about arbitrariness in applying the method. One is where to get the numbers from. Too much should not be made of this issue: goodness of candidature is not something writ in the heavens, but is a constrained numerical estimate. Fuzzy logic is often accused of introducing spurious and indeed ridiculous hyperexactness into what is after all a vague and fuzzy matter. This is only a serious problem if the numbers are taken to be God-given real values existing independently of us. If the assignment of numbers is construed instrumentally, as a way we can work with otherwise intractable or unquantifiable properties, then they can be taken with metaphysical lightness. The more serious problems of fuzzy logic, concerning its value-functionality, remain even when the numbers are taken lightly in this way. In actually used fuzzy logic, computation typically allows numerical values to be varied and an algorithm run repeatedly to see how far the result deviates from other results with different values, and the same could be done here. Also the problem of infinite or large finite ranges of values is taken in hand by considering finitely many subranges 10 of values, rather as taxation authorities divide incomes of taxpayers into different bands for the purposes of applying different rules, or statisticians divide continuous samples into bands for numerical treatment. Mapmakers provide contour lines cutting land surfaces at (e.g.) 10 m intervals as ways to present complex relief: this is a necessary simplification, as is the more obvious device of colouring relief at different heights. False colour images from satellite and astronomical data are another presentational device that is frankly accepted as a necessary simplification. The other problem is that of discerning the constraints imposed on precisifications by penumbral connections. Here there appears to be no simple or uniform procedure or algorithm: it is not like logic. Again this mirrors what happens in applied probability. Each statement or type of statement needs to be looked at in its own terms, relying on the judgement, common sense, and accumulated semantic expertise of the investigator or investigators. Given the complexity of our vague language, this is only to be expected. References Edgington, D. 1997. Vagueness by Degrees. In R. Keefe & P. Smith eds., Vagueness: A Reader. Cambridge, Massachusetts: MIT Press, 294-316. Fine, K. 1975. Vagueness, Truth and Logic. Synthese 30, 265-300. Keefe, R. 2000. Vagueness. Cambridge: Cambridge University Press. Simons, P. M. 1997. Vagueness, Many-Valued Logic, and Probability. In W. Lenzen, ed., Das weite Spektrum der Analytischen Philosophie – Festschrift für Franz von Kutschera. Berlin/New York: de Gruyter, 307–322. —— 1999. Does the Sun Exist? The Problem of Vague Objects. In: T. Rockmore, ed., Proceedings of the XX World Congress of Philosophy. Vol. II: Metaphysics. Bowling Green: Philosophy Documentation Center, 89–97. Williams, J. R. G. 200x. Supervaluationism and Logical Revisionism, forthcoming in The Journal of Philosophy. 11