Alternative-Sensitivity of Likely and Probable: Linguistic and

advertisement
Alternative-Sensitivity of Likely and Probable:
Linguistic and Psychological Implications
Daniel Lassiter, New York University and Institute of Philosophy
lassiter@nyu.edu
What do Likely and Probable Mean? The typical answer is that likely and
probable are synonyms, and that φ is likely/probable if and only if φ is more likely
than ¬φ (Kratzer 1991). In the probabilistic terms this is equivalent to: φ is
likely/probable iff prob(φ) > 0.5 (cf. Yalcin 2010).
Psychological Results on Alternative-Sensitivity. Teigen (1988) found that
subjects consistently deemed each of multiple mutually exclusive possibilities
“probable” if they were comparable in likelihood and no other outcome was
significantly more likely. He concludes that “people are willing to use expressions like
‘probable’ to characterize chances thought to be clearly below 50%”. Windschitl &
Wells (1999) found that subjects provided more optimistic estimates of a lowprobability outcome A if it was (one of) the most likely among a set of alternatives
than if another alternative B was more likely, even when A's probability was held
constant.
A Psychological Interpretation. If likely means “more likely than not”, no two
mutually exclusive events can both be likely, and the distribution of alternatives
should not matter at all. Teigen concludes that his results show “overestimation of
chances” and “violations of the distributive law of probability theory”. Windschitl &
Wells go further, claiming that their results reveal the effects of a non-rule-based,
associative, “gut-level” system used for verbal probability judgments. These
interpretations are in line with the large literature arguing that humans are bad at
probabilistic reasoning (cf. Kahneman & Tversky), but it is not clear how to make
sense of them in formal semantic terms.
A Semantic Interpretation. I argue that alternative-sensitivity in probability
judgments is a semantic fact, not a logical failing of experimental subjects or evidence
for an associative system for verbal probabilities. Evidence comes from two domains:
first, likely and probable belong to a class of GRADABLE ADJECTIVES for which such
effects are the norm; second, they are FOCUS-SENSITIVE, pointing to specifically
linguistic sensitivity to alternatives.
Relative Adjectives. Likely and probable fall among the RELATIVE ADJECTIVES, an
example of which is tall (Kennedy & McNally 2005). This is diagnosed by the
acceptability of modifers:
(1) Jeffrey is very/extremely/??completely/#slightly/#half tall.
(2) It is very/extremely/??completely/#slightly/#half likely that it will rain.
Relative adjectives are sensitive to COMPARISON CLASSES. But tall does not just mean
“taller than average for comparison class C”: if the average for C is 5′6′′, someone who
is 5′6.2′′ will be taller than average for C but not “tall” (Fara 2000). The meaning of
tall is roughly as in (3):
(3) [|tall|] λX<e,t>λxe . x is SIGNIFICANTLY taller than average/normal/expected for X
Note that (3) also presupposes x ∈ X: Harold is tall for a jockey is infelicitous if
Harold is not a jockey.
As with tall, φ is not “likely” if it is just barely more likely than ¬φ (Yalcin
2010). This suggests:
(4) [|likely|] = λp<s,t> . p is significantly more likely than p
This denotation does not have a comparison class argument, though. In fact likely
does not seem to accept overt comparison classes:
(5) ?? It is likely that it will rain for a summer's day.
This sentence does not have the intended reading “Rain is more likely than is typical
in the summer".
Focus. Despite this apparent difference between tall and likely, there is evidence
that likely has a comparison class argument. Imagine a lottery with a million tickets,
in which one individual, Mr. Burns, is determined to win and buys 300,000. The rest
are evenly distributed among the inhabitants of Springfield. Many speakers find a
contrast between (6) and (7) in this scenario:
(6) It is likely that [MR. BURNS will win the lottery]. (True)
(7) It is likely that [Mr. Burns will WIN THE LOTTERY]. (False)
This is not grammatical focus-sensitivity, though. Beaver & Clark (2008) show that
conventionally focus-sensitive operators like only must c-command the focus. This is
not necessary for FREE ASSOCIATION WITH FOCUS (FAF), which, they argue, explains the
focus-sensitivity of always.
(8) Frank is the one who I always see.
(9) Frank is the one who I only see.
Frank can be the focus in (8), but not (9) – that is, (9) cannot mean “I don't see
anyone but Frank". Likely patterns with always on this test: (10) can be interpreted as
equivalent to (6).
(10) Mr. Burns is the one who is likely to win the lottery.
We cannot go into the details of Beaver & Clark's analysis here, but they show that
FAF occurs primarily with expressions like always that have implicit domain
arguments. In the case of likely, I suggest, this is the set of propositions with which
the complement clause is compared for likelihood.
(11) [|likely|] = λP<st,t>λp<s,t> . p is significantly more likely than average/normal/
expected for P
As in Beaver & Clark's treatment of always, the implicit argument is filled by the set of
focus alternatives, a partition of the common ground corresponding to the complete
and relevant answers to the Question Under Discussion (QUD, cf. Roberts 1996).
Various properties of likely follow, e.g. that φ and ¬φ cannot both be likely in the
same context (see Yalcin 2009 for this worry).
The contrast in (6-7) now follows from the fact that their complements
respond to different QUDs:
(12) Alt((6)): Who will win? → {Mr. Burns will win, Bart will win, Millhouse will
win, ... }
(13) Alt((7)): What will Mr. Burns do? → {Mr. Burns will win, Mr. Burns will not
win}
This also explains why the standard analysis often gets the truth-conditions right, if
we adopt (14):
(14) Alt(φ) defaults to {φ, ¬φ} - i.e., QUD defaults to ?φ - unless context or focus
supplies another value.
Since linguistic examples are usually presented out of context, (14) and the
denotation in (11) will conspire to make comparing a proposition to its negation the
default option for likely and probable.
Average or Maximum? Suppose outcome A has probability .4, B has probability
.15, and the remaining probability is distributed evenly over 100 alternatives. (11)
allows that A and B may both come out as ‘likely’, depending on the value of
‘significantly’. This seems odd. One possibility is that likely refers not to an average
but to a maximum value, so that only the most likely alternative(s) count as `likely'.
However, this account predicts wrongly that `likely' should pattern with the
MAXIMUM-STANDARD adjectives like`full' – but these accept different degree modifiers,
and are not sensitive to comparison classes.
Exhaustivity. We maximize the similarity between likely and other relative
adjectives, and minimize stipulations about its meaning, if we take the intuition that
A and B are not both likely to be a secondary inference. When A and B are
alternatives, an assertion of B is likely triggers an exhaustivity inference to the effect
that A is not likely. As a result, B is likely is at best misleading when A is more likely
than B. This account correctly predicts no similar inference for tall (whose
alternatives are not propositions) or for possible (which is not sensitive to
comparison classes).
Explaining the Lack of Overt Comparison Classes. If this account is correct,
why don't likely and probable accept comparison class arguments? I suggest that this
is due to the fact that “x is A for a C" presupposes x ∈ C, which cannot be satisfied in a
likely for a NP construction. That is, (5) is infelicitous because the proposition that it
will rain is not an instance of a summer's day.
Psychological Implications. Subjects' sensitivity to the distribution of
alternatives in probability judgments does not necessarily show that they are
inconsistent or illogical. In this case, it shows instead that our assumptions about the
meanings of probability expressions were mistaken. Alternative-sensitivity is
expected for relative adjectives like likely, probable, and their derivatives. This does
not, to be sure, show that the people are good at probabilistic reasoning – there are
many more results that suggest the opposite. However, it does suggest that the
conclusions drawn by psychologists in this domain should be carefully scrutinized for
unjustified semantic assumptions: perhaps people are not quite so bad at
probabilistic reasoning as we have thought.
Reference List
Beaver, David, & Brady Clark. 2008. Sense & Sensitivity. Malden: Blackwell.
Fara, Delia Graff. 2000. Shifting sands. Philosophical Topics 28(1).
Kahneman, Daniel, & Amos Tversky. 1974. Judgment under uncertainty: Heuristics
and biases. Science 185.
Kennedy & McNally 2005. Scale structure, degree modification, and the semantics of
gradable predicates. Language 81(2).
Kratzer 1991. Modality. In A. von Stechow & D. Wunderlich (eds.), Semantics: An
International Handbook of Contemporary Research. Walter de Gruyter.
Roberts, Craige. 1996. Information structure in discourse. In J.-H. Yoon & A. Kathol
(eds.), Ohio State University Working Papers in Linguistics vol. 49: Papers in
Semantics.
Teigen, Karl. 1988. When are low-probability events judged to be ‘probable’? Effects
of outcome-set characteristics on verbal probability estimates. Acta
Psychologica 67(2).
Windschitl, Paul D., & Gary L. Wells. 1999. The alternative-outcomes effect. Journal
of Personality and Social Psychology 75(6).
Yalcin, Seth. 2009. The language of probability. Handout from talk at the
Department of Linguistics, University of California at Berkeley, November 30
2009.
Yalcin, Seth. 2010. Probability Operators. Philosophy Compass 5(11).
Download