Procedure Matters - TerpConnect

advertisement
Procedure Matters
Paul M. Pietroski
University of Maryland
Dept. of Linguistics, Dept. of Philosophy
http://www.terpconnect.umd.edu/~pietro
Most of the dots are yellow
15 dots:
9 yellow
6 blue
‘Most of the dots are yellow’
#{DOT & YELLOW} > #{DOT}/2
More than half of the dots are yellow
(9 > 15/2)
#{DOT & YELLOW} > #{DOT & YELLOW}
The yellow dots outnumber the nonyellow dots (9 > 6)
#{DOT & YELLOW} > #{DOT} – #{DOT & YELLOW}
The number of yellow dots exceeds
the number of dots minus the number of yellow dots
(9 > 15 – 9)
‘Most of the dots are yellow’
MOST[DOT(x), YELLOW(x)]
#{x:DOT(x) & YELLOW(x)} > #{x:DOT(x)}/2
More than half of the dots are yellow
(9 > 15/2)
#{x:DOT(x) & YELLOW(x)} > #{x:DOT(x) & YELLOW(x)}
The yellow dots outnumber the nonyellow dots (9 > 6)
#{x:DOT(x) & YELLOW(x)} > #{x:DOT(x)} – #{x:DOT(x) & YELLOW(x)}
The number of yellow dots exceeds
the number of dots minus the number of yellow dots
(9 > 15 – 9)
Most of the dots are yellow
15 dots:
9 yellow
6 blue
Hume’s Principle
#{x:T(x)} = #{x:H(x)}
iff
{x:T(x)} OneToOne {x:H(x)}
____________________________________________
#{x:T(x)} > #{x:H(x)}
iff
{x:T(x)} OneToOnePlus {x:H(x)}
α OneToOnePlus β iff for some α*,
α* is a proper subset of α, and α* OneToOne β
(and it’s not the case that β OneToOne α)
‘Most of the dots are yellow’
MOST[DOT(x), YELLOW(x)]
#{x:DOT(x) & YELLOW(x)} > #{x:DOT(x)}/2
#{x:DOT(x) & YELLOW(x)} > #{x:DOT(x) & YELLOW(x)}
#{x:DOT(x) & YELLOW(x)} > #{x:DOT(x)} – #{x:DOT(x) & YELLOW(x)}
OneToOnePlus[{x:DOT(x) & YELLOW(x)}, {x:DOT(x) & YELLOW(x)}]
‘Most of the dots are yellow’
MOST[D, Y]
OneToOnePlus[{D & Y},{D & Y}]
#{D & Y} > #{D & Y}
#{D & Y} > #{D}/2
#{D & Y} > #{D} – #{D & Y}
???Most of the paint is yellow???
Many Conceptions of Human Languages
•
•
•
•
•
complexes of “dispositions to verbal behavior”
strings of an elicited (or nonelicited) corpus
a procedure that generates an independently specified corpus
something a radical interpreter ascribes to a speaker
“Something which assigns meanings to certain strings of types
of sounds or marks. It could therefore be a function, a set of
ordered pairs of strings and meanings.”
Many Conceptions of Human Languages
• a biologically implementable procedure that generates
expressions, which may be characterizable only in terms
of the procedure that generates them
Many Conceptions of Human Languages
•
•
•
•
•
•
complexes of “dispositions to verbal behavior”
strings of an elicited (or nonelicited) corpus
strings of (perhaps written) words in some corpus
a procedure that generates an independently specified corpus
something a radical interpreter ascribes to a speaker
“a set of ordered pairs of strings and meanings”
• a biologically implementable procedure that generates
expressions, which may be characterizable only in terms
of the procedure that generates them
Many Conceptions of Human Languages
•
•
•
•
•
•
complexes of “dispositions to verbal behavior”
strings of an elicited (or nonelicited) corpus
strings of (perhaps written) words in some corpus
a procedure that generates an independently specified corpus
something a radical interpreter ascribes to a speaker
“a set of ordered pairs of strings and meanings” (E-Languages)
• a biologically implementable procedure that generates
expressions, which may be characterizable only in terms
of the procedure that generates them (I-Languages)
‘I’ Before ‘E’
Alonzo Church
(of Church-Turing fame)
function-in-intension vs. function-in-extension
--a procedure that pairs inputs with outputs in a certain way
--a set of ordered pairs (with no <x,y> and <x, z> where y ≠ z)
I-Language/E-Language
function in Intension
implementable procedure
that pairs inputs with outputs
function in Extension
set of input-output pairs
|x – 1|
+√(x2
– 2x + 1)
{…(-2, 3), (-1, 2), (0, 1), (1, 0), (2, 1), …}
λx . |x – 1| = λx . +√(x2 – 2x + 1)
λx . |x – 1| ≠ λx . +√(x2 – 2x + 1)
Extension[λx . |x – 1|] = Extension[λx . +√(x2 – 2x + 1)]
I-Language/E-Language
function in Intension
implementable procedure
that pairs inputs with outputs
function in Extension
set of input-output pairs
With regard to languages, we can...
(1) focus on phrasal composition, and worry later about words
assume meanings for ‘brown’ and ‘cow’,
and ask what ‘brown cow’ means
(there are lots of conjunction operations out there)
I-Language/E-Language
function in Intension
implementable procedure
that pairs inputs with outputs
function in Extension
set of input-output pairs
With regard to languages, we can...
(1) focus on phrasal composition, and worry later about words
(2) focus on words, and worry later about phrasal composition
assume a composition rule for ADJECTIVE^NOUN,
and ask what ‘cow’ (or ‘brown’) means
I-Language/E-Language
function in Intension
implementable procedure
that pairs inputs with outputs
function in Extension
set of input-output pairs
With regard to languages, we can...
(1) focus on phrasal composition, and worry later about words
(2) focus on words, and worry later about phrasal composition
assume a composition rule for QUANTIFIER^NOUN,
and ask what the quantifier ‘most’ means
Tim
Hunter
Darko
Odic
AW
l e
e l
x l
i w
s o
o
d
J
e
f
f
Justin Halberda
L
i
d
z
Most of the dots are yellow
‘Most of the dots are yellow’
MOST[D, Y]
OneToOnePlus[{D & Y},{D & Y}]
#{D & Y} > #{D & Y}
#{D & Y} > #{D}/2
#{D & Y} > #{D} – #{D & Y}
Some Relevant Facts
• many animals are good cardinality-estimaters, by dint of
a much studied system (see Dehaene, Gallistel/Gelman, etc.)
• appeal to subtraction operations is not crazy (Gallistel/King)
• but...infants can do one-to-one comparison (see Wynn)
• and Frege’s versions of the axioms for arithmetic can be
derived (within a consistent fragment of Frege’s logic) from
definitions and Hume’s (one-to-one correspondence) Principle
• Lots of references in…
The Meaning of 'Most’. Mind and Language (2009).
Interface Transparency and the Psychosemantics of ‘most’.
Natural Language Semantics (2011).
‘Most of the dots are yellow’
MOST[D, Y]
OneToOnePlus[{D & Y},{D & Y}]
#{D & Y} > #{D & Y}
#{D & Y} > #{D} – #{D & Y}
Are most of the dots yellow?
What conditions
make the
question
easy/hard
to answer?
That might
provide
clues
about
how we
understand the question
(given decent accounts of what
information is available to us in those conditions).
a model of the “Approximate Number System”
(key feature: ratio-dependence of discriminability)
distinguishing 8 dots from 4 (or 16 from 8)
is easier than
distinguishing 10 dots from 8 (or 20 from 10)
a model of the “Approximate Number System”
(key feature: ratio-dependence of discriminability)
correlatively, as the number of dots rises,
“acuity” for estimating of cardinality
decreases--but still in a ratio-dependent way,
with wider “normal spreads” centered on
right answers
4:5 (blue:yellow)
“scattered pairs”
1:2 (blue:yellow)
“scattered pairs”
4:5 (blue:yellow)
“scattered pairs”
9:10 (blue:yellow)
“scattered pairs”
4:5 (blue:yellow)
“column pairs sorted”
4:5 (blue:yellow)
“column pairs mixed”
5:4 (blue:yellow)
“column pairs mixed”
scattered random
scattered pairs
4:5 (blue:yellow)
column pairs mixed
column pairs sorted
Basic Design
• 12 naive adults, 360 trials for each participant
• 5-17 dots of each color on each trial
• trials varied by ratio (from 1:2 to 9:10) and type
• each “dot scene” displayed for 200ms
• target sentence: Are most of the dots yellow?
• answer ‘yes’ or ‘no’ by pressing buttons on a keyboard
• correct answer randomized
• controls for area (pixels) vs. number, yada yada…
Percent Correct
100
90
80
better performance on
easier ratios: p < .001
70
Scattered Random
Scattered Pairs
60
Column Pairs Mixed
Column Pairs Sorted
50
1
10 : 10
1.5
Ratio (Weber Ratio)
10 : 15
2
10 : 20
fits for Sorted-Columns trials to an independent model
for detecting the longer of two line segments
fits for trials
(apart from Sorted-Columns)
to a standard psychophysical model
for predicting ANS-driven performance
performance on Scattered Pairs and Mixed Columns
was no better than on Scattered Random;
looks like ANS was used to answer the question,
except in the Sorted Columns trials
scattered random
scattered pairs
4:5 (blue:yellow)
column pairs mixed
column pairs sorted
Follow-Up Study
Could it be that speakers
use ‘most’ to access a 1-To-1-Plus concept,
but our task made it too hard to
use a 1-To-1-Plus verification strategy?
4:5 (blue:yellow)
“scattered pairs”
What color are the loners?
better performance on components of a 1-to-1-plus task
10 : 10
10 : 15
10 : 20
We are NOT saying...
• that speakers always/usually verify sentences of the form
‘Most of the Ds are Ys’ by computing
#{D & Y} > #{D} – #{D & Y}
• that if there are some tasks in which speakers do not verify
‘Most of the Ds are Ys’ by using
a one-to-one correspondence strategy,
then ‘Most’ is not understood in terms of
a one-to-one correspondence
But we are (tentatively) assuming that...
if speakers understand sentences of the form
‘Most of the Ds are Ys’ as claims of the form
#{D & Y} > #{D} – #{D & Y}
then other things equal,
speakers will use this “logical form” as a verification strategy
if they can easily do so
Compare:
‘Bert arrived and Ernie left’
fits for Sorted-Columns trials to an independent model
for detecting the longer of two line segments
fits for trials
(apart from Sorted-Columns)
to a standard psychophysical model
for predicting ANS-driven performance
performance on Scattered Pairs and Mixed Columns
was no better than on Scattered Random;
looks like ANS was used to answer the question,
except in the Sorted Columns trials
Side Point Worth Noting…50% plus a tad
‘Most of the dots are yellow’
MOST[D, Y]
OneToOnePlus[{D & Y},{D & Y}]
#{D & Y} > #{D & Y}
#{D & Y} > #{D} – #{D & Y}
‘Most of the dots are blue’
#{x:Dot(x) & Blue(x)} > #{x:Dot(x) & ~Blue(x)}
#{x:Dot(x) & Blue(x)} > #{x:Dot(x)} − #{x:Dot(x) & Blue(x)}
• if there are only two colors to worry about, say blue and red,
then the non-blues can be identified with the reds
‘Most of the dots are blue’
#{x:Dot(x) & Blue(x)} > #{x:Dot(x) & ~Blue(x)}
#{x:Dot(x) & Blue(x)} > #{x:Dot(x)} − #{x:Dot(x) & Blue(x)}
if there are only two colors to worry about, say blue and red,
then the non-blues can be identified with the reds
• the visual system can (and will) “select”
the dots, the blue dots, and the red dots;
so the ANS can estimate these three cardinalities
but adding more colors will make it harder (and with 5 colors,
impossible) for the visual system to make enough “selections”
for the ANS to operate on
‘Most’ as a Case Study
‘Most of the dots are blue’
#{x:Dot(x) & Blue(x)} > #{x:Dot(x) & ~Blue(x)}
#{x:Dot(x) & Blue(x)} > #{x:Dot(x)} − #{x:Dot(x) & Blue(x)}
• adding alternative colors will make it harder (and eventually
impossible) for the visual system to make enough “selections”
for the ANS to operate on
• so given the first proposal (with negation), verification should
get harder as the number of colors increases
• but the second proposal (with subtraction) predicts relative
indifference to the number of alternative colors
better performance on
easier ratios: p < .001
no effect of number of colors
fit to psychophysical model of
ANS-driven performance
‘Most’ as a Case Study
‘Most of the dots are blue’
#{x:Dot(x) & Blue(x)} > #{x:Dot(x) & ~Blue(x)}
#{x:Dot(x) & Blue(x)} > #{x:Dot(x)} − #{x:Dot(x) & Blue(x)}
• adding alternative colors will make it harder (and eventually
impossible) for the visual system to make enough “selections”
for the ANS to operate on
• so given the first proposal (with negation), verification should
get harder as the number of colors increases
• but the second proposal (with subtraction) predicts relative
indifference to the number of alternative colors
‘Most of the dots are yellow’
MOST[D, Y]
OneToOnePlus[{D & Y},{D & Y}]
#{D & Y} > #{D & Y}
#{D & Y} > #{D}/2
#{D & Y} > #{D} – #{D & Y}
???Most of the paint is yellow???
‘Most’ as a Case Study
‘Most of the dots are blue’
#{x:Dot(x) & Blue(x)} > #{x:Dot(x)} − #{x:Dot(x) & Blue(x)}
• mass/count flexibility
Most of the dots (blobs) are brown
Most of the goo (blob) is brown
• are mass nouns (somehow) disguised count nouns?
#{x:GooUnits(x) & BlueUnits(x)} >
#{x:GooUnits(x)} − #{x:GooUnits(x) & BlueUnits(x)}
discriminability is BETTER for ‘goo’ (than for ‘dots’)
w = .18
r2 = .97
w = .27
r2 = .97
Are more of the blobs blue or yellow?
If more the blobs are blue, press ‘F’. If more of the blobs are yellow, press ‘J’.
Is more of the blob blue or yellow?
If more the blob is blue, press ‘F’. If more of the blob is yellow, press ‘J’.
Performance is better (on the same stimuli)
when the question is posed with a mass noun
w = .20
r2 = .99
100
95
w = .29
r2 = .98
90
% Correct
85
80
75
70
Mass Data
Mass Model
Count Data
Count Model
65
60
55
50
1.0
1.2
1.4
1.6
1.8
Ratio (Bigger Quantity/ Smaller Quantity)
2.0
2.2
‘Most’ as a Case Study
‘Most of the dots are blue’
#{x:Dot(x) & Blue(x)} > #{x:Dot(x)} − #{x:Dot(x) & Blue(x)}
• mass/count flexibility
Most of the dots (blobs) are brown
Most of the goo (blob) is brown
• are mass nouns disguised count nouns?
SEEMS
#{x:GooUnits(x) & BlueUnits(x)} >
#{x:GooUnits(x)} − #{x:GooUnits(x) & BlueUnits(x)}
NOT
Procedure Matters
...Psychophysics, on
the other hand,
is related more
directly to the level
of algorithm and
representation.
Different algorithms
tend to fail in
radically different
ways as they are
pushed to the limits
of their performance
or are deprived of
critical information.
As we shall see, primarily
psychophysical evidence proved
to Poggio and myself that our
first stereo-matching algorithm
was not the one used by the
brain, and the best evidence
that our second algorithm
(Marr and Poggio, 1976)
is roughly the one used also
comes from psychophysics.
Of course, the underlying
computational theory remained
the same in both cases, only the
algorithms were different.
Psychophysics can also help to
determine the nature of a
representation...
THANKS
GLONK
GLONKS
GLONK
But often, the world isn’t this helpful
GLONKS
GRUP GLONK
FLORT GRONK
FLIB GRONK
FLIB GRUP
FLIB GRONK
GLONKS
GRUPS
GLONKLED
FLORT GRONK
FLIB GRUPE
FLIB GRONK
FLIB GRONK
DRIV WONK HORTLE BING
GLONKS
FLIB GRONK
GRUPS
GLONKLED
FLORT
FLIB
GRUPE
GRONK
DRIV WONK
FLIB GRONK
HORTLE BING
GLONK
TRIANGLE
TRILATERAL
DETACHED DIAMOND-HALF
etc.
I-Language/E-Language
function in Intension
implementable procedure
that pairs inputs with outputs
function in Extension
set of input-output pairs
In drawing this distinction, we can...
(1) focus on phrasal composition, and worry later about words
(2) focus on words, and worry later about phrasal composition
assume a composition rule for ADJECTIVE^NOUN,
and ask what ‘cow’ (or ‘glonk’) means
Church (1941) on Lambdas
1: a function is a “rule of correspondence”
2: underdetermined when “two functions shall be considered the same”
2-3: functions in extension, functions in intension
In the calculus of L-conversion and the calculus of restricted
λ-K-conversion, as developed below, it is possible, if desired, to
interpret the expressions of the calculus as denoting functions in
extension. However, in the caluclus of λ-δ-conversion, where the
notion of identity of functions is introduced into the system by the
symbol δ, it is necessary, in order to preserve the finitary character of
the transformation rules, so to formulate these rules that an
interpretation by functions in extension becomes impossible.
The expressions which appear in the calculus of λ-δ-conversion are
interpretable as denoting functions in intension of an appropriate kind.
3: “The notion of difference in meaning between two rules of
correspondence is a vague one, but in terms of some system of notation, it
can be made exact in various ways.”
Lewis, “Languages and Language”
• “What is a language? Something which assigns meanings to
certain strings of types of sounds or marks. It could therefore
be a function, a set of ordered pairs of strings and meanings.”
• “What is language? A social phenomenon which is part of the
natural history of human beings; a sphere of human action ...”
Later on, in replies to objections...
• “We may define a class of objects called grammars...
A grammar uniquely determines the language it generates.
But a language does not uniquely determine the grammar
that generates it...”
Lewis, “Languages and Language”
“I know of no promising way to make objective sense of the assertion that a
grammar Γ is used by a population P, whereas another grammar Γ’, which
generates the same language as Γ, is not. I have tried to say how there are facts
about P which objectively select the languages used by P. I am not sure there
are facts about P which objectively select privileged grammars for those
languages...a convention of truthfulness and trust in Γ will also be a convention
of truthfulness and trust in Γ’ whenever Γ and Γ’ generate the same language.”
“I think it makes sense to say that languages might be used by populations even
if there were no internally represented grammars. I can tentatively agree that £
is used by P if and only if everyone in P possesses an internal representation of
a grammar for £, if that is offered as a scientific hypothesis. But I cannot accept
it as any sort of analysis of “£ is used by P”, since the analysandum clearly could
be true although the analysans was false.”
‘I’ Before ‘E’
• Church: function-in-intension vs. function-in-extension
--a procedure that pairs inputs with outputs in a certain way
--a set of ordered pairs (with no <x,y> and <x, z> where y ≠ z)
• Chomsky: I-language vs. E-language
--an implementable procedure that generates expressions:
π-λ
DS-SS-PF
DS-SS-PF-LF
PHON-SEM
(a) ‘generate’ as in ‘These axioms generate the natural numbers’
(b) procedure...a LEXICON plus a COMBINATORICS
(c) open question how such procedures are used in events of
comprehension/production/thinking/judging-acceptability
Download