Reply_to_Piantadosi7

advertisement
Can Statistical Learning Bootstrap the Integers?
Lance J. Ripsa, Jennifer Asmuthb, Amber Bloomfieldc
a
Psychology Department, Northwestern University, 2029 Sheridan Road, Evanston, IL 60208 USA
b
Psychology Department, Susquehanna University, 514 University Avenue, Selinsgrove, PA 17870 USA
c
Center for Advanced Study of Language, University of Maryland,7005 52nd Ave., College Park,
MD 20742 USA
Corresponding author:
Lance Rips
Psychology Department
Northwestern University
2029 Sheridan Road
Evanston, IL 60208 USA
847.491.5947
Fax: 847.491.7859
Email: rips@northwestern.edu
Reply to Piantadosi et al. / 2
Abstract
This paper examines Piantadosi, Tenenbaum, and Goodman’s (2012) model for how children
learn the relation between number words (“one” through “ten”) and cardinalities (sizes of sets with one
through ten elements). This model shows how statistical learning can induce this relation, reorganizing its
procedures as it does so in roughly the way children do. We question, however, Piantadosi et al.’s claim
that the model performs “Quinian bootstrapping,” in the sense of Carey (2009). The concept it learns is
not discontinuous with the concepts it starts with. Instead, the model learns by recombining its primitives
into hypotheses and confirming them statistically. As such, it accords better with earlier theories (Fodor,
1975, 1981) that learning does not increase expressive power. We also question the relevance of the
simulation for children’s learning. The model starts with a small, preselected set of primitives, and the
procedure it learns differs from children’s method. Finally, the knowledge of the positive integers that the
model attains is consistent with an infinite number of nonstandard meanings—for example, that the
integers stop after ten or loop from ten back to one.
Keywords: Bootstrapping, Number knowledge, Number learning, Statistical learning
Reply to Piantadosi et al. / 3
1. Introduction
According to the now standard theory of number development, children gradually learn to
recognize and produce collections of one, then two, and then three objects in response to requests, such as
“Give me one [two, three] cups” or “Point to the picture of one [two, three] elephants.” At this point, they
rapidly extend their success to larger collections—up to those named by larger numerals on their list of
number terms, for example, “ten” (Wynn, 1992). (The largest numeral for which they are successful can
vary, but let’s say “ten” for concreteness.) The standard theory sees this last achievement as the result of
the children figuring out how to count objects: They learn a general rule for how to pair the numerals on
their list with the objects in a collection in order to compute the total.
No one doubts that children in Western cultures learn enumeration as a technique for determining
the cardinality (i.e., the total size) of a collection. However, debates exist about how children make this
discovery (see, e.g., Carey, 2009; Leslie, Gelman, & Gallistel, 2008; Piantadosi, Tenenbaum, &
Goodman, 2012; and Spelke, 2000, 2011) and about its significance for their knowledge of number (e.g.,
Margolis & Laurence, 2008; Rey, 2011; Rips, Asmuth, & Bloomfield, 2006, 2008; Rips, Bloomfield, &
Asmuth, 2008). Our aim in the present article is to examine a recent theory of learning enumeration—one
by Piantadosi et al.—and to compare it to an earlier proposal by Carey. In doing so, we are motivated by
Piantadosi et al.’s claim to have shown that difficulties we identified in Carey’s theory disappear on
closer inspection.
1.1. Carey’s bootstrap proposal
Carey (2004, 2009) has given a detailed account of learning to enumerate as an instance of a
process she calls Quinian bootstrapping. In brief, children start with a short memorized list of numerals in
order from “one” to “ten,” but where these numerals are otherwise uninterpreted. Over an approximately
one-year period, children successively attach the numeral “one” to a mental representation of an arbitrary
one-member set (e.g., {o1}), the numeral “two” to a representation of a two-member set (e.g., {o1, o2}),
Reply to Piantadosi et al. / 4
and the numeral “three” to a representation of a three-member set (e.g., {o1, o2, o3}). Children next realize
that a parallel exists between the order of the numeral list (“one” then “two” then “three”) and the set
representations ordered by the addition of one element ({o1} then {o1, o2} then {o1, o2, o3}). They infer
that the meaning of the next element on the numeral list is the set size given by adding one element to the
set size named by the preceding numeral (e.g., the meaning of “five” is the cardinality one greater than
that named by “four”). This inference allows them to determine the correct cardinal value for the
remaining items on their count list (up to “ten”). In what follows, we refer to the conclusion of this
inference as the bootstrap conclusion.
This interesting proposal raises many empirical and theoretical questions (for a sample of these,
see the commentaries to Carey, 2011, and other sources cited later in this article). Two of these issues,
however, are important for the present discussion. The first is whether learning to enumerate creates a
fundamentally new way of representing the positive integers. According to Carey, Quinian bootstrapping
provides a child with new primitive concepts of number, concepts that the child’s old conceptual
vocabulary can’t express, even in principle:
Quinian bootstrapping mechanisms underlie the learning of new primitives, and this
learning does not consist of constructing them from antecedently available concepts (they
are definitional/computational primitives, after all) using the machinery of compositional
semantics alone (Carey, 2009, p. 514, emphasis in the original).
No translation is possible between the old number concepts and the new ones:
To translate is to express a proposition stated in the language of [Conceptual System 2] in
the language one already has ([Conceptual System 1])… In cases of discontinuity in
which Quinian bootstrapping is required, this is impossible. Bootstrapping is not
translation; what is involved is language construction, not translation. That is, drawing on
resources from within CS1 and elsewhere, one constructs an incommensurable CS2 that
is not translatable into CS1 (Carey, 2011, p. 157).
Reply to Piantadosi et al. / 5
In the last quotation, CS1 is the child’s conceptual system prior to an episode of Quinian bootstrapping
and CS2 is the conceptual system that results from bootstrapping. As the first of these quotations makes
clear, Quinian bootstrapping is a kind of learning, usually an extended process that takes months or years
to complete. Children don’t acquire the new number concepts by mere maturation. Likewise, external
causal forces don’t merely stamp them in. A challenge in understanding Quinian bootstrapping is how to
reconcile the claim about learning with the claim about discontinuity between old and new concepts.
A second question about Quinian bootstrapping is the scope of the concepts it produces.
Enumeration pairs the elements on the child’s count list with collections of objects. Be we have argued in
earlier work (Rips et al., 2006) that learning the meanings of “one” through “ten” via Quinian
bootstrapping does not pin down even the cardinal meanings of these terms to their correct meanings.
This raises one of the questions mentioned earlier: How important is Quinian bootstrapping (and
enumeration in general) to children’s understanding of the integers?
1.2. An overview
Piantadosi et al. (2012) present an explicit model of how children learn to enumerate objects, and
the clarity of their theory provides a chance to assess the issues just mentioned.1 Although this model
differs from Carey’s (2004, 2009) in important details, as Piantadosi et al. note, they nevertheless claim it
exemplifies bootstrapping. In the following sections, we argue, first, that because the model learns to
enumerate by straightforward concept combination, it does not create a new discontinuous conceptual
system. Instead, it illustrates Fodor’s (1975, 1981) hypothesis that learning elaborates old concepts: It
cannot produce a new language that increases the child’s expressive power. Because of this limitation, the
model is incapable of bootstrapping in Carey’s sense and does little to clear up the issues surrounding
bootstrapping. Of course, this doesn’t mean that the model is incorrect. It could provide a correct account
of enumeration even if Quinian bootstrapping has no role in its procedure. However, the model’s method
1
Unless we otherwise attribute them, all page references are to Piantadosi et al. (2012).
Reply to Piantadosi et al. / 6
of enumerating differs from that of children, and these differences raise questions about the relevance of
the model to children’s actual abilities. Finally, Piantadosi et al.’s model maps sets of one to ten elements
to the terms “one” to “ten,” but has no implications for the structure of the integers. So the same
difficulties about scope that beset Quinian bootstrapping carry over to this new proposal.
2. Is the Piantadosi et al. model a form of bootstrapping?
Carey (2004, 2009) introduced “Quinian bootstrapping” as a term for procedures in which people
learn new concepts that are discontinuous from their old ones. Thus, we take the central claims of Quinian
bootstrapping to be these (see Beck, submitted for publication):
Learning: In Quinian bootstrapping, an agent learns a new conceptual system, CS2, in terms of
an old system, CS1.
Discontinuity: After Quinian bootstrapping, CS2 is conceptually discontinuous from CS1.
These properties are explicit in the quotations in Section 1.1 and in many other places in Carey’s (2009)
presentation. Important for the present discussion is the fact that Carey introduces her chapter on how
children acquire representations of the positive integers by setting herself two challenges: “to establish
discontinuities in cognitive development by providing analyses of successive conceptual systems, CS1
and CS2, demonstrating in what sense CS2 is qualitatively more powerful than CS1” and “to characterize
the learning mechanism(s) that get us from CS1 to CS2” (p. 288).
Because of the Discontinuity claim, Quinian bootstrapping opposes the view that all forms of
learning derive new concepts by recombining (or translating from) old ones. According to this alternative
view (Fodor, 1975, 1981), learning is a form of hypothesis formation and confirmation in which the
hypotheses are spelled out in the old conceptual vocabulary (i.e., the concepts that the person possesses
prior to learning). Confirmation merely stamps them in without producing anything fundamentally new.
To have a general term for such non-bootstrapping forms of learning, we’ll use concept recombination.
Proponents of bootstrapping agree that mundane forms of learning use recombination. Bootstrapping
Reply to Piantadosi et al. / 7
occurs only in special cases. So the question that divides theorists is whether any actual instances of
learning are examples of bootstrapping.
Piantadosi et al.’s clearest statement about this issue suggests that their approach is much closer
to recombination than to Quinian bootstrapping (p. 214):
One of the basic mysteries of development is how children could get something
fundamentally new… Our answer to this puzzle is that novelty results from
compositionality. Learning may create representations from pieces that the learner has
always possessed, but the pieces may interact in wholly novel ways…This means that the
underlying representational system which supports cognition can remain unchanged
throughout development, though the specific representations learners construct may
change.
This approach, whatever its merits, has nothing to do with creating new primitives (or new conceptual
systems) that are discontinuous with old ones. As such, it jettisons a central part of Carey’s bootstrapping
theory, the Discontinuity claim. “Quinian bootstrapping” was introduced as a technical term, so there is
limited room for redefining it while simultaneously claiming to defend it.
We note that developmentalists have used the term “bootstrapping” in ways that differ from
Carey’s “Quinian bootstrapping.” In research on language acquisition, syntactic bootstrapping is a
hypothetical process in which children use syntactic properties of sentences to determine the referents of
component words, and semantic bootstrapping is a process in which children use the referents of words to
determine their syntactic category (see Bloom & Wynn, 1997, for a discussion of these possibilities in the
context of number). Neither of these forms of bootstrapping qualifies as Quinian bootstrapping, according
to Carey (2009, p. 21), since neither creates a conceptual system discontinuous with earlier ones.
Piantadosi et al.’s proposal for number learning is not an example of either syntactic or semantic
bootstrapping, since syntactic categories play no role in it. In fact, it might be possible to contend that
their proposal fails to qualify as bootstrapping in an even wider sense, but we won’t argue for that
Reply to Piantadosi et al. / 8
conclusion here.2 Our concern, instead, is to show that the Piantadosi et al. model is not a type of Quinian
bootstrapping—it does not satisfy both the Learning and Discontinuity criteria—and from here on, we
will use “bootstrapping” to mean Quinian bootstrapping.
2.1. Bootstrapping’s central features
To make the distinction between bootstrapping and recombination a little more precise, let c*
represent a new concept that is created in learning, and let c1, c2, …, ck represent old concepts. Proponents
of recombination believe that learning is a function taking the old concepts into the new one: 𝑐 ∗ =
𝑓(𝑐1 , 𝑐2 , … , 𝑐𝑘 ). It matters very much, however, what the function f is like. Advocates of bootstrapping
agree that the input to learning is a set of old concepts and the output a new concept. As Carey (2011,
p. 157) remarks, “Clearly, if we learn or construct new representational resources, we must draw on those
we already have.” But Carey would maintain that in examples of bootstrapping the function is not mere
recombination, as the first quotation in Section 1.1 makes explicit. To distinguish between the positions,
then, we need some restrictions on f or on its arguments in order to spell out the difference between
recombination and bootstrapping (Rips & Hespos, 2011).
As a first possibility, proponents of bootstrapping could insist that bootstrapping algorithms are
so complex that they go beyond what could reasonably be considered recombination. For example,
Carey’s (2009) theory of how children learn to enumerate includes an analogical inference that maps the
first few numerals (“one,” “two,” “three”) to corresponding representations of cardinalities (see the sketch
in Section 1.1). If analogical inference is too complicated to be recombination, then learning to enumerate
may be a form of bootstrapping.
A second potential way to distinguish bootstrapping and recombination is to hold that the input
concepts (c1, c2, …, ck) in bootstrapping come from a broader domain of knowledge than is possible in
2
As D. Barner has suggested (personal communication, June 14, 2012), all prior bootstrapping theories
appear to require that earlier representational stages be psychologically necessary steps in the acquisition
of later ones, whereas earlier representations of number in Piantadosi et al.’s theory (e.g., their Twoknower function) play no role in producing its later representations (e.g., the CP-knower function).
Reply to Piantadosi et al. / 9
recombination. In the case of number learning, the input concepts to bootstrapping may belong to two or
more distinct cognitive modules. For example, c1, c2, …, ci may come from a module devoted to natural
language quantifiers (e.g., some or all), whereas ci+1, ci+2, …, ck may come from a module for representing
small sets of physical objects. Or the input concepts may include some that don’t appear in the child’s
earlier number representations. For example, the old number representations may include only concepts
c1, c2, …, ci, whereas input to the new representations may also include concepts ci+1, ci+2,…, ck.
It is unclear to us whether either of these strategies suffices to show that bootstrapping and
recombination differ in kind. Sheer complexity of a process doesn’t seem inconsistent with
recombination. The individual steps in learning may be lengthy or difficult without creating anything
fundamentally new. If so, advocates of bootstrapping owe us an explanation of what aspects of learning
cause it to go beyond recombination. Similarly, why must recombination respect limits on the domain of
its input? Why shouldn’t recombination be allowed to draw on all old concepts in the learner’s repertoire?
Fodor’s (2010) response to Carey’s theory is to deny any such limits. We are not claiming that proponents
of bootstrapping have explicitly adopted either of these strategies. Nor do we claim that the strategies are
exhaustive. 3 In looking at Piantadosi et al.’s proposal, however, let’s keep these options temporarily
open, since they may help us see why these authors believe their model performs a type of bootstrapping.
The underlying issue with bootstrapping is that advocates have to reconcile the Learning and
Discontinuity claims. But doing so is tricky because these claims seem to pull in opposite directions, with
Learning suggesting continuity rather than discontinuity. If Learning and Discontinuity cannot be
reconciled, bootstrapping is incorrect and concept recombination is correct as a theory of human concept
acquisition. A main point of interest, then, in Piantadosi et al.’s model is that it purports to furnish a
working example of bootstrapping and may therefore demonstrate bootstrapping’s viability.
3
The two strategies just described are examples of what Beck (submitted for publication) calls deflationary theories
for reconciling the bootstrapping claims about learning and discontinuity. Neither creates anything totally new to the
child’s conceptual system, but either could bring to light concepts that were only latent within this system. More
radical strategies are also possible for makings sense of the Learning and Discontinuity claims (Beck, submitted for
publication, and Shea, 2011). But these are farther removed from Piantadosi et al.’s theory, and we therefore don’t
discuss them here.
Reply to Piantadosi et al. / 10
2.2. The Piantadosi et al. model
The Piantadosi et al. model receives as input a series of sets of different sizes, ranging from one
to ten elements, with the frequency of each set size determined by the corpus frequency of the associated
number words (“one” to “ten”). It evaluates its current stock of hypotheses about the appropriate number
word for a given set, increasing the probabilities of hypotheses that give the right answer (e.g., labeling a
set of four items with “four”) and decreasing the probabilities of hypotheses that give an incorrect or null
answer. After sufficient training, the model converges on a hypothesis—the Cardinal Principle (CP-)
knower function—that correctly labels sets of one to ten elements:
CP-knower function:
λS. (if (singleton? S)
“one”
(next (L (set-difference S (select S)))))
This function tests whether the input set of objects S is a singleton (i.e., one-element set), and if it is,
labels it “one.” If not, it removes an element from S (i.e., set-difference S (select S)) and recursively
applies the same CP-knower function to the reduced set (L accomplishes this recursion). If the reduced set
is a singleton, it labels it next(“one”) or “two.” And so on.
More interesting is the order in which hypotheses emerge as the most likely candidate. Early in
training, the model labels one-element sets with “one” and all other set sizes as unknown. It then switches
to labeling one- and two-element sets correctly (using its primitive singleton and doubleton predicates),
then one-, two-, and three-element sets (using singleton, doubleton, and tripleton), and finally reaches a
more complicated rule (the CP-knower function) that correctly labels one- to ten-element sets.
This behavior is extensionally similar to the progress children make in acquiring words for set
sizes (see Section 3 for qualifications). The learning sequence is a result of several design choices: First,
the model starts with primitive predicates that: (a) directly recognize set sizes of one, two, and three
elements (e.g., the singleton? predicate); (b) carry out logic and set operations (e.g., set-difference);
Reply to Piantadosi et al. / 11
(c) traverse the sequence of number words “one,” “two,” …, “ten” (the next predicate); and (d) perform
recursion (L). (See Piantadosi et al.’s Table 1 for the full list of primitives.) Second, the model constructs
hypotheses from these primitives in a way that gives lower prior probabilities to lengthier hypotheses and
to hypotheses that include recursion (depending on a free parameter, γ). Thus, the model starts by
considering simple and inaccurate non-recursive hypotheses (e.g., singleton sets are labeled “one” and all
other sets are undefined) and ends with a more complex, but correct, recursive hypothesis as the result of
feedback about the correct labeling.
2.3. Does the Piantadosi et al. model employ bootstrapping?
Piantadosi et al. try to make the case that the discovery of the correct number hypothesis is a form
of bootstrapping, though not quite of the variety Carey described in introducing this term. But on the face
of it, their model looks like a perfect example of recombination. It starts with a small stock of primitives,
and it combines them into hypotheses according to syntactic rules (a probabilistic context-free grammar).
The model learns which of these hypotheses is best by Bayesian adjustment through feedback. Thus, all
the primitive concepts that the model uses to frame its final hypothesis are already present in its initial
repertoire. The only missing element is the correct assembly of these primitives by the grammar. These
restrictions would seem to leave the model with little room for innovation of the sort that bootstrapping
requires. Why should we regard the process as bootstrapping rather than as translating one system into
another?
In setting out the bootstrap idea, we mentioned two possible strategies to discriminate it from
recombination (see Section 2.1). Of these possibilities, the first one—that bootstrapping involves a
learning process more complex than standard recombination—is out for the Piantadosi et al. model. The
model’s grammar composes its hypotheses by assembling them from a previously existing base. The
model employs no inference more complex than the Bayesian conditioning that updates the hypotheses’
probability.
Reply to Piantadosi et al. / 12
Piantadosi et al. have a better chance, then, of defending their bootstrapping claim by adopting
the second strategy. Perhaps the model’s discovery of the pairing between numerals and cardinalities
incorporates concepts that aren’t available in its initial state. Here the obvious candidate is recursion. The
model’s initial hypotheses make no use of recursion, whereas the final CP-knower hypothesis does. The
recursive predicate (L) confers greater computational power on this last hypothesis than is present in the
earlier ones. So perhaps bootstrapping occurs when the model introduces recursion. (The model ensures
that this introduction happens relatively late in learning by handicapping all hypotheses containing the
recursive predicate, as we mentioned earlier.) This accords with Piantadosi et al.’s statement that “the
model bootstraps in the sense that it recursively defines the meaning for each number word in terms of the
previous number word. This is representational change much like Carey’s theory since the CP-knower
uses primitives not used by subset knowers, and in the CP-transition, the computations that support early
number word meanings are fundamentally revised” (p. 212). The idea seems to be that the model’s
representations for cardinalities before bootstrapping aren’t extendible to the representations it uses after.
Something is missing from the early representations that’s necessary for a more adult-like understanding.
But in thinking about whether the CP-transition is a form of bootstrapping, we should keep in
mind that in many mundane instances of learning—in discrimination learning, for example—people add
primitives that do not figure in earlier hypotheses. In learning to distinguish poisonous from edible
mushrooms, people may have to take into account new properties like the color of the mushrooms’ spores
that were not parts of their original mushroom representations. No one would claim, though, that
including spore color in the new concept is a discontinuous conceptual change. Likewise, merely
including a previously unused predicate in a new hypothesis about number meaning doesn’t by itself
imply that the hypothesis is discontinuous with old ones. Stretching the concept of Quinian bootstrapping
to include such simple property additions would trivialize this concept. So if adding the recursive
predicate L does produce a big conceptual change that must be because L is special—perhaps because it
significantly increases the hypothesis’s computational power—not because it is new.
Reply to Piantadosi et al. / 13
However, adding recursion still doesn’t conform to bootstrapping as Carey describes it in the
quotations of Section 1. The model’s grammar prior to adopting the CP-knower hypothesis is identical to
its grammar after adopting it, as is its primitive conceptual vocabulary. So a translation manual could
easily express the new hypothesis in the old vocabulary, precisely as is done in Piantadosi et al.’s
definition of the CP-knower function, which we displayed earlier. This undermines the idea that
bootstrapping does not reduce to translation. For much the same reason, the CP-knower hypothesis in
Piantadosi et al.’s version does not involve the creation of new primitives, and it certainly doesn’t create
them in a way that goes beyond “the machinery of compositional semantics” (see the first of the
quotations from Carey in Section 1.1).
Of course, Piantadosi et al. could position their model as a non-Quinian type of “bootstrapping”
that allows translation and dispenses with the need to create new primitives. But this move would
abandon the important and arresting ideas that Carey had in mind in introducing this concept.
Bootstrapping in this revised sense would not create a conceptual system that is discontinuous with the
earlier one, and hence, it would not implement a method that bears the same intellectual interest. This
revision would not simply raise the “semantic” issue of how we should use the term “bootstrapping,” but
it would discard one of bootstrapping’s essential properties. In short, the Piantadosi et al. model could
have made bootstrapping plausible by showing how the Learning and Discontinuity claims can be joined.
Instead, it either jettisons Discontinuity or trivializes it. The reasonable conclusion from the Piantadosi et
al. model is not that it provides a rigorous form of bootstrapping, but that it demonstrates bootstrapping as
unnecessary. Children have no need for bootstrapping, since they can learn a correct method of
enumeration through ordinary recombination.
3. How realistic is the model?
The points raised in the preceding section do not show that the model is incorrect as a theory of
how children learn to label cardinalities. Even if the model doesn’t learn by bootstrapping, it could still be
Reply to Piantadosi et al. / 14
the right explanation of this learning process. However, three aspects of the model’s behavior deserve
comment and suggest that it does not learn enumeration in the way children do.
3.1. The model’s choice of primitives
First, the model draws on a relatively small set of handpicked primitives—the predicates in
Piantadosi et al.’s Table 1, which we described in Section 2.2. The model forms all its hypotheses as
combinations of these predicates. Piantadosi et al. believe these predicates “may be the only ones which
are most relevant” to number learning (p. 202). But this restriction raises the question of whether children
also limit their hypotheses in the same convenient way. We don’t dispute the importance of these
predicates to knowledge of number, but how do children know prior to learning that these predicates are
the most relevant ones?
Piantadosi et al. claim that their theory “can be viewed as a partial implementation of the core
knowledge hypothesis (Spelke, 2003),” but also deny that the primitive predicates are part of an
encapsulated core domain devoted to number: “These primitives—especially the set-based and logical
operations—are likely useful much more broadly in cognition and indeed have been argued to be
necessary in other domains” (p. 202). If this particular set of primitives does not come cognitively prepackaged, however, then children must search for them among a much larger group of mental predicates,
and many candidates exist in this larger set that carry numerical information. Although the model includes
a few primitives (e.g., set intersection) that are not necessary for its hypotheses, the model excludes, by
fiat, analog magnitudes, mental models, and explicit quantifiers (e.g., some), which according to many
theories are relevant parts of children’s number knowledge prior to their mastery of enumeration.4
Consider analog magnitudes. According to this idea, people have access to a continuous mental measure
that varies positively with the number of physical objects in the perceptual array. People can therefore use
4
For analog magnitudes, see, for example, Dehaene (1997), Gallistel and Gelman (1992), and Wynn (1992). For
mental models, Mix, Huttenlocher, & Levine (2002). For quantifiers, Barner and Bachrach (2010), Carey (2009),
and Sarnecka, Kamenskaya, Yamana, Ogura, and Yudovina (2007).
Reply to Piantadosi et al. / 15
this measure as an approximate guide to cardinality. A model like Piantadosi et al.’s could easily build a
hypothesis that makes use of analog magnitudes to label set sizes, and such a hypothesis would compete
with those of the present version of the model, presumably gaining relatively high posterior probability. It
could therefore slow or alter the course of number learning by delaying the success of the CP-knower
hypothesis.
On the one hand, if Piantadosi et al.’s predicates are truly the only ones children use in forming
their hypotheses, then we need to know what enables the children to restrict their attention to these items
and exclude information like analog magnitudes. On the other hand, if children consider a wider set of
predicates, what’s the evidence that the model can converge on the right CP-knower procedure and do so
in a realistic amount of time? The quantitative results from Piantadosi’s simulations become irrelevant
under this second possibility.
3.2. The model’s method of enumeration
A second question about the model’s fidelity is whether the CP-knower procedure is similar
enough to children’s actual enumeration to back the claim that the model learns what children do.
Children match numerals one-one to objects in an iterative way (Gelman & Gallistel, 1978). In counting a
set of three cups {cup1, cup2, cup3}, they label a first object (e.g., cup1) “one” and remove it from further
consideration. They then label the second object (e.g., cup2) “two,” and so on. As Piantadosi et al. point
out, however, “the model makes no reference to the act of counting (pointing to one object after another
while producing successive number words)” (p. 213). As we mentioned earlier, what the model does
instead is recurse through the set of items, taking set differences until it arrives at a singleton, and then it
unwinds through the list of numerals to arrive at the total. For example, if the model is enumerating the
set of three cups, it first tests to see if the set is a singleton. Since it’s not, the model recursively applies
the procedure to the result of removing one item from the set (e.g., {cup2, cup3}). Since this new set is
still not a singleton, it again recursively applies the procedure to the set (e.g., {cup3}) formed by removing
Reply to Piantadosi et al. / 16
another item from the set. Having at last found a singleton, it labels the set “one,” then labels the next
largest set “two,” and finally labels the original set “three.”
We can put this point about the difference between children’s behavior and the model’s in a
second way: When older children count “one, two, three…three cups,” the first three number words do
not label the size of sets of cups. Instead, these words mark positions in the count sequence. The children
then rely on the principle—Gelman and Gallistel’s (1978) Cardinal Principle—that the final word in the
enumeration sequence is the cardinality of the set, and they thus infer that there are “three cups.” Thus,
only the second “three” in the earlier phrase denotes a number of cups. By contrast, the model always
uses numerals as labels for set sizes. In their discussion (pp. 214-215), Piantadosi et al. claim that
children’s actual counting behavior is a metacognitive effort to keep their place in the recursive routine.
But what reason could there be for not taking the children’s simpler (and equally accurate) enumeration
algorithm at face value? The model’s inability to arrive at the right procedure suggests that something is
wrong with its architecture and calls into question Piantadosi et al.’s claims (p. 200) that “all assumptions
made are computationally and developmentally plausible.”
3.3. The model’s knowledge of the sequence of cardinalities
A third difference between children’s behavior and the model’s behavior is the extent of
children’s knowledge at the time they become CP-knowers. The usual test of CP knowledge is that
children can correctly produce sets of up to ten objects when asked to “Give me n.” For example, when
asked to “Give me eight beads,” they can produce eight from a larger pile of beads. Recent evidence by
Davidson, Eng, and Barner (2012), however, shows that children who are able to perform this task are
often unable to say whether a single bead added to a box of five results in six beads rather than seven.
This is the case even though children at the same stage can correctly say that the numeral that follows
“five” is “six” rather than “seven.” Davidson et al. (2012, p. 166) note that their analysis “reveals that
many CP-knowers do not have knowledge of the successor principle for even the smallest numbers…
Reply to Piantadosi et al. / 17
These data suggest that knowledge of the successor principle does not arise automatically from becoming
a CP-knower, but that this semantic knowledge may be acquired later in development.”
The Piantadosi et al. model is limited to determining the cardinality of a given set of objects. So
no logical inconsistency arises between possessing this skill and not being able to tell that one object
added to a set of five yields a set of six. Still, Piantadosi et al.’s CP-knower function, in the course of
determining that “six” labels a six-item set, also determines that “five” labels a set with one fewer
element. Given this procedure, children’s difficulty in figuring out that a six-item set is one greater than a
five-item set is mysterious and again suggests that the model’s CP-knower function is more complex than
the routine children actually use at this point in their number development.
Piantadosi et al. (pp. 206) describe their theory as a computational-level model, in the sense of
Marr (1982). So perhaps we should discount these deviations between the model’s behavior and
children’s, since they concern particular methods of pairing number words and cardinalities. The crucial
claims of the Piantadosi et al. paper, however, depend on more than computational description. For
example, whether the model learns by bootstrapping depends on whether the procedures the model
employs before becoming a CP-knower are qualitatively different from the procedure it employs later.
This difference requires a comparison of the algorithms before and after learning, and it limits how
abstractly we can view the model when we come to evaluate it (see Jones & Love, 2011, for general
criticisms along these lines of Bayesian learning theories).
4. How much does the model know about the positive integers?
An intriguing aspect of bootstrapping is that it is supposed to produce the child’s first true
representation of the positive integers. According to Carey (2004, p. 65), “coming to understand how the
count list represents numbers reflects a qualitative change in the child’s representational capacities; I
would argue that it does nothing less than create a representation of the positive integers where none was
available before.” Similarly, according to Piantadosi et al. (p. 201):
Reply to Piantadosi et al. / 18
Bootstrapping explains why children’s understanding of number seems to change so
drastically in the CP-transition and what exactly children acquire that’s “new”: they
discover the simple recursive relationship between their memorized list of words and the
infinite system of numerical concepts.
In Section 2, we examined the issue of whether Piantadosi et al.’s model effects a qualitative change in
representations. But setting that issue aside here, how much does the model know about the positive
integers—the “infinite system of numerical concepts”?
4.1. Does bootstrapping capture the meaning of the first few numerals?
One thing seems clear. The model never learns the full set of positive integers. It simply learns to
associate set sizes with the correct number words “one” to “ten” (or to whatever word is the last term on
the model’s count list). Presented with a set of ten items, the model correctly labels it “ten,” but presented
with a set of eleven items, it cannot make a correct response if “ten” is its largest count term.
In another sense, however, the model does possess a general rule for relating number words and
cardinalities: the CP-knower function, shown in Section 2.2. Piantadosi et al. write, “bootstrapping has
been criticized for being incoherent or logically circular, fundamentally unable to solve the critical
problem of inferring a discrete infinity of novel numerical concepts (Rips, Asmuth, & Bloomfield, 2006,
2008; Rips, Bloomfield, & Asmuth, 2008). We show that this critique is unfounded…” (p. 200). But
although we do believe that bootstrapping is unable to solve the “problem of inferring a discrete infinity
of novel numerical concepts,” we did not criticize bootstrapping as inconsistent or circular.5 Moreover,
the conclusion itself is a correct generalization about number word-cardinality pairs, as we noted in our
5
A threat of circularity looms, however, if you read too much into the bootstrapped conclusion. You may be
tempted to think that the conclusion fixes the cardinal meaning of the numerals if you understand “next term on the
count list” as involving the full, infinite list for the positive integers. The full list does, of course, fix the numeral’s
meaning since it is isomorphic to the positive integers. But at the time children perform the bootstrap inference, they
have no knowledge of the full list; so assuming this structure as part of the bootstrapping process does lead to
circularity. We note, too, that logical difficulties with bootstrapping’s inference procedures are not the same as the
difficulties we surveyed in Section 1. The former concern problems in getting from the meanings of “one,” “two,”
and “three” to the meanings of the terms in the rest of the child’s count list. The latter difficulties concern the more
abstract problem of combining the Learning and Discontinuity theses.
Reply to Piantadosi et al. / 19
earlier papers. What is learned is a correlation between advancing one step in the number word sequence
(e.g., from “four” to “five”) and increasing the cardinality of a set by one. (In the Piantadosi et al. model,
this correlation is implicit in the CP-knower procedure rather than declaratively represented, but the effect
is the same.) This is an important discovery for children, and any theory that explains how they do it is
praiseworthy.
The trouble with this principle, however, is that, at the time children learn it, it fails to specify the
meaning of the terms for the positive integers (Rips et al., 2006; Rips, Asmuth, & Bloomfield, 2008; Rips,
Bloomfield, & Asmuth, 2008). After adopting the CP-knower function, the Piantadosi et al. model has a
way to connect the word “one” to cardinality one, “two” to cardinality two,…, and “ten” to cardinality
ten. But the same function is equally extendible to either of the mappings in (1) and (2), as well as an
infinite number of others:
(1) “one” denotes only cardinality one.
“two” denotes only cardinality two.
…
“ten” denotes only cardinality ten.
(2) “one” denotes cardinalities one, eleven, twenty-one,…
“two” denotes cardinalities two, twelve, twenty-two,…
…
“ten” denotes cardinalities ten, twenty, thirty,…
That is, the CP-knower function doesn’t constrain the cardinal meanings of the number words on the
child’s list to their ordinary meanings.
Proponents of bootstrapping now appear to agree with us that the CP-knower function and its
equivalents don’t give children the meanings for numerals beyond those on their list of count terms. But it
doesn’t give them the correct meanings for numerals on their count lists either, as (1) and (2) reveal.
Knowing that a correlation exists between the numerals and the cardinalities is of no help in picking out
the positive integers from among its rivals unless the child knows either the structure of the numerals or
Reply to Piantadosi et al. / 20
the structure of the cardinalities. However, the numeral sequence, as given by the next predicate in
Piantadosi et al.’s model, does not continue beyond “ten,” and as Piantadosi et al. emphasize (p. 212),
their model does not build in a successor relation for cardinalities. Because the structure of the positive
integers is well understood, we can be quite specific about what the CP-knower function fails to convey.
It does not enforce the ideas that the correct structure is one that has: (a) a unique first element, (b) a
unique immediate successor for each element, (c) a unique immediate predecessor for each element
except the first, and (d) no element apart from those dictated by (a)-(c).
4.2. Can the model exclude rival meanings for the integers?
Results from their simulations show that Piantadosi et al.’s model learns the standard pairing for
the first ten integers rather than an alternative pairing in which “one” is mapped to sets with one or six
elements, “two” to sets with two or seven elements, …, and “five” to sets with five or ten elements. This
latter Mod-5 hypothesis (see their Figure 1) fails for two reasons: First, the model receives feedback that
disconfirms the Mod-5 pairings, and second, the Mod-5 hypothesis is more complex than the correct
alternative, given the choice of primitives. When feedback supports the Mod-5 hypothesis, however, the
model eventually learns it. From these facts, Piantadosi et al. (p. 211) conclude:
This work was motivated in part by an argument that Carey’s formulation of
bootstrapping actually presupposes natural numbers, since children would have to know
the structure of the natural numbers in order to avoid other logically plausible
generalizations of the first few number word meanings. In particular, there are logically
possible modular systems which cannot be ruled out given only a few number word
meanings (Rips et al., 2006; Rips, Asmuth, & Bloomfield, 2008; Rips, Bloomfield, &
Asmuth, 2008). Our model directly addresses one type of modular system along these
lines: in our version of a Mod-N knower, sets of size k are mapped to the k mod Nth
number word. We have shown that these circular systems of meaning are simply less
Reply to Piantadosi et al. / 21
likely hypotheses for learners. The model therefore demonstrates how learners might
avoid some logically possible generalizations from data…
The problem for theories of number learning, however, is not eliminating hypotheses that the data directly
disconfirm, such as Piantadosi et al.’s Mod-5 hypothesis. Instead, the difficulty lies in selecting from the
infinitely many hypotheses that have not been disconfirmed. For the simulations in Piantadosi et al., these
would include Mod-11, Mod-12, Mod-13, …. The model can’t decide among these hypotheses because
its list of numerals stops at “ten” and because it has no information about cardinalities greater than ten.
Hypotheses like Mod-11 might seem syntactically complex relative to the CP-knower function. If
so, the model would prefer CP-knower to Mod-11, even without training on sets of eleven, due to the
model’s assignment of higher prior probabilities to simpler hypotheses. But this is not the case. How
simple or complex a function must be to capture Mod-11 depends entirely on the structure of the numeral
list beyond “ten” (which we are assuming is the child’s highest count term). If the list continued, “one,”
“two,”…, “ten,” “one,” “two,”…, “ten,” “one,” “two,” …, “ten,”…, then the CP-knower function would
respond exactly in accord with Mod-11. Since neither children nor the Piantadosi et al. model knows how
the count list continues, syntactic complexity can’t decide between Mod-11 and the standard meanings of
the numerals; that is, it can’t discriminate between (1) and (2), above. (This is a variation of Goodman’s,
1955, famous point about the role of syntactic complexity in induction.)
The message in our earlier papers was that the bootstrap conclusion does nothing to settle the
question of whether the cardinal meaning of the first few numerals is given by their usual (adult) meaning
or by Mod-11, Mod-12, and so on. The same is true of Piantadosi et al.’s CP-knower function. As Rey
(2011) has pointed out in connection with Carey’s proposal, this issue is closely related to classic povertyof-the-stimulus arguments for learning natural language (e.g., Chomsky, 1965). Proponents of
bootstrapping could contend that children’s knowledge of the meaning of the numerals suffers from the
same problem that the bootstrap conclusion does. Adults clearly know that (1), and not (2), represents the
correct meaning, but children may distinguish them only at a later point in their number development.
Reply to Piantadosi et al. / 22
However, this conclusion, if it is true, places a stark limit on how much children learn about the numerals
from the bootstrap’s conclusion.
Piantadosi et al. begin to acknowledge this difficulty in noting that “the present work does not
directly address what may be an equally interesting inductive problem relevant to a full natural number
concept: how children learn that next always yields a new number word” (pp. 211-212). They believe that
“similar methods to those that we use to solve the inductive problem of mapping words to functions could
also be applied to learn that next always maps to a new word. It would be surprising if next mapped to a
new word for 50 examples, but not for the 51st” (ibid.). But this conjecture is not obviously correct: Most
lists that children learn—the alphabet, the months of the year, the notes of the musical scale—don’t have
the structure of the natural numbers. Next for the English alphabet ends at the 26th item, and next for the
sequence of U.S. Presidents currently ends at the 44th.
The crucial difficulty, as we’ve emphasized, is that learning the mapping between the numerals
and the cardinalities for one to ten can’t eliminate nonstandard sequences, such as Mod-11, unless
children can somehow induce the correct structure. The structure could come from the cardinalities for the
positive integers, or it could come from the structure of the numerals for these integers, since these
structures are isomorphic. But it has to come from somewhere. Bootstrapping allows children to exploit
the numeral sequence to determine the labels for cardinalities. But this strategy can’t pick out the right
cardinal meanings—it merely passes the buck—unless the problem about how “next always yields a new
number word” is resolved.
5. Conclusions
On our view, the Piantadosi et al. model doesn’t bootstrap. It therefore doesn’t help vindicate
bootstrapping as a cognitive process. What the model does is form hypotheses by recombining its
primitives and confirming them statistically. So should we conclude that children can learn to enumerate
through this (non-bootstrapping) sort of hypothesis formation and confirmation? Perhaps, although
accepting this conclusion depends on ignoring the facts that (a) the model learns by combining a pre-
Reply to Piantadosi et al. / 23
selected set of primitives but gives no account of how they are singled out, (b) finishes with a procedure
that differs in important ways from children’s, and (c) has a firmer grasp of the sequence of cardinalities
than children have. But even if the model is a correct description of how children learn to enumerate, the
model still faces the problem that it leaves an unlimited set of possibilities for the meanings of the first
few count terms.
Reply to Piantadosi et al. / 24
Acknowledgements
We thank David Barner, Jacob Beck, Jacob Dink, Brian Edwards, Emily Morson, James Negen, Steven
Piantadosi, and Barbara Sarnecka for comments on an earlier draft of this article. IES grant
R305A080341 helped support work on this paper.
Reply to Piantadosi et al. / 25
References
Barner, D., & Bachrach, A. (2010). Inference and exact numerical representation in early language
development. Cognitive Psychology, 60, 40-62. doi: 10.1016/j.cogpsych.2009.06.002
Beck, J. (submitted for publication). Can bootstrapping explain concept learning?
Bloom, P., & Wynn, K. (1997). Linguistic cues in the acquisition of number words. Journal of Child
Language, 24, 511-533. doi: 10.1017/s0305000997003188
Carey, S. (2004). Bootstrapping and the origin of concepts. Daedalus, 133, 59-68.
Carey, S. (2009). The origin of concepts. New York, NY: Oxford University Press.
Carey, S. (2011). Concept innateness, concept continuity, and bootstrapping. Behavioral and Brain
Sciences, 34, 152-161. doi: 10.1017/S0140525x10003092
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: M.I.T. Press.
Davidson, K., Eng, K., & Barner, D. (2012). Does learning to count involve a semantic induction?
Cognition, 123, 162-173. doi: 10.1016/j.cognition.2011.12.013
Dehaene, S. (1997). The number sense: How mathematical knowledge is embedded in our brains. New
York: Oxford University Press.
Fodor, J. A. (1975). The language of thought: A philosophical study of cognitive psychology. New York:
Crowell.
Fodor, J. A. (1981). The present status of the innateness controversy. Representations: Philosophical
essays on the foundations of cognitive science (pp. 257-316). Cambridge, MA: MIT Press.
Fodor, J. A. (2010, October 8). Woof, woof [Review of the book The Origin of Concepts, by S. Carey].
Times Literary Supplement, pp. 7-8.
Gallistel, C. R., & Gelman, R. (1992). Preverbal and verbal counting and computation. Cognition, 44, 4374. doi: 10.1016/0010-0277(92)90050-r
Gelman, R., & Gallistel, C. R. (1978). The child's understanding of number. Cambridge, Mass.: Harvard
University Press.
Reply to Piantadosi et al. / 26
Goodman, N. (1955). Fact, fiction and forecast. Cambridge, MA: Harvard University Press.
Jones, M., & Love, B. C. (2011). Bayesian Fundamentalism or Enlightenment? On the explanatory status
and theoretical contributions of Bayesian models of cognition. Behavioral and Brain Sciences,
34, 169-188. doi: 10.1017/s0140525x10003134
Leslie, A. M., Gelman, R., & Gallistel, C. R. (2008). The generative basis of natural number concepts.
Trends in Cognitive Sciences, 12, 213-218. doi: 10.1016/j.tics.2008.03.004
Margolis, E., & Laurence, S. (2008). How to learn the natural numbers: Inductive inference and the
acquisition of number concepts. Cognition, 106, 924-939. doi: 10.1016/j.cognition.2007.03.003
Marr, D. (1982). Vision: A computational investigation into the human representation and processing of
visual information. San Francisco: W.H. Freeman.
Mix, K. S., Huttenlocher, J., & Levine, S. C. (2002). Quantitative development in infancy and early
childhood. New York, NY: Oxford University Press.
Piantadosi, S. T., Tenenbaum, J. B., & Goodman, N. D. (2012). Bootstrapping in a language of thought:
A formal model of numerical concept learning. Cognition, 123, 199-217. doi:
10.1016/j.cognition.2011.11.005
Rey, G. (2011). Learning, expressive power, and mad dog nativism: The poverty of stimuli (and
analogies), yet again. Paper presented at the Society for Philosophy and Psychology, Montreal.
Rips, L. J., Asmuth, J., & Bloomfield, A. (2006). Giving the boot to the bootstrap: How not to learn the
natural numbers. Cognition, 101, B51-B60. doi: 10.1016/j.cognition.2005.12.001
Rips, L. J., Asmuth, J., & Bloomfield, A. (2008). Do children learn the integers by induction? Cognition,
106, 940-951. doi: 10.1016/j.cognition.2007.07.011
Rips, L. J., Bloomfield, A., & Asmuth, J. (2008). From numerical concepts to concepts of number.
Behavioral and Brain Sciences, 31, 623-642. doi: 10.1017/s0140525x08005566
Rips, L. J., & Hespos, S. J. (2011). Rebooting the bootstrap argument: Two puzzles for bootstrap theories
of concept development. Behavioral and Brain Sciences, 34, 145-146.
doi:10.1017/S0140525X10002190
Reply to Piantadosi et al. / 27
Sarnecka, B. W., Kamenskaya, V. G., Yamana, Y., Ogura, T., & Yudovina, Y. B. (2007). From
grammatical number to exact numbers: Early meanings of 'one', 'two', and 'three' in English,
Russian, and Japanese. Cognitive Psychology, 55, 136-168. doi: 10.1016/j.cogpsych.2006.09.001
Shea, N. (2011). New concepts can be learned. Biology & Philosophy, 26, 129-139. doi: DOI
10.1007/s10539-009-9187-5
Spelke, E. S. (2000). Core knowledge. American Psychologist, 55, 1233-1243. doi: 10.1037/0003066x.55.11.1233
Spelke, E. S. (2011). Quinean bootstrapping or Fodorian combination? Core and constructed knowledge
of number. Behavioral and Brain Sciences, 34, 149-150.
Wynn, K. (1992). Children's acquisition of the number words and the counting system. Cognitive
Psychology, 24, 220-251. doi: 10.1016/0010-0285(92)90008-p
Download