An Excel Sheet for Inferring Number-Knower-Levels from Give

advertisement
An Excel Sheet for Inferring Number-Knower-Levels
from Give-N Data
By James Negen, Barbara W. Sarnecka, and Michael Lee
Number-knower-levels indicate what children know about counting
and the cardinal meaning of number words, which is an important
developmental variable. The Give-N methodology, which is used to
diagnose knower-level, has been highly refined – in contrast, the
field’s analysis of Give-N data remains somewhat crude. Here we
work with a model by Lee and Sarnecka (2010), which is a generative
model of how children perform in Give-N, allowing more-principled
inference of knower-level. We present a close approximation of the
model’s inference that can be computed by Microsoft Excel, as well as
a worked implementation and instructions for its usage. This should
give developmental researchers access to sharper inference about
young children’s number-word knowledge.
Preschoolers learn very early to recite the first few natural numbers in order – often as
young as 2 years old (Fuson, 1988) – but don’t immediately know what those words
actually mean. From there, they start to fill in these words with meaning, one at a time
and in order (e.g., Carey, 2009; Sarnecka & Lee, 2009). The child’s progress in this
process is often referred to as her number-knower-level or just knower-level. A child who
does not yet know any number words is called a pre-number-knower or 0-knower. A
child who knows the word “one” alone is a 1-knower; knowing “one” and “two” makes
her a 2-knower, etc. After being a 3-knower or 4-knower, children become CP-knowers,
which means that they know how to use counting to find the cardinality of virtually any
set.
The child’s progress in this developmental timeline has been a useful variable in several
lines of research. For example, Ansari and colleagues (2003) used it to examine deficits
caused by Williams Syndrome; Le Corre and Carey (2007) used it to investigate when
and how the analogue magnitude system becomes linked to the system of number words;
Sarnecka, Kamenskaya, Yamana, Ogura and Yudovina (2007) used it to examine how
cross-linguistic variation can influence number-word learning; Duncan and colleagues
(2007) found it to be one of the most important predictor variables of success in
kindergarten.
The most prevalent task used to infer knower-level is the Give-N task. In its usual form,
children are given a large bowl of small items and told that they are going to play a game
with a puppet. The experimenter asks the child to put a certain number of items to the
puppet, e.g., “Can you give Mr. Bunny TWO bananas?”. The requested number of items
usually almost always includes the numbers one to four, plus a few more in the range of
five to ten. It’s normal to see three trials for each number word.
Wynn (1990; 1992) invented the Give-N task and tried to measure whether children at
certain ages knew certain number words as a group. From there, people started trying to
1
infer the knower-level of individual children instead of age ranges, either by ad hoc
heuristics (e.g., Sarnecka & Gelman, 2004) or trying to find convergence in a titrating
method (e.g., Barner, Chow & Yang, 2009). However, these previous methods of
inference generally had no principled reason to set the cutoffs where they did. This led to
the development of a formal model by Lee and Sarnecka (2010).
The model itself can be used to guide inference. The problem is that it requires technical
skills that are not prevalent among developmental researchers. Because of this, we
decided to develop a reasonable approximation that only requires the user to interact with
Microsoft Excel, which is a much more comfortable platform for the target audience.
The Present Study
In the remaining portions of the paper, we first describe the model itself in detail (Model
Description), and then describe how we are approximating it for the excel sheet
(Approximation). Next, we describe the Give-N datasets used to calibrate the model’s
parameters and test the quality of the approximation (Methods). We then provide
posterior distributions over the model parameters from the calibration dataset (Results),
compare the model’s inference to the excel sheet’s inference, as well as a popular ad hoc
method’s inference (Discussion), and finally provide instructions for the sheet itself (How
to Use the Excel Sheet).
Model Description
The task of creating a more principled method of inferring knower-level from Give-N
data was taken on by Lee and Sarnecka (2010), who created a full generative model of
the Give-N task. In their model, children have a base-rate of how many items they like to
give. This corresponds roughly to what one would expect if the child could be asked to
just give however many they wanted from the available bowl of items. This base-rate is
then modified when the child is asked for a specific number of items. To be specific,
three things happen when the child is asked for X items:
1. If the child knows the word X, it becomes more likely that she will correctly give X
items by a factor v.
2. If the child knows some set of words Θ, all words in Θ that are not X become less
likely by the same factor v (e.g., a 2-knower is unlikely to give 2 when asked for
“three”, even though she is not especially likely to give 3).
3. The probability of all of the words is divided by the sum probability, to make the
probabilities sum to unity again.
This provides a principled way to predict patterns of data coming from the Give-N task,
sorted by knower-level. The way it works out, a pre-number-knower just gives the baserate number of items. A one-knower almost always gives 1 when asked for “one”, but
rarely gives 1 when asked for anything else. She does, however, sometimes get “two”
right, because it has a high base-rate probability. A 2-knower does much better at “two”,
and also rarely gives 2 for anything other than “two”, etc. A CP-knower has a very good
chance of getting anything that you throw at her correct, as she is able to use counting the
2
way that adults do (to find the exact cardinality of any set by a point-and-count procedure
through the count list).
Following (Lee & Sarnecka, 2009), we used a graphical model as our implementation
(Figure 1). Discrete variables are indicated by square nodes, and continuous variables are
indicated by circular nodes. Stochastic variables are indicated by single-bordered nodes,
and deterministic variables (included for conceptual clarity) are indicated by doublebordered nodes. Finally, encompassing plates are used to denote independent replications
of the graph structure within the model.
In our implementation of the knower-level model in Fig. 1, the data are the observed qij
and gij variables, which give the number asked for (the “question”) and the answer (the
number “given”), respectively, for the ith child on his or her jth question. The base-rate
probabilities are represented by the vector π, which is updated to π’, from which the
number given is sampled. (Thus, the functions defining π’ act as a likelihood function.)
The update occurs using the number asked for, the knower-level zi of the child, and an
evidence value v that measures the strength of the updating. The base rate and evidence
parameters, which are assumed to be the same for all children, are given vague priors
(i.e., ones that allow for a very large range of possible inferences).
The updating rule that defines π’ decomposes into three basic cases. If a number k is
greater than the knower-level zi, then regardless of whatever number q they are being
asked for, the updated probability remains proportional to the base-rate probability πk for
that number. If a number k is within the child’s knower-level range zi, it either increases
in probability by a factor of v if it is the number q being asked for, or decreases in
probability by a factor v if it is not. For a child who is a CP-knower, his or her range
encompasses all of the numbers. The final part of the graphical model relates to the
behavior step, with the number of toys given being a draw from the probability
distribution π’ representing the updated beliefs.
This model, while it fits the known data well and provides a more principled way to
measure knower-level from Give-N data, is not very accessible to many researchers who
may find it useful; many researchers are not familiar with the WinBUGs language, and/or
the (proprietary) program MATLab that is often used to interface with it, and/or the
general principles of Bayesian statistical inference, and/or Bayesian cognitive modeling.
As such, it has not seen great use in the field, so we developed an approximation that can
be computed with Excel.
Approximation in the Excel Sheet
What do you need in order to get a posterior distribution of knower-levels for a child?
The first thing is a prior distribution, which can simply be defined by a researcher and
entered into the excel sheet. The second thing is an expression of the likelihood of each
response, given each knower-level and also given the question. A standard Markov Chain
Monte Carlo (MCMC) method treats this matrix of likelihoods as having a distribution.
The exact posterior distribution over knower-levels is the probability of the data given a
string of all the parameters (including knower-level), times the relevant priors, integrated
over the parameter space’s support, times a normalization constant. Unfortunately, Excel
3
isn’t up to the task of implementing this kind of inference for this model – the integration
step is best approached with MCMC methods.
The calculations in the excel sheet approximate the posterior knower-level distribution by
treating the likelihood as a single flat set of probabilities instead of being subject to
distributions. These probabilities can just be calculated once and then stored for future
use. They come from sampling out of the distribution of likelihoods, then sampling
responses out of those likelihood samples. The likelihoods come, in turn, out of the
posterior likelihoods from a calibration dataset (see Methods and Results for a description
of that process). The approximate posterior likelihood of a child being a given knowerlevel is the probability of the data under the new flattened likelihood matrix, times the
relevant priors, times a normalization constant. If there is enough data in the calibration
dataset to already make the posterior distribution very tight, then little should be lost in
the approximation. This approximation only requires the program implementing it to do
multiplication, addition, and division – well within the powers of Excel.
Methods
The data used to tune the model was taken from Negen & Sarnecka (2010). For this
dataset, children were asked to give one, two, three, four, six, and eight items. Each
request was repeated three times in pseudorandom order (total 18 trials). We excluded
sessions where the child did not complete at least 15 of the 18 trials, leaving us with 423
sessions.
The independent data used to compare the model inference and excel sheet inference
came from Lee and Sarnecka (submitted). This dataset has 56 children who were asked
for one, two, three, four, five, eight and ten items, three times each (21 trials total).
Results
The calibration data were run through the model using WinBUGS. Two chains were run,
each with 2,000 burn-in samples and 25,000 data collection samples, for a total of 50,000
points of MCMC data. Chain convergence was good, with the R statistic being very close
to 1 for all of the variables sampled.
The model inferred 9 0-knowers, 48 1-knowers, 50 2-knowers, 53 3-knowers, 67 4knowers and 196 CP-knowers. This distribution has likely arisen because the children are
drawn from an area with high socio-economic status. The slight paucity of 0-knowers
may be somewhat worrying, but the way the model works, the 0-knowers just give out of
the base-rate – which can be inferred from the other data anyway.
The inferred posterior prediction is shown in Figure 1, broken down by knower-level.
The full numeric breakdown is Sheet 2 of the excel sheet itself. The model inferred an
evidence value of 16.94 (SD = .69), which means the base rate is modified by a factor of
about 17 by the child’s knowledge of a given number word (the factor v in Model
Description). The inferred base-rate is the same as the posterior predictions for 0knowers, also in Figure 2.
4
Discussion
What advantages does this method of inference have? To see how the inferences made by
the excel sheet compare with a popular ad hoc heuristic and normal Bayesian inference
by MCMC, we looked at an independent dataset of 56 children from Lee and Sarnecka
(submitted). The ad hoc heuristic requires a child’s correct answers to outnumber their
errors by at least 2:1 in order to get credit for knowing that number word. The child’s
inferred knower-level is then the highest one that she knows. The data were run though
(a) this heuristic, (b) the excel sheet, and (c) a normal MCMC method, after being
appended to the calibration dataset.
Figures 3 and 4 show all of the posterior distributions for every child from each inference
method. The most striking advantage of the excel sheet over the ad hoc method is that the
posterior distribution is sometimes very diffuse when little evidence has been
accumulated (see #32, #33, #43, and #53 for examples). Thus, the excel sheet comes with
an intuitive measure of how much is actually known about the child.
There are 38 cases where the excel sheet agrees (in terms of maximum posterior
likelihood) with the ad hoc heuristic. Of the remaining 18 cases, only 1 has the excel
sheet’s inference concentrated at a lower knower-level. This is primarily because the ad
hoc heuristic requires the same thing from every knower-level, whereas the model is
more ‘lenient’ on sets that are larger. It’s intuitively appealing that larger sets should be
more difficult to generate, so 3-knowers and up should not be required to attain the same
level of accuracy as 0-knowers, 1-knowers, or 2-knowers.
The excel sheet also takes into account how many trials have passed without the child
erroneously giving a certain number. So, for instance, every time a child makes an error
when asked for four, and that error is not giving 2, the posterior odds of being a 2-knower
receive a small upward push relative to the odds of being a 0-knower or 1-knower. This is
because the chances that a 0-knower or 1-knower will erroneously give 2 items is much
higher than the chances a 2-knower will commit the same error. (Both the ad hoc
heuristic and the excel sheet will do the complement, pushing down the chances of being
a 2-knower whenever the child mistakenly gives 2 items in response to a different
number-word.) This allows indirect counter-evidence to accumulate against a child’s
errors when asked for a number word.
The excel sheet’s inference is very close to the inference generated by normal Bayesian
inference using MCMC. In terms of maximums, there are no discrepancies. The mean
absolute difference between the model’s posterior over knower-levels and the excel
sheet’s approximate posterior is 0.2%, with a standard deviation of 0.68%. The largest
difference is 4.8% (#50), where both inferences have the same basic shape but the excel
sheet is slightly more peaked at the mode. The large a dataset gets, the further it will be
able to pull the excel sheet away from the MCMC method.
How to Use the Excel Sheet
Figure 5 is a screenshot of what the excel sheet itself looks like. The user enters data in
the rows near the top labeled “question” and “response”. Both of these must be in the
range of 1 to 15. This has 2 implications: (1) If the child was asked for more than 15
items, some other method of analysis must be used. However, to our knowledge, it is rare
5
to find researchers asking for more than 10 items. (2) If the child had more/less than 15
items to give, there may be some minor problems in how the estimation works. However,
it will probably be alright if the user translates any maximal response to 15. So, if the
child has 10 items available, any time they give 10, enter 15. (The model is based in part
on the idea that giving all of the items has a high base-rate probability.)
The excel sheet is designed to handle all of the data from a single child at a time. In the
question row, the user can enter the numbers requested from left to right, with the child’s
responses beneath. (Trials don’t have to stay in order, as long as each response is entered
below the corresponding question.) It will fill in, on its own, the likelihood of each
question/response pair conditional on each knower-level, in rows 6 to 11. It’s important
that there are no questions without responses.
Prior likelihood is a place where you can enter prior weights for different knower-levels.
For most applications, these should stay at a uniform value. Note that they don’t have to
sum to 1 for things to work out correctly. These values might be adjusted, for example, if
there is a covariate collected with a known relationship to knower-level.
The end result is a set of relative likelihoods for the six knower-levels, along with a graph
to visualize them. This should allow the user to see what knower-level is preferred and
how strongly. Usually, just taking the one with maximum likelihood will be sufficient; if
there is scarce Give-N data available, the user may consider using the likelihood
distribution as a set of weights for further analysis. Provided also is the log-likelihood
and scaled log-likelihood of the data, as they may be more familiar to some researchers.
In the example in Figure 5, the child was asked for one, two, three, four and five. She
gave, respectively, 1, 2, 3, 3, and 6. These are entered in rows 3 and 4. The prior
likelihood for each knower-level is the same (cells L15 to L20). This leads to a
confidence of about 60% that this child is a 2-knower (seen in the graph and under
Normalized Likelihood in cell U14).
The excel sheet can be downloaded at …
6
Figure 1. A graphical representation of the model.
7
Figure 1. The likelihood of responses, organized by items requested and knower-level.
Darker squares indicate higher probability. Blue dots are actual data.
8
Inferred Base Rate
Inferred Evidence Strength
0.35
0.3
0.14
Base Rate
95% CI
0.12
0.1
Proability
Probability
0.25
0.2
0.15
0.08
0.06
0.1
0.04
0.05
0.02
0
1 2 3 4 5 6 7 8 9 101112131415
Items Given
0
14
16
18
Evidence Strength
20
Figure 2. Inferred parameters of the model: the base-rate (what the child might give if no
number word is used) and the evidence strength (the v in the model description; a
parameter controlling how much the probability of different responses gets modified by
the child’s knowledge of number words).
9
1
2
3
4
pre 1 2 3 4 CP
5
pre 1 2 3 4 CP
6
pre 1 2 3 4 CP
7
pre 1 2 3 4 CP
8
pre 1 2 3 4 CP
9
pre 1 2 3 4 CP
10
pre 1 2 3 4 CP
11
pre 1 2 3 4 CP
12
pre 1 2 3 4 CP
13
pre 1 2 3 4 CP
14
pre 1 2 3 4 CP
15
pre 1 2 3 4 CP
16
pre 1 2 3 4 CP
17
pre 1 2 3 4 CP
18
pre 1 2 3 4 CP
19
pre 1 2 3 4 CP
20
pre 1 2 3 4 CP
21
pre 1 2 3 4 CP
22
pre 1 2 3 4 CP
23
pre 1 2 3 4 CP
24
pre 1 2 3 4 CP
25
pre 1 2 3 4 CP
26
pre 1 2 3 4 CP
27
pre 1 2 3 4 CP
28
pre 1 2 3 4 CP
pre 1 2 3 4 CP
pre 1 2 3 4 CP
pre 1 2 3 4 CP
Figure 3. Inferred knower-levels of the first 28 children from Lee and Sarnecka
(submitted). The blue bars come from normal Bayesian inference using MCMC. The
green bars come from the excel sheet. The red bars come from the ad hoc heuristic.
10
29
30
31
32
pre 1 2 3 4 CP
33
pre 1 2 3 4 CP
34
pre 1 2 3 4 CP
35
pre 1 2 3 4 CP
36
pre 1 2 3 4 CP
37
pre 1 2 3 4 CP
38
pre 1 2 3 4 CP
39
pre 1 2 3 4 CP
40
pre 1 2 3 4 CP
41
pre 1 2 3 4 CP
42
pre 1 2 3 4 CP
43
pre 1 2 3 4 CP
44
pre 1 2 3 4 CP
45
pre 1 2 3 4 CP
46
pre 1 2 3 4 CP
47
pre 1 2 3 4 CP
48
pre 1 2 3 4 CP
49
pre 1 2 3 4 CP
50
pre 1 2 3 4 CP
51
pre 1 2 3 4 CP
52
pre 1 2 3 4 CP
53
pre 1 2 3 4 CP
54
pre 1 2 3 4 CP
55
pre 1 2 3 4 CP
56
pre 1 2 3 4 CP
pre 1 2 3 4 CP
pre 1 2 3 4 CP
pre 1 2 3 4 CP
Figure 4. Inferred knower-levels of the next 28 children from Lee and Sarnecka
(submitted). The blue bars come from normal Bayesian inference using MCMC. The
green bars come from the excel sheet. The red bars come from the ad hoc heuristic.
11
Figure 5. A screenshot of the actual excel sheet, with some example data filled in. There
is more room for data entry off to the right.
12
References
Ansari, D., Donlan, C., Thomas, M. S. C., Ewing, S. A., Peen, T., & Karmiloff-Smith, A.
(2003). What makes counting count? Verbal and visuo-spatial contributions to typical
and atypical counting development. Journal of Experimental Child Psychology, 85,
50-62.
Barner, D., Chow, K. & Yang, S. (2009). Finding one’s meaning: A test of the relation
between quantifiers and integers in language development. Cognitive Psychology,
58(2).
Carey, S. (2009). The origin of concepts. New York: Oxford University Press.
Duncan, G. J., Dowsett, C. J., Claessens, A., Magnuson, K., Huston, A. C., Klebanov, P.,
Pagani, L. S., Feinstein, L., Engel, M., Brooks-Gunn, J., Sexton, H., Duckworth, K.,
& Japel, C. (2007). School readiness and later achievement. Developmental
Psychology, 43(6), 1428-1446.
Fuson, K. C. (1988). Children's counting and concepts of number. New York: SpringerVerlag.
Le Corre, M., & Carey, S. (2007). One, two, three, four, nothing more: An investigation
of the conceptual sources of the verbal counting principles. Cognition, 105, 395- 438.
Lee, M.D., & Sarnecka, B.W. (2010). A model of knower-level behavior in numberconcept development. Cognitive Science, 34, 51-67.
Lee, M.D., & Sarnecka, B.W. (submitted). Number knower-levels in young children:
Insights from a Bayesian model.
Sarnecka, B. W., & Gelman, S. A. (2004). Six does not just mean a lot: Preschoolers see
number words as specific. Cognition, 92, 329-352.
Sarnecka, B.W., Kamenskaya, V. G., Yamana, Y., Ogura, T., & Yudovina, J.B. (2007).
From grammatical number to exact numbers: Early meanings of “one," “two,” and
“three” in English, Russian, and Japanese. Cognitive Psychology, 55, 136-168.
Sarnecka, B. W. & Lee, M. D. (2009). Levels of Number Knowledge in Early Childhood.
Journal of Experimental Child Psychology, 103(3), 325-337.
Wynn, K. (1990). Children’s understanding of counting. Cognition, 36(2), 155-193.
Wynn, K. (1992). Children’s acquisition of number words and the counting system.
Cognitive Psychology, 24(2), 220-251.
13
Download