applying nbda - Proceedings of the Royal Society B

advertisement
1
APPLYING NBDA
2
We provide a computer script in R (NBDA_example_analyses.txt) and two example
3
files (example_network_data.txt and example_diffusion_data.txt) so that others can
4
apply NBDA to their own data. In the following we describe how to use this script. If
5
you have problems applying NBDA or further questions, please contact the
6
corresponding author.
7
8
(a) To those inexperienced with R
9
R is a free software that can be downloaded from the internet at
10
http://www.r-project.org/.
11
It is command line driven. One of the files included in the electronic supplementary
12
material (NBDA_example_analyses.txt) includes the code for R to conduct NBDA.
13
You can open this file using any text-editor (e.g. Word, Notepad etc.). From there
14
you can copy the code to R, which will then run it.
15
16
(b) Data preparation
17
NBDA requires as input the structure of a social network and the timing at which the
18
trait was acquired by individuals in the network (see in file
19
NBDA_example_analyses.txt section ‘load data files’). The network structure must be
20
given as a matrix. Each individual in the group is represented by a column and a row,
21
and the values in the off-diagonal cells of this matrix equal the strength of a
22
connection between two individuals. Zeros are entered along the diagonal. The order
23
of the individuals should be the same for the rows and the columns, and values used
24
for connection strength should be non-negative numbers.
1
1
It is possible to use a non-symmetric matrix as input to NBDA. In this case it
2
is assumed that the probability with which individual A learns from B can differ from
3
the probability with which B learns from A. The network data must be entered in the
4
way in which a column includes all values that impact the probability that the
5
individual associated with this column is learning from other individuals in the rows
6
(with a value of zero for where the row is equal to the column). For example, assume
7
that all connection values associated with individual A are written in column and row
8
one, and that all connection values associated with individual B are written in column
9
and row two. The value that determines the probability that A will learn from B
10
(provided that B is skilled) is written in column one, row two. The value that
11
determines the probability that B will learn from A (when A is skilled) is written in
12
column two, row one.
13
The timing at which the individuals acquired the new trait should be coded in
14
time steps that have been standardized to zero, with the first individual(s) acquiring
15
the new trait at time step zero and all values scored as integers. These values should
16
be written in a vector, in which the order of the corresponding individuals equals the
17
order used in matrix that represents the social network structure. Examples for the
18
required data formats can be found in files example_network_data.txt and
19
example_diffusion_data.txt, which are included in the electronic supplementary
20
material.
21
Coding the observed diffusion data into discrete time steps requires choosing
22
the length of a time step, e.g. one hour, one day, one month, or whatever is most
23
appropriate for the data at hand. This choice should depend on the uncertainties that
24
are associated with the observation conditions and the applied observation method.
25
The length of a time step should be large enough to ensure a high certainty that every
2
1
learning event actually occurred in the coded time step. In general it can be expected
2
that the power to distinguish between social and asocial learning decreases if the
3
number of individuals that learned in the same time step increases. If possible, the
4
length of a time step should, therefore, kept short enough to avoid such cases.
5
Preliminary analyses showed that results of NBDA do not depend on the specific
6
choice of the length of a time step, as long as the length was not very large. However,
7
this could vary among datasets, and we therefore suggest that users repeat NBDA
8
with slightly different length of time steps to confirm that the results do not depend on
9
the specific choice of this parameter.
10
For cases in which the new behaviour did not spread through the entire group,
11
it is necessary to also assign individuals who did not acquire the trait an ‘artificial’
12
time step in which they learned. The assigned time step must be larger than any time
13
step in which an individual was observed to acquire the new behaviour (a suitable
14
value would be for instance one plus the largest observed time step). These data are
15
not actually used in the calculations (see next section).
16
17
(c) Changes required to be made in the code
18
Some simple changes to the code will be required before running the analysis. First
19
are changes in the section ‘Load data files,’ in which R is looking for the file to be
20
opened. Here you have to specify the path and name of the file to be opened (note that
21
paths are separated by a slash (“/”) rather than a backslash (“\”) as usual in Windows).
22
Further changes might be required in two parameters that are set in section
23
‘Set parameters’. The first parameter is tau_max, which determines the largest
24
possible value of tau that is considered during maximum likelihood fitting of the
25
social learning model (the smallest possible value is always zero, and for the asocial
3
1
learning model the smallest and largest possible values for the asocial learning rate are
2
always zero and one). Higher values of tau_max might be necessary if the mean
3
connection strength is smaller than in the example files. In particular it is advised to
4
increase tau_max if the estimated value for tau is close to tau_max. In any case the
5
analysis should be repeated with different values of tau_max to ensure that the results
6
are not affected by choosing values of tau_max that are too small (which might lead to
7
the erroneous conclusion that the asocial learning model fits better).
8
9
If a new behaviour did not spread through the entire group it is necessary to
manually set the parameter time_max. This parameter specifies the upper limit of the
10
time frame which is used to calculate model likelihoods. The default value of this
11
parameter is always set to the maximum value in the vector with learning times. In
12
cases in which the new behaviour did not spread through the entire group, this
13
maximum value will not correspond to the maximum time step in which an individual
14
acquired the new behaviour (as described in the previous section). Therefore, the
15
value of time_max has to be set to the truly observed maximum time step in which an
16
individual acquired the new behaviour.
17
18
(d) Results of NBDA
19
A successful application of NBDA returns a table in which AIC, Akaike weights and
20
estimated parameter values are reported for both models. Based on AIC values and/or
21
Akaike weights, it is possible to identify which learning mechanism best explains the
22
observed data (for further information on model selection see Burnham & Anderson
23
2002).
24
4
1
(e) Extended version of NBDA
2
We also provide an R script in which an extended version of NBDA is implemented
3
(NBDA_example_analyses.txt section ‘Extended network-based diffusion analysis’).
4
In this analysis a model of pure asocial learning and a model of social and asocial
5
learning are fit to the observed diffusion data. Fitting the model of social and asocial
6
learning requires an additional optimization procedure. The new procedure does not
7
require specifying a parameter tau_max, but initial values for tau (tau_i) and the
8
asocial learning rate (alr_i) are needed. The user is advised to check robustness of the
9
results relative to changes in these values.
10
As noted in the main paper, this extended NBDA is able to analyze data that
11
include time periods prior to the first occurrence of a new trait. Using such data, the
12
time steps should not to be standardized to zero. Instead, the time step with value zero
13
should correspond to the first time interval of the observation period and the time step
14
at which the first individual(s) acquiring the new trait can be any integer that is equal
15
or larger than zero.
5
Download