1 APPLYING NBDA 2 We provide a computer script in R (NBDA_example_analyses.txt) and two example 3 files (example_network_data.txt and example_diffusion_data.txt) so that others can 4 apply NBDA to their own data. In the following we describe how to use this script. If 5 you have problems applying NBDA or further questions, please contact the 6 corresponding author. 7 8 (a) To those inexperienced with R 9 R is a free software that can be downloaded from the internet at 10 http://www.r-project.org/. 11 It is command line driven. One of the files included in the electronic supplementary 12 material (NBDA_example_analyses.txt) includes the code for R to conduct NBDA. 13 You can open this file using any text-editor (e.g. Word, Notepad etc.). From there 14 you can copy the code to R, which will then run it. 15 16 (b) Data preparation 17 NBDA requires as input the structure of a social network and the timing at which the 18 trait was acquired by individuals in the network (see in file 19 NBDA_example_analyses.txt section ‘load data files’). The network structure must be 20 given as a matrix. Each individual in the group is represented by a column and a row, 21 and the values in the off-diagonal cells of this matrix equal the strength of a 22 connection between two individuals. Zeros are entered along the diagonal. The order 23 of the individuals should be the same for the rows and the columns, and values used 24 for connection strength should be non-negative numbers. 1 1 It is possible to use a non-symmetric matrix as input to NBDA. In this case it 2 is assumed that the probability with which individual A learns from B can differ from 3 the probability with which B learns from A. The network data must be entered in the 4 way in which a column includes all values that impact the probability that the 5 individual associated with this column is learning from other individuals in the rows 6 (with a value of zero for where the row is equal to the column). For example, assume 7 that all connection values associated with individual A are written in column and row 8 one, and that all connection values associated with individual B are written in column 9 and row two. The value that determines the probability that A will learn from B 10 (provided that B is skilled) is written in column one, row two. The value that 11 determines the probability that B will learn from A (when A is skilled) is written in 12 column two, row one. 13 The timing at which the individuals acquired the new trait should be coded in 14 time steps that have been standardized to zero, with the first individual(s) acquiring 15 the new trait at time step zero and all values scored as integers. These values should 16 be written in a vector, in which the order of the corresponding individuals equals the 17 order used in matrix that represents the social network structure. Examples for the 18 required data formats can be found in files example_network_data.txt and 19 example_diffusion_data.txt, which are included in the electronic supplementary 20 material. 21 Coding the observed diffusion data into discrete time steps requires choosing 22 the length of a time step, e.g. one hour, one day, one month, or whatever is most 23 appropriate for the data at hand. This choice should depend on the uncertainties that 24 are associated with the observation conditions and the applied observation method. 25 The length of a time step should be large enough to ensure a high certainty that every 2 1 learning event actually occurred in the coded time step. In general it can be expected 2 that the power to distinguish between social and asocial learning decreases if the 3 number of individuals that learned in the same time step increases. If possible, the 4 length of a time step should, therefore, kept short enough to avoid such cases. 5 Preliminary analyses showed that results of NBDA do not depend on the specific 6 choice of the length of a time step, as long as the length was not very large. However, 7 this could vary among datasets, and we therefore suggest that users repeat NBDA 8 with slightly different length of time steps to confirm that the results do not depend on 9 the specific choice of this parameter. 10 For cases in which the new behaviour did not spread through the entire group, 11 it is necessary to also assign individuals who did not acquire the trait an ‘artificial’ 12 time step in which they learned. The assigned time step must be larger than any time 13 step in which an individual was observed to acquire the new behaviour (a suitable 14 value would be for instance one plus the largest observed time step). These data are 15 not actually used in the calculations (see next section). 16 17 (c) Changes required to be made in the code 18 Some simple changes to the code will be required before running the analysis. First 19 are changes in the section ‘Load data files,’ in which R is looking for the file to be 20 opened. Here you have to specify the path and name of the file to be opened (note that 21 paths are separated by a slash (“/”) rather than a backslash (“\”) as usual in Windows). 22 Further changes might be required in two parameters that are set in section 23 ‘Set parameters’. The first parameter is tau_max, which determines the largest 24 possible value of tau that is considered during maximum likelihood fitting of the 25 social learning model (the smallest possible value is always zero, and for the asocial 3 1 learning model the smallest and largest possible values for the asocial learning rate are 2 always zero and one). Higher values of tau_max might be necessary if the mean 3 connection strength is smaller than in the example files. In particular it is advised to 4 increase tau_max if the estimated value for tau is close to tau_max. In any case the 5 analysis should be repeated with different values of tau_max to ensure that the results 6 are not affected by choosing values of tau_max that are too small (which might lead to 7 the erroneous conclusion that the asocial learning model fits better). 8 9 If a new behaviour did not spread through the entire group it is necessary to manually set the parameter time_max. This parameter specifies the upper limit of the 10 time frame which is used to calculate model likelihoods. The default value of this 11 parameter is always set to the maximum value in the vector with learning times. In 12 cases in which the new behaviour did not spread through the entire group, this 13 maximum value will not correspond to the maximum time step in which an individual 14 acquired the new behaviour (as described in the previous section). Therefore, the 15 value of time_max has to be set to the truly observed maximum time step in which an 16 individual acquired the new behaviour. 17 18 (d) Results of NBDA 19 A successful application of NBDA returns a table in which AIC, Akaike weights and 20 estimated parameter values are reported for both models. Based on AIC values and/or 21 Akaike weights, it is possible to identify which learning mechanism best explains the 22 observed data (for further information on model selection see Burnham & Anderson 23 2002). 24 4 1 (e) Extended version of NBDA 2 We also provide an R script in which an extended version of NBDA is implemented 3 (NBDA_example_analyses.txt section ‘Extended network-based diffusion analysis’). 4 In this analysis a model of pure asocial learning and a model of social and asocial 5 learning are fit to the observed diffusion data. Fitting the model of social and asocial 6 learning requires an additional optimization procedure. The new procedure does not 7 require specifying a parameter tau_max, but initial values for tau (tau_i) and the 8 asocial learning rate (alr_i) are needed. The user is advised to check robustness of the 9 results relative to changes in these values. 10 As noted in the main paper, this extended NBDA is able to analyze data that 11 include time periods prior to the first occurrence of a new trait. Using such data, the 12 time steps should not to be standardized to zero. Instead, the time step with value zero 13 should correspond to the first time interval of the observation period and the time step 14 at which the first individual(s) acquiring the new trait can be any integer that is equal 15 or larger than zero. 5