Electronic Supplementary Material for The key mimetic features of hoverflies through avian eyes by Roderick S. Bain, Arash Rashed, Verity J. Cowper, Francis S. Gilbert, Thomas N. Sherratt Appendix A. Species List for Images in the Set of 206 (see Dittrich et al. (1993)) Image Type wasp* Syrphus ribesii Temnostoma vespiforme† Chrysotoxum cautum Helophilus pendulus Epistrophe grossulariae Xanthogramma pedissequum Chrysotoxum bicinctum Sphecomyia vespiformis† Volucella zonaria Scaeva pyrastri Ischyrosyrphus glaucius†,‡ non-mimetic fly** * ** † ‡ Number Code 97 7 10 10 1 5 4 3 6 6 5 4 48 1 2 3 4 5 6 7 8 9 10 11 12 13 Abbreviation Wasp S. ri T. ve C. ca H. pe E. gr X. pe C. bi S. ve V. zo S. py I. gl NM Fly wasps were a mixture of Vespula vulgaris and Vespula rufa non-mimetic flies were a mixture of Dipteran species, including Tabanus spp., Tachina spp., Sarcophaga spp., Scatophaga spp., among others Three T. ve, four S. ve and one I. gl were classed with ‘cream’ for stripe/patch colour. There were no wasps or non-mimetic flies with this colour to train the neural network so all predictions for these images were based on a novel colour using the default mechanism in nnet. One non-mimetic fly and three I. gl images had colour ‘light grey’. If that non-mimetic fly happened to be in the training data then predictions for the three I. gl specimens were based on a colour seen in training; otherwise prediction for all four specimens was based on a novel colour. The 12 images that were cream or light grey were deleted when the models were redone without requiring prediction for novel colours to be made. In this case the R2 value for Wasp+ data was 0.89, df = 10, P < 0.0001, and for Fly+ data the value was 0.85, df = 10, P < 0.0001. The colours mentioned here are those taken from the images as seen by the pigeons not the actual specimen colour and reflect the influence of lighting when the photographs were taken. Appendix B. Development of a Reverse-Engineered Predator Model The software for the REP analysis was implemented in R (R Development Core Team 2004). The fully-connected, feedforward neural network nnet (Venables & Ripley 2002; Ripley 1996) was used as the classifier. The tuning parameters for nnet that were variables modified by the genetic algorithm described below were the number of nodes in the hidden layer (0 to 15 which is 0 to 1111 in base 2), whether or not skip-layer weights were used (No=0 or Yes=1), and the decay parameter (one of 8 values (0.0001, 0.0003, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3)). The 80 values for each numerical predictor variable in each of the ten W-NMF-i training sets were translated and scaled separately so that all values for each predictor variable were in the interval [-1,1]. The same translation and scaling were applied to the 163 values in the corresponding WNMF-H-i test set. The probability-to-peck-rate conversion function was peck rate 1 ( 2 1 ) p3 1 (1 p)4 2 , 3 , 4 0 (B.1) where p is the probability of being the target species. For 1 and 2 , 256 values (0, 0.25, 0.50, , 63.75) were available, and for 3 and 4 , a list of 128 values was available. The GA had four indices encoded (1 to 256 for 1 and 2 each, and 1 to 128 for 3 and 4 each) to select the parameters from the lists, e.g. if the value encoded was 3 corresponding to 1 then 1 would be the third entry in the list, namely 0.50. The general shapes permitted by the conversion function are shown in Figure B.1. Figure B.1 General Shapes Of Conversion Function Permitted (1 = 4, 2 = 50) The rationale was to allow concave or convex functions (B.1a and B.1b) and to allow rapid change in predicted peck rate at the extremes as the probability of being the target species predicted by the neural network approached zero or one (B.1c and B.1d) as was the case for the Wasp+ data in Figure 1a. For the genetic algorithm, 200 chromosomes were used. A chromosome consisted of a string of 55 zeros and ones in which there were three information packets encoded: (i) an enumeration of the predictor variables to be used; (ii) the structure of the neural network; and (iii) the parameters in the probability-to-peck-rate conversion function. A flow chart of the process for analysis of the Wasp+ experiment follows (the Fly+ analysis is analogous). Five repetitions, r=1,,5, of the entire process were done to provide more values for Figure 2. Set to be one of 1, 10 or 100. For each repetition r=1,,5 For each data set i=1,,10 For each generation k=1,,200 For each chromosome j=1,,200 Using scaled data corresponding to 80 images in W-NMF-i 1. 2. Extract information from chromosome i. which predictor variables are to be used ii. structure of the neural network (number of hidden nodes from 0 to 15), skip-layer weights Yes/No, decay parameter from preset list) iii. parameter values for the conversion function Train a neural network using all 80 images using the predictor variables from (i) and network structure from (ii) Using the trained neural network and the 163 images in the W-NMF-H-i data set. 1. 2. Predict the probability of being a wasp for each of the 163 images. Use the parameter values from (iii) in the conversion function to convert the predicted probabilities of being a wasp to predicted peck rates. 3*. Average the predicted peck rates for the 13 taxonomic groups, i.e., find the average of the 76 predicted peck rates for wasp, 7 predicted peck rates for S. ri, 10 predicted peck rates for T. ve and so on. m 1{mean predicted peck rate for group m} {mean observed peck rate for group m}2 13 4. Find SSE 5. Find fitness = -SSE - {number of predictor variables used as input} Record the following information from the chromosome that produces the greatest fitness from all of the chromosomes over all of the generations: i. the predictor variables retained ii. the mean predicted peck rates from (3*) At the end of all the repetitions for all the data sets, there are 50 sets of retained predictor variables and 50 corresponding sets of 13 mean predicted peck rates. Average the 50 sets of 13 mean predicted peck rates to get one overall average for each taxonomic group. It is these 13 averages that are compared to the mean observed peck rates in Figure 1a for data based on the Wasp+ experiment and the standard deviations are over the 50 values for each taxonomic group. The total number of times each predictor variable was retained is presented in Figure 2. Step-by-step details of the REP model building are available upon request. The limit of 200 generations was felt to be sufficient in that on average the maximum fitness was found by generation 129, and in only 7 of the 300 models that were built was the maximum fitness found at generation 200. For the nine specimens of the novel species Episyrphus balteatus, the 50 optimal overall structures (retained predictor variables, neural network structure and conversion function parameters for each repetition for each data set) when was set to 1 were used to predict peck rates. Hence, the two reported mean predicted peck rates of 48 for Wasp+ and 25 for Fly+ are each the average of the 450 values (9 × 5 × 10). When one taxonomic group was excluded, the procedure for obtaining the optimal REP model for each data set was identical to the original procedure except that there were only 12 taxonomic groups used to create the models. Using the best REP model found for each repetition for each data set, predictions were made for the specimens in the excluded group, then averaged for that group. This gave 50 (5 × 10) mean predicted peck rates for the excluded taxonomic group. These 50 values were then averaged to give an overall value for the mean predicted peck rate for the excluded group. Hence, in the end, there were 13 mean predicted peck rates after each taxonomic group was excluded for a set of simulations. It is the R2 value for the comparison of these 13 values to the mean observed peck rates that is reported as well as the R2 value excluding H. pe. These simulations were all done with Wasp+ data using = 1.