Supplementary Materials for Akçay, Campbell and Beecher (2013). Individual differences affect honest signaling in a songbird Table S1. Loading coefficients for the Principal Component Analyses (PCA). The first component of PCA on aggressive behaviors (the top three variables) explained 67.9% of variance and was taken as aggression score. The first component of PCA on signaling behaviors (the bottom two variables) explained 68.3% of variance and was taken as signaling score. n=219 trials, 69 subjects. variable Rate of flights Time spent within 5 m Closest approach Rate of soft songs Rate of wing waves Coefficients Aggression PCA 0.81 0.87 -0.79 Signaling PCA 0.83 0.83 Validation of measures of aggression as a replacement for physical attack In this section we report additional analyses on data published in Akcay et al. [1] that validates our measures of aggression (PCA scores as reported in Table S1). In Akcay et al., we confronted each subject (n=48) with a taxi-dermic mount at the center of their territory and subjects were given 15 minutes to attack the mount or not. 31 out of 48 subjects attacked (see ref [1] for the details of the experimental design). In order to validate our current measures of aggression, we extracted the same variables we use for aggression scores here (rate of flights, time spent within 5m, closest approach) and carried out a PCA analysis. The first component of PCA explained 69.3% of variance. Table S2. Loading coefficients of the behavioral measures in the PCA carried out on the data from Akcay et al. [1]. variable Rate of flights Time spent within 5 m Closest approach Coefficients PCA1 0.77 0.84 -0.89 When we entered the PCA1 scores into a discriminant function analysis, the PCA1 scores were able to classify all but 4 of the subjects (91.7%) correctly as attackers and non-attackers. The 4 subjects that were misclassified had high scores but did not attack. It is important to note that the 15 minute limit on attacks is arbitrary (but necessary to carry out a large number of trials in a reasonable amount of time), and it is possible these 4 subjects would have attacked had we kept the trial duration longer. In any case, the PCA scores were a remarkably good proxy of whether or not the bird will physically attack the mount. This result thus validates the PCA scores used in the current experiment. An alternative method of calculating repeatability of signaling scores controlling for aggression. In this section, we report an alternative method of calculating repeatability of signaling scores while controlling for aggression that uses the General Linear Mixed Model (LMM) approach suggested recently by Nakagawa and Schielzeth [2]. In this method, the adjusted repeatability (repeatability controlling for a confounding factor) is calculated by partitioning the observed variance in the variable of interest (here, the signaling levels) into three components: the variance accounted by the fixed factor (σϒ2; individual aggression), between-subject variance (σα2) and within-subject (error) variance (σε2). The adjusted repeatability then can be calculated as the following ratio: σα2/ (σα2+ σε2), as in a simple repeatability. Thus, adjusted repeatability is akin to calculating residuals in signaling scores after accounting for aggression and calculating the repeatability of these residuals. If, after removing the variance due to the effect of aggression, a significant portion of the variance in signaling scores can still be attributed to between-subject variance, than signaling levels can be said to be repeatable after controlling for aggression. This would be evidence for a repeatable personality trait that, along with aggressiveness, drives observed signaling levels (Figure 1c). On the other hand, if after removing the variance due to aggression levels, the remaining variance is mostly due to within-subject (error) variance, than there would be no evidence for this hypothesis (Figure 1b). We calculated the adjusted repeatability scores controlling for aggression scores from a random-intercept LMM (using R package ‘lme4’) where signaling score was the dependent variable, aggression scores the fixed effect and subject identity the random factor. We tested whether the effect of the random variable (subject identity) was significant with a restricted likelihood ratio test using the R-package RLRsim[3]. The mixed model analysis showed that aggression scores were a significant predictor of signaling scores (effect size± SE = 0.39± 0.06, t= 6.59, n = 219, p< 0.000001). Figure S1 shows the portions of variance in the LMM model that was explained by the fixed effect of aggression scores, individual (subject) and residual error variance. We then calculated the repeatability of signaling scores, adjusted for aggression scores. As explained above, we divided the estimated variance of between-subject variance (σα2= 0.33) by the sum of between-subject variance and residual variance (σε2= 0.42) to calculate the repeatability: r= 0.44, p< 0.0001 (the p-value was determined from a likelihood-ratio test for the significance of the random effect). This finding suggests that individuals consistently signaled above or below the expected level of signaling for any given level of aggression, corroborating the finding of the analysis on signaling residuals reported in the main text. Note that the estimate of the repeatability, although somewhat higher than the estimate of repeatability from the residuals (0.39), is well within the range of confidence intervals we found for the repeatability of residuals (0.24- 0.54). Ruling out alternative explanations of consistent individual differences. In this section we deal with two alternative explanations raised by reviewers that may explain the presence of individual differences. The first alternative explanation is that individual differences may be not due to some individual trait but due to age differences in aggression. If age is responsible for most of the individual differences, controlling for age should decrease the repeatabilities of aggression and signaling behaviors significantly. To answer this question, we carried out LMM-based adjusted repeatability analyses as described above on the 34 subjects (108 trials) for whom we had age information. We carried out two LMM analyses (using lmer in R package lme4) on the signaling and aggression scores with age as a fixed covariate and bird identity as a random effect, and the significance of the fixed effect was tested with a likelihood-ratio test (anova function in R). Age had a significant effect on aggression scores (β+SE = -0.19+0.10, t=2.00, p=0.046, n=108 trials) but not on signaling scores (β+SE: -0.11+0.08, t= -1.45, p=0.14, n=108). In these models, age explained 7.5% and 4.3% of variance respectively (R2 were calculated according to the formulae in [4]) . This is a fraction of the variance explained by the random effect of bird identity in same models (50.7% and 50.4%, respectively). Adjusted repeatability after controlling for age for aggression scores was r= 0.55 and the adjusted repeatability for signaling scores was r=0.52. For the same 34 males, simple repeatabilities calculated from LMMs that did not include age as a factor were r=0.57, and r=0.53, for aggression and signaling scores respectively. Thus, the effect of age did not have a profound effect on the estimates of repeatabilities. We also calculated the adjusted repeatability scores for signaling, controlling simultaneously for age of the bird and aggression scores in an LMM. This repeatability score also is very similar to the ones reported in the manuscript and above; r= 0.38. These results all point to the fact that individual’s age, although having a weak negative effect on aggression scores (but not signaling scores), is not a primary driver of individual consistency in either aggression or signaling or signaling strategies. Another alternative explanation for individual differences raised by one reviewer is that (1) if observation durations are consistently different between birds because of the fact that birds had different latencies to respond, and (2) if birds switch from signaling to aggressive behaviors at a certain threshold, we would detect what looks like individually consistent differences in signaling levels. We can rule these two assumptions out with current dataset: First of all, observation durations (mean± SD: 547 ± 99.9 seconds) were not repeatable (r= -0.005, F(4,214)= 0.78, p=0.54). The second point above assumes that there is a distinct switch from primarily signaling behaviors to primarily aggressive behaviors. This is not the case in song sparrows. For instance Searcy et al. [5] showed a minute by minute breakdown of soft song rates in the period leading up to an attack and found that soft song rates do not change significantly from early to late trial; i.e. the birds do not stop signaling. This is consistent with the detailed descriptions of natural interactions between song sparrows described by Nice [6]. We can also rule out that song sparrows do not stop signaling later in our current dataset. To this aim, we looked at behaviors in the first 5 minutes vs. second last minutes of the trial and ran a LMM model for each of the following behaviors (again converted to rates per minutes to account for unequal durations of observations within these halves): rate of flights, rate of soft songs, rate of wing waves, time spent within 5m, and closest approach distance. For each of these measures we ran a LMM with trial period (first vs. second) as a fixed factor and the month of testing and bird identity as random factors. We used likelihood ratio tests to compare the LMM with a null model that did not have trial period in the model (this model only has month and bird as random factors). We used lmer function in lme4 package for the LMM and the anova function in base R for the likelihood ratio tests. The results are presented in the table below (Table S3). With the exception of time spent within 5 m, the effect of trial period was not significant. Time spent within 5 m showed a slight but significant increase between the first and second of the trial which is probably due to the fact that at the start of the trial, birds tend to be far away from the speaker. The difference between early vs. late in time spent close to the speaker therefore does not correspond to a switch from a mostly signaling to mostly aggressive behavior. Table S3. Coefficients from the LMM models and χ2 values and associated p-values from the likelihood ratio tests. Variable flights time spent within 5m closest approach soft songs wing waves Coefficient 0.94 0.07 -0.67 0.03 -0.26 St. Error 1.29 0.03 0.65 0.11 0.16 t 0.73 2.56 -1.04 0.31 -1.56 χ2 0.52 6.31 1.08 0.09 2.44 p 0.47 0.01 0.30 0.76 0.12 References 1. Akçay Ç., Campbell S.E., Tom M.E., Beecher M.D. 2013 Song type matching is an honest early threat signal in a hierarchical animal communication system. Proceedings of the Royal Society, B 280, 20122517. (doi:http://dx.doi.org/10.1098/rspb.2012.2517). 2. Nakagawa S., Schielzeth H. 2010 Repeatability for Gaussian and non‐Gaussian data: a practical guide for biologists. Biological Reviews 85(4), 935-956. 3. Scheipl F., Greven S., Kuechenhoff H. 2008 Size and power of tests for a zero random effect variance or polynomial regression in additive and linear mixed models. Computational Statistics & Data Analysis 52(7), 3283- 3299. 4. Nakagawa S., Schielzeth H. 2013 A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution 4(2), 133-142. (doi:10.1111/j.2041210x.2012.00261.x). 5. Searcy W.A., Anderson R.C., Nowicki S. 2008 Is bird song a reliable signal of aggressive intent? A reply. Behavioral Ecology and Sociobiology 62(7), 1213-1216. 6. Nice M.M. 1943 Studies in the life history of the song sparrow II. The behavior of the song sparrow and other passerines. Trans Linn Soc NY 6, 1-328.