Supplementary statistical analysis Response variables (total number of social sounds per minute per whale, rate of production of vocalisations, surface behaviour events and breaching, proportion of vocalisations, surface behaviour events and breaching) were square-root transformed for normalization. A conversion or transformation of data having a Poisson distribution, where sample means are approximately proportional to the variances of the respective samples, is achieved by replacing each measurement by its square root resulting in homogeneous variances (see Zar for further details). The response of each normalised variable with sea state was tested using an analysis of variance model (with the fixed effect of sea state). Results are stated as (F df1, df2 = ,P< ). A linear regression analysis (see Zar for further details) was used to test the effect of mean wind speed, mean broadband levels and mean third octave broadband levels on (normalized) rates of social sound production per minute per whale (rSS, rSA, rSVand rBR) and the social sound activity budget per minute per whale (pSA, pSV and pBR). Results are stated including the P value and r-squared value for the regression. ‘P’ indicates the significance of the regression analysis and rsquared estimates the ‘goodness of fit’ of the line and represents the percentage variation of the data explained by the fitted line. A power analysis was used to assess the probability of detecting an effect of a given size with a given level of confidence (see Zar for further details). At a significance level of 0.05, the ANOVA model (testing sea state with social sound proportions) had a power of 0.6. Linear regression models testing the effect of wind speed had a power of 1.0 and linear models testing the effect of broadband noise had a power of 0.9 (surface active sounds) and 0.8 (vocalizations). A validation analysis (using the 2004 dataset to create the linear model and the 2008 dataset and tag dataset as the test sample) was used to support the results of the regression analysis. In this analysis a linear model was created using only the 2004 dataset (see Lewicki & Hill for further details). This linear equation from this model was used to predict dependent values (proportion of surface active sounds and proportion of social vocalisations) for the 2008 dataset and tag data. The predicted values (with 95% confidence limits) from the 2004 model were compared with observed dependent values for the 2008 dataset to test if the 2004 linear model holds true for subsequent datasets. A ‘best subset’ stepwise regression model (using all 1/3 octave band noise levels as explanatory variables) was used to generate the best regression model to account for the variability in the dependent data (see Lewicki and Hill for further details). All explanatory variables (in this case all 1/3 octave noise bands) were initially included in the regression model and adjusted r-squared value, mean square residual, F value and P value calculated. Step 1 removed the least significant explanatory variable (with the highest p-value) and recalculated the above values. Subsequent steps removed explanatory variables (and re-inserted explanatory variables) until the optimum model which explained the variation in the dependent dataset was found. In other words, this model analysis determined which combination of frequency bands of noise accounted best for the observed variation in the social sound repertoire of humpback whales. The critical p value (for entry or removal of explanatory variable from the model) was 0.05. The adjusted r-squared value (adjusted to the number of explanatory variables in each model) and mean square residual was used to select the best model. Frequency bands selected in the model were then used as predictor variables in a final multiple regression analysis. Lewicki P. & Hill T. 2006 Statistics: methods and applications: a comprehensive reference for science, industry and data mining. StatSoft Inc. Zar, J. H. Biostatistical analysis. Prentice Hall International, Inc.