1 Supporting Information: Regressions on distance matrices To generate the response matrix N, the nestedness metric for any pair of hosts ‘i’and ‘j’ is calculated in the following manner: (i) suppose ‘i’ and ‘j’ are characterized by vulnerabilities (numbers of parasite species) ki and kj respectively, and that ki>kj, (ii) the nestedness measure (Npaired) for this host pair is given by the percentage of parasite species interacting with ‘j’ which is shared with ‘i’. (Almeida-Neto et al. 2008). As we previously sorted the hosts in interaction matrix M in decreasing order of vulnerability, the Npaired is zero only when species ‘i’ and ‘j’ have the same vulnerability or share no parasite in common (see Almeida-Neto et al. 2008 for other possibilities). The taxonomic distance matrix T was calculated by assigning semi-metric distances between host species. Thus, tij values were assumed to be equal to 1, 2, 3, 4 and 5 if species i and j belong, respectively, to the same genus, to the same family (but not the same genus), to the same order (but different families), to the same class (but different orders), and to different classes. The abundance, body size, biomass, and sampling effort distance matrices were calculated by simple subtractions between hosts, incorporating an ordering component consistent with the concept of nestedness implied by Npaired. Consider again the two host species ‘i’ and ‘j’, with abundances ai and aj, and vulnerabilities ki > kj. . As the parasite composition of the species with lower vulnerability is expected to be a nested subset of the composition of the species with higher vulnerability (and not the opposite), we also expect that ai>aj if abundance is a factor driving nestedness (i.e. more abundant hosts are expected to have richer parasite compositions as they are easier targets for infection and have a larger capacity to sustain parasite populations). So the abundance distance aij was calculated as ai – aj. A matrix A with these distances for all i and j presenting ki>kj was then calculated, and used as explanatory matrix to represent the influence of abundance on nestedness. The same procedure was used for biomass, body size (fish standard length), and sampling effort (number of individual fishes analyzed for parasites), generating the distance matrices B, L, and E, respectively. For species pairs with the same vulnerability (ki= kj), the Npaired is equal to zero and there is no expected 2 direction for the difference between their abundance, biomass, body size, or sampling effort, so no value was calculated for these pairs in the explanatory distance matrices. All matrices were unfolded to generate vectors n, a, b, l, e, and t, representing nestedness, abundance, biomass, body size, sampling effort, and taxonomy, respectively. These vectors contain the H(H-1)/2 unique host pairs comprising the elements above the main diagonal of original matrices (see Figure 10.21 in Legendre & Legendre 1998), where H is the total number of host species with data available for all variables (H = 69). The usual procedure is then to carry out a linear regression of n against the explanatory vectors a, b, l, e, and t (Legendre & Legendre 1998). However, as there is substantial correlation among some of these explanatory vectors, we made a principal component regression (Graham 2003 and references therein). It consists in replacing original explanatory vectors with the factors derived from a Principal Component Analysis (PCA). These factors are linear combinations of original vectors, and are orthogonal, which means that the estimation of regression coefficients is not biased by collinearity (Graham 2003). As suggested by Graham (2003), all PCA factors were used for regression. Prior to PCA, the variables were standardized to control for differences in scale. After the regression, the coefficients representing the effects of original (standardized) variables are back calculated by multiplying a matrix containing the PCA eigenvectors by a vector containing the regression coefficients of PCA factors (Legendre & Legendre 1998). The species pairs with the same vulnerability did not enter this analysis for the reason outlined above. So, from a total of 69(69-1)/2 = 2346 original pairs, only 2116 were valid cases. The significance of the regression coefficients were assessed by a randomization procedure, as in a Mantel test (Legendre & Legendre 1998). It consisted in randomly permuting the order of species in matrix N, recalculating n, excluding from n the cases corresponding to invalid cases in a, b, l, e, and t, (i.e. host pairs with the same vulnerability) and performing a new regression. The procedure was repeated 9999 times, producing a distribution of 10000 coefficient estimates (including the observed value) for each explanatory variable. 3 The p-values were calculated by the proportion of coefficients in these distributions with magnitudes higher than or equal to the observed coefficient. The same steps above were used to test for influences of the explanatory variables on the response matrix C, with three exceptions: (i) we used logistic instead of linear regression; (ii) the distance matrices of abundance, biomass, body length and sampling effort were calculated by the absolute differences, because the ordering of species vulnerability does not matter for network modules as it matters for nestedness; and (iii) as a consequence we could use all 2346 species pairs for this regression. References Almeida-Neto, M., Guimarães, P., Guimarães Jr, P.R., Loyola, R.D. & Ulrich, W.(2008) A consistent metric for nestedness analysis in ecological systems: reconciling concept and measurement. Oikos, 117, 1227-1239. Graham, M.H. (2003). Confronting multicollinearity in ecological multiple regression. Ecology, 84, 28092815. Legendre, P. & Legendre L. (1998) Numerical Ecology. Elsevier Science.