SUPPORTING INFORMATION Appendix S1 Spatial predictors Spatial filter computation To account for non-environmental spatial constraints on the marsh harrier distribution, we used spatial variables obtained through eigenvector mapping (hereafter called spatial filters; Griffith & Peres-Neto, 2006; De Marco et al., 2010). This method assumes that the spatial arrangement of all points of the study area (i.e., sample locations or, as in our study case, the regular grid cells of the whole study area) can be translated into a set of predictor variables, which capture spatial effects at different spatial scales (Diniz-Filho & Bini, 2005; Václavík et al., 2012). The inclusion of spatial filters in the models allowed us to account for the effect of subjacent spatial structures that were not captured by the environmental factors considered (De Marco et al., 2010). We computed the spatial filters in SAM 3.0 (Rangel et al., 2006) by constructing a pair-wise distance matrix amongst all grid cells of the study area using their Universal Transverse Mercator coordinates (i.e., latitude and longitude). The distance matrix was truncated at four times the maximum distance that connects all cells under minimum spanning tree criterion, and from this modified distance matrix 481 positive spatial filters were computed using principal coordinate analysis (Borcard & Legendre, 2002). Because of the computational limitations of SAM (i.e., calculations could not be carried out for more than about 4000 cells, Moriguchi et al., 2013), spatial filters were created using coordinates at a coarse resolution (20x20 km) and then interpolated to the same resolution as environmental variables (1x1 km) using the inverse distance weighted method (IDW) in ArcGIS 9.3 (BlachOvergaard et al., 2010; Moriguchi et al., 2013). We selected IDW because it is a good 1 interpolator for phenomenon whose distribution is strongly correlated with distance, particularly for closely packed, consistently spaced sample point sets (Kennedy 2009). However, we acknowledge here that interpolation of the spatial filters from the 20-km resolution to the 1-km resolution may result in more smoothed trends in the spatial eigenvectors than using the data from the 1-km resolution directly. We presume that such differences might be greater in more rugged landscapes than in more gently sloping regions but due to the aforementioned computational obstacles this issue was not examined in this study. We then proceeded with a forward logistic regression procedure to select an adequate number of spatial filters to be included as independent variables in SDMs (Griffith & PeresNeto, 2006; Peres-Nesto & Legendre, 2010). This way, only filters that in fact contain important parts of the geometry of the data were used in SDMs. Filter selection was conducted for the breeding and wintering seasons separately. For these analyses, we used the occurrence data and 10,000 randomly selected points from areas from which the species is not known to reside as pseudo-absences. We used Moran’s I coefficients and correlograms to evaluate spatial patterns in selected filters as a measure of their spatial structure (Diniz-Filho & Bini, 2005). Spatial filters selected In total, 17 filters were selected for the breeding season (Fig. 5a; for comparison with results obtained according to a random subsample of the breeding period dataset, see Fig. S3 in Supporting Information) and 19 for the wintering season (Fig. 5b) to describe spatial variability in marsh harrier occurrence data. The spatial complexity and variable shape of correlograms showed how selected filters reflected different spatial structures at different 2 spatial scales (see Fig. S4 in Supporting Information for the map pattern of selected filters). High levels of spatial autocorrelation in the first, intermediate and last distance classes (represented by large positive and negative values of Moran’s I coefficients) tended to be portrayed by a map pattern containing few major clusters of similar values as in the first eigenvectors (e.g. F1, F4). As the degree of positive spatial autocorrelation decreased in filters with lower eigenvalues, the map pattern became more fragmented (e.g. F30, F80), representing finer-resolution spatial variation in the data. Overall, filters selected for the breeding season had higher eigenvalues than those for the wintering season, meaning that aggregation patterns of the species in the breeding season occur at broader spatial scales than in winter (i.e., probability of occurrence differs more between sites that are further away from each other during the breeding season than during the wintering season, thus the occurrence patterns are patchier during the winters). Supporting references Blach-Overgaard, A., Svenning, J.C. Dransfield, J. Greve, M. & Balslev, H. (2010) Determinants of palm species distributions across Africa: the relative roles of climate, non-climatic environmental factors, and spatial constraints. Ecography, 33, 380-391. Borcard, D. & Legendre, P. (2002) All-scale spatial analysis of ecological data by means of principal coordinates of neighbor matrices. Ecological Modelling, 153, 51-68. De Marco, P.Jr., Diniz-Filho, J.A.F. & Bini, L.M. (2010) Spatial analysis improves species distribution modelling during range expansion. Biology Letters, 4, 577–580. Diniz-Filho, J. A. F. & Bini, L. M. (2005) Modeling geographical patterns in species richness using eigenvectorbased spatial filters. Global Ecology and Biogeography, 14, 177-185. 3 Griffith, D.A. & Peres-Neto, P.R. (2006) Spatial modeling in ecology: the flexibility of eigenfunction spatial analyses. Ecology, 87, 2603–2613. Kennedy, K.H. (2009) Introduction to 3D Data: Modeling with ArcGIS 3D Analyst and Google Earth, Ed. John Willings & Sons, Inc., Hoboken, NJ, USA. Moriguchi, S., Onuma, M. & Goka, K. (2013) Potential risk map for avian influenza A virus invading Japan. Diversity and Distributions, 19, 78-85. Peres-Neto, P.R. & Legendre, P. (2010) Estimating and controlling for spatial structure in the study of ecological communities. Global Ecology and Biogeography, 19, 174-184. Rangel, T. F. L. V. B., Diniz-Filho, J.A.F., Bini, L.M. (2006). Towards an integrated computational tool for spatial analysis in macroecology and biogeography. Global Ecology and Biogeography, 15, 321-327. Václavík, T., Kupfer, J.A. & Meentemeyer, R.K. (2012) Accounting for multi-scale spatial autocorrelation improves performance of invasive species distribution modelling (iSDM). Journal of Biogeography, 39, 42–55. 4 FIGURES Figure S1. The distribution of the six environmental variables: annual precipitation (PAN); minimum temperature of the coldest month (TMIN); maximum temperature of the warmest month (TMAX); slope (SLO); percentage of open vegetation (VEG) and percentage of aquatic habitats (AQ). 5 Figure S2. Model performance of univariate models based on 6 1x1 km environmental variables containing information at 1km2 and 100km2 resolution: annual precipitation (PAN); minimum temperature of the coldest month (TMIN); maximum temperature of the warmest month (TMAX); slope (SLO); percentage of open vegetation (VEG) and percentage of aquatic habitats (AQ). Mean AUC of 15 replicate Maxent runs and standard deviation are shown. 6 Figure S3. Spatial correlograms of spatial filters selected for the breeding distribution of marsh harriers in peninsular Spain according to a random subsample of the breeding period dataset (i.e., 284 locations). Correlograms were defined by Moran's I coefficients in 10 distance classes, indicating links among points of the study area successively separated by 100 km. Filters for the breeding season are: F1, F4, F5, F6, F7, F11, F12, F25, F30, F34, F37, F42, F48, F49, F50, F55, F340; where increased number in the name of the filters indicates subsequently lower eigenvalues. 7 Figure S4. Example of geographic patterns of 9 of the selected spatial filters used in this study. First spatial filters, such as F1 and F4, represented spatial correlation at broad spatial scales. For example, the spatial pattern of F1 shows two major clusters of high and low values, respectively, portraying a north-south gradient in Spain, while Filter 4 captures a north-west / south-east gradient in the study area. Subsequent filters portray more oscillatory patterns across Spain, representing aggregation patterns as subsequent lower scales. 8 Figure S5. Observed and averaged predicted distributions of the marsh harrier in Spain in the breeding period according to a random subsample of 284 localities of the breeding period dataset. Predicted distributions are based on Maxent models using different sets of predictors: climate (CLIM model), climate and habitat predictors (CLIM+HAB model) or climate plus habitat and spatial filters (CLIM+HAB+SPAT model). Note that models developed for each set of predictors were calibrated using 15 different randomly selected subsamples of the data (averaged predictions are shown). 9 TABLES Table S1. Pearson correlation coefficient of the environmental variables and spatial filters. Correlation is based on 10,000 randomly picked cells. Significant correlations (P < 0.05) are shown in bold letters. Variable TMIN TMAX TAN TMAX 0.448 TAN 0.897 0.78 SLO -0.367 -0.518 -0.487 PAN -0.045 -0.668 -0.368 0.494 VEG -0.117 0.108 -0.036 -0.25 -0.329 AQ 0.162 0.08 0.153 -0.092 -0.016 10 SLO PAN VEG -0.032 Table S2. Differences in model performance based on different sets of predictors for a random subsample of 284 localities of the breeding period dataset. Model performance is assess using different measures of accuracy and model fit (AUC, AICc, BIC, sensitivity and specificity) and modelling techniques (GLM and Maxent). Note that for each set of predictors, 15 replicate models with different subsets of the data were conducted. The mean ± sd of the 15 replicate models conducted for each predictor set are shown. Comparisons are based on Mann-Whitney U test. Letters indicate models that are not significantly different after Bonferroni corrections (i.e. at α ≤ 0.01). Best models are shown in bold. GLM AUC AICc BIC Sensitivity Specificity Maxent AUC AICc BIC Sensitivity Specificity CLIM HAB CLIM+HAB SPAT CLIM+SPAT HAB+SPAT CLIM+HAB+SPAT 0.73 ± 0.03 1704 ± 18 1747 ± 18 0.70 ± 0.08a 0.66 ± 0.04a 0.82 ± 0.01a 1552 ± 10 1590 ± 12a 0.82 ± 0.05b,c 0.68 ± 0.04a 0.83 ± 0.02a,b 1514 ± 16 1587 ± 16a 0.77 ± 0.06a,c,d 0.74 ± 0.04b 0.85 ± 0.02b,c 1457 ± 22 1632 ± 25b 0.71 ± 0.10a,d 0.80 ± 0.07b,c 0.86 ± 0.02c 1403 ± 21 1613 ± 20b 0.77 ± 0.08a,c,d 0.76 ± 0.05b 0.88 ± 0.01c 1329 ± 20a 1539 ± 23c 0.83 ± 0.04b 0.83 ± 0.03c 0.88 ± 0.01c 1320 ± 18a 1549 ± 22c 0.78 ± 0.05a,b,c,d 0.81 ± 0.03c 0.72 ± 0.01 5129 ± 6 5146 ± 6 0.65 ± 0.04a 0.70 ± 0.02 0.81 ± 0.02a 4999 ± 17 5017 ± 18a 0.84 ± 0.03b 0.66 ± 0.01 0.83 ± 0.02a 4964 ± 15 4999 ± 16b 0.79 ± 0.05b,c 0.74 ± 0.02 0.82 ± 0.01a 4927 ± 14 5021 ± 13a 0.69 ± 0.05a 0.82 ± 0.04a 0.85 ± 0.02 4883 ± 21 4989 ± 22b 0.77 ± 0.05c 0.77 ± 0.03b 0.87 ± 0.01b 4819 ± 14a 4926 ± 14c 0.80 ± 0.08b,c 0.79 ± 0.05a,b 0.88 ± 0.01b 4822 ± 19a 4940 ± 20c 0.80 ± 0.08b,c 0.79 ± 0.06a,b 11