Appendix S1 – Table listing sources of museum specimen records and recent state/province harvest records (with locational uncertainty less than 8500 m) used in SDM creation and/or relative density analysis. Type Museum Harvest Location Museums listed in Mammal Networked Information System (MaNIS) (www.manisnet.org) Museums listed in CONABIO (www.conabio.gob.mx) Philip L. Wright Zoological Museum, Missoula, Montana Virginia Museum of Natural History, Martinsville, Virginia North Carolina Museum of Natural Sciences, Raleigh, North Carolina Carnegie Museum of Natural History, Pittsburgh, Pennsylvania Bell Museum of Natural History, Minneapolis, Minnesota Museum of Cultural and Natural History, Cleveland Museum of Natural History, Cleveland, Ohio University of Arkansas Collections, Fayettevelle, Arkansas University of Alabama Museums, Tuscaloosa, Alabama University of Northern Iowa Museums, Cedar Falls, Iowa Royal British Columbia Museum, Victoria, British Columbia Manitoba Museum, Winnipeg, Manitoba Royal Saskatchewan Museum, Regina, Saskatchewan Royal Alberta Museum, Edmonton, Alberta Idaho Museum of Natural History, Pocatello, Idaho Connecticut State Museum of Natural History, Storrs, Connecticut Alaska Alberta Arkansas British Columbia Connecticut Idaho Maine Manitoba Massachusetts Minnesota Nevada New Mexico Nova Scotia Ontario Oregon Quebec Rhode Island Appendix S2 – Additional details on methods employed in this manuscript Environmental Data Climatic variables were obtained from the WorldClim database (Hijmans et al., 2005), which gives a variety of climatic data averaged over the years 1960-1991. We also calculated long-term (1979-2000) average winter (October-March) snow depth and snow cover using data from the North America Regional Reanalysis dataset (Mesinger et al., 2006). For species distribution models, we also included information regarding the ecoregion of each grid cell (Omernick, 1987), but excluded information on human disturbance and land use, due to potential discrepancies between the time period of coyote presence records and available vegetation and human disturbance layers. We performed an initial screening of the 19 bioclimatic variables given by WorldClim by running a MaxEnt model including all potential predictor variables, and eliminated variables that had a small relative influence. We then examined pair-wise correlations between the remaining variables, and for those pairs that were highly correlated (r > 0.85), we retained only the most biologically meaningful variable. In total, six bioclimatic variables were used in the final modeling (minimum temperature coldest month, maximum temperature warmest month, precipitation coldest quarter, precipitation warmest quarter, temperature seasonality, diurnal range), as well as % snow cover, elevation and ecoregion. For the harvest models, we calculated mean values of environmental variables within each trapline, township, or county as the predictor variables in the analysis. Given sample size considerations, we a priori restricted the climatic layers to 4 variables related to temperature and precipitation (see Table 1). We also calculated mean values of the human influence index (SEDAC, 2005), as a measure of overall human disturbance. This index combines layers related to human population pressure, infrastructure, land use, and access (roads, rivers, etc.) to produce an overall map of human disturbance. We calculated mean tree cover (broadleaf or needleleaf) in each trapline or county based on the Globcover 2009 product (Arino et al. 2012) as a simple measure of land use that likely influences canid abundance. Species Distribution Models Model development The program MaxEnt was used to create species distribution models separately for the historic and each expanding population (all expansion points considered together, and NE, NW, and SE expansion fronts), and were projected across all of the United States and Canada. These models are hereafter referred to as historic, full expansion (note that the "full expansion" designation does not include expansions to the south of Mexico, due to a lack of information in this area), NE expansion, NW expansion, and SE expansion models. MaxEnt models can be highly dependent on the definition of the background from which pseudo-absences are drawn (Anderson & Raza, 2010). Background data should reflect the area potentially available to a species (Barve et al., 2011). For our historic coyote population, we defined the background as a 200 km buffer around the historic range (as an estimate of the area available to dispersing coyotes at the edge of the range). For each expansion front, we defined the background as a 200km buffer around a minimum convex polygon formed from the presence records in each expansion zone. As a subsidiary analysis, we let all of North America function as the background for each population. Results were similar to the 200km buffer background included in the main text of the manuscript, but models much more severely constrained to each region (a common problem with large background areas; Anderson and Raza 2010), which we felt was overly conservative and these models were therefore not considered further. Our locality data used to build distribution models suffered from several sources of sampling bias (Newbold, 2010), which is problematic for MaxEnt modeling (Phillips et al., 2009). We subsampled records to reduce the unevenness in density of presence records. We created two subsamples, one with a single presence record for every 900 km2 area, and one with a single presence record every 10,000 km2 area. This latter subsample reduced unevenness in the presence records, but also discarded a large number of usable locations. The first subsample still contained an uneven density of coyote presence records that was reflective of sampling effort, and therefore bias in the presence records was further addressed by creating a bias grid following procedures outlined in Elith et al. (2010). The bias grid is used to down-weight the importance of presence records from areas with more intense sampling. Results of MaxEnt models developed from the 10000km2 subset and the 900km2 subset with bias file were qualitatively similar, and so we present the results of the 10000km2 subset only. MaxEnt models were fit using only hinge features, which allow non-linear fitted functions similar to a generalized additive model (Elith et al., 2011). We used a 10-fold crossvalidation approach in developing models and calculating response curves to environmental variables. MaxEnt logistic output was used to plot models and compare niches of historic and expanding populations (see below). As a general indicator of how well historic models predicted expansion locations, we calculated area-under-the-curve (AUC) values from historic models using 10 independent sets of expansion front test locations. For the historic range MaxEnt model, we calculated a Multivariate Environmental Similarity Surface (MESS; Elith et al, 2010) to examine the degree to which environments within the historic range were reflective of the expansion range. The MESS calculation shows how similar a given point or location is to a reference set of points, given a certain set of predictor variables (Elith et al. 2010). The MESS calculates, with respect to the predictor climatic variables used in this analysis, how similar a point is to a reference set of points (in this case, the reference set corresponds to the historic training data for coyotes). In the figure below, negative values (shown in red) indicate novel climate (where values of one or more predictors fall outside the range of those layers in the historic reference set). Positive values (shown in blue) indicate values more similar to the median values in the historic reference set. . Niche comparison Niche overlap was calculated between historic and each expansion front population by comparing the logistic output of MaxEnt models using the I statistic (see Warren et al., 2008 for formula). This statistic compares suitability values of two models at every grid cell in the study area to determine the degree of niche overlap, and ranges from 0 (no niche overlap) to 1 (niche identity). Two MaxEnt models that have similar suitability values (i.e., logistic output) for each grid cell will have higher values for the I statistic, than two models that diverge greatly in their predicted suitability. Because sample size and environmental space differed between each comparison (historic-full expansion, historic-NE expansion, historic-NW expansion, historic-SE expansion), and background environments in each expansion zone varied in their similarity to background environments in the historic range, niche overlap values could not be compared directly (Peterson, 2011). Instead, to determine changes in niche overlap we compared the results of tests of niche similarity, which tests the hypothesis that niches of two populations are more similar than would be expected based on chance alone (outlined in Warren et al., 2008). To test for niche similarity, we randomly located presence points within the range of each expanding population. The number of random presence points was equal to the total number of presence points within each expansion front. This process was repeated 100 times for each expansion front. Niche models were created from each random pseudo-replicate and overlap values between the historic model and these pseudo-replicates were calculated to create a null distribution of overlap values (niche overlap calculation was performed using EMNTools; Warren et al., 2010). If the actual overlap values between historic and any given expansion population falls outside of this distribution (at the α =0.05 level), this indicates that the niche is more similar than expected given random background differences in the environmental space (Warren et al., 2008). In contrast, if actual overlap falls within the randomly generated null distribution, this indicates that the niche is not more similar than expected from random. The advantage of the niche similarity test is that it accounts for background differences in available environments as driving the niche overlap through use of the random null model, and therefore detects similarity between niches caused by actual commonalities in habitat selection between two populations. We predicted that NE expanding populations would have less similar niches compared to the historic model (i.e., the actual overlap values would lie closer to the null distribution of values) than the NW or SE expanding populations. Appendix S3 – Coyote presence records. Top = all presence records, applying a 900km2 subset filter, Bottom = all presence records, applying a 10000km2 subset filter. We fit MaxEnt models using the 900km2 subset and a bias file (sensu Elith et al. 2010), and with the 10000km2 subset without a bias file. Results were qualitatively equivalent, so we present in the manuscript only the analysis based on the 10000km2 subset. Appendix S4: Marginal response curves to environmental variables from historic (solid black line), NE expansion front (long dashed line), and NW expansion front (short dashed line) models. Y-axis indicates habitat suitability, and x-axis displays each of 8 continuous environmental variables used in this study. Supporting References Anderson, R.P. & Raza, A. (2010) The effect of the extent of the study region on GIS models of species geographic distributions and estimates of niche evolution: preliminary tests with montane rodents (genus Nephelomys) in Venezuela. Journal of Biogeography, 37, 1378– 1393. Arino, O., Perez, R., Julio, J., Kalogirou, V., Bontemps, S., Defourny, P. & Van Bogaert, E. (2012): Global Land Cover Map for 2009 (GlobCover 2009). doi:10.1594/PANGAEA. 787668. Barve, N., Barve, V., Jiménez-Valverde, A., Lira-Noriega, A., Maher, S.P., Peterson, A. T., Soberón, J., & Villalobos, F. (2011) The crucial role of the accessible area in ecological niche modeling and species distribution modeling. Ecological Modelling, 222, 1810–1819. Elith, J., Kearney, M. & Phillips, S. (2010) The art of modelling range-shifting species. Methods in Ecology and Evolution, 1, 330-342. Elith, J., Phillips, S.J., Hastie, T., Dudík, M., Chee, Y.E. & Yates, C.J. (2011) A statistical explanation of MaxEnt for ecologists. Diversity and Distributions, 17, 43–57. Hijmans, R.J., Cameron, S.E., Parra, J.L., Jones, P.G. & Jarvis, A. (2005) Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology, 25, 1965-1978. Mesinger, F., DiMego, G., Kalnay, E., Mitchell, K., Shafran, P.C., et al. (2006) North American Regional Reanalysis. Bulletin of the American Meteorological Society, 87, 343-360. Newbold, T. (2010) Applications and limitations of museum data for conservation and ecology, with particular attention to species distribution models. Progress in Physical Geography, 34, 3-22 Phillips, S.J., Dudík, M., Elith, J., Graham, C.H., Lehmann, A., Leathwick, J. & Ferrier, S. (2009) Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. Ecological Applications, 19, 181–97. Omernik, J.M. (1987) Ecoregions of the conterminous United States. Map (scale 1:7,500,000). Annals of the Association of American Geographers 77: 118-125. Peterson, A.T. (2011) Ecological niche conservatism: a time-structured review of evidence. Journal of Biogeography, 38, 817–827. SEDAC. ( 2005) Last of the Wild Data Version 2 (LWP-2): Global Human Influence Index (HII). New York, NY: WCS and CIESIN. http://sedac.ciesin.columbia.edu/wildareas/downloads.jsp#infl. Warren, D.L., Glor, R.E., & Turelli, M. (2008) Environmental niche equivalency versus conservatism: quantitative approaches to niche evolution. Evolution, 62, 2868–83. Warren, D.L., Glor R.E. & Turelli, M. (2010) ENMTools: a toobox for comparative studies of environmental niche models. Ecography, 33, 607-611.