Genetic structure in linear habitats Patterns and processes underlying population genetic structure in anthropogenic linear habitats: Four hypothetical scenarios and one case study DAG Ø. HJERMANN, ROLF ANKER IMS AND SØREN BONDRUP-NIELSEN* Department of Biology, Division of Zoology, PO Box 1050 Blindern, N-0316 Oslo, Norway * University of Arcadia, British Columbia, Canada Abstract Linear landscape elements (LLEs) can function both as movement corridors and as habitats. We focus on the habitat function of LLEs, which in many man-dominated landscapes are important refuges for wildlife. The main question of this paper is: what happens with the genetic diversity and structure when most of the original habitat of a plant or animal species is destroyed and reduced to, or replaced by, a LLE? This depends both on historical (the genetic pattern before LLE was created, how the LLE was invaded) and current processes (gene flow and genetic drift). Using a numerical model, we simulate how spatial variation in allozyme frequencies develop in four different scenarios (which represent different population histories) and at varying gene flow levels and population sizes. We analyse the resulting spatial genetic variation using FST, spatial autocorrelation of allele frequencies, and correlation between allele frequencies at different loci. We find that population history often can influence patterns of spatial genetic variation for at least 50 generations after the LLE was invaded. Conversely, simple analyses of allele frequencies at two or more loci (e.g. allozyme data) can yield valuable insight about when and how the species invaded the LLE, and how much of the spatial genetic pattern that have developed after LLE invasion. As a case study, we analyse the spatial genetic variation in bush-crickets (the wart-biter Decticus verrucivorus) living in an LLE, the grassy margins along a major highway in Norway. We cencused the populations as well as the distribution of habitat along a 54 km stretch of this highway, and sampled animals for genetic analysis. As genetic markers we used one allozyme locus (malic enzyme, ME) as well as the coloration (average wing melanism, AWM). We found that the two markers covaried geographically in spite of being uncorrelated with populations. According to the model, this covariation cannot have developed after the LLE was invaded, i.e., that these characters covaried genetically also before the LLE was invaded. We also compared the spatial genetic variation between wart-biters from the onedimensional roadside habitat and wart-biters sampled from a continuous, two-dimensional habitat in 1 Paper VI the French Alps. The ME differentiation between close neighbouring populations (< 2 km apart) did not differ between the two habitats, while AWM differentiation, as predicted from genetic theory, was significantly higher in the one-dimensional habitat. Introduction Linear habitats are an important feature of many landscape types, especially in landscapes dominated by man (e.g., Forman & Godron 1986). These landscape elements are often referred to as corridors (e.g., Rosenberg 1997), but this term refers more to one of the functions of these physical entities rather than to the entities themselves. Thus, we will refer to such structures as "linear landscape elements" (LLEs), most of which can function as habitats for some species and as movement corridors for other, depending on whether they can sustain resident populations or whether individuals move from one place to another through them. A LLE can even have different functions for different individuals of the same species (Bennett et al. 1994). Zoologists dealing with landscape and metapopulation ecology have to a large extent focused on the movement function of LLEs, i.e., corridors between wildlife refuges (e.g., Harrison 1992; Beier 1993; Vermeulen 1994; Dunning et al 1995; Ruefenach & White 1995; Bowne et al. 1999; Shkedy & Saltz 2000; Danielson & Hubbard 2000). In contrast, botanists have focused on the habitat function of LLEs, often the detrimental effects of invading exotic species (e.g., Tyser & Worley 1992; Tabacchi et al. 1996; Planty-Tabacchi et al. 1996; Greenberg et al. 1997; Oleander et al. 1998; Erskine et al. 1999). However, in heavily disturbed landscapes, LLEs such as riparian forests (e.g., Knopf & Samson 1994; Bratton et al. 1994; Spackman & Hughes 1995; Burbrink et al. 1998) and roadside habitats (e.g., Major et al. 1999) can be important habitats of a wide range of wildlife (including plants), and can act as genetic reservoirs for species threatened by loss of genetic variation. In this paper, we focus on the habitat function of LLEs that result from human activities. We define LLEs to include both linear habitats created by reduction of a formerly more widespread, two-dimensional habitat type (e.g., forest remnants along rivers and lakes), or newly created habitat (e.g., roadsides). We seek the answer of the following question: what happens with the genetic diversity and structure when most of the original habitat of a plant or animal species is destroyed and reduced to, or replaced by, a linear habitat? The answer depends on both historical and current processes. Historically, it will depend on the genetic pattern before the landscape changed and how the LLE was colonised. Current processes include gene flow and drift among populations along the LLE. With respect to historical 2 Genetic structure in linear habitats processes, a crucial question is whether we can reconstruct how the genetic pattern was before the landscape changed, and how the LLE was colonised, based on the current pattern of allele frequencies. The answer to this question hinges on to which degree it is possible to separate historic patterns from patterns generated contemporary processes (Templeton et al. 1995). Spatial genetic variation of linear populations or linear arrays of populations has often been treated in theoretical papers (e.g., Kimura & Weiss 1964; Kimura & Maruyama 1971, Rousset 1997). The theoretical works, however, tend to ignore the initial conditions, i.e., they assume that gene frequencies are uniform or randomly distributed in space at the time populations are established. Also, there are other factors that may add to the theoretic effects of habitat dimensionality on genetic variation. For instance, the density of individuals can be both higher or lower in linear compared to non-linear habitats (e.g., Fauske et al. 1997; Major et al. 1999), and linearity may or may not increase dispersal (e.g., Andreassen et al. 1996). In spite of the many models of evolution in LLEs, there are very few empirical papers that focus on genetic structure in LLEs, especially with a focus on how habitat dimensionality can affect genetic variation. Thus, theory has generally remained untested on this issue. This paper consists of two parts. First, we present four scenarios of how anthropogenic LLEs might be created and how a species may establish populations along an LLE. To examine how genetic variation may develop in each scenario, we use a simulation model to examine how allele frequencies at two loci may change as a function of local population size and the level of migration (assumed to be of the stepping-stone type) between populations. In the second part, we analyse the spatial genetic variation in a species of bushcricket (the wart-biter Decticus verrucivorus) that lives in the narrow grassy strip along a highway in Norway. The observed pattern of genetic variation in this LLE is compared with the simulation results and that of wart-biters living in a continuous non-linear habitat in the French Alps. Theory Ecological scenarios The creation and colonization of LLEs may come about from different processes (Fig. 1). The LLE may either be newly created habitat (Fig. 1a,b; e.g., roadside habitat through a field, or deciduous vegetation in a power line corridor through a conifer forest) or a linear remnant after destruction of a once widespread habitat (Fig. 1c; e.g., remnants of forest 3 Paper VI A B C Figure 1. Four scenarios of establishment of LLEs and LLE populations. See text for explanation. between a clearcut and a river). In the first case, LLE populations are founded by dispersers. In the latter case, populations at least in part, stem from the animals that happened to live there when the rest of the habitat was destroyed. The animals founding LLE populations may immigrate more or less perpendicularly onto the LLE (Fig. 1a), along the LLE (Fig. 1b), or in a combination of the two. In the resulting LLE, the species may be widespread throughout, or may form more or less distinct populations separated by stretches of unsuitable habitat. In any case, the history of the corridor populations is likely to be reflected in the spatial genetic variation (SGV) among the corridor populations. To illustrate the range of possibilities, we present four scenarios which represent extreme situations or caricatures. 1. The source population is widespread and has negligible SGV. After general habitat destruction, small populations are established along the LLE. SGV among LLE populations arises is a result of genetic drift and limited dispersal of individuals between neighbouring populations along the LLE; thus, a balance is reached between genetic drift and migration. According to Wright’s classic model of isolation by distance, we expect that FST increases with time and spatial autocorrelation in allele frequencies. Fig. 1c may illustrate this scenario. 2. There is a gradient pattern in the SGV of the source population, i.e., the allele frequency of a given allele is gradually decreasing/increasing from one end of the region to the other. This gradient may originate from an isolation-by-distance model or from a hybrid zone. LLE populations are established by dispersing more or less perpendicularly onto the LLE. At the time of establishment, the spatial pattern of genetic variation generally corresponds to that of the original populations. Fig. 1a or 1c may illustrate this scenario. 4 Genetic structure in linear habitats 3. The source population is subdivided with a low migration rate between subpopulations, and long-distance migration is as common as migration between neighbouring populations (the island model). LLE populations are established in the same way as in Scenario 2. As a result allele frequencies among populations diverge, but without any specific geographic pattern. Fig. 1a may illustrate this scenario. 4. There are source populations only at the ends of the LLE at the time of LLE creation, and allele frequencies differ between the two sources. LLE populations are established by invasion along the LLE (Fig. 1b). Assuming that colonization of the LLE is relatively rapid, a steep cline or a "step" in allele frequencies is likely to develop somewhere around the middle of the LLE. We used simulation models to explore the genetic consequences of these scenarios. The simulations started at the time of LLE establishment. We explored three cases where SGV is present when populations are founded (reflecting SGV in the two-dimensional source habitat), and one case where it is absent. In addition, we varied population sizes and migration rates. The model The model represented 20 linearly arranged patches/populations. We followed two loci (A and B) with two alleles each (A1,A2; B1,B2). We assumed random recombination between the two loci (e.g., they may be on different chromosomes), and we also assumed that we can ignore effects of mutation. The allele frequencies of A1 and B1 were denoted pA and pB, respectively. Generations were non-overlapping and the number of individuals in each population was assumed to be stable. Mating was random and formed the stochastic element in the model, while we assumed that gene flow was deterministic. We ran models with four different patterns of SGV at establishment of populations in the LLE, which we will call Constant, Gradient, Random, and Step, corresponding to scenario 1-4 above (Fig. 2). In Constant we assumed no spatial variation in the allele frequencies in the ancestral population(s), so that pA = pB = 0.5 in all 20 populations at establishment. In Gradient we assumed that allele frequencies changed gradually in space at establishment, from pA = pB = 0.3 in one end of the LLE to pA = pB = 0.7 in the other. In Random, the initial variation in allele frequencies was the same as in Gradient, but pA and pB varied independently of each other and without any geographical pattern. In Step we let the 5 Paper VI 0,8 Constant 0,5 0,2 0,8 G radie nt 0,5 0,2 0,8 Random 0,5 0,2 0,8 0,5 Ste p 0,2 0 5 10 15 20 P opula tion num be r a long LLE Figure 2. Outline of the different patterns of spatial genetic variation (SPV) at LLE establishment in the simulation study. The horizontal axis represent space along the LLE, while the vertical axes represents pA and pB (the allele frequencies of A1 and B1) for each of the twenty populations along the LLE. The lines show initial SPV patterns for each of the scenarios Constant, Gradient, Random and Step (see text for further explanation). In the case of Random, pA and pB is shown with separate lines. In the case of the three other scenarios, pA = pB and each line thus represent both allele frequencies. populations in one half of the LLE start with pA = pB = 0.4 and the populations in the other half start with pA = pB = 0.6. For each pattern of initial SGV, we ran models where we varied the population size N of the LLE populations (15 or 60) and migration rate (m) between neighbouring populations (15 different values from 0 to 0.20). For simplicity, migration was supposed to occur as gamete migration among neighbouring populations (e.g., pollen spread). For each mating bout, 1-m of the gamete pool in populations with two neighbours consisted of gametes from the population itself, while its neighbours contributed with m of the gamete pool (m/2 from each neighbour). Then , N new individuals were formed by picking two gametes at 6 Genetic structure in linear habitats random from the gamete pool N times. Those N new individuals completely replaced the former generation (i.e., generations did not overlap). For the two populations at each end, 1(m/2) of the gamete pool comes from its own individuals. Thus, there was no stochastic element in the migration process. We ran 30 simulations for each combination of initial condition, N and m. For each generation, we calculated FST for each locus following Weir & Cockerham (1984). For generation 5,10, 20 and 50 we also calculated the correlation between pA and pB for each generation, the spatial autocorrelation for each allele for lag = 1 through 5, as well as the cross-correlation between pA and spatially lagged values of pB and vice versa. We used Spearman’s correlation for all correlations. To summarise the auto- and crosscorrelation results, we used the correlation coefficient at lag = 1 as well as the slope of a linear regression of the correlation coefficient vs lag. Simulation results Figure 3 sum up the main results of the simulations. The overall FST (Fig. 3, left column) is the best-known of the statistics presented. Early in the life of the LLE populations, FST was naturally strongly influenced by the genetic pattern at initiation, but also by the size of the populations. As time went by, FST increased, especially for low migration rates and no initial SGV. After 50 generations, the initial presence of a gradient had little influence compared to population size and migration rate. The plots of autocorrelation at lag = 1 (Fig. 3, middle column) reflects that at the start of the simulations, autocorrelation was positive for the simulations initiated with a gradient (especially with large populations), and zero for the other types of initial SGV. As time goes by, genetic drift served to reduce autocorrelation in the Gradient simulations with low migration rate, while migration led to autocorrelation in Constant, Random, and Step, the faster the higher migration rate. After 50 generations, autocorrelation was neither influenced by initial SGV or population size, while migration rate (especially below 0.05) is very important. The slope of the autocorrelation function (not shown) was largely a mirror image of the autocorrelation coefficient at lag = 1. 7 Paper VI 8 Genetic structure in linear habitats Figure 3 (previous page). Results of the simulation study. The figures show three different measures of genetic structure (from left to right: FST, spatial autocorrelation, and correlation between loci) at different times after populations are established in the LLE (from top to bottom: 5, 10, 20 and 50 generations after establishment). Within each figure, each line represent one combination of scenario and population size. The secenarios are Constant (open circles), Gradient (filled circles), Random (open triangles) and Step (filled triangles); population size is N = 15 (unbroken lines) or 60 (broken lines). On the horizontal axis is the migration rate (the proportion of immigrants from neighbouring populations). FST measures overall genetic differenatiaton between populations. The spatial autocorrelation is the correlation (Spearmans rank correlation coefficient) in allele frequencies between neighbouring populations (lag = 1; averaged for the two loci). The correlation between loci is correlation in allele frequencies between the two loci, measuerd by Spearmans rank correlation coefficient. See text for more details. The correlation coefficient between the loci (Fig. 3, right column) was positive only when there was a gradient in the initial allele frequencies. In that case correlation persisted for a long time, especially for large population sizes. For Gradient with K = 60, correlations declined a bit when migration rate went below 0.05. Crosscorrelation at lag = 1 is not shown because it largely resembled the ordinary correlation, and the slope of the autocorrelation function resembled a mirror image of the correlation coefficient. (In the case that the geographic gradients of allele frequency goes in opposite directions for the two loci, correlation will of course be negative, while the slope will be positive.) As we see, the different statistics can provide information on different aspects of the history and ecology of the linear habitat populations. Except for the "step" simulations, migration rate influences overall FST and autocorrelation; after 20 generations, the latter statistic is determined by migration rate alone. Population size has a distinct effect on FST when some time has passed. A genetic gradient in the founding populations can be traced as a correlation between the allele frequencies; this correlation is persistent even after 50 generations, in contrast to FST. A "step" in the allele frequencies will tend to lead to very high FST values which will be little influenced by neither population size or migration rate, which is hardly influenced by the initial conditions when more than 10 generations has passed. Case study: Wart-biters along a highway Study organism The wart-biter (Decticus verrucivorus L.) is a relatively large bush-cricket (Orthoptera: Tettigoniidae) found in meadows and grasslands in continental Europe and in the warmer parts of England and Scandinavia. In Norway it is restricted to the lowlands in the South-eastern part of the country, and it has become much less common the last decades 9 Paper VI because it requires relatively low, unfertilised meadow vegetation, a type of habitat that has become relatively uncommon in this part of the country. However, it can also live in the grassy verges along major roads, as long as this vegetation is kept down by cutting rather than by using herbicides. Although the species is not wingless, it is effectively almost flightless: it flies only when disturbed, and then only up to five meters; longer flights are extremely uncommon (Ander 1947). In Norway it occurs in relatively low densities. We have previously shown that the pattern of presence/absence in small habitat patches largely fits a metapopulation pattern and that migration capacity is fairly restricted (Hjermann & Ims 1996). Methods Landscapes, sampling and surveys The main study area was a 55 km long stretch of the major highway (E-18) that runs south-westwards from Oslo along the coast. The north end of the stretch was just south of S a m pling de sign from highw a y roa dside s 0 10 20 30 40 50 Inte nsive ly sa m ple d highw a y stre tch 37 38 39 40 41 42 Sa m pling de sign in the continuous la ndsca pe (the Fre nch Alps) 0 1 2 3 4 5 6 Figure 4. Outline of sampling areas of this study. The roadside habitat was not completely straight in reality. In the case of the roadside, the thin inner lines above and below the km-line represent habitat on the east and west side of the road, respectively, and the thick outer lines show where male wartbiters were observed. The crosses above the lines show where we sampled animals for analysis (without regard to which side of the road we sampled). 10 Genetic structure in linear habitats Holmestrand town (= 0 km) and the south end was at a tunnel entrance just before Larvik town (= 54.8 km). The stretch runs through one of the most intensively exploited agricultural landscapes in Norway, although there is a substantial proportion of forest and residential areas as well. Except for the verges of the main roads, almost no parts of the present landscape is suitable for the wart-biter. A few decades ago, however, meadows suitable for the species probably covered a substantial part of the landscape. Large parts of the road we surveyed has been built since 1980. Thus, the species has in part shifted from meadows to roadside habitats. The habitat along both sides of the road was surveyed. For each 100 m interval, we recorded the number of singing males for each 100m interval and for each side of the road. In addition, we recorded presence of major dispersal barriers (such as rivers or major crossing roads), and we categorised each side of the road into one of the following categories: grass/meadow along open habitats (such as fields), forest or cliffs/rock. In the case of grass/meadow, we also noted whether the road verge was flat or sloped towards or from the road, and whether it was narrow or broad (> 3 m). The categorisation of the habitat was done by one of the authors (DØH) that had some experience with wart-biter habitat requirements in general, but no experience of the distribution of wart-biters along this road. Both the habitat survey and the observation of animals was achieved by driving by car at moderate speed along the road. The surveys of singing males were done during optimal singing conditions (sunny days around 9 p.m. to noon); animals were counted three times on the western side of the road and four times on the eastern side. In July-August 1997 and 1998, wart-biters were sampled at 21 locations along this transect (Fig. 4a). In addition, wart-biters were sampled at two more locations further north on the same road (15 and 45 km north of the transect start, respectively). We sampled on both sides of the road in 13 of these locations (wart-biters were not always present at both sides). Of the 21 locations, we allocated eight locations to a 4.1 km stretch (km 37.4 - 41.5; Fig. 4b) that was almost continuously populated by wart-biters. For comparison with the latter subsegment of the road, we also obtained a sample of wart-biters in an elongated, 5.2 km long area at an altitude of 2000-2200 m in the French Alps (located 3 km north of the village Versoye-les-Granges north of Seez, close to the French-Italian border). In this sampling area, the entire landscape is a suitable wart-biter habitat with high densities of the species. Wartbiters were sampled from eight points with a range of 5.2 km (Fig. 4c) in July-August 1998. 11 Paper VI [For simplicity, we will throughout this paper refer to wart-biters from a sampling site as a population, although we postpone to find out to which degree they really are separate populations in the "population genetic" sense of the word.] Spatial distribution of wart-biters in the surveyed LLE For simple analyses of the numbers of observed wart-biters and the number of segments with wart-biters on each side, we used the average number per survey (four surveys on the eastern side, three on the western). Since the LLE may not be completely invaded by wart-biters yet, we analysed the abundance of males along the road using correlation analyses (Proc Arima in SAS, SAS Institute 1996). Since we were most interested in the spatial variation of wart-biters that could not be attributed to the distribution of habitat, we calculated the mean numbers of males per 100 m segment for roadside classified as habitat and non-habitat, respectively, and used the residuals from the mean in autocorrelation analyses "corrected" for presence of habitat. We ran three kinds of autocorrelation analyses on the residual abundance. First, we calculated simple and partial autocorrelation on each side of the road separately. Second, we calculated the cross-correlation between opposite sides of the road, i.e., the correlation between the corrected wart-biter abundance one side and the corrected abundance some distance north or south on the other side. In the third analysis, we examined abundance patterns on a larger scale. We divided the road in eleven intervals: 0-5 km, 5-10 km, …, 45-50 km, 50-54.8 km, and analysed the differences between these stretches by incorporating ten dummy variables that represented the eleven road intervals (i.e., contrasts between the first interval and each of the other) into a first-order autocorrelation model. In such a model, the estimated parameters for the dummy variables tell us whether there are differences in abundance between the eleven intervals that can not be attributed to distribution of habitat or the autocorrelation between proximate 100 m intervals. If wart-biters invade the LLE from a few points along it, and invasion is not complete, we expect such differences. Genetics: Laboratory and statistical methods The live-caught wart-biters were killed by freezing them at -20 °C. Within 48 hours, the animals were moved to a -80 °C freezer. We recorded each animal's sex and the wing melanism (i.e., the amount of black pigmentation on the wings), which was assessed on a scale from 0 (no black) to 5 (almost entirely black). The animals were homogenised at 0 °C 12 Genetic structure in linear habitats after removing the crop, the wings and the hind feet. After initial trials with ca. 25 enzymes, we found that only malic enzyme (ME) returned suitable results. We found two discernible alleles of this enzyme; we will call the slowest-moving allele 1 and the other allele 2. We use pi to denote the allele frequency of allele 1 in population number i. In the subsequent analyses which will be described below, we used each sample’s average wing melanism (AWM) as a genetic character independent of the ME alleles. Thus, we assumed as that a substantial portion of the variation in wing melanism could be attributed to genetic differences, and in autocorrelation analyses we used AWM as it was a gene frequency. This was taken as a a working hypothesis, since we know no studies of the heritability of melanism in this species. Among grasshoppers and crickets as a whole, there are species where coloration is primarily being controlled by developmental conditions (acridoid grasshoppers: Rowell 1971, Colvin and Cooter 1995, Sword et al. 2000; bushcrickets: Lymbery 1992) and other species where coloration is genetically controlled (groundhoppers: Forsman 1999a, 1999b; bush-crickets: Oda and Ishi 1998; wetas: King et al. 1996, Morgan-Richards and Gibbs 1996). Cherrill & Brown (1991) noted that melanism appears to vary geographically in this species, but this is not inconsistent with melanism being environmentally determined. We tested the other part of the assumption, that wing melanism and ME genotype are independent (i.e., not genetically or physiologically linked), using a logistic model. The response variable was number of allele 1 in each individual (as the outcome of a binomial process with n = 2) and the predictor variables were wing coloration and population (the latter variable was included to remove the effect of between-population correlation). The outcome of a type III likelihood ratio test (using Proc Genmod, SAS) was that wing coloration had no effect on the ME genotype, i.e. that these two variables were not correlated within the Norwegian populations (χ2 = 0.02, P = 0.90). We used Genepop 3.1 (Raymond & Rousset 1995) to analyse genetic variation based on ME genotype. We calculated genic differentiation and FST (following Weir & Cockerham 1984) for both the entire samples and for pairwise locations within each sample. If genetic variation was structured geographically as expected from the "isolation by distance" model, we expect a linear relationship between FST/(1-FST) and ln(distance) of population pairs (Rousset 1997). In Genepop, this is tested with a Mantel test, a permutation procedure (Mantel 1967). We tested this using physical distance (along the road) for the highway sample including the two locations north of the surveyed stretch. For the surveyed stretch, we 13 Paper VI also did the same test using three other distance measures, based on the assumption that forest and perhaps cliffs are barriers to dispersal: "forest distance" (distance with forest on both sides of the road); "forest + cliff distance" (distance with forest or cliffs on both sides of the road), and "non-habitat distance" (distance with no habitat on either side of the road)). We did the same tests with AWM, replacing FST with absolute difference in AWM. To be able to compare the spatial genetic pattern of the wart-biters with the results of the simulation model (see above), we performed correlation analyses of the genetic structuring of wart-biters following the same main scheme as used for the simulation results. We tested whether allele frequency (p) was correlated with average wing melanism (AWM) for each population. Autocorrelation for both p and AWM was analysed in two ways: by correlating locations separated by a certain number of locations, and by correlating locations separated by a certain distance in kilometers. For the first type of analysis on the highway sample, we pooled the 37.4, 37.7 and 38.5 km locations and the 39.8, 40.2, 40.9, 41.2 and 41.5 km locations in order to reduced the variance in between-location distance. Ordering the locations after their order in the transect (i = 1 to 17), we also tested whether p was autocorrelated, i.e., whether pi was correlated with pi -lag, where lag = 1, 2,…7 and i = 1+lag to 23. The same way, we tested whether AWM was autocorrelated. We also calculated crosscorrelations between p and AWM, i.e. the correlation between pi and AWMi -lag and between AWMi and pi -lag. In the second approach, we analysed correlation between locations separated by <5 km, 5-10 km, 10-20 km 20-30 km, 30-40 km and 40-60 km. We tested whether allele frequencies differed between the east and west sides of the road using a logistic regression of the allele frequencies with location and side and their interaction as predictor variables. Using the same predictor variables in a two-way Anova, we tested whether AWM differed between the sides of the road. Results Spatial distribution We made altogether 625 observations of singing males in 259 road segments (24 % of the 2 x 548 segments; Fig. 4a, Tab. 1). 15 % of the observations were made in segments we had classified as non-habitat for wart-biters. We observed wart-biters in a significantly higher fraction of the segments on the eastern than on the western side of the road. The reason for this was that higher fraction of the habitat segments on the eastern side of the road 14 Genetic structure in linear habitats Table 1. Some basic data from the survey of wart-biter (Decticus verrucivorus, D.v.) and its habitat Norwegian highway, including tests of differences between the east and west side of the road. Tests were either Pearson goodness-of-fit tests of independence (those reporting χ2) or Mann-Whitney U tests (those reporting Z, which refers to the standard normal distribution uinder the null hypothesis). West East Total 548 548 1096 classified as habitat 377 360 737 % classified as habitat 69 66 67 83.7 93.5 177.2 in habitat segments 66.3 84.5 150.8 Segments with D.v. (total) 110 149 259 χ2df=1 = 7.69 ** Average presence of D.v. § 0.112 0.125 0.119 Z = -2.09 * habitat segments only 0.128 0.171 0.149 Z = -3.26 *** non-habitat segments only 0.076 0.037 0.056 Z = -1.50 % non-habitat of D.v. segments 21 12 17 χ2df=1 = 3.03 (*) 0.153 0.171 0.162 Z = -2.20 * 0.176 0.235 0.205 Z = -3.39 *** 0.101 0.048 0.075 Z = -1.48 0.771 0.645 0.695 Z = -2.96 ** Number of 100 m segments Number of D.v. observed per survey No. of D.v. per segment per survey in all habitat segments in all non-habitat segments in habitat segments with D.v. (a) † in1 non-habitat segments with D.v. † 0.722 (b) 0.500 Tests χ2df=1 = 1.19 0.627 Z = -2.46 * 0,4 W es t, s im ple autocorrelation ( ) Corre lation coe fficie nt C orre lation coe fficie nt W es t, partial autocorrelation 0,8 * P < 0.10, * P < 0.05, ** P < 0.01, *** P < 0.001 0,3 East, sim ple autocorrelation The numbers show the probability of observing D.v. in a segment on a given survey East, partial autoc orrelation 0,6 † With D.v. on one or more surveys § 0,4 0,2 were occupied by wart-biters (Tab. 1). However, the two sides did not differ in the fraction of 0,2 0,1 occupied non-habitat segments, nor in the numbers of males observed in occupied segments 0 (Tab. 1). 0 1 2 3 Lag (km) -0,2 0 -3 -2 -1 0 1 2 3 Lag (km)corrected for habitat Using the residual numbers of wart-biter males (i.e., numbers distribution), we found that the abundance of observed wart-biters were spatially (c) 2 autocorrelated on both sides of the road. On the eastern side, segments up to 2.1 km apart 1,5 Re lativ e abundance were 1significantly correlated, while western segments further apart than 600 m were 0,5 East uncorrelated (Fig. 5a). In the second correlation analyses, we found that the residual number W es t 0 of wart-biters were quite strongly correlated between the eastern and western sides (r = 0.35), -0,5 and the -1 corrected number of wart-biters in one segment also correlated with segments up to -1,5 away on the other side (Fig. 5b). In the last analysis, where we looked for differences 2.2 km 0 10 20 30 40 D istance along transe ct (km) 50 15 Figure 5. Autocorrelation of wart-biter abundance. (a) for each roadside. (b) Cross-correlation between roadsides for different spatial lags (c) Abundance in 5 km segments corrected for autocorrelation. Paper VI in wart-biter abundance on a larger scale, we found that the abundance of wart-biters was substantially higher in the 30-45 km compared to the rest of the LLE (Fig. 5c). Especially, the abundance on the eastern side in the 35-45 km stretch was very high both compared to the rest of the road and compared to the same stretch on the western side. For the rest of the road, densities appeared similar between the two sides. Patterns of allele frequencies and wing melanism The ME genotype was successfully recorded for 357 animals along the highway (of 1 5 (a) 146 were caught in the intensively sampled stretch). There which was some evidence of p heterozygote deficit AW M(Fis = 0.12, P = 0.014) and good evidence for genic differentiation 0,8 4 between populations (Fst = 0.031, P = 0.005). The allele frequency of allele 1 (p) varied from 0,6 p 3 AWM 0.2 (at -45 km, n = 5) to 0.857 (at 41.2 km, n = 14) (Fig. 6a). The Mantel test for isolation by distance 0,4 (using FST/(1-FST) and loge(distance)) did not yield good evidence that gene differentiation between pairs of populations increases with distance (one-sided test, 2 0,2 P = 0.088). The same test ran using "forest distance", "forest+cliff distance" and "non-habitat distance"0 (for the 21 populations within the surveyed part of the 1transect) only resulted in -50 -40 -20P =-10 20 30 40 50 The 60 allele frequencies did not higher P-values (P =-300.26, 0.30 0and 10 P = 0.31, respectively). Dista nce a long tra nse ct (km ) differ between the east and west sides of the road (location*side interaction: χ2 = 14.27, df = (b) 1 (c) 5 1 5 12, P = 0.28; side: χ2 = 1.31, df = 1, P = 0.25). Within the intensively sampled stretch (Fig. 4b), 0there might have been a deficit of heterozygotes (Fis = 0.12, P = 0.07), but there was no ,8 0,8 4 4 evidence for genic differentiation between populations (Fig. 6b; Fst = 0.0092, P = 0.17) and no 0 ,6 AW M 0 ,4 3 p p 3 AW M 0,6 0,4 2 0 ,2 2 0,2 p p AWM 0 AWM 1 37 38 39 40 41 42 Dis ta nc e along tra ns ec t (k m ) 0 1 0 1 2 3 4 5 6 7 8 9 Dis ta nc e along tra ns ec t (k m ) Figure 6. Allele frequencies (p) of allele 1 of the ME enzyme and average wing melanism (AWM) along the Norwegian transect (a), the Norwegian subtransect in detail (b) and along the French transect (c). 16 Genetic structure in linear habitats Table 2. Gene frequencies (p) and average wing melanism (AWM) for each sampling location along the highway. Km denotes distance from the start of the road survey (the two "negative" sampling points were located outside the surveys strecth) and N denotes the sample size. The last four columns relate to differences between pairs of successive sampling locations: distance between locations, FST, the absolute difference in AWM (∆ AWM) and a P-value that summarizes the significance tests of FST and ∆ AWM. Relative to next sampling point along road Km N p AWM Km FST § -45 5 0.20 2.60 30 0,169 -15 29 0.57 2.72 19.8 4.8 4 0.25 2.25 5.9 7 0.71 7.4 21 11.4 ' * AWM † Summarized P # 0.12 0,72 0,099 0.47 0,26 1.1 0,235 0.18 0,74 2.43 1.5 -0,020 0.43 0,77 0.60 2.00 4 -0,028 0.40 1,2 10 0.55 1.60 2.4 0,088 2.07 ** 0,13 13.8 3 0.83 3.67 3.4 -0,067 1.17 1,26 17.2 16 0.72 2.50 6.1 -0,019 1.14 ** 0,58 23.3 12 0.79 3.64 10.1 -0,039 0.03 3,79 33.4 9 0.78 3.67 4 0,076 1.01 37.4 24 0.54 2.65 0.3 0,012 0.53 37.7 29 0.67 3.18 0.8 -0,016 0.55 38.5 20 0.60 2.63 1.3 -0,017 0.49 0,54 39.8 24 0.67 3.13 0.4 -0,034 0.21 1,81 40.2 11 0.73 3.33 0.7 -0,063 0.71 1,29 40.9 8 0.69 2.63 0.3 0,036 0.52 0,26 41.2 14 0.86 3.14 0.3 0,165 0.43 0,09 41.5 16 0.56 2.71 1.3 -0,026 0.10 2,53 42.8 13 0.62 2.62 2.1 0,117 0.62 0,03 44.9 11 0.32 3.23 3.2 -0,011 0.66 0,24 48.1 7 0.50 2.57 0.6 0,033 0.10 1,08 48.7 25 0.70 2.67 6 0,035 0.13 0,52 54.7 39 0.54 2.54 - - - - * * 0,014 * 0,09 * 0,32 * * P < 0.05, ** P < 0.01, *** P < 0.001 § The indications of statistical significance are results of Fisher exact tests on allele frequencies † The indications of statistical significance are results of t-tests # Summarized P = [P(FST) + P(∆ AWM)]2 17 Paper VI evidence that gene differentiation increased with distance (P = 0.43). The only pairs of successive sample points that had significantly different values of p was -45 and -15 km and 41.2 and 41.5 km (Tab. 2; note the small spatial distance of the latter pair). In the sample from the French Alps (Fig. 6c; 186 animals), there was no significant heterozygote deficit (Fis = -0.033, P = 0.54), and we found no evidence of genetic differentiation (Fst = -0.0021, P = 0.41). p varied from 0.66 to 0.85 (Fig. 6c), with no apparent structure along the transect (Mantel test for isolation by distance: P=0.67). Comparing difference in allele frequencies among neighbouring populations that lie < 2 km from each other, the average pairwise normalized FST (FST /[1 + FST]) was 0.001 (SE 0.006; N=15) in the French sample and 0.023 (SE 0.023; N=11) in the Norwegian subsample. When correcting for distance (using Ancova), this difference was not significant (F1,23 = 0.93, P = 0.34). Thus, there is no indication that genetic distance (relative to physical distance) differs between the continuous habitat in the French Alps and the intensively sampled stretch of the Norwegian highway. For average wing melanism (AWM), we find that the absolute difference in AWM for populations <2 km apart was 0.28 (SE 0.05) in the French sample and 0.45 (SE 0.07) in the Norwegian subsample. This difference was statistically somewhat significant when we corrected for distance (F1,23 = 5.47, P = 0.028). Average wing melanism (AWM) and correlations between AWM and allele frequencies (p) The average wing melanism (AWM) varied from around 2 to 3.5 (Fig. 6a). The Norwegian animals were significantly darker than the French (Anova, F22,329 = 3.05, P < 0.0001). Within the Norwegian transect, there was good evidence that p was positively correlated with AWM (Spearman’s rho = 0.52, p = 0.011, Fig. 6a, b). The change in p and AWM between consecutive populations was correlated as well (Spearman’s rho = 0.49, p = 0.019), especially for the 15 populations from km 4.8 through km 41.5 (Spearman’s rho = 0.86, P < 0.0001). However, when testing which pairs of successive sample points that were significantly different with respect to p and AWM, respectively, different sets of pairs resulted (Tab. 2). There were quite large differences with regard to both p and AWM between the pairs 33.4-37.4 km and 42.8-44.9 km. The two variables did not correlate in the French transect (Spearman’s rho = -0.31, P = 0.46, Fig. 6c). 18 Genetic structure in linear habitats Autocorrelation of p and AWM, and cross-correlation between p and AWM Correlating populations separated by a number of populations along the road, p might have had a tendency to be positively correlated for lag = 1 through 3, and negatively correlated thereafter. However, the only correlations significant with P < 0.05 were with lag = 6 and 7 (Fig. 7a). AWM showed generally the same tendencies, but in this case no correlations were significant at the 5 % level (Fig. 7a). Correlating populations separated by a number of kilometres instead, there was is no tendency of autocorrelation of p at all, while AWM shows a similar response as in the first approach, but with the only significant correlation being a negative correlation at a lag of 20-30 km. There were also substantial cross-correlation between p and AWM along the road (Fig. 7b). The correlation between pi and AWMi-lag seemed to decline to a minimum and thereafter rise again, being significantly negatively correlated at lag = 4 populations or 10-20 km (Fig. 7b). AWMi was positively correlated with pi -lag at lag = 2 or 3 populations; the same pattern could be traced when physical distance was used as lag, but much weaker and uncertain. (a) (b) p(i) vs . AWM(i-lag) p 0,8 0,8 AWM(i) vs . p(i-lag) 0,6 0,6 0,4 0,4 0,2 0 -0,2 0 2 4 6 8 Spe a rm a n’s r S pe a rm a n’s r AWM 0,2 0 -0,2 -0,4 -0,4 -0,6 -0,6 -0,8 0 2 4 6 -0,8 La g La g Figure 7. (a) Autocorrelation for allele frequency (p) and melanism (AWM) and in the Norwegian transect. (b) Cross-correlation between allele frequency (p) and melanism (AWM). In both panels, "lag" (on the horizontal axis) is number of populations, and Spearman’s correlation coefficient was used. The eight populations in the Norwegian subtransect were pooled to two populations (see Methods). Correlation values above/below broken lines are statistically significant (P<0.05). 19 8 Paper VI Discussion The use of correlation techniques in the study of spatial genetic variation has been fairly well studied by simulation studies (reviewed by, e.g., Epperson 1993). However, the simulation presented here differs from most other such studies in a number of respects. First, we emphasise the influence of nonrandom gene frequency distributions at population establishment. It is a more usual approach to initiate populations with some "null" distribution of allele frequencies, i.e., a random or constant distribution. Second, we emphasise the situation from the time of colonization of the linear habitat and for a relatively short period thereafter. Commonly, simulations are run for at least a thousand generations and focus is often on the situation several hundreds of generations after initiation, when some kind of equilibrium has been reached. Third, we have dealt with two loci rather than one, and explored the covariation between their allele frequencies. This is not common, simply because two loci are not expected to correlate a lot if it is assumed that they are unlinked and independently distributed initially (but see Epperson 1993). All of these points reflect that we focus on anthropogenic linear habitats. Most of these landscape elements have been created fairly recently (although the habitat itself may be ancient, as in Fig. 1c). The individuals that established in these habitats were not necessarily a random sample of a single source population; they just as likely came from a variety of widely spaced source populations that may have displayed substantial geographic genetic variation. Kimura & Weiss (1964) analysed mathematically a model quite similar to ours; populations in a one-dimensional array that only exchange individuals with adjacent populations at a rate of m/2 per generation. They found that at equilibrium, FST was related to Nm (the product of population size and migration rate) by the function FST = 1/[4N(2mµ)0.5 + 1], where µ is the mutation rate. One prediction from this model is that genetic differentiation relative to distance is expected to be larger in linear than in two-dimensional habitats (e.g., Wright’s island model where all populations exchange individuals). Another prediction is that in the absence of mutation, FST approaches 1, independently of migration rate as the mutation rate approaches zero. This means that if the system is initiated with FST = 0, FST will increase until it is counterbalanced by the effects of mutation. (It thereby differs from the island model, where FST = 1/[4N(m+µ) + 1]). This is in accordance with our model (where we 20 Genetic structure in linear habitats 21 Paper VI assume zero mutation); we find FST to be increasing steadily through time for all migration rates. A main result of the simulations is that if the gene frequencies of two loci co-vary geographically when populations are initiated, the correlation persists for as 50 generations after initiation even when the loci are completely unlinked and the level of migration is quite high. A just as striking result is that neither the Step or Random models can be distinguished from each other or the Constant model (except for differences in FST). If the parameters are fairly well known, simulations can be used to predict roughly how the spatial genetic pattern will evolve in a linear habitat that is currently under establishment. As linear habitats become increasingly important habitats many species, they may also become vital for the conservation of genetic diversity within species. Such conservation depends on keeping an optimal balance between genetic drift and gene flow - too much gene flow may eradicate differences between locally adapted genotypes, while too little gene flow may cause inbreeding depression. However, these simulations may have greater practical importance as a tool that potentially can help us make inferences about historical and contemporary processes. In Table 3 we have indicated some ways to combine known facts of the populations with simple statistics on genetic data to estimate unknown or uncertain parameters. Effective population Table 3. Principles of how genetic data can be used to obtain estimates of some variables, given that we from other sources of information know the value(s) of at least some other variables. The variables considered here are spatial genetic pattern at initiation of the populations (Init), effective size of each (sub-)population (N), migration rate (m) and the number of generations since LLE populations were established (age). By "correlation", we mean both autocorrelation within loci and cross-correlation between loci. For further explanation, see text. Known variables Init N m X X age From correlation: find whether initiated as a gradient or not; then, find age from autocorrelation. If age < 10 gen.: find initial conditions from FST. If age > 10 gen.: find age from FST as a double-check. X 22 Interpretation of statistics From correlation: find whether initiated as a gradient or not. If not initiated as a gradient, or age > 20 gen.: find m from autocorrelation; then, find N from FST (if age > 10 gen.). If initiated as a gradient: find N from correlation; then, find m from FST and/or autocorrelation. Genetic structure in linear habitats size, effective migration rates, and the length of time since populations were established, are obviously very important parameters that would be useful to know for a large number of purposes. Of course, interpretation of the statistics is not as straightforward in the real world as in a simulation study. In a case study, we are likely to run into some of the following obstacles: The populations along the transect are not equally large and not equally spaced, and therefore the opposing forces of genetic drift and gene flow varies along the habitat. The "ecological" distances between populations may be hard to determine because we do not know how different habitat types and barriers influence migration. Population sizes and migration rates may vary substantially in time. Migration is seldom of the strict stepping-stone type. Colonization is often not a single event, and the immigration routes may be complex and give rise to more complex patterns of SGV. Even relatively weak selection (e.g., stabilising selection) may influence the results strongly. And of course, in a single case study, deviations from the expected mean of a statistic may be large. Nevertheless, genetic data offers two aspects of knowledge that we usually cannot obtain using ecological studies alone. The first is a long-term perspective on population sizes and migration rates. Ecological studies typically last only a few years, while genetic patterns may reflect the balance between drift and gene flow for several decades. (The latter is not always an advantage; when the landscape has changed recently, the current state of affairs may be much more interesting than estimating parameters that have been averaged over a long time.) The second is a source of information about patterns and processes in the past. The simulations presented here shows that some past genetic patterns, such as gradients in allele frequencies, leave "fingerprints" that last for a long time, while other patterns are not so simple to detect from the relatively simple statistics we have used here. There are, however, more sophisticated genetic techniques that can give more detailed and reliable information on population history. Especially useful is information on genealogical relationships based on nucleotide-level data on the rapidly evolving mDNA, e.g. Aars et al. (1998), Clark et al. (1999) and Turner et al. (2000). With such data one can potentially deduce, e.g., immigration routes (Clark et al.1999). Although an increasing number of studies employs such techniques, however, they remain relatively time-consuming, expensive and in demand of good facilities compared to allozymic data. This is especially relevant with regard to ’third world’ nations, who lags far behind the West regarding knowledge of nature, especially landscape ecology and the impact of human development on 23 Paper VI nature. It is therefore important to be able to make the best possible use of simple genetic data such as allozyme data. The case study illustrates both the potential of this method and the problems of interpretation. The relatively strong correlation between ME allozyme frequency (p) and average wing melanism (AWM) is a quite certain clue that p and AWM varied geographically in the maternal population before the road verge populations were founded, so that part of the genetic variance observed today is "ancient" variance. The quite clear cross-correlation pattern support this conclusion. Probably, p and AWM had their maxima in the area around 15-35 km, declining in both directions (Fig. 6) already before the road was constructed. The alternative hypothesis, that this pattern of variation has developed after the roadside populations were established, is unfortunately not easily tested, since many of the source populations are extinct. It is easier to "test" the age of the roadside populations, which according to the simulations should be indicated by the observed magnitude of the p-AWM correlation. Comparing the observed correlation with the simulation results indicates a quite short time since populations were established, probably not more than 10 years. This fits well with the fact that much of the road, especially the parts around 15-45 km, was constructed around 15 years ago, which corresponds to 7-8 wart-biter generations (it has a two-year life cycle). The high correlation coefficient, the high spatial autocorrelation and the quite low FST value indicate that the average population size is closer to 60 individuals than to 15, and the frequency of migration is likely to be above 0.05 than below. We have no good independent measures of neither independent measures of neither migration rate nor population sizes. Nevertheless, we have some data that we can use to "guesstimate" population sizes. We observed >10 singing males per survey in only four of the 21 populations; the median was 5 singing males per survey. We collected (for analysis) >10 individuals in 14 populations and >20 in six populations. However, we did not collect very intensively in many of the populations. A better indication comes from the highest ratios between the number of collected animals and the number of singing males in populations. This ratio was around 4-6 or more in seven populations (10-15 in two populations!). Thus, to arrive at a total number of individuals, we can multiply the number of singing males with at least 4-6. From this we conclude that the median number of individuals per population was at least 20-30, perhaps as much as 50-70. The prediction that the average population size is in the order of 60 therefore seems quite plausible. 24 Genetic structure in linear habitats In the case of the migration rates, Hjermann & Ims (1996) estimated the average dispersal distance in corn fields to be ca. 40 m. Comparing this distance with the distances between sampling points along the roadside, 5 % immigrants or more sounds too much. However, it is likely that animals disperse much longer along a roadside than in a corn field. From the data published in Hjermann & Ims (1996), it can be estimated that at least 10 % and more probably 25-50 % of the males emigrate from their original habitat patch (Hjermann, unpublished results). An immigration rate as high as 5% is therefore at least not entirely out of the question. There are some other features of the roadside data that does not agree very well with the simulations. First, Fig. 6 indicates that there are one ore more coinciding "breakpoints" or discontinuities in p and AWM (between 11.4 and 13.8 km and between 33.4 and 37.4 km). This is not expected from the Gradient model, and indicates strong isolation between neighbouring sampling points or colonization from two different sources. The two hypotheses does not contradict each other, and either hypothesis is supported by the fact that each of the two pairs of the aforementioned sampling sites are separated by quite long stretches where the road runs through forest (700 m in both cases; only three neighbouring sampling points were separated by more than 300 m forest). The latter hypothesis is supported by the distribution data, which indicate that not all habitat along the road is saturated. Thus, a combination of the gradient and step models (scenarios 3 and 4) may have been involved in the process of roadside colonisation: parts of the roadside are invaded by wart-biters spreading along the roadside (in contrast to invasion form the surrounding landscape), and the process of invasion is not yet finished. The second and perhaps most disturbing anomaly in the roadside data is the extremely close correspondence between p and AWM in a 37 km long portion of the fragment (4.8 to 42.5 km, Fig. 6). In the case of the gradient scenario, we would expect p and AWM to correlate by following roughly the same pattern on a large scale (e.g., increasing to and thereby decreasing), while they would vary more independently on a finer scale due to genetic drift. However, p and AWM have strikingly similar ups and downs in this stretch. There are three possible explanations for this. The first possibility is that the genetic pattern in the source populations may have been more complicated than a simple gradient or U-shaped pattern of gene frequencies, perhaps due to patterns of dispersal in this landscape. Secondly, patterns of dispersal during colonisation of the road may have caused the present pattern, i.e., that the colonization routes followed landscape patterns and LLEs rather than straight lines 25 Paper VI perpendicularly onto the road. Third, p and AWM may be subject to selection pressures that vary in the same manner geographically, which obviously would invalidate a lot of this analysis by breaking the assumption of neutral variation. The latter explanation is not improbable (e.g., Houle 1989; Karl and Avise 1992; McDonald et al. 1996). The lack of a significant isolation-by-distance effect on pairwise FST values was at least partly a result of the large-scale variation in p. Because the large-scale pattern in p is formed like an inverted U, remote sample points have the same allele frequencies as proximate sample points, which makes it difficult to determine whether there is an isolationby-distance effect superimposed on this pattern. The data are somewhat ambiguous regarding genetic differentiation in the one-dimensional roadside versus the two-dimensional, continuous Alp landscape. For the ME allele, we found no difference in genetic differentiation, while colour differentiation between close neighbours was higher in the roadside, as expected from theory (Kimura and Weiss 1964). The highway is a relatively recent structure built through farmlands forests, i.e., areas uninhabited by wart-biters prior to road construction, while the French sample is collected in an ancient habitat where allele frequencies probably are much closer to an equilibrium between gene flow and genetic drift. Thus, the genetic pattern in the two landscapes has developed over very different timescales, which makes it difficult to draw any interferences about differences or similarities in migration rate. E.g., many of the animals from the intensively sampled road stretch may stem from a common population, and there may not have been enough time to develop genetic differences. Also, it is difficult to separate the effect of landscape configuration on dispersal from the effects of the social setting during dispersal, i.e., dispersal through and to empty sites along the road vs. dispersal through a landscape saturated with conspecifics. A somewhat unusual feature of our case study is the use of a phenotypic trait in place of gene frequencies. It was previously widely believed that polygenic or quantitative traits would exhibit little or no spatial autocorrelation because the separate distributions for different loci (unless they are linked) should be essentially independent (Sokal and Wartenberg, 1983; Epperson 1990, 1993). However, Lande (1991) showed that quantitative traits can have spatial correlations similar to those for single loci, a result that was confirmed by Nagylaki (1994) when he reformulated and generalized Lande’s stepping-stone model. These studies and recent empirical studies of differentiation (e.g., Long and Singh 1995; Podolsky and Holtsford 1995) demonstrate that phenotypic traits are not outdated as genetic markers. Because a population phenotypically can be characterised with mean and variance of 26 Genetic structure in linear habitats the trait, as opposed to frequencies of genes, phenotypic characters may often be a statistically more powerful tool for differentiating between populations than genetic data. However, the value of phenotypic traits is much increased if approximate heritability are available; e.g., this allows estimation of measures of population subdivision such as FST (Kremer et al. 1997). One of the weak points in our case study is lack of knowledge of the heritability of melanism in the wart-biter. With the basis in our own observations, we use the following argument to justify that melanism appears to be mainly genetically controlled, independently of ME genotype. The alternative hypotheses are that (1) melanism is mainly environmentally controlled, or (2) it is genetically controlled, but not independently of ME genotype (i.e., the gene coding for ME may influence melanism through physiological mechanisms, or the ME and melanism genes may be linked). In Methods, we found that melanism and ME appears to be completely independent within populations, which refutes (2). We find (1) to be very improbable, since AWM and p correlates strongly between populations; nothing indicates that this variation follows any environmental gradient, and nothing indicates that the types of ME allozymes are environmentally controlled. Given the abundance of linear habitats and that fairly much theoretical work on genetic variation in linear population arrays, it is surprising how difficult it is to find empirical studies of this. One of the few examples is Aars et al. (1998), who studied variation in mitochondrial DNA among bank voles along a riparian habitat. They found that the genetic similarity between individuals decreased with distances up to 2 km and flattened out thereafter. In contrast, Stacy et al. (1997), who used the same nucleotide sequence and species in non-linear habitat in the same region, had to increase distance to 30 km to achieve as low genetic similarity between trapping sites as found by Aars et al. (1998). In addition to the theoretical effects of linear versus non-linear systems on genetic diversity, Aars and colleagues hypothesised that long-distance dispersal may be more inhibited in a linear habitat because resident females may be more successful at defending their territories when the habitat is only 10 m wide. Furthermore, the large temporal fluctuations in the density of bank voles in this study area may influence genetic patterns. Thus, genetic theory, behaviour and population dynamics may all influence population genetic processes, which hints at how difficult it may be to predict genetic patterns in nature. We do not claim that the simulation exercise we have presented here in are the ultimate answer to how linear habitats shape genetic diversity. On the contrary, we realise that our models only are represent a small part of "model space". E.g., dispersal might extend 27 Paper VI further than to the nearest neighbour, and the effect of dispersal on gene frequencies is likely be stochastic rather than deterministic. If animals are distributed continuously along the habitat, an entirely different class of models should be employed. Incorporating and analysing the effect of variable population sizes and between-population distances would also make models much more useful in practice. Lastly, repeated cross-pollination between theory and empiry has proved to be an effective way to improve biological knowledge. The paucity of empirical studies on genetic variation in linear versus two-dimensional habitats currently impedes the advancement of our understanding of this important class of habitats. References Aars, J., Ims, R.A., Liu, H.P., Mulvey, M., and Smith, M.H. (1998). Bank voles in linear habitats show restricted gene flow as revealed by mitochondrial DNA (mtDNA). Molecular Ecology 7: 1383-1389. Ander, K. (1947). Flygförmågan hos våra hopprätvinger [Flight ability of our Orthoptera]. Fauna Och Flora 42: 210-221. Andreassen, H.P., Halle, S., and Ims, R.A. (1996). Optimal width of movement corridors for root voles: Not too narrow and not too wide. Journal of Applied Ecology 33: 63-70. Beier, P. (1993). Determining minimum habitat areas and habitat corridors for cougars. Conservation Biology 7: 94-108. Bennett, A.F., Henein, K., and Merriam, G. (1994). Corridor use and the elements of corridor quality - chipmunks and fencerows in a farmland mosaic. Biological Conservation 68: 155-165. Bowne, D.R., Peles, J.D., and Barrett, G.W. (1999). Effects of landscape spatial structure on movement patterns of the hispid cotton rat (Sigmodon hispidus). Landscape Ecology 14: 53-65. Bratton, S.P., Hapeman, J.R., and Mast, A.R. (1994). The lower Susquehanna river gorge and floodplain (USA) as a riparian refugium for vernal, forest-floor herbs. Conservation Biology 8: 1069-1077. 28 Genetic structure in linear habitats Burbrink, F.T., Phillips, C.A., and Heske, E.J. (1998). A riparian zone in southern Illinois as a potential dispersal corridor for reptiles and amphibians. Biological Conservation 86: 107115. Cherrill, A. J. and Brown, V. K. (1991). Variation in coloration of Decticus verrucivorus (L.) (Orthoptera: Tettigoniidae) in Southern England. Entomologist’s Gazette 42: 175-183. Clark, A.M., Bowen, B.W., and Branch, L.C. (1999). Effects of natural habitat fragmentation on an endemic scrub lizard (Sceloporus woodi): an historical perspective based on a mitochondrial DNA gene genealogy. Molecular Ecology 8: 1093-1104. Colvin, J. and Cooter, R.J. (1995). Diapause induction and coloration in the Senegalese grasshopper, Oedaleus senegalensis. Physiological Entomology 20: 13-17. Danielson, B.J. and Hubbard, M.W. (2000). The influence of corridors on the movement behavior of individual Peromyscus polionotus in experimental landscapes. Landscape Ecology 15: 323-331. Dunning, J.B., Borgella, R., Clements, K., and Meffe, G.K. (1995). Patch isolation, corridor effects, and colonization by a resident sparrow in a managed pine woodland. Conservation Biology 9: 542-550. Epperson, B.K. (1993). Recent advances in correlation studies of spatial patterns of geneticvariation. Evolutionary Biology 27: 95-155. Epperson, B.K. (1995). Spatial structure of 2-locus genotypes under isolation by distance. Genetics 140: 365-375. Erskine, W.D., Terrazzolo, N., and Warner, R.F. (1999). River rehabilitation from the hydrogeomorphic impacts of a large hydro-electric power project: Snowy River, Australia. Regulated Rivers-Research & Management 15: 3-24. Fauske, J., Andreassen, H.P., and Ims, R.A. (1997). Spatial organization in a small population of the root vole Microtus oeconomus in a linear habitat. Acta Theriologia 42: 79-90. Forman, R. T. T. and Godron, M. (1986). Landscape Ecology. Wiley, New York. 29 Paper VI Forsman, A. (1999). Reproductive life history variation among colour morphs of the pygmy grasshopper Tetrix subulata. Biological Journal of the Linnean Society 67: 247-261. Forsman, A. and Appelqvist, S. (1999). Experimental manipulation reveals differential effects of colour pattern on survival in male and female pygmy grasshoppers. Journal of Evolutionary Biology 12: 391-401. Greenberg, C.H., Crownover, S.H., and Gordon, D.R. (1997). Roadside soils: A corridor for invasion of xeric scrub by nonindigenous plants. Natural Areas Journal 17: 99-109. Harrison, R.L. (1992). Toward a theory of inter-refuge corridor design. Conservation Biology 6: 293-295. Hjermann, D.Ø. and Ims, R.A. (1996). Landscape ecology of the wart-biter Decticus verrucivorus in a patchy landscape. Journal of Animal Ecology 65: 768-780. Houle, D. (1989). Allozyme-associated heterosis in Drosophila melanogaster. Genetics 123: 789-801. Karl, S.A. and Avise, J.C. (1992). Balancing selection at allozyme loci in oysters implications from nuclear rflps. Science 256: 100-102. Kimura, M. and Maruyama, T. (1971). Pattern of neutral polymorphism in a structured population. Genetical Research 18: 125-131. Kimura, M. and Weiss, G. H. (1964). The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics 49: 561-576. King, T.M., Wallis, G.P., Hamilton, S.A. and Fraser, J.R. (1996). Identification of a hybrid zone between distinctive colour variants of the alpine weta Hemideina maori (Orthoptera: Stenopelmatidae) on the rock and pillar range, southern New Zealand. Molecular Ecology 5: 583-587. Knopf, F.L. and Samson, F.B. (1994). Scale perspectives on avian diversity in western riparian ecosystems. Conservation Biology 8: 669-676. 30 Genetic structure in linear habitats Kremer, A., Zanetto, A., and Ducousso, A. (1997). Multilocus and multitrait measures of differentiation for gene markers and phenotypic traits. Genetics 145: 1229-1241. Lande, R. (1992). Neutral theory of quantitative genetic variance in an island model with local extinction and colonization. Evolution 46: 381-389. Long, A.D. And Singh, R.S. (1995). Molecules versus morphology - the detection of selection acting on morphological characters along a cline in Drosophila melanogaster. Heredity 74: 569-581. Lymbery, A.J. (1992). The environmental control of coloration in a bush-cricket, Mygalopsis marki Bailey (Orthoptera, Tettigoniidae). Biological Journal of the Linnean Society 45: 71-89. Major, R.E., Christie, F.J., Gowing, G., and Ivison, T.J. (1999a). Age structure and density of red-capped robin populations vary with habitat size and shape. Journal of Applied Ecology 36: 901-908. Major, R.E., Smith, D., Cassis, G., Gray, M., and Colgan, D.J. (1999b). Are roadside strips important reservoirs of invertebrate diversity? A comparison of the ant and beetle faunas of roadside strips and large remnant woodlands. Australian Journal of Zoology 47: 611-624. Mantel, N. (1967). The detection of disease clustering and a generalized regression approach. Cancer Research 27: 209-220. McDonald, J.H., Verrelli, B.C., and Geyer, L.B. (1996). Lack of geographic variation in anonymous nuclear polymorphisms in the American oyster, Crassostrea virginica. Molecular Biology and Evolution 13: 1114-1118. Morgan-Richards, M. and Gibbs, G.W. (1996). Colour, allozyme and karyotype variation show little concordance in the New Zealand giant scree weta Deinacrida connectens (Orthoptera: Stenopelmatidae). Hereditas 125: 265-276. Nagylaki, T. (1994). Geographical variation in a quantitative character. Genetics 136: 361381. 31 Paper VI Oda, K. and Ishii, M. (1998). Factors affecting adult color polymorphism in the meadow grasshopper, Conocephalus maculatus (Orthoptera:Tettigoniidae). Applied Entomology and Zoology 33: 455-460. Podolsky, R.H. And Holtsford, T.P. (1995). Population-structure of morphological traits in Clarkia dudleyana.1. Comparison of f-st between allozymes and morphological traits. Genetics 140: 733-744. Raymond, M. and Rousset, F. (1995). GENEPOP (version 1.2): population genetics software for exact tests and ecumenicism. Journal of Heredity 1996: 248-249. Rousset, F. (1997). Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics 145: 1219-1228. Rowell, C.H.F. (1971). The variable coloration of the acriodid grasshoppers. Advances in Insect Physiology 71: 145-198. Ruefenacht, B. and Knight, R.L. (1995). Influences of corridor continuity and width on survival and movement of deermice Peromyscus maniculatus. Biological Conservation 71: 269-274. SAS Institute (1996). SAS for Windows, release 6.12. SAS Institute, Cary, North Carolina, USA. Shkedy, Y. and Saltz, D. (2000). Characterizing core and corridor use by Nubian ibex in the Negev Desert, Israel. Conservation Biology 14: 200-206. Spackman, S.C. and Hughes, J.W. (1995). Assessment of minimum stream corridor width for biological conservation - species richness and distribution along mid-order streams in Vermont, USA. Biological Conservation 71: 325-332. Sword, G.A., Simpson, S.J., El Hadi, O.T.M. and Wilps, H. (2000). Density-dependent aposematism in the desert locust. Proceedings of the Royal Society of London Series BBiological Sciences 267: 63-68. 32 Genetic structure in linear habitats Tabacchi, E., Planty-Tabacchi, A. M., Salinas, M. J., and Décamps, H. (1996). Landscape structure and diversity in riparian plant communities: a longitudinal comparative study. Regulated Rivers: Research and Management 12: 367-390. Templeton, A.R., Routman, E., and Phillips, C.A. (1995). Separating population-structure from population history - a cladistic analysis of the geographical-distribution of mitochondrial-dna haplotypes in the tiger salamander, Ambystoma tigrinum. Genetics 140: 767-782. Turner, T.F., Trexler, J.C., Harris, J.L., and Haynes, J.L. (2000). Nested cladistic analysis indicates population fragmentation shapes genetic diversity in a freshwater mussel. Genetics 154: 777-785. Tyser, R.W. and Worley, C.A. (1992). Alien flora in grasslands adjacent to road and trail corridors in Glacier National Park, Montana (USA). Conservation Biology 6: 253-262. Vermeulen, H.J.W. (1994). Corridor function of a road verge for dispersal of stenotopic heathland ground beetles (Carabidae). Biological Conservation 69: 339-349. Weir, B. S. and Cockerham, C. C. (1984). Estimating F-statistics for the analysis of population structure. Evolution 38: 1358-1370. 33 Paper VI Tables Table 1. Some basic data from the survey of wart-biter (Decticus verrucivorus, D.v.) and its habitat Norwegian highway, including tests of differences between the east and west side of the road. Tests were either Pearson goodness-of-fit tests of independence (those reporting χ2) or Mann-Whitney U tests (those reporting Z, which refers to the standard normal distribution uinder the null hypothesis). West East Total 548 548 1096 classified as habitat 377 360 737 % classified as habitat 69 66 67 83.7 93.5 177.2 in habitat segments 66.3 84.5 150.8 Segments with D.v. (total) 110 149 259 χ2df=1 = 7.69 ** Average presence of D.v. § 0.112 0.125 0.119 Z = -2.09 * habitat segments only 0.128 0.171 0.149 Z = -3.26 *** non-habitat segments only 0.076 0.037 0.056 Z = -1.50 % non-habitat of D.v. segments 21 12 17 χ2df=1 = 3.03 (*) 0.153 0.171 0.162 Z = -2.20 * in all habitat segments 0.176 0.235 0.205 Z = -3.39 *** in all non-habitat segments 0.101 0.048 0.075 Z = -1.48 in habitat segments with D.v. † 0.771 0.645 0.695 Z = -2.96 ** in non-habitat segments with D.v. † 0.722 0.500 0.627 Z = -2.46 * Number of 100 m segments Number of D.v. observed per survey No. of D.v. per segment per survey ( ) Tests χ2df=1 = 1.19 * P < 0.10, * P < 0.05, ** P < 0.01, *** P < 0.001 The numbers show the probability of observing D.v. in a segment on a given survey † With D.v. on one or more surveys § 34 Genetic structure in linear habitats Table 2. Gene frequencies (p) and average wing melanism (AWM) for each sampling location along the highway. Km denotes distance from the start of the road survey (the two "negative" sampling points were located outside the surveys strecth) and N denotes the sample size. The last four columns relate to differences between pairs of successive sampling locations: distance between locations, FST, the absolute difference in AWM (∆ AWM) and a P-value that summarizes the significance tests of FST and ∆ AWM. Relative to next sampling point along road Km N p AWM Km FST § -45 5 0.20 2.60 30 0,169 -15 29 0.57 2.72 19.8 4.8 4 0.25 2.25 5.9 7 0.71 7.4 21 11.4 ' * AWM † Summarized P # 0.12 0,72 0,099 0.47 0,26 1.1 0,235 0.18 0,74 2.43 1.5 -0,020 0.43 0,77 0.60 2.00 4 -0,028 0.40 1,2 10 0.55 1.60 2.4 0,088 2.07 ** 0,13 13.8 3 0.83 3.67 3.4 -0,067 1.17 1,26 17.2 16 0.72 2.50 6.1 -0,019 1.14 ** 0,58 23.3 12 0.79 3.64 10.1 -0,039 0.03 3,79 33.4 9 0.78 3.67 4 0,076 1.01 37.4 24 0.54 2.65 0.3 0,012 0.53 37.7 29 0.67 3.18 0.8 -0,016 0.55 38.5 20 0.60 2.63 1.3 -0,017 0.49 0,54 39.8 24 0.67 3.13 0.4 -0,034 0.21 1,81 40.2 11 0.73 3.33 0.7 -0,063 0.71 1,29 40.9 8 0.69 2.63 0.3 0,036 0.52 0,26 41.2 14 0.86 3.14 0.3 0,165 0.43 0,09 41.5 16 0.56 2.71 1.3 -0,026 0.10 2,53 42.8 13 0.62 2.62 2.1 0,117 0.62 0,03 44.9 11 0.32 3.23 3.2 -0,011 0.66 0,24 48.1 7 0.50 2.57 0.6 0,033 0.10 1,08 48.7 25 0.70 2.67 6 0,035 0.13 0,52 54.7 39 0.54 2.54 - - - - * * 0,014 * 0,09 * 0,32 * * P < 0.05, ** P < 0.01, *** P < 0.001 § The indications of statistical significance are results of Fisher exact tests on allele frequencies † The indications of statistical significance are results of t-tests # Summarized P = [P(FST) + P(∆ AWM)]2 35 Paper VI Table 3. Principles of how genetic data can be used to obtain estimates of some variables, given that we from other sources of information know the value(s) of at least some other variables. The variables considered here are spatial genetic pattern at initiation of the populations (Init), effective size of each (sub-)population (N), migration rate (m) and the number of generations since LLE populations were established (age). By "correlation", we mean both autocorrelation within loci and cross-correlation between loci. For further explanation, see text. Known variables Init N m X X age From correlation: find whether initiated as a gradient or not; then, find age from autocorrelation. If age < 10 gen.: find initial conditions from FST. If age > 10 gen.: find age from FST as a double-check. X 36 Interpretation of statistics From correlation: find whether initiated as a gradient or not. If not initiated as a gradient, or age > 20 gen.: find m from autocorrelation; then, find N from FST (if age > 10 gen.). If initiated as a gradient: find N from correlation; then, find m from FST and/or autocorrelation. Genetic structure in linear habitats Figure legends Fig. 1. Four scenarios of establishment of LLEs and LLE populations. See text for explanation. Fig. 2. Outline of the different patterns of spatial genetic variation (SPV) at LLE establishment in the simulation study. The horizontal axis represent space along the LLE, while the vertical axes represents pA and pB (the allele frequencies of A1 and B1) for each of the twenty populations along the LLE. The lines show initial SPV patterns for each of the scenarios Constant, Gradient, Random and Step (see text for further explanation). In the case of Random, pA and pB is shown with separate lines. In the case of the three other scenarios, pA = pB and each line thus represent both allele frequencies. Fig. 3. Results of the simulation study. The figures show three different measures of genetic structure (from left to right: FST, spatial autocorrelation, and correlation between loci) at different times after populations are established in the LLE (from top to bottom: 5, 10, 20 and 50 generations after establishment). Within each figure, each line represent one combination of scenario and population size. The secenarios are Constant (open circles), Gradient (filled circles), Random (open triangles) and Step (filled triangles); population size is N = 15 (unbroken lines) or 60 (broken lines). On the horizontal axis is the migration rate (the proportion of immigrants from neighbouring populations). FST measures overall genetic differenatiaton between populations. The spatial autocorrelation is the correlation (Spearmans rank correlation coefficient) in allele frequencies between neighbouring populations (lag = 1; averaged for the two loci). The correlation between loci is correlation in allele frequencies between the two loci, measuerd by Spearmans rank correlation coefficient. See text for more details. Fig. 4. Outline of sampling areas of this study. The roadside habitat was not completely straight in reality. In the case of the roadside, the thin inner lines above and below the km-line represent habitat on the east and west side of the road, respectively, and the thick outer lines show where male wartbiters were observed. The crosses above the lines show where we sampled animals for analysis (without regard to which side of the road we sampled). Fig. 5. Autocorrelation of wart-biter abundance. (a) for each roadside. (b) Cross-correlation between roadsides for different spatail lags (c) Abundance in 5 km segments corrected for autocorrelation Fig. 6. Allele frequencies (p) of allele 1 of the ME enzyme and average wing melanism (AWM) along the Norwegian transect (a), the Norwegian subtransect in detail (b) and along the French transect (c). 37 Paper VI Fig. 7. (a) Autocorrelation for allele frequency (p) and melanism (AWM) and in the Norwegian transect. (b) Cross-correlation between allele frequency (p) and melanism (AWM). In both panels, "lag" (on the horizontal axis) is number of populations, and Spearman’s correla 38 Genetic structure in linear habitats 1 (a) 5 p AW M 0,8 4 p 3 0,4 AWM 0,6 2 0,2 0 1 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 Dista nce a long tra nse ct (km ) 1 (c) 1 5 0 ,8 5 0,8 4 4 0 ,6 0 ,4 3 p p 3 AW M 0,6 AW M (b) 0,4 2 0 ,2 2 0,2 p p AWM 0 AWM 1 37 38 39 40 41 42 Dis ta nc e along tra ns ec t (k m ) 0 1 0 1 2 3 4 5 6 7 8 9 Dis ta nc e along tra ns ec t (k m ) Figure 8. Allele frequencies (p) of allele 1 of the ME enzyme and average wing melanism (AWM) along the Norwegian transect (a), the Norwegian subtransect in detail (b) and along the French transect (c). tion coefficient was used. The eight populations in the Norwegian subtransect were pooled to two populations (see Methods). Correlation values above/below broken lines are statistically significant (P<0.05). 39 Paper VI (a) (b) 1 0,4 W es t, s im ple autocorrelation W es t, partial autocorrelation East, sim ple autocorrelation East, partial autoc orrelation 0,6 0,4 0,2 0 0 1 0,3 Corre lation coe fficie nt C orre lation coe fficie nt 0,8 2 0,2 0,1 3 Lag (km) -0,2 0 -3 -2 -1 0 1 2 Lag (km) (c) 2 Re lativ e abundance 1,5 1 0,5 East W es t 0 -0,5 -1 -1,5 0 10 20 30 40 50 D istance along transe ct (km) Figure 9. Autocorrelation of wart-biter abundance. (a) for each roadside. (b) Cross-correlation between roadsides for different spatial lags (c) Abundance in 5 km segments corrected for autocorrelation. 40 3 Genetic structure in linear habitats Table 1. Some basic data from the survey of wart-biter (Decticus verrucivorus, D.v.) and its habitat Norwegian highway, including tests of differences between the east and west side of the road. Tests were either Pearson goodness-of-fit tests of independence (those reporting χ2) or Mann-Whitney U tests (those reporting Z, which refers to the standard normal distribution uinder the null hypothesis). West East Total 548 548 1096 classified as habitat 377 360 737 % classified as habitat 69 66 67 83.7 93.5 177.2 in habitat segments 66.3 84.5 150.8 Segments with D.v. (total) 110 149 259 χ2df=1 = 7.69 ** Average presence of D.v. § 0.112 0.125 0.119 Z = -2.09 * habitat segments only 0.128 0.171 0.149 Z = -3.26 *** non-habitat segments only 0.076 0.037 0.056 Z = -1.50 % non-habitat of D.v. segments 21 12 17 χ2df=1 = 3.03 (*) 0.153 0.171 0.162 Z = -2.20 * in all habitat segments 0.176 0.235 0.205 Z = -3.39 *** in all non-habitat segments 0.101 0.048 0.075 Z = -1.48 in habitat segments with D.v. † 0.771 0.645 0.695 Z = -2.96 ** in non-habitat segments with D.v. † 0.722 0.500 0.627 Z = -2.46 * Number of 100 m segments Number of D.v. observed per survey No. of D.v. per segment per survey Tests χ2df=1 = 1.19 ( ) * P < 0.10, * P < 0.05, ** P < 0.01, *** P < 0.001 The numbers show the probability of observing D.v. in a segment on a given survey † With D.v. on one or more surveys § 41