Abstract pochválit si: počet vzorků, zohlednění prostorové závislosti processess occurrnig immediately after hybridization in a system yet not stabilized? homoploid hybridization as a source of genome size variation? Introduction Hybridization amog plant species is percieved as an important source of genetic diversity, transgressing boundaries between existing taxa (Arnold 1997, Seehausen 2004, Mallet 2007, Nolte et Tautz 2010), affecting nearly 40 % of plant families (Whitney et al. 2010). Hybridization occurs frequently among closely related taxa (Mallet, 2005), recent studies suggest the ploidy level of the hybrid to be driven by parental genetic divergence: crossing loosely related lineages produces presumably more quickly stable polyploid hybrids (allopolyploids), while closely related lineages gives rise to homoploid hybrids (Paun, 2010). Unless reproductive isolation is established by means of chromosomal rearrangements, spatial or temporal isolation (Riesberg et al. 2003, Faria et Navarro 2010), homoploid hybrids act as a bridge allowing gene flow by backcrossing to parental species. Homoploid hybrids posses the advantage of extreme genetic variation of the progeny via recombination and attaining new adaptive peaks (Mallet 2007). According to its size and temporal stability, such hybrid population is termed "hybrid swarm" or "hybrid zone" (Nolte et Tautz 2010). However, the processess on the onset of hybridisation (promising hybrid speciation) still remain unclear and deserve scientific attention (Nolte et Tautz 2010) Cirsium Mill. is a genus with a very high frequency of interspecific hybridization, especially in the Central Eropean parts of its range (Bureš et al. , 2004, 2010, and references therein). Cirsium hybrids occurr usually as single hybrid individuals (e.g. C. x tataricum) inside populations of parental species, rarely in larger populations as it happens in C. x rigens. . C. x rigens Wallroth 1822 resulted from hybridisation od C. acaule and C. oleraceum. This hybrid is relatively widespread and occurs in secondary contact zones of parental species (mosaic sympatry sensu Mallet et al. 2005). The parental species are ecologicaly differentiated, inhabit low stature calcareous grasslands on gaize slopes resp. nutrient rich wet meadows or fringes. The hybrid is characterised by an intermediate habitus, covering the whole range between both parental species, highly viable pollen and a chromosome number identical as both parental species (2n=34). In terms of genome size, like other Cirsium hybrids, it shows a smaller genome size than the inetermediate expected (Bureš et al. 2004), which is currently explained as genome downsizing after hybridization in the initial phase of the genomic shock. If this downsizing would be caused by chromosomal rearrangements, this process would result in the emergence of mating barriers and (consequently hybrid speciation) or incerased hybrid sterility (). Evidence for a completion of this process is lacking and C.x rigens is actually considered as a nothospecies (tropicos.org). Intriguingly, parental species of C. x rigens differ significantly in genome size, about 1.12-fold (Bureš et al. 2004). Preliminary observation in this hybrid also reported intrapopulation genome size variation (Bureš, unpubl.). Reports on variation in genome size after homoploid hybridization are still scarce in the scientific litterature (e.g. Baack et al. 2003, Mahelka et al. 2007). Measurements of genome size provide a rapid, effective and realiable tool for the survey of whole genome level processes (Doležel et al. 2007). The aims of our study are: 1. To carry out a detailed study of genome size variation in a homoploid hybrid population; 2. To indentify possible backcrossing to parental species; 3. To identify the contribution of different generations to the whole population; 4. To identify local "hotspots" and 5. possible affinity of specific genome size categories to different microhabitats. The results were cross-validated on two independent populations. Materials and methods For the purpose of this study, two populations of the hybrid were selected and sampled, first Lipovka near Rychnov nad Kněžnou (50°10'19.101"N, 16°15'13.505"E), the second near the natural reserve Radostínské rašeliniště (49°38'47.747"N, 15°51'55.702"E). In each population, the position of each sample plant was recorded in relative coordinates, the origin being located with a field GPS tool. Tissue sample were taken into plastic bags with a small amount of water, labelled and stored at low temperature. The samples were subsequently analyzed using flow cytometry (Cy Flow ML, Partec GmbH) using a two-step chopping procedure (Otto, 1990; Doležel et al., 2007) and DAPI fluorescent dye. For details on sample preparation and dye concentrations used see Šmarda et al. (2008). Fresh leaves of Cirsium vulgare (2C=5.54 pg, Bureš et al. 2004) were used as internal standard. Spatial analyses were performed using the R package nlme (Pinheiro et al. 2013). The extent of spatial association was estimated using the range parameter of an empirical (semi)variogram. Simply stated, the (semi)variogram is a function describing the degree of spatial dependence of a stationary and presumably isotropic process with a constant mean. Implemented in theoretical framework of generalized least squares method, a marginal model including spatial correlation structure in the residuals produces a good idea of its shape. The typical (semi-)variogram shape is an increasing function approaching a limit of maximum dissimilarity (sill) within a distance (range). Thus, the range parameter describes the distance in which spatial association dissapears and delimits the ecological limts of the given process. For details, see Cressie (1993), Borcard et al. (2011), Waller et Gotway (2004). The analysis of population composition was done by comparision of the observed distribution and randomly generated compound distributions. We assumed the F1 and B1 generations (subpopulations) having normal distribution, mean and standard deviation equal to expected values (see Table 1). We tested 9 scenarios of different rates of the generations. For every scenario, we ran 1000 replications of generating the random distribution and testing the distributions equality using the KolmogorovSmirnov D statistic. Random distribution had approximately equal sizes as observed distributions. For the identification of local spatial association of similar values, we estimated local Moran I (Anselin, 1995) using the R package spdep (Bivand et al. 2012)). The neighborhood was set as the inverse function of Euclidean distances; important hotspots were selected according to the results of the permutation test refined by Bonferroni bounds. Given the specific ecology of the parental species, the gradient of ecological conditions inhabited by the hybrids can be successfully aproximated by relative altitude differences (uphill – C. acaule, dryer soils, downhill – C. oleraceum, wetter soils). High-resolution altimetry data was taken from Digital terrain model of the Czech Republic of the 5th generation (DMR 5G, measurement error 0.18 m in open landscape; CUZK 2012), the surface representation was processed from raw point data to a triangular network (TIN) using Grass GIS (Grass Gis Core Team, 2012). Such data was available only for the Radostin site. Correlation analyses were done using the linear mixed models approach, optionally with spatial position as covariate. Preliminary analysis In order to determine effective sampling distance, we sampled extensively a quarter of its extent, mesured genome size of all samples and constructed an empirical semivariogram of spatial (auto)correlation of genome sizes. We assumed an stationary and isotropic linear model with a nulll hypothesis that genome sizes of all samples are identical. The best fitting spatial correlation structure was selected using the Akaike criterion (AIC) amongst Gaussian, exponential, linear and spherical. The lowest AIC was found for the exponential shape. Inclusion of the nugget parameter did not provide any substantial improvement of the model. When running a model without inclusion of spatial correlation, the semivariogram has a monotnonously increasing trend, reaching the sill at the distance around 10 m (AIC -960.3). A more complex and appropriate model including spatially correlated residuals (AIC -1088.5), predicted spatial autocorrelation within the range of 3.5 m for the Radosín population (Fig.1), 4.12 m for the Lipovka population. The result was used as an effective sampling distance. Data from these preliminary measurements were randomly selected using the same minimum distance. * trochu si tady vymyslim, ve skutecnosti to bylo jinak, ze, ale smysl vypovedi se nemeni, nevadi to? Results Genome size 150 samples from the locality Radostín and 67 samples from Lipovka were collected and measured. Three samples with extreme genome size (0.662,0.676 and 0.683 a.u.), probably triploid hybrids formed from one unreduced gamete of C. oleraceum and one normal gamete of C. acaule, were excluded from further analyses. All measurements yielded symmetrical and well discriminated peaks. Basic descriptive statistic are given in Table 1. We found considerable genome size variation in both populations, covering the whole range of genome sizes between the parental species (Fig. 2). Table 1. Descriptive statistics of genome size distribution in C. acaule, C. oleraceum, two hybrid populations and expected values of putative F1 and B1 hybrids. Expected maximum and minimum values of were obtained by addition resp. subtraction of three sd. N Min Max Median Mean Std. dev Hybrids - Radostín 150 0.4247 0.5270 0.4497 - 0.0182 Hybrids - Lipovka 57 0.4313 0.5226 0.4711 - 0.0212 Pure species - C. acaule** 20 0.4470 0.5490 0.4980 0.4980 0.0170 Pure species - C. oleraceum** 36 0.4120 0.4420 0.4270 0.4270 0.0050 Expected F1 - 0.4265 0.4985 0.4625 0.4625 0.01 * Expected B1 - F1 + C. acaule - 0.4412 0.5192 0.4802 0.4802 0.0130 * Expected B1 - F1 + C. oleraceum 0.4165 0.4735 0.4450 *compound distribution, sd = sqrt(sqrt(var(A)^2+var(B)^2)/2) ** Bureš et al. 2004 0.4450 0.0095 * When compared to the theoretical intermediate genome size (0.463 in a.u.), the Radostin population median was lower, than it has been expected, whereas the Lipovka population median was even higher. The hypothesis of normality of distribution was rejected in both populations (Shapiro-Wilk test, p<0.0001), moreover, a closer inspection of histograms, resp. the kernel density estimates (Fig. 2) reveals a bimodal distribution of genome size, the median being thus an inappropriate measure of comparision. Two local maxima of the distribution overlap with the expected values for the B1 generations. As these generation differ in expected variation (resp. the peaks standard deviations), we conducted an analysis of probability of population composition. Although only individuals with hybrid morphology were sampled, a considerable part of samples showed genome size comparable to the genome size of pure species. Among populations Both population differed significantly in genome size (Kruskal-Wallis test, p<0.0001). Also the distributions of both populations differed significantly (KS test, p<0.0001), thus both population samples are not coming from the same theoretical distribution and reflect different processess of genome size evolution. Composition of the populations We took the empirical distributions of the populations and compared it to theoretical compound distributions arising from different scenarios of local genome size composition. The sums of D statistics over the 1000 runs are listed in Table 2. The lowest sum of differences for the Radostin population was found for the distribution consisting of 66% of backcrosses to C. oleraceum and 33 % backcrosses to C. acaule (Fig. 4). In the Lipovka population, best results were found for 50 % equally for F1 and backcrosses to C. acaule. Table 2. Differences among scenarios of genome size composition and the true hybrid populations. B1 Ol = backcrosses to C. oleraceum, B1 Ac = backcrosses to C. acaule. Lowest values for every population are marked bold. Radostin Lipovka Scenario ratio sum D ratio sum D ratio sum D ratio sum D F1 1 378.48 - - 1 351.48 - - B1 Ol + F1 1:1 157.08 2:1 150.73 1:1 443.47 2:1 506.18 F1 + B1Ac 1:1 472.49 2:1 436.23 1:1 195.53 2:1 232.59 B1 Ol + F1 + B1 Ac 1:1:1 244.14 2:1:1 164.87 1:1:1 256.74 2:1:1 336.89 B1 Ol + B1 Ac 1:1 224.53 2:1 140.24 1:1 278.01 2:1 409.15 Spatial structure of genome size distribution Local Moran I was computed for all samples, only observations with p<0.001 and a z-score > 1.96 (indicator of clustering in similar vaules) or <-1.96 (indicator of dispersion) were considered. In the Radostin population, we identified 13 plants as local hospots (Fig 5) with unevenly distributed genome size: samples with clustering tendency (10 plants) showed mostly higher genome size, among three dispersed samples, there were two very low and one high (Fig 6). In Lipovka population, the Moran statistic detected no spatial association, only with one plant (GS = 0.52 a. u.) being a local outlier. Ecological correlates of genome size # v případě bimodality rozdělení je třeba použít jiný způsob testování. Genome size was positively and significantly correlated with elevation (r=0.03, p=0.009), with regard to residual spatial autocorrelation, the result became marginally significant (r=0.03, p=0.068). Discussion Genome size variation The present study of the hybrid C. acaule x C. oleraceum is exceptional due to the amount of analyzed samples per population, which permits a closer look on the processess shaping the evolution of genome size of the hybrid. Genome size variation in homoploid hybrid swarms has already been reported in Elytrigia repens and E. intermedia hybrids (Mahelka et al. 2007). Among 63 homoploid samples, F1 hybrids had intermediate genome size, backcrosses were rare (3 plants) and their genome size could be easily discerned from the F1 hybrids. In the Helianthus annus – H. petiolaris homoploid hybrid system, where a maternal effect has to be considered, hybrid zone plants, early generation synthetic hybrids and backcrosses did not differ statistically in genome size from their maternal species. (Baack et al. 2005). Low numbers of individuals per population probably mask subtle differences among plants of different origin, while standard deviations are comparable to the variation detected in the present study. In Picris hispidissima, 1.09-fold genome size variation points to its hybrid origin and ongoing introgression (Slovák et al. 2009). In the genus Narcissus, genome size variation was recorded for (odd ploidy numbered) hybrids, genome size and chromosome number was intermediate (Marques 2012). Zhou (2010) found intermediate genome size in a homoploid hybrid taxon Hippophae goniocarpa, 2.95 pg (sd 0.08), Šiško (2003) in syntetic hybrids of Cucurbita, Zonneveld (2001) in Helleborus. Most of these studies lack a closer look on the variation inside populations. Possibly, a genome size continuum could be found also in the well studied homoploid hybrid Senecio squalidus, which resulted from the hybridisation of S. aetnensis and S. chrysanthemifolius (James et Abbot, 2005). According to the C-values database (Bennett et Leitch 2012), the parental species differ more than 1.5-fold. A similar situation is to be expected in homoploid hybrids of Artemisia sect. Tridentatae – 1.07 fold (Garcia et al. 2008), Achillea roseoalba – 1. 07 fold (Guo et al. 2004), Pinus uliginosa – 1.14 fold (Wachoviak et al. 2011). Whether genome size variation is a common feature of homoploid hybrids, but did not obtained enough attention yet, remains unclear. Among the most prone variables, that influence the possibility of emergence of GS variation, we should mention: hybrid frequency, hybrid fitness, fertility and longevity, genetic distance among parents, evolution rates in the genus ecogeographic isolation among parents resp. elapsed time since first contac (speciation trough reinforcement in past hybrid zone) Small scale genome size alterations are commonly due to proliferation of retrotransposones, small deletions, insertions or unequal recombination (Bennetzen et al. 2005). Hybridization can even open the way for genome reorganization via multiple processses (Hegarty et Hiscock 2004, Kalisz et Kramer 2008) but here, these processes seems to play only a minor role. As we confirmed backcrossing of F1 hybrids to both parental species, mating barriers are of low importance, chromosomal rearrangements are not frequent, downsizing is not probable. Intrapopulation dynamics of genome size variation The maximum density values of Radostín population corresponds to backcrosses of F1 to C. oleraceum, a smaller fraction to backcrosses of F1 to C. acaule. Strikingly, a genome size category corresponding to F1 hybrids is underrepresented, nontheless, we cannot rule it out completely. A more detailed differentiation of the distribution is still limited by the sample number and genome size estimation precision. In the surroundings of the Radostin population, abundant populations of C. oleraceum are present, but pure C. acaule is almost missing. The present state is thus a relict of a past state before the extinction of C. acaule. Whether pure C. acaule dissapered due to habitat change, inbreeding depression in a small and isolated population cannot be decided, but genetic corrosion through hybridisation definitely contributed to it. * sem by se krásně hodila ta ecological correlate: Ecological differentiation maintains genome size variation. Plants with higer genome size perform better in enviroments similar to C. acaule. The Lipovka population consists of equal amounts of backcrosses to C. acaule and F1 hybrids. Both parental species are present in the surrounding. *Proč chybí B1Ole? Indicators of local association – dispersion. Local Moran I. As it can be seen from the number of dispersed samples, immigration/dispersal along the gradient happens rarely. Both small and large genomes are prone to dispersal. Hybridization Hybrid fitness/advantage – plants of pure species did not occur directly at same microsites as hybrids → mosaic sympatry, sympatric speciation, fitness superior to parents at transition zones (Arnold 2007, Riesberg 2003, Wang 2001) Spatial autocorrelation of genome sizes can be regarded as a result of I) vegetative sprouting 2) pollination distance effect and 3) ecological specialization Influence of pollination distance, which, in Cirsium does not exceed 10 m (Beattie 1976, Price &Waser 1979, de Vere 2007). Seed dispersal distance, Individuals with low viability ocurring in populations of pure species, immigration and/or introgression (Bures 2010) Hybrids contribute pollen, Oleraceum mothers provide ovules. Possibility for a maternal effect? Hybridization and introgression – movement of alleles vs. incoroporation of allelles (Harrisson 2012) Even in the case of large populations the impact of hybrids proximity does not affect the phenotype integrity of the parental population. However, indices of introgression emerge in the study of pollen viability, as frequently hybridizing species (C. oleraceum and C. acaule fall into) show accidental decreases in pollen viability (Bureš 2010). Genome size flow? Improbable, as the size of introgressing DNA fragments is under control, smaller fragments are less likely to contain deletrious genes, Populus (Martinsen et al. 2001). Is genome size neutral? No, while it is not flowing. . Trash: chomosomal compatibility is maintained by a flexible mechanism ? competitive advantage of conspecific pollen (Waser 2000) semipermeable barriers (Harrisson 2012) coexistence in patchy environments longevity of perennials in changing environments Outnumbering hypothesis blending inheriatnce, paint pot, stabilizing selection gynodioecious Figure 2. Figure 4. Fig. 1 semivariogram radostín fig 2 histograms rad, lip Figure 3: tabulka výsledků GLS s konfidenčními intervaly Fig. 4 density comparision Fig 5 hotspots na louce Fig 6 boxplot clustered vs dispersion