Establishment of a core collection to optimise the conservation of cherimoya (Annona cherimola Mill.) genetic resources using SSR information P. Escribano, M.A. Viruel, J.I. Hormaza Estación Experimental La Mayora – CSIC, 29750 Algarrobo-Costa, Málaga, Spain Keywords: ex situ conservation, germplasm management, plant genetic resources Abstract The management and evaluation of large germplasm collections is expensive and inefficient due to redundancies and/or duplications and the impossibility of analyzing with detail all the accessions conserved, particularly in fruit tree species. Thus, collection management can be significantly improved if the regeneration, characterization and evaluation steps are focused on a subset of individuals, denominated ‘core collection’, that represent the diversity conserved in the whole germplasm collection. Although molecular markers are becoming the tool of choice for the development of core collections in plants, the examples of their use to develop core collections in fruit species are very scarce. In this work, we used SSR marker data to develop a core collection in an underutilized subtropical fruit tree species, cherimoya (Annona cherimola Mill.), from an initial collection of 279 genotypes from different countries. We compared six alternative allocation methods to construct the core collection. The best subset was obtained with 40 accessions. In this subset, all the SSR alleles present in the whole collection were recovered and no significant differences in frequency distribution of alleles for any of the loci studied or in variability parameters were recorded between the core and the whole collection. INTRODUCTION The management and evaluation of large germplasm collections is expensive and inefficient due to the redundancies and/or duplications in the collections and the impossibility of analyzing with detail all the accessions conserved (Grenier et al., 2000a). In those cases it is interesting to select a subset of individuals that represent the whole germplasm collection. Thus, the effort of regeneration, characterization and evaluation can be carried out in this subset. Frankel and Brown (1984) defined a core collection as a subset of a larger germplasm collection that maximizes the possible genetic diversity of a crop species with minimum redundancy. In the last years, molecular markers are becoming the tool of choice for the development of core collections. However, although core collections in fruit species could greatly improve germplasm management, the examples of the use of molecular markers to develop core collections in these species are scarce. In this work, we used SSR marker data to develop a core collection in an underutilized subtropical fruit tree species, cherimoya (Annona cherimola Mill., Annonaceae), from an initial collection 279 genotypes from different countries. We compared six alternative allocation methods to construct the core collection, four not based in the similarity dendrogram and two based on dendrogram data. MATERIAL AND METHODS Data set The data used in this study were 279 accessions of cherimoya from different geographical origins maintained at the E.E. La Mayora–CSIC in Malaga (Spain). Those accessions have been genetically analyzed with 16 polymorphic SSRs (Escribano et al., 2007). Construction of the core collection Six alternative approaches to develop core collections were compared. For each approach, five core subsets with different sizes, ranging from 10 to 50 at ten-individual intervals were developed. a. Random sampling (R). The accessions were selected from the whole collection by random sampling without replacement. b. Maximization strategy (M-strategy) (M) (Schoen y Brown 1995). It was carried out using the MSTRAT software (Gouesnard et al., 2001). This strategy maximizes the number of alleles in each locus using as second maximization approach the Nei's diversity index (Nei, 1987). c. Simulated annealing algorithm using the Core Set function in PowerMarker (Liu and Muse, 2005). In this case two optimal core subsets were obtained: one maximizing genetic diversity (SD) and the other maximizing the number of SSR alleles (SA). d. Logarithmic strategy (L), Brown (1989). Six main groups were defined according to Escribano et al. (2007) (cluster A to cluster E). e. Stepwise clustering with random sampling (S) according to Hu et al. (2000) also based in the dendrogram obtained by Escribano et al. (2007) Characterization of the subsets and comparison to the entire collection The different subsets obtained were compared by molecular diversity measures, such as number (A) and frequency (Fr) of alleles, and observed (Ho) and expected (He) heterozygosities calculated with ARLEQUIN version 3.01 (Excoffier et al., 2005). The frequency of alleles at loci level between the entire collection and the core collections was analyzed by the Chi-square test. The rest of parameters related to collection diversity (A, He and Ho) were compared by the Friedman’s Repeated Measures Analysis of Variance on Ranks. Post-hoc Dunnett's test was used to compare the different subsets developed with the whole collection (control group). All the comparisons at significance level p<0.05, were carried out with SigmaStat 3.0 (SPSS Science Version, Chicago, IL, USA). RESULTS AND DISCUSSION Significant differences (p<0.05) were obtained in the number of alleles compared to the whole collection in some subset sizes of all the methods except for the M-strategy. Observed heterozygosities were similar to those of the whole collection in all the subsets except with the subset of 10 accessions with the M-strategy, the subset of 20 accessions with the random strategy and with all the subsets developed following the L- strategy where significant (p<0.05) differences were obtained. Regarding expected heterozygosity (analogous to Nei’s genetic diversity index) significant (p<0.05) differences were only recorded for the M-strategy with 10 individuals. Regarding the allele frequencies, only with the subset of 50 accessions of the random strategy, the subsets of 40 and 50 accessions of the M-strategy, the subset of 50 accessions with the SD strategy and the subsets of 47 and 76 accessions with the S strategy no significant differences in at least 95% of the loci was obtained. All the strategies seem to represent fairly well the overall diversity of the collection (He). However the M-strategy is the only approach that recovers all the alleles of the whole collection with as low as 30 accessions, reducing redundancy and capturing most of the genetic diversity. The whole germplasm collection shows a high redundancy since with only 30 cultivars all the alleles could be retained with the M-strategy, but with that subset 6% of the loci showed significantly different allele frequencies compared to the whole collection. Therefore, although some redundancy is present, the subset that showed the best adjustment to all the validation parameters was that of 40 accessions with the M-strategy (Table 1). In this subset, all the alleles were recovered and no significant differences were recorded for any of the other variability parameters studied (He and Ho). This subset represents 14% of the original collection. All the countries in the whole collection are represented in the selected core collection, except Italy and Chile. The UPGMA dendrogram obtained after the similarity analysis with the accessions that constitute the core collection is shown in Fig. 1. Similarly to the results obtained with the whole collection (Escribano et al., 2007) no clear geographic pattern was observed. ACKNOWLEDGEMENTS Financial support for this work was provided by the Spanish Ministry of Education (Project Grant AGL2004-02290/AGR) and the European Union under the INCO-DEV program (Contract 015100). Literature Cited Brown, A.D.H. 1989. Core collection: a practical approach to genetic resources management. Genome 31: 818-824. Escribano, P., Viruel, M.A. and Hormaza, J.I. 2007. Molecular characterization, genetic diversity patterns and population structure within a worldwide germplasm collection of Annona cherimola Mill. (Annonaceae) using SSRs. Journal of the American Society for Horticultural Science 132:357-367. Excoffier, L., Laval, G. and Schneider, S. 2005. Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online 1: 4750. Frankel, O.H. and Brown, A.H.D. 1984. Current plant genetic resources a critical appraisal. In: Genetics: new frontiers (vol IV). Oxford and IBH Publishing, New Delhi, India, pp 1–13. Gouesnard, B., Bataillon, T. M., Decoux, G., Rozale, C., Schoen, D. J. and David, J. L. 2001. MSTRAT: An algorithm for building germ plasm core collections by maximizing allelic or phenotypic richness. Journal of Heredity 92: 93-94. Grenier, C., Bramel-Cox, P. J., Noirot, M., Rao, K. E. P. and Hamon, P. 2000a. Assessment of genetic diversity in three subsets constituted from the ICRISAT sorghum collection using random vs non-random sampling procedures A. Using morpho-agronomical and passport data. Theoretical and Applied Genetics 101: 190-196. van Hintum, T., Brown A.H.D., Spillane C. and Hodgkin T. 2000. Core collections of plant genetic resources IPGRI Technical Bulletin No.3. International Plant Genetic Resources Institute, Rome, Italy. Hu, J., Zhu, J. and Xu, H. M. 2000. Methods of constructing core collections by stepwise clustering with three sampling strategies based on the genotypic values of crops 22. Theoretical and Applied Genetics 101: 264-268. Liu, K. J. and Muse, S. V. 2005. PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics 21: 2128-2129. Nei, M. 1987. Molecular evolutionary genetics. Columbia University Press, New York Schoen, D.J. and Brown, A.H.D. 1995. Maximixing genetic diversity in core collections of wild relatives of crop species. In Hodgkin T, Brown AHD, vab Hintum ThJL and Morales AEV (Eds) Core collection of plant genetic resources (pp. 55-76). Chichester, Uk: John Wiley and Sons. Table 1. Origin of cultivars selected to form the best subset . Cultivars Anderson Andrews CortesII El Bumbo Equador Fino de Jete Haluza Loma Mariella Mexico1 Pierce Sabor SB109 SB124 SC10 SE11 SE14 SE29 SM29 SM32 Country Cultivars SP6 SP10 SP26 SP36 SP41 SP46 SP52 SP55 SP65 SP76 SP7752 SP79 SP86 SP95 SP129 SP131 SP137 SP138 SP205 Zarzero Australia Australia Mexico Australia Australia Spain Australia USA Australia Mexico USA Australia Bolivia Bolivia Colombia Ecuador Ecuador Ecuador Portugal Portugal Country Peru Peru Peru Peru Peru Peru Peru Peru Peru Peru Peru Peru Peru Peru Peru Peru Peru Peru Peru Costa Rica ANDERSON ANDREWS MARIELLA SC10 FINO DE JETE CORTES II EQUADOR SP129 MEXICO 1 PIERCE SB109 SB124 SP7752 SP65 SP131 SM29 SP26 SE11 SM32 SP6 SP76 SP138 SP137 SP10 SP86 SP36 SP55 SP52 SE14 SE29 SP46 SP95 SP41 SP205 EL BUMBO HALUZA ZARZERO SP79 LOMA SABOR FinodeJeteMW 0.40 0.55 0.70 0.85 1.00 Similarity Fig. 1. Dendrogram with the 40 selected accessions of the best cherimoya core subset using UPGMA of similarity data.