On the Origin of Sunflowers: Fossils, Genes, Genomes, and Hybridization Loren H. Rieseberg1, Benjamin K. Blackman2, Moira Scascitelli1, and Nolan C. Kane1 1University of British Columbia, 3529-6270 University Blvd., Vancouver, B.C. V6T 1Z4, Canada, lriesebe@mail.ubc.ca 2Department of Biology, Duke University, Durham, NC 27701, USA, bkb7@duke.edu ABSTRACT The domestication of plants and animals by prehistoric humans was perhaps the most far-reaching cultural development in human history. Not only were domesticated organisms crucial to the rise of modern civilization, but their widespread use has dramatically altered the ecology and evolutionary history of numerous other species. As a consequence, there is considerable interest in determining the geographic origins, timing, and genetic bases of domestication. Here we review how molecular data has shed light on the origin of domesticated sunflowers, thereby resolving a debate about whether agriculture arose wholly independently in Eastern North America. Analyses of variation at microsatellite loci indicates that wild populations from the East-Central USA are the most likely ancestral source for all domesticated sunflower landraces. Likewise, for two of three candidate genes, domesticated alleles occur in wild populations from eastern North America but are absent from Mexican wild populations. Thus, all extant cultivated sunflowers appear to have arisen from a single domestication event in eastern North America. Ongoing studies are employing genome scan approaches to (1) identify large numbers of domestication genes, (2) study the geographic distribution of domestication alleles, (3) determine the targets of selection by early farmers, and (4) establish the role of hybridization in sunflower domestication and improvement. In the future, we hope to extend these genome scan approaches to include archaeological sunflower remains, which should allow us to estimate the timing and strength of selection on domestication alleles. Keywords - archaeological evidence - domestication genes - microsatellites - selective sweeps sunflower origins INTRODUCTION In 1951, Charley Heiser showed that the cultivated sunflower most likely was domesticated by Native Americans in the East-Central USA (Heiser, 1951). This finding, which was based on analyses of both extant sunflowers and archaeological remains, was of considerable general significance because it implied that agriculture had arisen independently in eastern North America. Over the next 50 years, Heiser’s conclusions were corroborated by additional findings of domesticated achenes from archaeological sites in eastern North America (Crites, 1993) and by the failure to find sunflower remains outside of this region. These conclusions were revisited recently due to the discovery of a fossilized seed and achene, tentatively identified as belonging to domesticated sunflowers, at the San Andrés archaeological site in Tabasco, Mexico (Lentz et al., 2001; Pope et al., 2001). Using accelerator mass spectrometry the authors dated the San Andrés remains to before 2000 B.C. The size of the Mexican achenes exceeded those found in archaeological sites in eastern North America from the same time period. These data were interpreted as evidence of an independent origin of the domesticated sunflower in southern Mexico that predated and possibly influenced the later domestication of sunflower in eastern North America (Lentz et al., 2001). This interpretation received further support from the discovery of three putative domesticated sunflower achenes from the Cueva del Gallo site in Morelos, Mexico, one of which dated to 290 B.C. (Lentz et al., 2008). However, several scholars have taken issue with these findings. For example, Smith (2006) noted that that the San Andrés specimens lack morphological characters diagnostic for Helianthus, and Heiser (2008) documented the striking similarity of the San Andrés achene with bottle gourd seeds, which are common at the site. Smith (2008) further commented that unlike the San Andrés material, the Cueva del Gallo achenes ‘fall within the size range of the Marble Bluff (Arkansas) sunflower assemblage (n = 260), which predates Gallo by >1,000 years.’ Thus, he argued that the Cueva del Gallo specimens (if shown to be sunflower) probably represent an introduction from eastern North America. Given these disagreements over the validity and interpretation of the fossil data, several studies have been conducted using putatively neutral molecular markers and/or candidate domestication genes to determine the number and geographic location(s) of domesticated sunflower origins. Here we briefly review the findings from these studies and also describe some new analyses of microsatellite loci that provide further support for a single origin of the domesticated sunflower in eastern North America. NEUTRAL MARKER STUDIES Several early studies examined relationships among cultivated and wild sunflowers using allozymes (Rieseberg and Seiler, 1990), chloroplast DNA restriction site variation (Rieseberg and Seiler, 1990), random amplified polymorphic DNA (Arias and Rieseberg, 1995), and nuclear microsatellite loci (Tang and Knapp, 2003). However, these earlier studies failed to include significant sampling from Mexico, and thus were not well designed to determine the geographic origin(s) of the domesticated sunflower. A more comprehensive sampling strategy was employed by Harter et al. (2004), who analyzed variation at 18 microsatellite loci in 21 wild populations from throughout the native range of H. annuus, including eight populations from Mexico. In addition to the wild population samples, Harter et al. assayed seven landraces cultivated by Native American groups in the USA, two domesticated landraces from Mexico, and one modern cultivar. Analyses of genetic relationships showed that all cultivars employed in the study had a single origin in eastern North America. This result was corroborated by a subsequent analysis of chloroplast DNA variation across a geographically broad sample of wild and domesticated accessions, which pointed towards a single domestication ‘somewhere outside of Mexico’ (Willis and Burke, 2006). While it initially appeared that the Harter et al. (2004) and Willis and Burke (2006) studies had settled the sunflower origins debate in favour of a single origin in eastern North America, concerns were expressed about the limited number of Mexican landraces included in these studies (Lentz et al. 2008). Therefore, the microsatellite study of Harter et al. (2004) was extended to include five additional Mexican landraces (Blackman et al., 2011). An assignment test using the computer program STRUCTURE confirmed the previous conclusions of Harter et al. (2004): all seven of the Mexican landraces clustered with eastern North American wild samples and not with the Mexican wild samples. Ancestry analyses further revealed that > 96% of the alleles found in the Mexican cultivars derived from eastern North American (ENA) wild populations. Here we further investigated the assignment of these domesticated samples and their genetic composition on a more detailed scale, using the four main wild clusters identified by Harter et al. (2004) as potential ancestral sources. These four groups correspond to: East-Central USA (ENA), US Great Plains, East-Central Mexico (plus Arizona) and West Mexico. We performed ten independent simulations using a Bayesian clustering method, as implemented in Structure v2.3.3 (Pritchard et al., 2000). The East-Central USA cluster was the most likely ancestral source for all the cultivated groups, including all the seven Mexican landraces (Fig. 1), with an average ancestry coefficient of 0.99 for the eastern North America cultivars and 0.98 for the Mexican (MX) cultivars. Fig. 1. Estimated ancestry of domesticated H. annuus from North America (ENA Domesticated) and Mexico (MX Domesticated). Each vertical bar represents an individual’s genome and grey shading represents the proportion of the estimated ancestry of wild populations from the East-Central USA (USA, ENA), USA Great Plains, eastern Mexico plus Arizona, and western Mexico. We also performed a neighbour-joining (NJ) tree analysis with the extended sample set of Blackman et al. (2011). The NJ tree was based on a matrix of Nei’s genetic distance (DA; Takezaki and Nei, 1996) generated between pairs of populations or landraces, with 1000 bootstrap replicates. We then obtained a consensus NJ tree with an extended Majority-Rule method using the programs neighbor and consense, available in the package Phylip v3.69 (Felsenstein, 1989). This distance based method (Fig. 2) supports the findings of the Bayesian clustering assignment analysis (Fig. 1): all the cultivars (ENA and MX) cluster most closely with wild ENA populations and away from any of the wild Mexican groups. Taken together, these results indicate all the extant cultigens, including all the indigenous Mexican landraces currently analyzed, are derived wild sunflowers from the East-Central United States. Fig. 2. Neighbor-joining tree for wild and domesticated sunflowers based on microsatellite genetic distances. Wild populations are indicated by squares (grey-shading within squares matches potential ancestral clusters used in Figure 1. ENA and Mexican landraces are visualized with empty or filled black triangles, respectively. Two modern cultivars, Mammoth and USDA, are also included in the NJ tree. CANDIDATE GENE STUDIES The origin of extant domesticated sunflowers in eastern North America is also supported by sequence variation in genes that have undergone selective sweeps during sunflower domestication. These “domestication genes” are especially useful for elucidating the history of domestication because sequence variation in positively selected genes is more likely to accurately reflect phylogenetic relationships than is variation in neutral genes. To further test for the possibility of a second origin of the domesticated sunflower in Mexico, Blackman et al. (2011) analyzed sequence variation in three candidate domestication genes in an extended sample of North American wild populations, Mexican wild populations, and indigenous Mexican landraces (Blackman et al., 2011). The three candidate genes employed were: (1) c4973, a chorismate synthase homolog involved in aromatic amino acid synthesis (Chapman et al., 2008); HaFT1, a homolog of the floral inducer FLOWERING LOCUS T (Blackman et al., 2010); and (3) HaGA2ox, a Gibberellin 2oxidase homolog (Blackman et al., 2011). All three genes exhibit significantly reduced sequence diversity in domesticated sunflowers when compared to neutral loci. HaFT1 has been shown to underlie a major flowering time QTL in domesticated sunflowers, and selection for later flowering may be responsible for the putative selected sweep observed for this gene. The selection pressures responsible for the apparent sweeps at c4973 and HaGA2ox are unknown. However, gibberellin 2-oxidases are known to be involved in the regulation of seed germination, and selection for reduced seed dormancy represents one possible explanation for the selective sweep observed at HaGA2ox (Blackman et al., 2011). The relative and absolute frequencies of the haplotypes found in wild and cultivated sunflowers from USA/Canada or Mexico (Blackman et al., 2011) for the three genes are summarized in Figure 3. For two of the three genes (HaFT1 and HaGa2ox), the domesticated landraces from Mexico were fixed (or nearly fixed) for the same ‘domesticated’ allele found in North American cultivars. North American wild populations carried the domesticated alleles at low frequencies, whereas these alleles were absent in Mexican wild samples, as expected under the hypothesis of a single domestication event in eastern North America. The fact that the domesticated haplotypes are at low frequency in wild USA populations is consistent with a selective sweep, in which the domesticated allele increased in frequency during the domestication process. Fig. 3. Frequencies of domesticated and wild alleles in North American and Mexican landraces (ENA and MX Domest., respectively) and in American and Mexican wild populations (ENA or MX Wild, respectively). The number of haplotype sequences per allele class is reported inside each pie chart. The three genes analyzed (HaFT1, HaGa2ox and c4973) appeared to have undergone selective sweeps during early domestication in sunflowers (Chapman et al., 2008; Blackman et al., 2011). Two domesticated haplotypes were found for the third gene, (c4973), both of which were present in ENA and MX wild samples (Blackman et al., 2011). Thus, it was not possible to distinguish between hypotheses of a single origin of the domesticated sunflower in eastern North America versus multiple independent origins in both eastern North America and Mexico. However, as noted by Blackman et al. (2011), the higher frequency of the domesticated haplotypes in wild populations from the East-Central USA supports the former hypothesis. In conclusion, all of the molecular data support the hypothesis originally put forward by Heiser (1951) that cultivated sunflowers derive from a single domestication event in eastern North America. An important caveat is that these molecular analyses have only sampled from extant sunflower cultivars. Thus, the possibility of independent domestication and subsequent extinction of Mexican sunflower cultivars cannot be ruled out. FUTURE DIRECTIONS While it might seem that the issue of domesticated sunflower origins is fully solved, at least with respect to what molecular data can contribute, we feel that much more can and should be done. For example, genome-wide scans are underway to identify a much larger fraction of candidate domestication genes. Analyses of the geographic distribution of a larger number of domesticated alleles should allow us to more precisely identify the ancestral germplasm that gave rise to the domesticated sunflower and better determine the targets of selection by early farmers. Extension of the genome scan studies to include archaeological sunflower remains will make it possible to estimate the timing and strength of selection on domestication alleles, as well as to determine the ancestry of the Mexican fossils. Lastly, as sequence data becomes available for other sunflower species, it will become increasingly possible to assess the role of hybridization in sunflower domestication and improvement. ACKNOWLEDGEMENTS We thank Charley Heiser for inspiring the work described in this paper, David Lentz and Robert Bye for their extensive collections of wild and domesticated sunflowers from Mexico, Abby Harter for access to her microsatellite data set, and Bruce Smith for helpful discussions about sunflower origins. REFERENCES Arias, D.M., and L.H. Rieseberg. 1995. Genetic relationships among domesticated and wild sunflowers. Econ. Bot. 49:239-248. Blackman, B.K., J.L. Strasburg, A.R. Raduski, S.D, Michaels, and L.H. Rieseberg. 2010. The role of recently derived FT paralogs in sunflower domestication. Curr. Biol. 20:629-635. Blackman, B.K., M. Scascitelli, N.C. Kane, H.H. Luton, D.A. Rasmussen, R.A. Bye, D.L. Lentz, and L.H. Rieseberg. 2011. Sunflower domestication alleles support single domestication center in eastern North America. Proc. Natl. Acad. Sci. USA 108:14360-14365. Chapman, M.A., C.H. Pashley, J. Wenzler, J. Hvala, S. Tang, S.J. Knapp, and J.M. Burke. 2008. A genomic scan for selection reveals candidates for genes involved in the evolution of cultivated sunflower (Helianthus annuus). Plant Cell 20:2931-2945. Crites, G.D. 1993. Domesticated sunflower in fifth millennium B.P. temporal context: New evidence from Middle Tennessee. Am. Antiquity 58:146-148. Felsenstein, J. 1989. PHYLIP - Phylogeny Inference Package (Version 3.2). Cladistics 5:164-166. Harter, A.V., K.A. Gardner, D. Falush, D.L. Lentz, R. Bye, L.H. Rieseberg. 2004. Origin of extant domesticated sunflowers in eastern North America. Nature 430:201-205. Heiser, C.B. 1951. The sunflower among the North American Indians. Proc. Am. Philos. Soc. 95:432-448. Heiser, C.B. 2008. The sunflower (Helianthus annuus) in Mexico: Further evidence for a North American domestication. Genet. Resour. Crop Evol. 55:9-13. Lentz, D.L., M.E.D. Pohl, K.O. Pope, and A.R. Wyatt AR. 2001. Prehistoric sunflower (Helianthus annuus L.) domestication in Mexico. Econ. Bot. 55:370-377. Lentz, D.L., M.E.D. Pohl, J.L. Alvarado, S. Tarighat, and R. Bye. 2008. Sunflower (Helianthus annuus L.) as a preColumbian domesticate in Mexico. Proc. Natl. Acad. Sci. USA 105:6232-6237. Pope, K.O., M.E.D. Pohl, J.G. Jones, D.L. Lentz, C. von Nagy, F.J. Vega, and I.R. Quitmyer. 2001. Origin and environmental setting of ancient agriculture in the lowlands of Mesoamerica. Science 292: 1370-1373. Pritchard, J.K., M. Stephens, and P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945-959. Rieseberg, L.H, and G. J. Seiler. 1990. Molecular evidence and the origin and development of the domesticated sunflower (Helianthus annuus, Asteraceae). Econ. Bot. (suppl.) 44:79-91. Smith, B.D. 2006. Eastern North America as an independent center of plant domestication. Proc. Natl. Acad. Sci. USA 103:12223-12228. Smith, B.D. 2008. Winnowing the archaeological evidence for domesticated sunflower in pre-Columbian Mesoamerica. Proc. Natl. Acad. Sci. USA 105:E45. Takezaki, N., and M. Nei. 1996. Genetic distances and reconstruction of phylogenetic trees from microsatellite DNA. Genetics 144: 389-399. Tang, S., and S.J. Knapp. 2003. Microsatellites uncover extraordinary diversity in native American land races and wild populations of cultivated sunflower. Theor Appl Genet. 106:990-1003. Wills, D.M., and J.M. Burke. 2006. Chloroplast DNA variation confirms a single origin of domesticated sunflower (Helianthus annuus L.). J. Hered. 97:403-408.