Genetic Basis of Agronomic Traits Connecting Phenotype to Genotype Rely on co-inheritance of functional polymorphism and DNA variant Traditional F2 QTL Mapping Association Mapping Correlate molecular with phenotypic variation, rely on many Use recombination eventsrecombination, in F2 to narrowphenotype trait of interest to a may generations of historical of interest genomic regionwith a smaller chromosomal segment be associated Yu and Buckler (2006); Zhu et al. (2008) Association Mapping Workflow Choose lines to include in mapping population to capture as much diversity as possible Genome-wide association mapping Candidate/targeted gene approach 1. Germplasm 2. Genotyping 2. Phenotyping 3. Association Testing Grow and measure traits in replicated trials Correlate phenotypic variation with genotypic variation Association Mapping Considerations • Extent of linkage disequilibrium – Informs genotyping strategy – Amount of resolution • Degree of population structure – Can lead to false associations Sunflower Association Mapping (SAM) Objectives 1. Population genetics of the sunflower germplasm, select lines for inclusion 2. Investigate the structure of LD within the association mapping population 3. Grow and characterize the population for wide variety of traits + genotype 4. Test for associations between molecular polymorphisms and variation in key traits SAM Population Line Selection 433 Cultivated Sunflower Lines Core 12 (~50% of allelic diversity) Core 48 (~60% of allelic diversity) Core 96 (~70% of allelic diversity) Core 192 (~80% of allelic diversity) Core 288 (~90% of allelic diversity) Mandel et al., TAG 2011 SAM Genetic Diversity Mandel et al., PLoS Genetics 2013 SAM Genetic Relationships HA X RHA 10k SNPs Mandel et al., PLoS Genetics 2013 Genome-Wide Patterns of FST RHA vs. HA 10k SNPs Mandel et al., PLoS Genetics 2013 Genome-Wide Patterns of LD Linkage Group 1 10k SNPs Mandel et al., PLoS Genetics 2013 Genome-Wide Patterns of LD Linkage Group 10 10k SNPs Mandel et al., PLoS Genetics 2013 Genome-Wide Patterns of LD 10k SNPs Mandel et al., PLoS Genetics 2013 Background Genomic Diversity • Substantial SNP genetic variation • Population structure RHA vs. HA – Somewhat restricted to linkage groups • LD also varies extensively across the genome • Phenotypic measurements SAM Field Locations Plant > 20K seeds 288 inbred lines 4 plants per line 2 replicates 3 locations 7,200 plants 15 people SAM Phenotyping/Genotyping Phenotyping: - Flowering time - Plant architecture - Pigmentation - Leaf traits - Seed size/shape - Oil-related traits - Dormancy/germination - Wood-related traits - Total biomass - Leaf C and N Genotyping strategies: - Entire SAM re-sequencing - 10k SNP Infinium array - GBS approach, ~ 40k SNPs Flowering Time SNP associations 10k SNP Array 10k SNPs Mandel et al., PLoS Genetics 2013 Visualizing Associations – LG 10 No Branching Branching Recessive apical branching Mandel et al., PLoS Genetics 2013 10k SNPs Elevated LD and Potential Targets of Selection Downy Mildew Branching/Flowering Sunflower Rust Black Stem Downy Mildew Mandel et al., PLoS Genetics 2013 10k SNPs Co-Localization of QTL and SNP Associations 10k SNPs Days to Flower Mandel et al., PLoS Genetics 2013 Total Branching Cell-wall Chemistry SNP Associations GBS, Lignin at GA location ~40k SNPs SAM Re-Sequencing Efforts • Entire SAM population of 288 lines • South Africa ARC, Genome Canada/Quebec, and INRA • Illumina Hi-Seq, 1 or 2 samples per lane SAM Re-Sequencing Data Analysis Workflow Adam Bewick and Ben Hsieh • Sunflower genome – Version: Nov22k22.scf.split.fasta • Read-trimming with prinseq-lite • Alignment with BWA • Produce VCF files with samtools SAM Re-Sequencing Coverage Adam Bewick LR, NO-I, O-I, OPV, NO, O 191/288 lines have been run through the pipeline SAM Re-Sequencing Next Steps • Next step is to assay genetic variation – Structural Variation: CNV – Adam’s talk – SNPs • Use data for genome-wide investigations of genetic variation, association mapping, and evolutionary analyses Association Genetics Summary • Mapping panel is very diverse • LD varies across the genome • Association testing and SNPs and genomic regions as candidates • Created permanent mapping resource • Sequenced genome and 288 re-sequenced lines: GREAT resource! Acknowledgments Members of the: Burke Lab Leebens-Mack Lab Rieseberg Lab Raj Ayyampalayam Undergrad Teams Adam Bewick Ben Hsieh John Bowers Mark Chapman Laura Marek Jenny Dechaine Savithri Nambeesan Ed McAssey Steve Knapp Eleni Bachlava