Mimulus guttatus: a resource for comparative genomics in the asterids UNC Chapel Hill: Todd Vision et al. (Toby Clarke, Amy Bouck, Eric Ganko*, Andrew Morgan, Stefanie Hartmann, Jason Phillips) Duke University: John Willis et al University of Washington: Toby Bradshaw et al University of Montana: Lila Fishman et al CUGI: Jeff Tomkins* et al JGI: Dan Rokhshar et al Stanford*: Jeremy Schmutz, Jane Grimwood Funded by: NSF FIBR, DOE CSP Outline • Background on Mimulus and the asterids • Status of genomic resources ESTs, BACs, physical and genetic maps JGI whole-genome shotgun assembly Status of annotation and high-density mapping • Comparative genomics Polyploidy and synteny Phylogenetic relationship of Solanaceae relative to Mimulus and coffee The lamiid (euasterid I) clade Coffea Solanaceae Antirrhinum Mimulus 106-119 Mya Striga Sesamum Bremer et al (2002) Mol Phyl Evol 24, 274 Bremer et al (2004) Syst Bio 53, 496 • ~160 species in the Phrymaceae ex Scrophulariaceae • • • Multiple species complexes Diversity of floral morphologies and mating systems Habitat and life history diversity from sea level to 10,000 ft serpentine and mine tailings aquatics, woody shrubs The genus Mimulus M. guttatus as a lab organism • • • • • • • • Easy to cultivate & clone Small size Generation time ~2 mos. High seed yield Can be easily selfed or crossed Small genome (430 Mb) High recombination (1 cM ~ 250 kb) Rich in SNP/indel polymorphism Really nice field sites Iron Mountain, OR Chronology of an emerging model Various investigators study: • the genetics of speciation Ecological genetic studies begun by • inbreeding depression Permanent mapping populations Clausen, Keck & Heisey • mating system evolution ESTs, gene based markers • ecological adaptation BAC-based physical maps • cytological evolution Molecular phylogeny of genus Genome sequence Biosystematic studies begun by Robert Vickery First markers & linkage maps 1940 1950 1960 1970 1980 1990 2000 ~4K records for “Mimulus + genetics” in Google Scholar Some ongoing studies in M. guttatus • • • • • • • • Dobzhansky-Muller factors Cytoplasmic male sterility Meiotic drive Copper tolerance & reproductive isolation Floral pigmentation Inbreeding depression Water use efficiency Flowering time & life history Candidate genes for floral diversification in Mimulus EMS mutants of M. lewisii Christina Pince, Bradshaw lab M. lewisii mutant M. lewisii wt M. lewisii mutant M. lewisii mutant M. inconspicuus M. douglasii M. lewisii mutant M. bifidus M. parishii M. lewisii mutant M. cardinalis M. lewisii and M. cardinalis Evolution of prezygotic isolation M. lewisii M. cardinalis Elevation mid-high low-mid Pollinator bee hummingbird Petal color pink red Corolla width wide narrow/tubular Stigma/anther inserted exserted Nectar volume 1-2 ml 40-100 ml Toby Bradshaw and Doug Schemske Genome projects in Mimulus • NSF Frontiers in Biological Research (2003) Integrated ecological and genomic analysis of speciation in Mimulus To enable positional cloning of ecologically and evolutionarily interesting genes, particularly for speciation and reproductive isolation Physical and genetic maps, markers, transformation protocols • DOE Joint Genome Institute Community Sequencing Program (2005) Whole genome shotgun of M. guttatus IM62 A number of supporting sequencing projects Physical maps • BAC Libraries M. guttatus: two 11X libraries M. lewisii: one 10X library • HICF contigs • End-sequences for all BACs • Overgos (~750) CUGI & JGI Expressed sequence tags • >500K from M. guttatus IM62 ~ half Sanger, half 454 Mostly from floral buds and maturing fruits Also from roots, seedlings, leaves • 30K from M. guttatus DUN • 30K from M. nasutus SF • 20K from M. lewisii Other FIBR products • • • • • • ~1000 genetic (EPIC) markers Multiple RIL and IL populations Baseline polymorphism and LD data Agrobacterium-based transformation Germplasm collection Community database http://openwetware.org/wiki/Mimulus_Community JGI whole-genome shotgun • M. guttatus IM62 inbred line from Iron Mountain, OR ~7X coverage Mostly 3 kb & 8 kb paired-end Sanger reads >200K ~30K fosmid end-sequences >100K BAC end-sequences • M. guttatus DUN inbred line from a coastal population ~20X coverage of 454 paired-ends Assembly statistics number tot lgth N50 L50 scaffolds 2,216 321.7 Mb 81 1.1 Mb contigs 17,831 300.7 Mb 1,770 45.5 Kb Scaffolds span ~75% of estimated 430 Mb genome Gaps comprise ~6.5% of the scaffolds Number of scaffolds > 50 KB = 512 (which includes 95.7% of assembly) Jeremy Schmutz Genetic mapping of scaffolds • To order and orient the >2K scaffolds on the genetic map • Procedure Genome reduction to ~100K markers using restriction-associated DNA (RAD) - Miller et al (2007) Genome Res 7, 214 Nextgen sequencing of ~100 IM62xDUN RIL lines at ~0.4X coverage each (in pools of 12) Infer genotypes in 20-100kb sliding windows and analyze linkage map • Model predicts ~50% of scaffolds and >90% of bp in assembly will be mapped Toby Clarke Structural annotation • Currently ~25,000 predictions (earlier assembly) • Planned JGI pipeline Mask for repeats Homology to Phytozome proteins, Genbank asterid proteins M. guttatus and M. nasutus ESTs, translated M. lewisii ESTs Genomescan, Fgenesh integrated with Pasa Filter out TEs & low-confidence genes Annotation planned to be available in February Theresa Mitros, JGI Repeat composition • RepeatModeler/RepeatMasker masks ~60% of the assembly • Known LTR retrotransposons: 15% Most active yet seen is a gypsy with 35 EST hits • Known centromere linked repeats: 8% Cnt728: highly abundant, heterochromatic, tandem repeat interspersed with LTRs FISH of pachytene chromosomes with probes for DNA (blue), Cnt728 (green) and a gypsy retrotransposon (pink). Eric Ganko and Arpiar Saunders BAC annotations • 22 fully-sequenced BACs (2.5 Mb) • 236 genes -> 40-45K genes in genome Eric Ganko Mimulus-rice synteny Andrew Morgan Microsynteny with Solanaceae (ovate region, Wang et al 2008 Genetics) May be broken into 3 regions in Mimulus Within each, overall order and orientation well conserved 2 Mimulus genes with no syntenolog in Solanaceae 9 Solanaceae genes with no syntenolog in Mimulus Eric Ganko VISTA conservation plot (20/50nt windows) Mimulus-specific genes (blue) Mimulus genes with Solanceae syntenologs (orange) Eric Ganko Unresolved relationships among lamiid genomes Solanum (Solanales) Mimulus (Lamiales) Coffee (Gentianales) All 3 possible relationships for these taxa have been published Mimulus is an outgroup to the Solanaceae-coffee clade Solanum (Solanales) 100 Coffee (Gentianales) Mimulus (Lamiales) Maximum likelihood analysis of concatenated alignment for 13 gene families totalling 5,691 amino acids Amy Bouck and Stefanie Hartman Mimulus as a resource for tomato • Data should be available by mid-2009 • Assist with gene and repeat structural annotation • Early look at asterid-specific biology Shared functional annotation • Evolutionary analysis Potentially improve coverage of syntenic blocks Could serve as outgoup to reconstruct Solanaceae ancestor and polarize evolutionary changes between Solanaceae and coffee