Title: Development of sequence-based community tools for Setaria

advertisement
Title: Development of sequence-based community tools for Setaria viridis-a
model genetic system for C4 grasses
1. Description
The goal of this proposal is to build on the existing foundation established at JGIDOE to develop genomics resources for Setaria viridis.
(http://www.phytozome.net/foxtailmillet.php).
Team members include:
PI: Thomas P. Brutnell, Boyce Thompson Institute
Co-PI: Jeffrey Bennetzen, University of Georgia
Co-PI: Andrew Doust, Oklahoma State University
Co-PI: Katrien Devos, University of Georgia
Co-PI: Elizabeth Kellogg, University of Missouri, St. Louis
Co-PI: Todd Mockler, Oregon State University
Specifically, we have identified three primary goals. Detailed justifications for
each objective are in the Justification section.
1. Sequence 50 diverse accessions of S. viridis and map polymorphisms
relative to the S. italica reference genome.
To date, approximately 160 S. viridis accessions from 15 countries,
including 60 accessions from the US, have been collected by Co-PIs Kellogg and
Devos and corresponding genomes are currently being characterized by Devos
using SSR markers. A subset of 50 inbred lines that captures the most diversity
in the population will be nominated for deep sequencing (20x coverage) to fine
map genetic variation throughout the genome.
Estimated total sequence: (500Mb x 20x coverage x 50 lines= 500 Gb)
2. Sequence a subset of NMU- and fast neutron-mutagenized lines to define
the nature and frequency of mutation in the populations.
To characterize the efficacy of an ongoing mutagenesis program, we
propose deep sequencing (20x coverage) of a subset of the M1 and descendant
M2 progeny to assess the distribution, nature and frequency of mutations across
the genome. This deep coverage anticipates a relatively high percentage of
unmappable reads resulting in a minimal coverage of 8-10x. Characterization of
M1 plants will reveal somatic mutation rates whereas germinal transmission rates
will be determined in the M2 progeny. We anticipate that most mutations
identified in the NMU-mutagenized populations will carry G-A transition mutations
as NMU is an alkylating agent. Fast-neutron mutagenized plants are anticipated
to harbor primarily deletion alleles of varying size and distribution throughout the
genome.
Estimated total sequence (12 plants x 500Mb x 20x coverage x 3 treatments =
360 Gb)
3. Sequence a S. Italica x S. viridis segregating population to map all
breakpoints within the population and determine parental polymorphisms.
To accelerate the discovery and characterization of genes underlying
major QTL, we propose sequencing 500 members of a founder F2 population
that is being self-pollinated to create new recombinant inbred populations. Each
of the 500 individuals will be sequenced to 0.5X coverage to identify genetic
breakpoints, and to generate a high density SNP map for both parental genomes.
That is, although each line will be sequenced to a relatively low depth, each
parental allele will be sequenced to a depth of 125x coverage. At this
sequencing depth, nearly all small indels and SNPs will be determined for the
two parental genotypes and will then be imputed onto the genetic map once
breakpoints are determined. Thus, through this sequencing effort, we will
provide the foundation for fine-mapping studies.
Estimated total sequence (500 plants x 500 Mb x 0.5x coverage = 125 Gb)
Total Sequencing Requested= 985 Gb
2. Justification
S. viridis is a small stature, rapid cycling grass species that is closely related to
some of the most promising bioenergy feedstock grasses including the sister
taxa switchgrass (Panicum virgatum) and other closely related panicoids such as
Miscanthus (Miscanthus giganteus), sorghum (Sorghum bicolor) and sugarcane
(Saccharum officinarum). It is a rapidly emerging genetic model system to study
C4 photosynthesis (Brutnell et al. (2010) Plant Cell 22:2537-2544), abiotic and
biotic stress response (Li and Brutnell (2011) J Ex Bot., in press) and
domestication (Doust et al. (2009) Plant Phys 149:137-141) – essential areas of
research for the development of low input, high yielding bioenergy feedstocks
that will be grown widely throughout the US. DOE-JGI has recently produced
~8.3X coverage of the ~500 Mb genome of the closely related food and feed
crop, Setaria italica, and efforts are currently underway to sequence the S. viridis
genome. We have identified three specific goals with the following justifications.
1. Sequence 50 diverse accessions of S. viridis
Co-PIs Kellogg and Devos are currently collecting S. viridis accessions
throughout the US and the world to study the global diversity of S. viridis, and the
history, origin, and population structure of S. viridis introductions into North
America. They will also examine the effects of selection for herbicide resistance
on genome evolution. As S. viridis is generally considered one of the most
adaptable plant species in the world, these lines will also serve as a foundation
for association mapping studies, as a source of rare allelic variants and as
founders for the development of additional recombinant inbred populations.
2. Sequence a subset of NMU and fast neutron-mutagenized lines.
Brutnell has recently conducted a large-scale NMU-mutagenesis (3000
M1 plants propagated) and is in the process of conducting a fast-neutron
mutagenesis (3000 M1 seed mutagenized) of S. viridis seeds. The Bennetzen
lab has performed a fast-neutron mutagenesis of S. italica seeds. The three
populations will serve as foundations for community forward and reverse genetic
screens. Thus, sequence-based characterization of these populations will
provide great insight into their utility as a resource for genetic screens and inform
future mutagenesis programs.
3. Sequence a S. Italica x S. viridis segregating population to map
breakpoints
Co-PIs Bennetzen and Devos have created a number of crosses between
Yugu1 (the foxtail millet accession sequenced by JGI) and selected S. viridis
accessions. We propose sequencing F2 individuals from one of these founder
populations to define genetic breakpoints, determine
heterozygosity/homozygosity and map all parental SNPs and small indels. This
will create a large mapping population that will be used to extend previous QTL
analyses by co-PIs Doust and Kellogg on a small F3 segregating population from
a cross between the wild (A10, S. viridis) and cultivated (B100, S.italica)
accessions developed by co-PI Devos (e.g. Doust et al. (2004) Proc Natl Acad
Sci 101:9045-9050, Wang et al. (1998) Theor Appl Genet 96:31-36). A small F7
RIL population from this cross was also developed by Panaud and colleagues,
and co-PI Devos has generated a high-density SNP map (manuscript in prep) for
the RILs that has been used to detect QTL for plant architecture, biomass
accumulation, and flowering time (co-PI Doust unpublished). Comparative
genomic analysis has suggested candidate genes that may determine several of
these traits, yet further analysis requires large mapping populations, such as the
one proposed for sequencing here.
3. Utilization
The sequence data generated will serve multiple purposes including: 1)
Provide a high density SNP map across 50 diverse accessions of S. viridis. This
will guide the development of recombinant inbred populations and serve as a
foundation for association mapping experiments. 2) Enable highly accurate
determination of mutation density in chemical and fast-neutron mutagenesis
populations of Setaria. These data will then be used to guide future mutagenesis
programs and to establish reverse and forward genetic screening protocols to
mine for specific allelic variants; 3) Define recombination breakpoints and the
majority of parental polymorphisms in an F2 population. These data will be used
to create one of the most highly resolved mapping populations yet developed for
any plant species. Thus, this sequencing effort will have applications for plant
breeding, genetics and genomics and greatly facilitate the development of
community tools for S. viridis.
4. Community interest
The sequence data generated will be used widely by a growing community of
Setaria researchers. This includes a major international research community that
is using Setaria as a model to understand C4 photosynthesis, industry and
academic scientists using Setaria as a model for C4 bioenergy grasses that
include sugarcane, switchgrass, sorghum and Miscanthus, and academic and
industry scientists who currently work with Zea mays (maize) and are looking for
a closely related and readily transformable model (Brutnell et al. (2010) Plant Cell
22:2537-2544; Doust et al. (2009) 149:137-141; Li and Brutnell (2011) in press).
5. DOE mission
This sequencing effort will have the most immediate effect in DOE’s
mission for alternative energy production. We are developing tools for S. viridis
specifically with the bioenergy feedstock community in mind. S. viridis is a
member of the same tribe as switchgrass (Paniceae) and is sister to Miscanthus,
sugarcane, maize and sorghum. Although Brachypodium distachyon has been
advanced as a model for understanding the grass cell wall, as a C3 grass, it has
limited utility in understanding photosynthetic limitations of the bioenergy
grasses, which nearly exclusively utilize C4 photosynthesis. Thus, S. viridis has
great potential as the model system to examine carbon-nitrogen balance,
biomass accumulation, biotic and abiotic stress tolerance and water use
efficiency in C4 grasses. This is particularly relevant for Miscanthus or
switchgrass that are recalcitrant to genetic analysis due to their large size,
polyploidy, long generation times, sterility (Miscanthus) and self-incompatibility
(switchgrass).
6. Sample preparation
Brutnell and Mockler have extensive experience in Illumina library
construction (e.g. Filichkin et al. (2010) Genome Res 20:45-58; Li et al. (2010)
Nat Genet 42:1060-1067) and will construct libraries for this project. In particular,
Brutnell proposes generating an indexed library for sequencing the F2 progeny
on the Illumina HiSeq2000 platform and to generate the libraries for sequencing
NMU and fast neutron-mutagenized populations. Mockler proposes generating
Illumina libraries for the 50 diverse accessions of S. viridis. Both the Mockler and
Brutnell labs will also assist in data analysis. Mockler has developed automated
informatics pipelines for sequence variant detection from Illumina genomic resequencing datasets. These pipelines have been used in the JGI Brachypodium
distachyon re-sequencing project (PI John Vogel; manuscript in preparation) and
the snow leopard genome project (Mockler, Irizarry et al., manuscript in
preparation). The Mockler and Brutnell labs are happy to collaborate with JGI on
bioinformatics efforts associated with this proposed project.
Download