iPG2P Steering Committee Meeting, January 13-14

advertisement

iPG2P Steering Committee Meeting, January 13-14, 2010

Use Case 1: Interrelationship between carbon metabolism and flowering time.

There is strong evidence that carbon metabolism affects flowering time in Arabidopsis thaliana . Elevations of CO

2

can either advance or retard flowering depending on other factors such as photoperiod. Genes involved in the sequestration of starch and its subsequent mobilization are also known to affect flowering time. Furthermore, there appears to be a pronounced light quantity effect on flowering time in A. thaliana .

An interesting modeling use case would be to attempt to link a C3 photosynthesis model to a network model of flowering time. In addition to relevant literature data there is also an unpublished RIL set available that combines a common laboratory strain with a parent selected for enhanced fitness under high CO

2

. (Of course, data from such a RIL set would have to be collected using other funds.)

Action item : Further develop the specifics of this use case.

Use Case 2: Hypothesis-Generation through Data-Mining, Processing, and

Visualization

Note: Comparative genomics has been invoked only sparingly here. A plant biologist

(PB) would like input on other ways to use sequence comparisons to obtain additional information about genes that confer drought tolerance in the experiment described below.

A plant biologist has imposed drought stress on a series of accessions of a crop plant species. The accessions have been chosen on the basis of previous reports from farmers concerning their ability to tolerate drought. Wide ranges of differing tolerances have been selected. The species itself has not been fully sequenced, although there are EST data available, but there is complete genomic sequence for a close relative. The stress experiment was carried out at several field locations using rainout shelters to impose drought stress on half the plant population. PB collected physiological (“phenotypic”) data over the three months of the experiment. These included measurements of photosynthesis, relative water content, growth rates, biomass and biomass distribution, time to flowering. At four time points during the experiment, PB collected leaf and root material for transcriptomic, epi-transcriptomic, lipidomic, and metabolomic profiling.

Detailed meteorological and soil data, including carbon and nitrogen levels, were collected for each site over the course of the experiment.

The objective of the experiment was to identify possible causes/mechanisms underlying differential drought tolerance among the accessions that were included. A special interest was to pinpoint the effects of drought and location on mechanisms related to photosynthesis, and to phenological events, and, in turn, the influences of the various field environments on the behavior of the stressed and unstressed plants that grew to

1

maturity in each location. Statistical analyses will be run to identify important factors influencing yield under the chosen experimental conditions. Would modeling be possible? Would enough data be collected? (SG: Good questions; in my opinion modeling will probably help but more data collected would also be good.)

All data have been carefully analyzed and stored in a Web-accessible form. Armed with these data, and with the meteorological and soil data for each field site, PB wants to use the Web services and compilation power built by the iPlant community to identify:

1.

Homologs in the closely related fully sequenced species, ( e.g. specialized databases for cereals, legumes, the Solanaceae, Populus spp. Phytozome and more) or in other fully sequenced species that are further afield, to genes that responded in the experiment. E.g. Eric Lyons’ tool.

2.

The metabolic/signaling pathways in which these homologs operate.

Interactome, (clues about signaling pathways), Reactome (identification of reactions catalyzed, metabolic pathways.) MapMan to visualize metabolic pathways, after Reactome has identified those pathways that are relevant to the problem.

3.

Changes in metabolites that “fit” in those metabolic/signaling pathways. For metabolic pathways in Arabidopsis, this is possible at AraCyc., others at

PlantCyc. No place for signaling pathways yet, although the approach taken by

Interactome could give clues about any plant species for which sequence data are available. Also, new design tool suggested by Nick.

4.

Responsive genes associated with lipid signaling or biosynthetic pathways.

5.

Changes in lipids that correspond to gene expression data. (Same as Item 3) A lipidomics database was promised some time ago, but I have not found it yet on the Web (10/06/09). It may be necessary to construct such a database.

6.

Small RNAs that target the homologs identified in 1. (Do small RNAs “transfer” across species? I’m not sure of this.) There is an Arabidopsis small RNA database.

7.

Publications where Items 3, 5, or 6 are reported and the experimental conditions under which the change occurred. (Reactome provides links to the literature, don’t know how often it is updated, though. Nick’s eFB tool provides such links also)

8.

Correlations between recorded local soil and climatic conditions and the responses of the various accessions to drought imposition. (This is Jeff White’s and Steve W’s province.) (SMW: These correlations would not be made directly, but rather with relevant state variables in ecophysiological models driven by the soil and climate data.)

Information obtained in 1-8 above, will be compiled, as PB proceeds through Steps 1-5.

Examples of data mining, compilation, and visualization tasks:

1.

Use comparative genomic databases to identify homologs of genes of interest in a fully sequenced genome. (“reference genome”).

2

2.

The results of Eric’s tool are linked to sites showing relevant metabolic and/or signaling pathways. Identify roles of genes of interest in these pathways.

3.

Paint data from metabolite profiling database onto pathways identified in 2. (My note to Nick: The big question remains as to how much metabolomics data we could access online, to start to build more refined metabolic pathways, or, really, instances of how those pathways respond to specific conditions. But, from what

Nick sent on Sept. 17 th

, we could make a start much sooner with uploading an individual's metabolic and gene expression data. I have some data of this kind from a drought stress study on potato. The data have already been published, but not in the form that we are discussing here, something entirely non-graphical.)

Repeat this process, using lipidomics data.

4.

Use specialized databases dedicated to the “reference genome” to identify small

RNAs that target homologs of the genes of interest.

5.

Devise a tool to exhibit, at will, any aspect of 1-5, in an easily storable form for the user.

6.

Relate physiological data to 1-5, wherever possible. Provide a link to historical data about each accession used, its’ lineage and provenance, where known. Does provenance matter? Devise a tool for visualizing and storing the result, in each case. For example, a result might be a correlation between source-sink partitioning and the degree of inhibition of photosynthesis by drought in 80% of the accessions, in all but one of the field locations.

The knowledge and intelligence of PB are then harnessed to devise one or more hypotheses that suggest mechanistic bases for the range in tolerances already known for the collection of accessions of the crop plant.

The PB then goes back to the lab or field and tests the hypothesis, generates more data, refines the hypothesis and repeats the testing in additional iterations.

3

Download