b. Studies and Results

advertisement
Principal Investigator/Program Director (Last, first, middle):
Gerstein, Mark B
b. Studies and Results
* Genomic surveys of membrane proteins and membrane protein motifs: Liu et al., GenomeBiology (2002)
and Zhang et al., JMB (2002). [Related to Aim 1]
During this year, we completed two surveys of membrane proteins and membrane protein motifs. In the first,
we grouped membrane proteins into families and looked at their relative abundance in a number of different
genomes. We also looked at the abundance of a number of different motifs -- in particular, GXXXG. In the
second paper, we extended our motif work further, looking at the occurrence of protein motifs, not only in
known proteins but also in the intergenic regions of genome. We called the motifs found "pseudomotifs" (in the
spirit of pseudogenes). We compared a number of eukaryotic genomes in terms of their occurrence of these
motifs.
* Helix Packing Calculations: Tsai & Gerstein, Bioinformatics (2002) [Related to Aim 2]
This year we continued on the work on 3D helix packing calculations. We performed a detailed sensitivity
analysis of our calculations and created a database relevant parameters, which is available through the web.
This sensitivity analysis was invaluable in helping us understand which parameters were most useful in our
packing calculations.
* Helix Interaction Motifs: Schneider et al., FEBS (2002) [Related to Aim 2]
We did a composition analysis of membrane proteins, focusing on the implications of composition for helixhelix interactions.
* Development of Integrative Database Systems: Lin et al., NAR (2002) and Mateos et al., Genome Research
(2002) [Related to Aim 3]
This year we set up a new integrated resource, GeneCensus.org, which followed on from last year's system
PartsList.org. GeneCensus takes a more sequence and less structural view of genome comparisons focusing on
expression data, pathway activities, and protein interactions. It has an extensive section devoted to the
occurrence of transmembrane helix interaction motifs in different genomes. In a second paper, we collaborated
with IBM researchers on developing a neural network system for integrating many different types of genomic
information. This was used in relation to predicting various aspects of protein function and protein class.
c. Significance
* Our survey of membrane proteins highlights the importance and overrepresentation of specific motifs .
* Our various parameter sets for packing calculations and detailed sensitivity analysis will be generally useful
for helix packing calculations.
* Our GeneCensus website represents new ways of organizing "poly-genomic" information .
PHS 398/2590 (Rev. 05/01)
Page _______
Continuation Format Page
Principal Investigator/Program Director (Last, first, middle):
Gerstein, Mark B
d. Plans
Next year, we plan to continue with the project as outlined in the original proposal. In particular:
* Ursula Lehnert will be continuing to work on the project and, as usual, will continue on with the packing
calculation in collaboration with Neil Voss, Julian Graham, and Nat Echols (individuals not funded by the
project). We believe we should have a significant amount of membrane helix packing and dynamics
calculations done.
* We plan to continue on database integration. We hope to combine the PartsList and GeneCensus systems
together with a smart-linking technology.
* We plan to continue on with our inventory membrane proteins, looking in more detail at the occurrence of
particular domain combinations.
* Finally, we plan to continue surveying the occurrence of membrane to membrane proteins and membrane
associated proteins in intergenic regions, looking at the fossil history of these proteins. In particular, we are
keen to survey the pseudogenes associated with cytochrome c.
e. Publications
Calculations of protein volumes: sensitivity analysis and parameter database.
J Tsai, M Gerstein (2002) Bioinformatics 18: 985-95.
Thermostability of membrane protein helix-helix interaction elucidated by statistical analysis.
D Schneider, Y Liu, M Gerstein, DM Engelman (2002) FEBS Lett 532: 231-6.
Genomic analysis of membrane protein families: abundance and conserved motifs.
Y Liu, DM Engelman, M Gerstein (2002) Genome Biol 3: research0054.
Digging deep for ancient relics: a survey of protein motifs in the intergenic sequences of four
eukaryotic genomes.
ZL Zhang, PM Harrison, M Gerstein (2002) J Mol Biol 323: 811-22.
GeneCensus: genome comparisons in terms of metabolic pathway activity and protein family
sharing.
J Lin, J Qian, D Greenbaum, P Bertone, R Das, N Echols, A Senes, B Stenger, M Gerstein (2002) Nucleic Acids
Res 30: 4574-82.
Systematic learning of gene functional classes from DNA array expression data by using
multilayer perceptrons.
A Mateos, J Dopazo, R Jansen, Y Tu, M Gerstein, G Stolovitzky (2002) Genome Res 12: 1703-15.
PHS 398/2590 (Rev. 05/01)
Page _______
Continuation Format Page
Principal Investigator/Program Director (Last, first, middle):
Gerstein, Mark B
f. Project-generated Resources
This past year, we have created a number of new project web resources for this grant, in particular
GeneCensus.org. This is added to http://bioinfo.mbb.yale.edu and http://www.partslist.org. These sites make
available:
- all of our transmembrane helix identifications and surveys of helix interaction motifs
- the expression level of membrane proteins with known structure
- the parameters from a our packing calculations
We also have a page devoted exclusively to this grant: http://thorin.csb.yale.edu/papers/grant/mem .
PHS 398/2590 (Rev. 05/01)
Page _______
Continuation Format Page
Principal Investigator/Program Director (Last, first, middle):
PHS 398/2590 (Rev. 05/01)
Page _______
Gerstein, Mark B
Continuation Format Page
Download