Project_sum_1003b - Shiu Lab

advertisement
PROJECT SUMMARY
TRPGR:
Improving the Annotations of Plant Proteins and ESTs via
Protein Domain Discovery and Phylogenomics
Shin-Han Shiu, Dept. of Plant Biology, Michigan State University
The overall goal of this project is to improve annotations of plant protein sequences and ESTs by: (1)
identifying additional conserved protein domains in these sequences and (2) transferring functional
information from genes of model species such as Arabidopsis thaliana to plant protein and EST
sequences. Currently, conserved regions representing domains in plant proteins are not well studied. As a
result, known protein domains in various databases can only provide a fragmentary description of plant
protein space. In addition, knowledge of gene functions is most concentrated on model species. With the
increasing number of plant genomes and over 9 million plant ESTs available, the challenge to determine
the functions of these plant sequences can be met by first generating hypotheses on gene functions based
on phylogenomic approaches, that is, inference of functions based on evolutionary relationships between
sequences. The proposed studies have four major aims: (1) identify protein domains based on
conservation among plant sequences, (2) classify plant proteins and ESTs into domain families, (3) infer
plant protein and EST functions with phylogenomics, and (4) construct the Domain Database of Plant
Proteins (DoPP) for broader dissemination of research data.
Because we will identify a comprehensive set of plant protein domains allowing description of
regions that are not adequately covered by current domain databases, our proposed research will lead to a
significant improvement in the annotation of plant sequences. In addition, we will transfer functional
information from model eukaryotes to genes from plant species important for agricultural, ecological,
and/or evolutionary applications. The results of the proposed studies will also be of great value for
advancing the fields of comparative and evolutionary genomics. In particular, having a dataset of plant
protein domains will allow us to better investigate the rates of domain sequence evolution and the
frequency of domain shuffling. The functional annotations will also provide a dataset for assessing
functional conservation and divergence among plant duplicate genes.
The creation of the DoPP database will greatly benefit the plant research community by providing
domain and functional annotation information for their sequences of interest. Moreover, the project will
foster an interdisciplinary training environment for students with different backgrounds. High school,
undergraduate, graduate students, and postdoctoral researchers with biological science, mathematics,
statistics, and computer science will be recruited to work on sub-projects of varying complexity. Their
interactions will not only provide high school and undergraduate students with a realistic view of how
science is done but also allow graduate students and postdoctoral researchers to learn how to be a mentor.
To broaden dissemination of understanding of science and technology outside the campus, the PI has
formed a partnership with the East Lansing Public Library to develop outreach activities aiming to
enhance the general public’s understanding of science, evolution, and genomics using the proposed
research project as an example.
Download