PRO-&-cROP-Wu - Buffalo Ontology Site

advertisement

cROP Plant Ontologies &

Protein Ontology (PRO)

PRO-PO-GO

Meeting

Amherst, NY

May 16, 2013

Cathy H. Wu, Ph.D.

2

PRO Communities

• Ontology Developers

• GO ontology: Interfaces of GO/PRO complexes; GO definition (e.g., GO:0005109)

• GO annotation: precise annotation of protein forms in PomBase

• Cell Ontology: Define cell types based on protein types

• Annotation Ontology for annotating scientific documents on the web

• Brucellosis Ontology (IDOBRU), extension of the Infectious Disease Ontology (IDO)

• Semantic Resources

• Semantic Web Applications in Neuromedicine (SWAN); Neuroscience Information

Framework (NIF)

• Pathway/Process-Modeling Resources

• Reactome, MouseCyc, EcoCyc, MaizeCyc

• Chemical/Proteomic Resources: PubChem, IUPhar, P3DB, Top-Down Proteomics, PDB

• Pharma/Clinical Communities: Drug Discovery & Disease Biomarker

• Alzforum

• Salivaomics KB/SALO (Saliva Ontology): Saliva Biomarkers

• Clinical flow cytometry, immunology (ImmuPort) community

3

Biological Questions

• List all the genes expressed differentially in the leaves of Rice plant varieties IRBB5 and IR24 at the 5-leaf visible growth stage, when the plants were infected with Xanthomonas oryzae pv. oryzae were grown in a growth camber. IRBB5 is resistant and IR24 is susceptible to rice bacterial blight disease.

• Filter the differentially expressed gene set for those with

– LRR-domains

– Transmembrane domains (e.g. in excess of 1)

– Receptor like kinase function

– Plasmamembrane cellular location

– OR those having Tryptophan decarboxylase function

– Tryptophan metabolism

– Have known alleles and homologs with disease resistance phenotype

4

Object

XX

Annotation: Ontology Requirements

Object type

Feature or ontology Feature type

Molecular Function GO

Biological Process GO

Cellular component GO

Plant structure

Plant growth stage

(bio)chemical

Disease

Protein Domains

Pathway

Trait

PO

PO

ChEBI

DO

PRO

InterPro

Pathway??

TO

Attribute and score

PATO context

Any of the ontologies including the environment ontology for adding context to the annotation.

E.g. PEP carboxylase activity (GO-MF) in maize is required for

C4 carbon assimilation

(GO-PB). The process occurs in the plastid

(GO-CC) of the leaf mesophyll cell (PO).

5

Disease Ontology Example

Building genotype-phenotype associations

GO: response to pathogen

Allele-B

GO: Receptor like Kinase

Gene:XA21

Allele-A belongs_to

Oryza genotype

6

PRO Workflow

 Data Sources

• Manual annotation (curator, collaborator, user): sourceforge tracker; RACE-PRO

• Semi-automated processing of external databases (e.g., UniProtKB, Reactome,

MouseCyc, EcoCyc); coverage of 12 reference genomes in progress

 Integration with text mining: RLIMS-P/eFIP ( P hosphorylation and F unctional I mpact)

RACE-PRO

Annotation Interface:

Capture knowledge of protein forms/ complexes of interest to support integrated analysis

7

PRO representation of the spindle checkpoint

PRO search query to retrieve PRO terms that contain the phrases

“spindle checkpoint” or

“spindle assembly checkpoint” or “mitotic checkpoint” and combined Cytoscape web view of the search results nodes retrieved by the search are blue; related nodes (parents and children) are gray

Use of the protein ontology for multi-faceted analysis of biological processes: a case study of the spindle checkpoint. Ross et al. (2013) Front Genet. 4:62. [PMID: 23637705] 8

Phosphorylated forms of BUB1B in PRO

[PMID: 23637705]

Four species-independent

BUB1B phosphorylated forms (blue nodes).

Display options set to show parents and all children, including organism level terms.

Sequence alignment of human, frog, and mouse

BUB1B highlighted to indicate experimentally determined phosphorylation sites

(blue) and predicted phosphorylation sites

(red).

9

PRO in iPTMnet Framework

PTM network of enzyme-substrate relationships and protein-protein interactions => iPTMnet with rich relations

Data Mining: iProClass database for molecular and omics data integration

Text Mining: RLIMS-P/eFIP system for knowledge extraction from literature

Ontology: PRO for knowledge representation of PTM forms

Web portal linking data and analysis/visualization tools for scientific queries

( http://proteininformationresource.org/iPTMnet )

10

PTM Enzyme-Substrate Database

• Literature-curated kinase-substrate data

 PhosphoSitePlus, Phospho.ELM, HPRD

 PhosphoGRID

 P3DB, PhosPhAt

 UniProtKB, PRO

• Database content

 Substrates: 28,000; P-Sites: 126,000; Kinases: 700

 Substrate/site-kinase pairs: 13,000

 Covering: human, mouse, rat, other vertebrates, Drosophila, C. elegans, yeast and plants

 Curated phosphorylation papers: 10,000

• Full-scale processing of PubMed abstracts: 22 million

 Phosphorylation papers identified by RLIMS-P: 143,000

 Phosphorylation-PPI related papers identified by eFIP: 10,000

11

iPTM

Network

Exploring Relations

• Substrate-centric:

What PTM forms of a protein and their modifying enzymes are known?

• Enzyme-centric:

What substrates are known for a given PTM enzyme?

• Interaction:

What interacting partners are known for each PTM form of a given protein?

• Pathway:

What modifications and enzymes are known in a given signaling pathway?

Coupled with functional annotation and biological context

(homology, disease, tissue/cell..)

=> Hypothesis generation and discovery

12

Human BUB1B Phosphorylation Network

• 73 nodes

• 24 phosphorylated forms

• 9 protein kinases

• 10 phospho-specific PPIs

• BUB1B/Phos:2 interacts specifically with PPP2R5A

• UB1B/Phos:2 phosphorylated by two important mitotic kinases:

CDK1 and PLK1

• BUB1B interacts with both phosphorylated and unphosphorylated CDC27

• Phosphorylation on

CDC27/Phos:1 sites does not regulate CDC27 interaction with BUB1B

Construction of protein phosphorylation networks by data mining, text mining, and ontology

integration: analysis of the spindle checkpoint. Ross et al. (2013) Database (Oxford) (in press).

BR Signaling

• Brassinosteroids (BRs): a class of growth-promoting hormones, which plays role in plant growth and development.

• BR signaling is highly integrated with the light, gibberellin, and auxin pathways, and crosstalks with other receptor kinase pathways to modulate stomata development and innate immunity.

BR signaling curation

Step 1: Search RLIMS-P with core genes (bri1, bak1, bin2, bsu1, bzr1, bes1) and

“brassinosteroid mediated signaling pathway” to identify phosphorylation papers with phosphorylation information (kinase, substrate, site)

Step 2: Use RACE-PRO to curate phosphorylated protein forms, their kinases, PPIs, and associated GO functions, process, subcellular component

14

BR Signaling Pathway

Core proteins and other associated proteins annotated with GO related to BR signaling pathway (blue)

15

SCF complexes & Auxin/Jasmonate Signaling

Cullin-1 Rubylated

• SCF Complexes formed in response to auxin and jasmonate signaling

• Link to ChEBI for small molecule-containing complexes

16

Download