US_HUPO_2006 - Protein Information Resource

advertisement
The iProXpress Knowledge System for Proteomic Data Analysis
116
Zhang-Zhi Hu1; Hongzhan Huang1; Peter McGarvey1; An Chi2; Julio Valencia3; Cathy H. Wu1
1Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, DC 20007; 2 University of Virginia Department of
Chemistry, Charlottesville, VA 22904; 3Laboratory of Cell Biology, National Cancer Institute, Bethesda, MD 20892
Introduction
Large-scale proteomic profiling of biological samples such as cells, organelles or biological fluids has led to discovery of
numerous key and novel proteins involved in many biological/disease processes, as well as to the identification of novel
disease biomarkers and potential therapeutic targets. Bioinformatics infrastructure and systems play instrumental roles in
such data analyses and discovery processes. iProXpress (integrated Protein eXpression system) is an prototype analysis
system designed to help analyze proteomic and genomic data, such as protein/peptide and gene profiles from MS
proteomic and microarray gene expression experiments. It has been applied to several studies including the expression
profile analysis of hormone-induced changes in tumor cells, and is currently being adopted for analyses of pathogen/host
genomic and proteomic data produced from the NIAID Biodefense Proteomic Program. We will present a case study where
proteomes of various stages of melanosomes from human melanoma cells were analyzed using the iProXpress system to
illustrate its utility in facilitating a better understanding of pathways of melanin synthesis and melanosome biogenesis.
Figure 1. An Overview of the
iProXpress System
iProXpress
integrated Protein eXpression
Analysis System
A Case Study – Organellar proteomes of various stages of melanosomes from human melanoma cell lines
• Mapping to known mouse coat color genes led to identification of 17 essential human melanosome related proteins;
• Identification of possible stage-specific melanosome proteins for validation;
• Comparison of melanosome proteome with those of several other organelles.
IP/2D/MS Proteomic Data
Figure 2. Electron microscopy images showing the morphological characteristics of the melanosomes
during maturation. In pigmented cells such as MNT1, stage I melanosomes are relatively spherical
organelles with an amorphous matrix; at Stage II, organelles elongate, the internal fibrils form regular
parallel arrays and melanin synthesis starts at granular sites studding the internal walls of the melanosome;
at late-stage stage III, deposition of melanin on the melanosomal matrix becomes evident; till stage IV, the
organelle is fully matured with a heavy accumulation of melanin. For those of non-pigmented mutant cells
such as SkMel-28, melanosomes stay at early developmental stages and melanin synthesis is halted due to
the dysfunctional maturation and a large number of early stage melanosomes are accumulated as
Gene/Protein ID list Peptide Sequence
UniProt
Protein Mapping
System Designs
• iProXpress consists of three major components:
• The PIR (Protein Information Resource) data warehouse with integrated protein information,
• Analytical tools for sequence analysis and functional annotation, and
• Graphical user interface for categorization and visualization of expression data.
• Major Functionalities
• Gene/Peptide to Protein Mapping. Gene or protein lists are mapped to corresponding entries in UniProt
Knowledgebase (UniProtKB) of all known proteins based on gene/protein IDs, names or sequences.
• Protein information matrix. A comprehensive matrix is generated, summarizing salient features including gene
ontology (GO) and pathways retrieved from underlying PIR protein databases with annotated experimental
literature information or inferred based on sequence similarity.
• Protein Data Analysis. By iterative categorization and sorting of proteins in the information matrix, users can
correlate expression/interaction patterns to protein properties for pathway and network discovery.
Melanosome Proteome Analysis
• ID mapping. Total 2298 gi numbers were mapped using UniProt/PIR ID mapping service, which converts common
gene/protein IDs (e.g., gi number) to UniProtKB AC/ID and vice versa. 1253 were mapped to UniProtKB sequences.
• Peptide mapping. Peptides from MS data were matched against all human sequences in UniProtKB with a two-step
procedure: direct mapping or mapping using UniRef90 clusters (90% or more sequence identity grouped in the
cluster), giving 1506 mapped proteins. Combining both ID and peptide mappings, 1936 (84%) of the proteins were
mapped to 1532 UniProtKB sequences.
• Protein information matrix. A comprehensive protein information matrix were generated from underlying PIR data
warehouse (UniProt/iProClass) or inferred based on sequence homologies for melanosome proteins. Attributes in the
matrix include protein name, family, domain/motif/site, isoform, post-translational modification, GO, function/functional
category, structure/structural classification, pathway, protein interaction and complex, etc.
• Melanosome proteome analysis: The melanosome proteome datasets were partitioned into 12 groups and
categorization and sorting functions for each group or all datasets were provided based on the protein information
matrices especially of gene ontologies (GO) and pathways (KEGG and BioCarta). Iterative categorization and sorting
of proteins based on functions, pathways, and/or other attributes were carried out to generate various protein clusters,
from which interesting unique or common proteins at different stages of melanosome biogenesis can be identified in
combination with manual examination.
• Comparative analysis of organelle proteomes. Melanosome proteomes of early or late stages were compared with
other organelle proteomes such as lysosome, synaptosome and endosome. This comparative analysis coupled with
other bioinformatic analysis was aimed at deducing a set of signature proteins characteristics of the melanosme.
Gene Expression
iProClass
demonstrated on the panels to the right.
Gene
Symbol
Protein
Information
Matrix
Function in
Pigmentation
Murine Locus
piebald spotting (s)
Sfxn1
flexed tail
Tricarboxylate
carrier
Q9H9B4 : Sideroflexin1(Common early stage)
chocolate (cht)
Targeting of Tyrp1
protein to the
melanosome
P57729 : Ras-related
protein Rab-38.
(uniqueSkmel28)
606281
*si
silver (si)
melanosomal
matrix protein
P40967 : Pmel 17
precursor (Common all
stage)
Oculocutaneous albinism
[155550]
*Tyr
albino, color (c)
melanogenic
enzyme
P14679 : Tyrosinase
precursor (Unique MNT1)
OCA1 [203100]; OCA1B
[606952]…
brown (b)
melanosomal
enzyme/stabilizing
factor
P17643 : 5,6dihydroxyindole-2carboxylic acid oxidase
precursor. (Unique MNT1)
Rufous albinism, ROCA
[115501]; OCA3 [203290];
Precocious graying of hair
[278400]
iris pigment
dispersion (ipd)
Apparent
melanosomal
component
Q14956 : Transmembrane
glycoprotein NMB
precursor (Common all
stage)
Glaucoma-related pigment
dispersion syndrome-1
[604368 ]
transporter
Q9UMX9 : Melanoma
antigen AIM1
Q6P2P0 : Membraneassociated transporter
protein, isoform b (MNT1
stage1 & 2)
OCA4 [606574]
Griscelli syndrome, type 2
[607624]
SLC24A5, a human skin
color gene
Science 16 December
2005: 1754-1755.
GO tree visualization
Pathway Map
Function Categorization Chart
*Tyrp1
Two-Way Comparison Matrix
*Gpnmb
*Matp
underwhite (uw)
Sorting of data sets and display of protein information matrix
Rab27a
Functional categorization
based on gene ontology
P24530 : ET-B (Common
stage1
& MNT1 stage2)
Human Disease (OMIM)
Ednrb
Expression Profiling
Interaction Map
Human Melanosome
Proteins
melanoblast
differentiation
*Rab38
Protein Information Matrix
Table 2. Partial list of stage-specific melanosome proteins
Table 1. Mapping of mouse color genes to human melanosome proteins
Functional Annotation
gdn
ashen (ash)
melanosome
transport
Q6IAS8 : RAB27A protein
P51159 : Ras-related
protein Rab-27A (Common
all stage)
golden (gdn)
Causes delayed
and reduced
development of
melanin
pigmentation
Q71RS6
Ion transporter JSX
(Unique late stage)
Waardenburg-shah
syndrome [277580]…
P36955 : Pigment epithelium-derived factor precursor (PEDF)
Q14254 : Flotillin-2 (Epidermal surface antigen)
P07093 : Glia derived nexin precursor (GDN)
P24390 : ER lumen protein retaining receptor 1 (KDEL receptor 1) (TM)
O14880 : Microsomal glutathione S-transferase 3 (TM)
MNT1 Stage II (112)
Q12846 : Syntaxin-4 (TM) (interact with O75379 Vesicle-associated membrane protein 4;
Q15836 Vesicle-associated membrane protein 3)
Q04656 : Copper-transporting ATPase 1 (=mouse Atp7a)
Adaptor proteins:
O95782: AP-2 (~mouse Ap3bl) (also Skmel)
Q96EL6: Adaptin (~mouse Ap3d)
Vacuolar protein sorting:
Q96A65 : Exocyst complex component Sec8
P46459 : Vesicle-fusing ATPase (EC 3.6.4.6) (interact)
P14415 : Sodium/potassium-transporting ATPase beta-2 chain (TM)
MNT4 Stage IV (267)
Adaptor protein:
Q9Y6Q5 : Adaptor protein complex AP-1 mu-2 subunit, (interact with P63010: AP2B1)
Motor poteins:
Q14203 : Dynactin-1 (Progressive lower motor neuron disease [OMIM:607641]) (interact with P18669:
Phosphoglycerate mutase 1)
Q9H193 : KINESIN-13A2
Transport:
Q99747 : Gamma-soluble NSF attachment protein (also Skmel)
Q99698 : Lysosomal trafficking regulator (=mouse Lyst)
Vacuolar protein sorting:
Q9H444 : VPS32 (~mouse Vps33a) (interact) (also Skmel)
Q9NZZ3 : VPS60 (~mouse Vps33a)
Skmel28 unique (143)
This is a partial list of total 17 mapped genes. The others include Lyst, Ostm1, Dct, Atp7a, Gpr143, Myo5a and
Krt2-17. For complete list, go to http://pir.georgetown.edu/~huz/datamining/proteomics/
NIAID Biodefense Proteomics Program
MNT1 Stage I (77)
Mater catalog of proteins from NIAID Biodefence
Proteomic Research Centers.
Both melanosome-specific proteins Tyrosinase and TYRP1 are absent in Skmel28 data set, suggesting that their
absence can partially account for the lack of melanin production in Skmel28.
P57729 : Ras-related protein Rab-38 (=mouse Rab38)
P51810 : G-protein coupled receptor 143 (=mouse Gpr143)
P53794 : Sodium/myo-inositol cotransporter (Na(+)/myo-inositol cotransporter)
All stage IV-specific membrane protein:
P50443 : Sulfate transporter (Solute carrier family 26 member 2) (OMIM: 600972)
Q9NZ45 : Protein C10orf70
P33121 : Long-chain-fatty-acid--CoA ligase 1 (LACS 1) (OMIM: 152425) (also in ER)
Q8NCC2 : Hypothetical protein FLJ90355 (Solute carrier family 2
Q8IWB8 : CCR4-NOT transcription complex, subunit 1, isoform b
P27449 : Vacuolar ATP synthase 16 kDa proteolipid subunit (EC 3.6.3.14) (OMIM: 108745)
Q71RS6 : Ion transporter JSX [Homo sapiens] – human skin color gene
Q6ZTT7 : Hypothetical protein FLJ44232 [Homo sapiens]
Q16444 : Phosphoglycerate kinase (Fragment) [Homo sapiens] (47 aa)
Stage IV
Lysosome
AP2
Tyrp1, Dct
and MART1
?
AP3
Gp100
gp100
AP3
Early
Late
Endosome Tyr Endosome
AP3
gp100
Tyrosinase
Table 3. Summary of the comparison of organellar proteomes
Stage III
Organelle
AP3
AP1
Mouse ER (ER)
AP3
APx AP4
Two-way comparative matrix
cis-Golgi
A catalog of proteins from Harvard University, cDNA clones of which are
used for protein chip assays to detect protein-protein interactions in situ.
Browsing selected complete proteome(s) with
protein links to the Proteomic Center data.
Stage I
gp100
(90%)
ER
Figure 3A. Trafficking and sorting of proteins to melanosomes. Hearing 2005.
• current short list of melanosome stage-specific
proteins as proposed in table 2;
• some constituent proteins commonly found in
several other organelles such as those in table 3.
References
~141
131
57 (M:51, S:36)
-19(33%) all stage
Knoblach et al,
2003
Human neuromelanin 72
granules (NG)
72
43 (M:38, S:36)
Tribl et al, 2005
-22(51%) all stages
Rat synaptosome (SY) 200
88
43 (M:35, S:27)
Witzmann et al,
-14(33%) all stages 2005
Rat lysosome (LY)
215
116
49 (M:40, S:38)
Bagshaw et al,
-13(27%) all stages 2005
Human platelet (PL)
~93
71
33 (M:26, S:22)
-6(18%) all stages
Human exosome (EX) ~56
55
42 (M:42, S:32)
Mears et al, 2004
-18(43%) all stages
Martens et al,
2005
Proteins Present in at Least 3 Other Organelles in Either Early Stages or All Stages of
Melanosomes (1-ER, 2-NG, 3-SY, 4-EX, 5-PL, 6-LY)
P30101
P35232
P60709
Q5FVE7
P04406
P04843
P06576
P10809
P11142
Conclusion
The prototype iProXpress system enables the
analysis of large-scale expression data through its
functionalities in ID/peptide mapping, sequence
analysis, functional annotation, and functional
profiling. The case study on melanosome proteome
analysis illustrates that the use of the system greatly
facilitates the study of melanosome biogenesis and
pigmentation related human diseases. The system
should be applicable to a broad range of genomic
and proteomic profiling studies.
Common with
melanosome *
Stage II
trans-Golgi Network
• previously known melanosome-specific proteins;
# Entries
mapped
Tyr
gp100
(10%)
Proposal: a list of proteins characteristic of melanasome
that may consist of:
# Protein
reported
1 2 3 4 5 6
|+|+|+|+|+||+|+|-|+|-|-|
|+|+|-|-|+||+|-|+|+|-|+|
|+|+|+|-|-||+|-|+|+|-|-|
|+|+|-|-|-||+|-|+|-|-|+|
|+|+|+|+|+||-|+|+|-|-|+|
|+|+|+|+|+||-|+|-|-|+|+|
|+|+|+|-|+||+|+|+|-|-|-|
|+|+|+|+|+||-|+|+|+|-|-|
|+|+|+|+|+||-|+|+|+|-|+|
Protein disulfide-isomerase A3 precursor
Prohibitin
Actin, cytoplasmic 1 (Beta-actin)
Voltage-dependent anion channel 1
Glyceraldehyde-3-phosphate dehydrogenase
protein glycosyltransferase
ATP synthase beta chain, mitochondrial
60 kDa heat shock protein, mitochondrial
Heat shock cognate 71 kDa protein
* M – MNT1 cells ; S – Skmel28 cells; Percent of protein common in all stages of MNT1 melanososmes are also indicated.
The complete listing of the common proteins is at http://pir.georgetown.edu/~huz/datamining/proteomics/ .
Slominski, A. et al. Physiol. Rev. 84: 1155-1228 2004; Copyright ©2004 American Physiological Society
Figure 3B. Schematic of melanosomal proteins (Hypothetical depiction)
Download