About OMICS Group 1 OMICS Group International is an amalgamation of Open Access publications and worldwide international science conferences and events. Established in the year 2007 with the sole aim of making the information on Sciences and technology ‘Open Access’, OMICS Group publishes 400 online open access scholarly journals in all aspects of Science, Engineering, Management and Technology journals. OMICS Group has been instrumental in taking the knowledge on Science & technology to the doorsteps of ordinary men and women. Research Scholars, Students, Libraries, Educational Institutions, Research centers and the industry are main stakeholders that benefitted greatly from this knowledge dissemination. OMICS Group also organizes 300 International conferences annually across the globe, where knowledge transfer takes place through debates, round table discussions, poster presentations, workshops, symposia and exhibitions. NINDS workshop, 02/28/2013 About OMICS Group Conferences OMICS Group International is a pioneer and leading science event organizer, which publishes around 400 open access journals and conducts over 300 Medical, Clinical, Engineering, Life Sciences, Pharma scientific conferences all over the globe annually with the support of more than 1000 scientific associations and 30,000 editorial board members and 3.5 million followers to its credit. OMICS Group has organized 500 conferences, workshops and national symposiums across the major cities including San Francisco, Las Vegas, San Antonio, Omaha, Orlando, Raleigh, Santa Clara, Chicago, Philadelphia, Baltimore, United Kingdom, Valencia, Dubai, Beijing, Hyderabad, Bengaluru and Mumbai. 2 NINDS workshop, 02/28/2013 Translating multidimensional cancer omics data into biological insights Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine bing.zhang@vanderbilt.edu Omics opportunities and challenges DNA CNV mutation methylation Elephant mRNA expression splicing Risk Diagnosis Prognosis Therapeutics Protein expression modification 4 The Cancer Genome Atlas (TCGA) Colon and Rectal Cancer (CRC) Project Nature (2012) 487: 330-337 • • • • • Hypermutator genotype Somatic mutations Copy number alterations Translocations Transcriptomic subtypes How genomic alterations drive cancers? Clinical Proteomic Tumor Analysis Consortium (CPTAC) 5 TCGA−A6−3807 TCGA−A6−3808 TCGA−A6−3810 TCGA−AA−3518 TCGA−AA−3525 TCGA−AA−3526 TCGA−AA−3529 TCGA−AA−3531 TCGA−AA−3534 TCGA−AA−3552 TCGA−AA−3554 TCGA−AA−3558 TCGA−AA−3561 TCGA−AA−3664 TCGA−AA−3666 TCGA−AA−3672 TCGA−AA−3684 TCGA−AA−3695 TCGA−AA−3710 TCGA−AA−3715 TCGA−AA−3818 TCGA−AA−3848 TCGA−AA−3864 TCGA−AA−3986 TCGA−AA−3989 TCGA−AA−A004 TCGA−AA−A00A TCGA−AA−A00E TCGA−AA−A00F TCGA−AA−A00J TCGA−AA−A00K TCGA−AA−A00N TCGA−AA−A00O TCGA−AA−A00R TCGA−AA−A00U TCGA−AA−A010 TCGA−AA−A017 TCGA−AA−A01C TCGA−AA−A01D TCGA−AA−A01F TCGA−AA−A01I TCGA−AA−A01K TCGA−AA−A01P TCGA−AA−A01R TCGA−AA−A01S TCGA−AA−A01T TCGA−AA−A01V TCGA−AA−A01X TCGA−AA−A01Z TCGA−AA−A022 TCGA−AA−A024 TCGA−AA−A029 TCGA−AA−A02E TCGA−AA−A02H TCGA−AA−A02J TCGA−AA−A02O TCGA−AA−A02R TCGA−AA−A02Y TCGA−AA−A03F TCGA−AA−A03J TCGA−AF−2691 TCGA−AF−2692 TCGA−AF−3400 TCGA−AF−3913 TCGA−AG−3574 TCGA−AG−3580 TCGA−AG−3584 TCGA−AG−3593 TCGA−AG−3594 TCGA−AG−4007 TCGA−AG−A002 TCGA−AG−A008 TCGA−AG−A00C TCGA−AG−A00H TCGA−AG−A00Y TCGA−AG−A011 TCGA−AG−A014 TCGA−AG−A015 TCGA−AG−A016 TCGA−AG−A01J TCGA−AG−A01L TCGA−AG−A01N TCGA−AG−A01W TCGA−AG−A01Y TCGA−AG−A020 TCGA−AG−A026 TCGA−AG−A02N TCGA−AG−A02X TCGA−AG−A032 TCGA−AG−A036 Global proteomics analysis: Samples Female Clinical characteristics of the 90 colorectal cancer (CRC) samples Non−Hyp Colon Deceased Living Male Rectum Hyp MSI−L MSI−H MSS Stage II Stage I NA's Stage IV NA's Stage III miRNA.Seq (74) RNA.Seq (87) RPPA (47) CNV (88) Methylation (85) AgilentArray (73) Exome.Seq (79) Shotgun (90) Sample ID 6 Global proteomics analysis: Vanderbilt shotgun proteomics platform dm_200392_TpepC_std #587 RT: 20.55 AV: 1 NL: 1.95E7 T: + c d Full ms2 388.37@40.00 [ 95.00-790.00] 170.9 42 605.3 y7 b2 40 38 36 534.3 606.3 y6 34 32 30 Relative Abundance 28 26 24 22 20 y3 18 16 143.0 b3 12 10 4 129.1 169.8 143.9 b5 402.2 b6 y5 b7 y2 171.9 2 374.2 b4 242.0 8 6 tissue y4 303.3 14 299.2 197.0 232.5 196.3 246.2 271.1 304.1 357.2 459.7 375.2 400.0 403.2 445.1 535.2 b8 607.5 477.0 517.3 615.3 530.2 587.4 558.0 0 100 peptides 150 200 250 300 350 400 450 500 550 600 m/z m/z MS/MS spectra Digestion bRPLC LC-MS/MS (TFE, trypsin) 15 concatenated fractions Thermo Orbitrap-Velos Preprocessing •MSConvert Peptide identification •Myrimatch •Pepitome •MS-GF+ Protein assembly •IDPicker 7 Global proteomics analysis: Summary of the proteomics data • • • • • 90 tumors 6,299,756 identifiable spectra 124,823 distinct peptides (1% FDR) 7,526 proteins (2.6% FDR) 3,899 quantifiable proteins (0.43% FDR) 684 GB https://cptac-data-portal.georgetown.edu 8 What is the added value of proteome profiling in human cancer studies? • Cancer biomarker discovery • Oncogenic driver prioritization • Molecular subtype classification 9 Mutant proteins as cancer biomarkers • Proteins resulting from somatic mutations are ideal cancer biomarkers – Specificity • Unlike CEA and PSA – Possible drivers • Not only association – Established targeted proteomics technologies • Selected Reaction Monitoring (SRM) • Multiple Reaction Monitoring (MRM) • Many somatic mutations have been identified in the TCGA studies • Can we prioritize somatic mutations for targeted analysis? TCGA. Nature, 2012 10 Personalized protein database from RNA-Seq data • Increased sensitivity • Reduced ambiguity • Variant peptides customProDB is available in Bioconductor Wang et al., J Proteome Res, 2012 Wang & Zhang, Bioinformatics, 2013 11 Personalized database search results • 1,065 variant peptides • 796 unique Single Amino Acid Variants (SAAVs) – 64 in TCGA reported somatic variations – 101 in the COSMIC database – 526 in the dbSNP database • 647 variant proteins – Known cancer genes: KRAS, CTNNB1, SF3B1, ALDH2, FH, etc. – Targets of FDA approved drugs: ALDH2, HSD17B4, PARP1, P4HB, TST, GAK, SLC25A24, SUPT16H, etc. 12 What is the added value of proteome profiling in human cancer studies? • Cancer biomarker discovery • Oncogenic driver prioritization • Molecular subtype classification 13 Candidate drivers in regions of copy number alteration (CNA) • The TCGA study identified 17 focal amplification regions and 28 focal deletion regions • CNA-mRNA correlation has been widely used for the prioritization of candidate drivers – Wang et al. Clin Cancer Res, 2013 • What can proteomics tell us about these regions and candidate drivers? TCGA. Nature, 2012 14 Proteomics data enables prioritization of CNA regions and candidate drivers CNA-mRNA correlation 14 16 18 20 22 Y 13 15 17 19 21 X 12 11 7 8 9 10 1 2 3 4 5 6 Gene location b 20q Common Specific 480 20q 360 240 120 0 120 Focal%Amp/Del% HNF4A 1 2 3 4 5 6 7 8 9 10 12 11 14 13 16 18 2022 Y 15 17 19 21 X 1 2 3 4 c 23,125 genes d P rotein T C G A − A G − A 0 2 6 T C G A − A G − A 0 1 4 T C G A − A A − A 0 3 J T C G A − A A − A 0 1 7 C N V p = 1.0e-5 0.43 T C G A − A G − A 0 1 6 0.79 m R N A 0.51 3 5 6 7 8 9 10 12 11 CNA location T C G A − A A − A 0 1 D 3,764 genes a CNA-protein correlation 14 13 16 18 20 22 Y 15 17 19 21 X CNA location e p <2.2e-16 23,125 genes f p = 4.0e-6 2 15 TCGA.AA.3525 TCGA.AA.A010 TCGA.AA.A00A TCGA.AA.3518 TCGA.AA.A00R TCGA.AA.3554 Two genes in the focal amplification region 20q13.12 TCGA.AG.A016 TCGA.AG.A026 TCGA.AG.A014 TCGA.AA.A03J TCGA.AA.A017 TCGA.AA.A01F TCGA.AA.A00O TCGA.AA.A02H TCGA.AG.A01N TCGA.AG.A032 TCGA.AG.3574 TCGA.AF.3913 TCGA.AG.4007 TCGA.AA.3531 TCGA.AA.A01C TCGA.AG.A020 TCGA.AA.A01T TCGA.AA.A01D TCGA.AG.A01J TCGA.AA.A02J TCGA.AA.A004 TCGA.AG.A01L TCGA.AG.3584 TCGA.A6.3810 TCGA.AG.A00H TCGA.AA.3666 TCGA.AA.3561 TCGA.AG.A015 TCGA.AA.A00F TCGA.AG.A00C TCGA.AG.3593 TCGA.AA.A00U TCGA.AF.2691 TCGA.AG.A00Y TCGA.AA.A01Z TCGA.AA.A01S TCGA.AA.A01I TCGA.AA.3695 CNV CNA mRNA mRNA Protein protein 0.06 0.67 5 Cor = -0.02 TCGA−AA−3518 TCGA−AA−3818 TCGA−AA−3534 TCGA−AG−A008 TCGA−AA−3664 TCGA−AA−A01K TCGA−AG−A02X TCGA−AG−3580 TCGA−AA−3986 TCGA−AA−3552 TCGA−AA−3529 TCGA−AA−A02E TCGA−AA−A01X TCGA−AA−3684 TCGA−AA−A00N TCGA−AA−3715 TCGA−AF−2692 TCGA−AF−3400 TCGA−AG−A02N TCGA−AG−A002 TCGA−AA−A00E TCGA−AA−A022 TCGA−AA−3672 TCGA−AA−3710 TCGA−AA−A02R TCGA−AA−A00J TCGA−AA−A03F TCGA−AA−3864 TCGA−AA−A01V TCGA−AG−3594 TCGA−AA−A024 TCGA−AA−A01R TCGA−AA−A01P TCGA−AA−3554 TCGA−AA−A010 TCGA−AA−A00A TCGA−AG−A016 TCGA−AA−A01D TCGA−AG−A026 TCGA−AG−A014 TCGA−AA−A03J TCGA−AA−A017 TCGA−AA−A01F TCGA−AA−A00O TCGA−AA−A02H TCGA−AG−A01N TCGA−AG−A032 TCGA−AG−3574 TCGA−AF−3913 TCGA−AG−4007 TCGA−AA−3531 TCGA−AA−A01C TCGA−AG−A020 TCGA−AA−A01T TCGA−AG−A01J TCGA−AA−A02J TCGA−AA−A004 TCGA−AG−A01L TCGA−AG−3584 TCGA−A6−3810 TCGA−AG−A00H TCGA−AA−3666 TCGA−AA−3561 TCGA−AG−A015 TCGA−AA−A00F TCGA−AG−A00C TCGA−AA−A00U TCGA−AG−3593 TCGA−AF−2691 TCGA−AG−A00Y TCGA−AA−A01Z TCGA−AA−A01S TCGA−AA−A01I TCGA−AA−3695 TCGA−AA−A00K TCGA−A6−3807 TCGA−AG−A01W TCGA−AA−A029 TCGA−AA−3848 TCGA−AG−A011 TCGA−AA−A02O TCGA−AA−3989 TCGA−AG−A01Y STK4 STK4 TCGA.AA.A00K TCGA.A6.3807 TCGA.AG.A01W TCGA.AA.A029 TCGA.AA.3848 TCGA.AG.A011 TCGA.AA.A02O TCGA.AA.3989 TCGA.AG.A01Y TCGA.AA.3818 TCGA.AA.3534 TCGA.AG.A008 TCGA.AA.3664 TCGA.AA.A01K TCGA.AG.A02X TCGA.AG.3580 TCGA.AA.3986 TCGA.AA.3552 TCGA.AA.3529 TCGA.AA.A02E TCGA.AA.A01X TCGA.AA.3684 TCGA.AA.A00N TCGA.AA.3715 TCGA.AF.2692 TCGA.AF.3400 TCGA.AG.A02N TCGA.AG.A002 TCGA.AA.A00E TCGA.AA.A022 TCGA.AA.3672 TCGA.AA.3710 TCGA.AA.A02R TCGA.AA.A00J TCGA.AA.A03F TCGA.AA.3864 TCGA.AA.A01V TCGA.AG.3594 TCGA.AA.A024 TCGA.AA.A01R TCGA.AA.A01P TCGA−AA−3525 TCGA−AA−A00R mRNA mRNA 0.79 Cor = 0.51 0.43 Proteomics data help narrow down candidate oncogenic drivers HNF4A CNA CNV Protein protein 16 HNF4A 2 Higher cell lines 3 4 5 HNF4A 6 7 8 9 dependency 10 12 14 16 18 in 2022CRC Y 11 13 15 17 19 21 X CNA location estine Other cancer ll lines cell lines f p = 4.0e-6 2 shRNA score p = 6.9e-20 1 0 -1 -2 Colon cancer cell lines Other cancer cell lines Data from the Achilles project http://www.broadinstitute.org/achilles 17 What is the added value of proteome profiling in human cancer studies? • Cancer biomarker discovery • Oncogenic driver prioritization • Molecular subtype classification 18 TCGA transcriptomic subtypes • The TCGA study identified three transcriptomic subtypes: MSI/CIMP (Microsatellite instability/ CpG island methylator phenotype), Invasive, and CIN (Chromosome Instability) • The MSI/CIMP subtype is enriched with hypermutated tumors TCGA. Nature, 2012 19 mRNA level does not reliably predict protein level b 2.5 32% significant positive correlation (FDR<0.01) Mean=0.23 =0.47 5 lation 89.4% positive correlation Can we rediscover or redefine CRC subtypes using proteomics data? Probability density 2.0 1.5 1.0 0.5 0.0 0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 Spearman’s correlation 0.8 20 Five proteomic subtypes 2 MSI/CIMP Invasive CIN Subtypes Genomic features TCGA mRNA_subtype yes no NA nm5 Methylation subtype CIMP-H Cluster 3 nm3 Subtypes CIMP-L Methylation_subtype TCGA mRNA_subtype Cluster 4 Hypermutated nm5 Subtypes MSI−H nm3 Proteomic subtype TCGA mRNA_subtype POLE Methylation_subtype Transcriptomic subtype nm5 Subtypes BRAF HypermutationHypermutated nm3 KRAS MSI−H MSI-high TCGA mRNA_subtype Methylation_subtype nm5 TP53 POLE TP53 mutation Hypermutated nm3 18q loss BRAF 18q loss MSI−H Methylation_subtype stage KRAS Methylation subtype 1.0 0.8 0.6 Transcriptomic subtype 0.4 Proteomic subtype A B C D E Protein subtype Subtype A (n=15) Subtype B (n=9) Subtype C (n=25) Subtype D (n=11) Subtype E (n=19) 0.2 Wound response p = 0.734 0.0 -2 of survival Percentage percentage of survival Protein expression 0 10 20 30 40 50 60 70 survival months Survival months POLE Hypermutated TP53 BRAF MSI−H 18q loss KRAS POLE stage 21 Summary • Integrated proteogenomic analysis may enable new advances in cancer biology and management. – Identifying mutant peptides as candidate cancer biomarkers – Prioritizing oncogenic drivers in focal amplification regions. – Revealing molecular subtypes that cannot be distinguished using mRNA expression data. 22 Making data accessible and reusable NetGestalt CRC portal (http://crc.netgestalt.org) Experimental Data Analysis Features Knowledge Sample Type Data processing levels Data Type Mutation GO biological process CNA Clinical data CNA Cell lines Mutation (targeted) mRNA expression GO cellular component Annotation data, SBT Protein expression Integrated results, SCT/SBT/SST mRNA expression Gene by sample matrices, CCT/CBT DNA Methylation Significant calls for individual genes, SBT GO molecular function Statistical results for individual genes, SCT Tissue Integration Data query/upload Track menu • Browse systems tracks • Search systems tracks • Upload track file • Enter gene symbols View menu • Select system networks • Upload user networks Horizontal integration • Order statistics • Boolean operations Vertical integration • Summary tracks • Correlations • Boolean operations Network integration • Module-based approach • Pathway-based approach • Direct neighbor approach KEGG pathway Visualization Reactome pathway NCI-Nature pathway Cancer Cell Map HumanCyc pathway Genetic vulnerability Overview • Linear network layout • Heat maps • Bar charts • Data transformation • Summary tracks Zoom and filter • Zoom-in and zoom-out • Pan • Filter Details-on-demand • Node-link diagrams • Detailed supporting data Drug response PPI networks Sample annotations, TSI Tracks Views Shi et al. Nature Methods, 2013 23 Acknowledgement • • • • Jing Wang Qi Liu Xiaojing Wang Jing Zhu Dan Liebler Rob Slebos Dave Tabb Zhiao Shi CPTAC Network TCGA Network Funding: SAIC RFP S13-029 NCI U24 CA159988 NIGMS R01 GM088822 NCI P50 CA095103 NIMH P50 MH096972 24 Let Us Meet Again We welcome you all to our future conferences of OMICS Group International Please Visit: www.omicsgroup.com www.conferenceseries.com www.pharmaceuticalconferences.com 25