Network analysis of proteomes Peter Andras School of Computing Science University of Newcastle peter.andras@ncl.ac.uk Overview Introduction The data Network analysis Protein interaction networks Analysis of protein interaction networks Computational drug target discovery Motivation Search for new antibiotics, drugs for genetic and prion diseases Destroying and restoring the functionality of cells How to do this ? eXSys project Cells – 1 Cells – 2 Analysing cells Analysing and understanding cells by analysing their protein interaction network Ideally: dynamic analysis Simplified version: static analysis Proteomics data Yeast-two-hybrid data Gene co-expression based predicted data Other experimental data Web databases DIP (Database of Interacting Proteins) – experimentally validated data, mostly for yeast STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) – large amount of predicted data based on gene expression data KEGG – metabolic cycle descriptions EBI – Proteome – full proteomes Swiss – Prot – general protein information Data collection eXSys data management engine Collects and updates automatically data from web databases Extracts information about protein interactions and stores this in proprietary format Allows to get specific data about selected proteins from web databases Networks Graphs: nodes and edges Scale-free networks Damaging scale-free networks Robustness to random damage High sensitivity to targeted damage Important nodes Hubs: high connectivity nodes Bottlenecks: nodes connecting clusters Other important nodes Elementary cycle number of nodes Effect of deletion on the characteristic polynomial Integrity measures Average minimum path length: how close are in average the nodes of the graph Clustering coefficient: how densely clustered is the graph Number of isolated clusters Other integrity measures Comparison of characteristic polynomials of the damaged and non-damaged networks Informational measures Comparative integrity measures Integrity measures calculated for well specified targeted damage or random damage E.g., top 10% hub nodes deleted, average damage by 10% of randomly selected nodes deleted Network analysis Evaluation and categorisation of nodes Evaluation of damaging capacity of nodes and node combinations Selection of nodes to achieve a desired level of damage Protein interaction systems Protein interaction systems can be viewed as networks Static picture of the cell, ignores the temporal activation of sub-networks of the full protein interaction network Protein interaction networks E. coli P. aeruginosa Analysis of protein interaction networks – 1 Protein interaction networks are scale-free networks high sensitivity to targeted damage, low sensitivity to random damage Earlier work shows that hub proteins are likely to be essential proteins Analysis of protein interaction networks – 2 Conjecture: graph theoretic network integrity is related to functional integrity of the protein interaction system Objective: determine important nodes and node combinations that can cause significant integrity damage Analysis of protein interaction networks – 3 Lists of hubs, bottlenecks, elementary cycle nodes and other important nodes Calculation of comparative damage measures Analysis of protein interaction networks – 4 Calculation of optimal combination of nodes that have damage potential above a pre-specified limit Cocktails of target proteins; blocking the activity of target proteins causes significant integrity damage to the protein interaction network Analysis of protein interaction networks – 5 Checking potential targets for toxicity BLAST comparison of targets with important proteins of host organism Selection of admissible targets and target combinations Protein network analysis eXSys network analysis engine Takes data files generated by the eXSys data management engine Performs network analysis and generates suggested target protein cocktails of admissible targets Analysis of B. subtilis B. subtilis Important nodes for B. subtilis – 1 Hub nodes Id Swiss -Prot Id Protein Name Gene Name Function 355 P351 64 Sensor protein resE RESE Member of the twocomponent regulatory system resd/rese involved in the global regulation of aerobic and anaerobic respiration. Probably phosphorylates resd. 378 P164 97 Sporulation kinase A KINA Phosphorylates the sporulation-regulatory proteins spo0a and spo0f. It also autophosphorylates in the presence of atp. 391 Q456 14 Sensor protein yycG YYCG 392 O316 61 YKRQ protein YKRQ 393 P397 64 Sporulation kinase C KINC Essential Member of the twocomponent regulatory system yycG/yycF involved in the regulation of the ftsAZ operon. Probably phosphorylates yycF. Phosphorylates the sporulation-regulatory protein spo0a. Important nodes for B. subtilis – 2 Bottleneck nodes Id- SwissProt Id Protein Name Gene Name Significance Function 121 P05652 DNA gyrase subunit B GYRB Essential DNA gyrase negatively supercoils closed circular double-stranded DNA in an ATPdependent manner and also catalyzes the interconversion of other topological isomers of double-stranded DNA rings, including catenanes and knotted rings. 122 Q45066 Topoisomerase IV subunit A PARC/GRLA Essential Topoisomerase IV is essential for chromosome segregation. It has relaxation of supercoiled DNA activity. Performs the decatenation events required during the replication of a circular DNA molecule 123 P05653 DNA gyrase subunit A GYRA Essential DNA gyrase negatively supercoils closed circular double-stranded DNA in an ATPdependent manner and also catalyzes the interconversion of other topological isomers of double-stranded DNA rings, including catenanes and knotted rings. 124 Q59192 Topoisomerase IV subunit B PARE Essential Topoisomerase IV is essential for chromosome segregation. It has relaxation of supercoiled DNA activity. Performs the decatenation events required during the replication of a circular DNA molecule 453 O07622 Hypothetical protein yhw YHFW Important nodes for B. subtilis – 3 Other important nodes Id SwissProt Id Protein Name Gene Name Significance Function 52 P16336 Preprotein translocase secY subunit SECY Essential Involved in protein export. Interacts with secA and secE to allow the translocation of proteins across the plasma membrane, by forming part of a channel. 34 P42920 50S ribosomal protein L3 RPLC Essential This protein binds directly to 23S ribosomal RNA and may participate in the formation of the peptidyltransferase center of the ribosome 35 P42921 50S ribosomal protein L4 RPLD Essential This protein binds directly and specifically to 23S rRN 36 P42924 50S ribosomal protein L23 RPLW Essential Binds to a specific region on the 23S rRNA 37 P42919 50S ribosomal protein L3 RPLC Essential This protein binds directly to 23S ribosomal RNA and may participate in the formation of the peptidyltransferase center of the ribosom Target list for B. subtilis Target nodes validated against human proteome Id Swiss-Prot Id Protein Name Gene Name Significance 55 P05647 50S ribosomal protein L34 RPMH Essential 56 O06492 Glutamyl tRNA amidotransferase subunit C GATC Essential 374 Q45614 Sensor protein yycG YYCG Essential 410 P42924 Preprotein translocase secY subunit SECY Essential 776 P42060 50S ribosomal protein L22 RL22 Essential eXSys proteome analysis system – 1 Components: eXSys data management engine eXSys network analysis engine eXSys user interface and network visualisation tool Performs data collection, analysis of protein interaction networks, provides user interface and network visualisation eXSys proteome analysis system – 2 Computational search for new antibiotic targets Bacterial proteome + host proteome Analysis of bacterial proteome with BLAST validation against the host proteome List of potential antibiotic targets that can cause significant damage to the bacteria while are likely to not damage the host New antibiotics Usual antibiotics target a single protein or a related class of proteins (e.g., penicillin targeting PBPs, ribosomal antibiotics targeting ribosomal subunits) New antibiotics: multiple target proteins, achieving effect by combined damage Computational search for drug targets for prion and genetic diseases – 1 Prions and mutated genes produce wrong protein interactions within the protein interaction network Restoring the functionality of the cells might be done by adding or changing existing proteins such that the functional integrity of the protein interaction system is restored Computational search for drug targets for prion and genetic diseases – 2 Analysing protein interaction systems of diseased cells can lead to the prediction of likely interventions that may lead to the restoration of functional integrity of the protein interaction system Summary – 1 Cells can be perceived as protein interaction systems Protein interaction systems can be analysed as networks Protein interaction networks are scale-free networks, which are resistant to random damage but highly sensitive to targeted damage Summary – 2 The eXSys protein interaction network analysis system can collect data about proteomes and analyse them to detect potential new drug target proteins Computational drug target discovery may lead to new antibiotics and new drugs to restore the functionality of diseased cells eXSys project team Project leaders: Peter Andras Malcolm P Young Project members: Olusola Idowu Steven Lynden Panos Periorellis