_________ ____ Interactions and more interactions Rob Russell Cell Networks University of Heidelberg Russell Group, Protein Evolution _________ ____ Russell Group, Protein Evolution Aloy & Russell Nature Rev Mol Cell Biol 2006 _________ ____ But instead of a cell dominated by randomly colliding individual protein molecules, we now know that nearly every major process in a cell is carried out by assemblies of 10 or more protein molecules Bruce Alberts, Cell 1998 Russell Group, Protein Evolution ____ Yeast two-hybrid _________ system Fields & Song, Nature, 340, 245, 1989 a Native GAL4 UASG GAL1-lacZ b c Individual hybrids with GAL4 domains Interaction between hybrids reconstitutes GAL4 activity GAL4 DNAbinding domain X UASG GAL1-lacZ GAL4 DNAbinding domain X UASG Y GAL1-lacZ GAL4 activating region Y UASG GAL1-lacZ Russell Group, Protein Evolution Applied to whole Yeast genome Uetz et al, Nature, 403, 623, 2000. Ito et al, PNAS, 98, 4569, 2001. _________ Interaction discovery I The two-hybrid system Binary interactions: Bait Prey FUS3 DIG2 DIG2 FUS3 x1000s LSM2 PAT1 CKS1 CLB1 NPL4 UFD1 NPL4 CDC48 NPL4 FUC1 NPL4 SUA7 . . . Russell Group, Protein Evolution . . . Uetz et al, Nature, 2000. (Yeast) Ito et al, PNAS, 2001. (Yeast) Rain et al, Nature, 2002. (H.pylori) Giot et al, Science, 2003 (D. melanogaster) Li et al, Science, 2004 (C. elegans) ____ _________ The system works, but how? ____ c Interaction between hybrids reconstitutes GAL4 activity GAL4 DNAbinding domain Y X UASG Gal-4 (C) (hypothetical) GAL1-lacZ Native GAL4 CDC28 Cyclin A S12 CKS Gal-4 (N) Russell Group, Protein Evolution L22 _________ ____ Two datasets in Yeast See: Ito et al, PNAS, 2001 (comparing to Uetz et al, Nature, 2000) Russell Group, Protein Evolution _________ ____ Interaction discovery II Affinity purification (e.g. TAP/MS) x1000s l 100 Relative Intensity [%] Complexes: Bait Co-purification partners FUS3 DIG2 DIG1 DIG3 DIG2 FUS3 DIG2 NPL4 UFD1 CDC48 FUC1… (Etc.) 50 l l l l l l M M l 1000 l l * l l l l 1500 2000 * l l 2500 3000 m/z Russell Group, Protein Evolution Gavin et al, Nature, 2002. (Yeast) Ho et al, Nature, 180, 2002. (Yeast) Trying to define binary _________ interactions from purification data ____ Reality Purification Spoke Matrix Purifications only report a collection of proteins and don’t provide any information about precisely who interacts with whom. There are thus two models for representing binary interactions from complexes, neither of which are real. Hakes et al, Comp Funct Genomics, 2006 Russell Group, Protein Evolution y Different worlds Total _________ ____ Intersection with 3D ( Transient Total Complexes ) 3D structure Two-hybrids Affinity purification Homology Aloy Ito Uetz Ho Gavin 8597 420 499 79 781 23 25 2 69 130 61 12 138 126 499 499 011 561 10 27 17 6 23 17 4475 8 1 1 231 330 Comparing interactions to known 3D structures shows that original yeast 0 1 1447 25 6 contain more 199transient interactions, 9 10 1 352 two-hybrid datasets compared to affinity 72690 130 27 that contain 106 92 3 31 28 purification datasets more stable complexes 138of 25 Uetz et23 113 97 23 are transient, 4197 2 are 48751 (e.g. al interactions with structures, dedicated or stable) Intersection with each other Russell Group, Protein Evolution Total Aloy & Russell, Trends Biochem Sci, 2003 _________ ____ Error rates in interaction discovery False negatives: interactions known to occur that are missed by a screen - To asses this one needs a reference set of positives (i.e. known interactions) among a set of proteins being screened. The fraction of these missed is the false-negative rate. Relatively simple - normally one has a set of previously determined interactions or “gold standard” False positives: interactions reported by a screen that are incorrect - To assess this one needs a set of interactions that are known not to occur that are seen in a screen. Very difficult to obtain – how can you know that two proteins definitely do not interact? - tricks include taking pairs of proteins presumed to never see each other (i.e. different cellular compartment, etc.) Russell Group, Protein Evolution Von Mering et al, Nature, 2002 _________ ____ Error rates in interaction discovery: the old view Russell Group, Protein Evolution Von Mering et al, Nature, 2002 _________ ____ Error rates in interaction discovery: the new view Russell Group, Protein Evolution Yu et al, Nature, 2002 Sociological bias _________ affects the perceived performance ____ Interactions determined on a protein by protein basis are focused around what the investigator wants to study, and thus biased towards particular areas of biology that are hot. High-throughput techniques are used precisely to find new interactions. Thus using the previously determined networks as a “gold standard” is likely to be unfair. Russell Group, Protein Evolution Braun et al, Nature Methods, 2008 _________ ____ Interaction data: predictions I Proteins in the same bacterial operon are typically functionally associated, and often physically interacting. Groups of proteins entirely absent in one or more organisms among a closely related set are often functionally/physically associated Russell Group, Protein Evolution Proteins that are separate in some organisms and fused in others are likely interacting physically. Aloy & Russell, Nature Rev Mol Cell Biol 2006 _________ ____ Interaction data: predictions II Pairs of proteins homologous to pairs of proteins seen to interact in known 3D structures can interact in the same way. Pairs of proteins containing a pair of domains often seen in interacting proteins can be used to infer interactions in proteins where interactions have not been observed. Russell Group, Protein Evolution The presence of a linear motif can indicate interactions with proteins known to bind this motif.. Aloy & Russell, Nature Rev Mol Cell Biol 2006 _________ ____ Interaction databases Resources are very different in appearance and content Efforts are underway to make a unified search/view, but not complete Thus one needs currently to look at several sites to check if an interaction is known Some are content (e.g. IntAct, MINT) others are processed and augmented (e.g. STRING) with additional predicted/inferred interactions Russell Group, Protein Evolution _________ Interaction networks ____ Grb-2 Sos-1 RGS-4 Ga/q Node Edge RGS-3 Russell Group, Protein Evolution Node _________ ____ Biological interaction networks Node Node Edge Nodes: •Proteins •Genes •Chemicals •Effects(?) Edges: •Physical interaction (e.g. yeast two-hybrid) •Co-expression (e.g. microarrays) •Same operon •Regulation of gene expression (protein to gene) •Catalysis (e.g. metabolic networks) Russell Group, Protein Evolution _________ ____ Interaction networks Jeong et al, Nature, 2001. Biological networks tend to be scale free: most nodes (e.g. proteins) are connected to only a few others with a handful of “hubs” making many more interactions. They are also “small-world” in that any pair of nodes tends to be connected via a relatively small number of intermediate nodes. Russell Group, Protein Evolution _________ ____ “Hubs” in networks Hubs are more likely to be lethal when deleted Jeong et al, Nature, 2001 Hubs are more likely to be disordered. Haynes et al, PLoS Comp Biol, 2006 Russell Group, Protein Evolution _________ ____ p53 – the promiscuous transcription factor Russell Group, Protein Evolution _________ ____ Linear motifs in p53 P 15:DNA-PK,RSK2,ATM P 18:CK1s NES P 20:CHK2 9:Unknown P MDM2 P 55:MAPKs 37:DNA-PK/ATM P CYCLIN S 386 DNA binding domain (95-289) 33:GSK-3s,CDK7,CDKs P USP7 Tetramerization domain (323-356) P 46:HIPK2 IUPred disorder prediction Russell Group, Protein Evolution P 215:AuroraA P P 315:AuroraA,CDKs P P 371,376,378:CDK7 P 392:CDK2s,CDK7,EIF2AK2 Russell & Gibson, FEBS Lett. 2008