I. Searching for human splice-regulatory motifs & II. Network analysis of synthetic-lethal interactions Fritz Roth Harvard Medical School Dept. of Biological Chemistry & Molecular Pharmacology IPAM Workshop Jan 2006 Outline Human alternative-splicing motif search Yeast synthetic-lethal network analysis Outline Human alternative-splicing motif search Review mRNA splicing Splice-junction expression data Sequence neighborhoods Clustering splice-junctions by usage Results Yeast synthetic-lethal network analysis Brief review of canonical mRNA splicing Adapted from Molecular Cell Biology, Lodish et al. Importance of alternative splicing ~100,000 genes ~70% of human multi-exon genes alternatively spliced 35,000 25,000 human genes 5% alt. spliced Antiquity 2001 2004 From correlation to regulatory mechanism Roth et al , 1998, Hughes et al, 2000 Tavazoie et al 1999, Slides adapted from S. Tavazoie Sequence neighborhood of a splice junction Conservation: constitutive vs. alternative 5’ donor neighborhood 3’ acceptor neighborhood (Sorek & Ast, Gen. Research, 2003) Previous splicing motif searches Canonical splicing enhancers/repressors not associated with specific tissues (e.g. Brudno et al., Burge et al, Chasin et al) Based on small (curated literature) set of alternatively spliced genes (e.g. , Brudno et al, Stamm et al., Fedorov et al.) Based on expressed sequence tag (EST) datasets (Xu et al), biased towards 3’ ends of genes, can contain artificial splice variants due to Unigene clustering (Modrek & Lee, 2002) Exon-exon splice junction expression data Pre-mRNA Cassette exon Alternative mature mRNAs 18 nt + 18 nt • Every splice junction for ~11K reference mRNAs for ~10K genes • ~100K probe sequences on five arrays • 49 tissues & cell lines run in fluor-reversed pairs • full-length mRNA amplification Johnson et al. Science. 2003 Dec 19;302(5653):2141-4. Castle et al. Genome Biology. 2003;4(10):R66. Tissue-specific probes & data Flowchart splice junction usage data probes tissues Tissuespecific splice junction expression NPTB 5’ original intensities 3’ probes tissues Tissuespecific splice junction expression NPTB 5’ original intensities 3’ rescaled intensities probes tissues Tissuespecific splice junction expression NPTB 5’ original intensities 3’ rescaled intensities relative splicing Tissue-specific splice junction expression Synexin (ANXA7) Flowchart splice junction usage data Probe set choice all cassette exons cassette exons alternative donors / acceptors Clustered splice junctions skipped cassette exons alternative donors / acceptors Results: heart & muscle Cassette exons skipped in heart & skeletal muscle Flowchart splice junction usage data Probe set choice all cassette exons alternative donors / acceptors clustered splice junctions Sequence neighborhood choice proximal to donor all exonic proximal to acceptor sequence neighborhoods intronic Sequence neighborhood of a splice junction Alternative splice junction neighborhoods 1 3 3 2 1. Immediate neighborhood 2. Neighborhood of junctions with shared splice site 3. Neighborhood of junctions of other competing junctions Flowchart splice junction usage data Probe set choice cassette exons all alternative donors / acceptors clustered splice junctions Sequence neighborhood choice proximal to donor all intronic exonic proximal to acceptor sequence neighborhoods Pattern finding wordbased motifbased enriched sequence patterns Results: heart & muscle Cassette exons skipped in heart & skeletal muscle ACTAAC @ end of intron - 8 out of 39 probes - 14-fold enrichment - strong position bias (often nearby) Results: brain Cassette exons skipped in brain 26% of probes (9-fold enrichment) Results: ileum 65% (15-fold) 76% (3-fold) Results: jejunum, liver, pancreas Cluster Results: jejunum, liver, pancreas Cluster HNRPL Results: jejunum, liver, pancreas HNRPL RRM1 RRM2 RRM3 RRM4 Cluster HNRPL Results: jejunum, liver, pancreas PTB RRM1 RRM2 RRM3 RRM4 Protein interaction HNRPL RRM1 RRM2 RRM3 RRM4 Cluster HNRPL Results: jejunum, liver, pancreas PTB RRM1 RRM2 RRM3 RRM4 Protein interaction HNRPL RRM1 RRM2 RRM3 RRM4 Cluster HNRPL Summary, Part I tissue-specific alternative splicing candidate cis-regulatory motifs candidate cis-regulatory secondary strcture validated cis-regulatory motifs Map to trans-acting splicing factors Acknowledgments, Part I Adnan Derti George Church & Lab Roth Lab Rosetta/Merck Jason Johnson John Castle Lee Lim Adrian Krainer Outline Human alternative-splicing motif search Yeast synthetic-lethal network analysis Outline Human alternative-splicing motif search Yeast synthetic-lethal network analysis Background Overlap with other biological relationships Network motifs Predicting synthetic lethality Role of transcription compensation in mutational robustness SSL vs. protein interaction in predicting function What is Synthetic Lethality? Gene X Gene Y Cells live Gene X Gene Y Cells live Gene X Gene Y Cells die What is Synthetic Sickness/Lethality (SSL)? Gene X Gene Y Cells live Gene X Gene Y Cells live Gene X Gene Y Cells die or grow slowly An engine can run without one cylinder (from http://www.cs.unc.edu/~geom/collide/videos.shtml) Scenarios resulting in synthetic genetic interaction Partially redundant genes A 2 partially redundant pathways 3 compensatory pathways, 2 required A E A E J B F B F K C1 C2 C G C G L D D H D H M B E I I SSL Protein complex tolerating 1 but not 2 mutations A B D C E F A known sub-network of SSL interactions A Canadian consortium (Boone et al.) has made many double mutants As of 2001: 8 query genes x 4500 nonessential “array” genes ≈ 36,000 tested pairs (Tong et al., Science, 2001) The known sub-network circa 2001 (Tong et al., Science, 2001) The known SSL sub-network circa 2004 160 query x 4500 nonessential ≈ 700,000 tested pairs (≈4% pairs) ~3800 interactions (Tong et al., Science, 2004) Outline Human alternative-splicing motif search Yeast synthetic-lethal network analysis Background Overlap with other biological relationships Network motifs Predicting synthetic lethality Role of transcription compensation in mutational robustness SSL vs. protein interaction in predicting function Overlap between synthetic lethality S+C Neither P(S|C) P-valueN Query Characteristic S+C Odds S only C only S+C S onlyS only C onlyC only Neither P(C|S)P(C|S) Odds P(S|C) P-value &27256 other “interactions” 27256 698373 80.03 3E-460 779GO779 1746 1746 698373 0.31 0.31 8 3E-460 Similar annotation 779 17460.03 27256 389 716176 1E-273 389 2136 21369453 9453 716176 0.15 0.15 12 120.04 1E-273 Same MIPS phenotype 389 21360.04 9453 385 717268 0.15 13 0.04 6E-288 S+C only only Neither P(S|C) P-value Query Characteristic S+C S0.04 only C only Ne S+C S2140 onlyS2140 C8361 onlyC8361 Neither P(C|S) Odds P(S|C) P-value 385 717268 0.15 P(C|S) 13 Odds 6E-288 Same GO annotation 385 2140 8361 113 241227256 3500 722129 0.04 7E-68 1746 698373 890.03 0.03 3E-460 Similar GO779 annotation 779 1746 27256 69 779 1746 698373 0.31 89 3E-460 113 2412localization 3500 27256 722129 0.04 0.31 7E-68 Same subcellular 113 2412 3500 82phenotype 24439453 1445 724184 0.03 160.04 0.05 1E-67 389 716176 0.04 1E-273 Same 389 2136 9453 71 389 2136 716176 0.15 12 1E-273 82MIPS 2443 2136 1445 9453 724184 0.03 0.15 16 0.05 1E-67 MIPS complex 82 12 2443 1445 962140 24298361 2586 723043 0.04 110.04 5E-63 385 717268 0.04 6E-288 Same annotation 385 2140 8361 71 385 717268 0.15 13 6E-288 96GOinteraction 2429 2140 2586 8361 723043 0.04 0.15 11 5E-63 Physical 96 13 2429 2586 47 2478 1769 723860 0.02 1E-25 113 2412 722129 980.03 0.037E-68 7E-68 Same 113 2412 3500 72 113 2412 3500 722129 0.04 947 3)10E-3) 47subcellular 2478localization 1769 3500 723860 0.02 0.04 8 1E-25 Sequence homology (BLAST Eval < 10E-3) 2478 1769 mplexes) 20complex 2505 166 725463 0.01 350.05 0.111E-67 1E-23 82 724184 0.05 1E-67 82 2443 1445 72 2443 1445 724184 0.03 16 es) Same82 20MIPS 2505 2443 166no1445 725463 0.01 0.03 35 0.11 1E-23 MIPS complex (with subcomplexes) 20 16 2505 166 29protein 2496 1023 724606 0.01 80.04 0.035E-63 4E-17 96 723043 11 0.04 5E-63 Physical interaction 96 2429 2586 72 2429 2586 723043 0.04 11 29MIPS 2496 2429 1023 2586 724606 0.01 0.04 8 0.03 4E-17 Same96 class 29 2496 1023 262478 2499 906723860 724723 0.01 1E-15 47 2478 1769 723860 0.031E-25 1E-25 Sequence homology (BLAST < 10E-3) 47826 80.03 2478 1769 72 3)10E-3) 1769 0.02 26MCODE 2499 906 Eval 724723 0.01 0.02 1E-15 Same47 complex 2499 906 rmplexes) 33complex 2492 722718 30.11 0.011E-23 1E-08 20 2505 166725463 725463 35 0.11 1E-23 (with no 2911 subcomplexes) 20 2505 166 72 es) Same 20 2505 166 0.01 0.01 35 33MIPS 2492 interaction 2911 722718 3 0.01 1E-08 Common physical partner 33 2492 2911 .7) Same 46 2479 6171 719458 0.02 7E-03 3E-06 2496 827E-03 0.034E-17 4E-17 protein class 29846 2496 1023 72 29 2496 1023 0.01 0.03 46MIPS29 2479 expr 6171 1023 719458 0.02 0.01 2 3E-06 Correlated mRNA (Cho, CC724606 >0.7)724606 2479 6171 142499 2511 724597 0.011E-15 3E-05 26 2499 906724723 724723 840.03 0.03 1E-15 Same complex 26814 2499 906 72 26 906 1032 0.01 0.01 14MCODE 2511 1032 724597 4 0.01 3E-05 Physical interaction: APMS 2511 1032 CI 9 2492 2516 699722718 724930 1E-03 r 33 2492 2911 722718 0.01 340.01 0.011E-08 1E-08 Common physical partner 33349 2492 2911 72 33 2911 0.01 4E-03 9 interaction: 2516interaction 699 HMS-PCI 724930 4E-03 1E-03 Physical APMS: 2516 699 ke" Correlated 5 2479 2520 290 725339 50.02 0.023E-06 4E-03 .7) 46 2479 6171 0.02 27E-03 7E-03 3E-06 5interaction: 2520Y2H 290 725339 2E-03 4E-03 mRNA expr CC >0.7) 719458 46255 2479 6171 71 6171 719458 0.02 2E-03 Phys46 U(Cho, APMS "spoke" 2520 290 14 2511 2511 724597 40.01 Physical APMS 144 2511 0.013E-05 10323E-05 72 14 interaction: 1032 1032 724597 0.01 0.01 CI 9 2516 2516 699724930 724930 40.01 Physical APMS: 94 2516 0.011E-03 699 1E-03 72 9 interaction: 699HMS-PCI 4E-034E-03 Overlap between synthetic lethality S+C Neither P(S|C) P-valueN Query Characteristic S+C Odds S only C only S+C S onlyS only C onlyC only Neither P(C|S)P(C|S) Odds P(S|C) P-value &27256 other “interactions” 27256 698373 80.03 3E-4606 779GO779 1746 1746 698373 0.31 0.31 8 Similar annotation 779 17460.033E-460 27256 389 716176 1E-2737 389 2136 21369453 9453 716176 0.15 0.15 12 120.04 Same MIPS phenotype 389 21360.041E-273 9453 385 717268 0.15 13 0.04 6E-288 S+C only only Neither P(S|C) P-value Query Characteristic S+C S0.04 only C only Ne7 S+C S2140 onlyS2140 C8361 onlyC8361 Neither P(C|S) Odds P(S|C) P-value 385 717268 0.15 P(C|S) 13 Odds 6E-288 Same GO annotation 385 2140 8361 113 2412 3500 722129 0.04 7E-68 1746 698373 890.03 0.03 3E-460 Similar GO779 annotation 779 1746 27256 697 779 1746 27256 698373 0.31 89 3E-460 113 2412localization 3500 27256 722129 0.04 0.31 7E-68 Same subcellular 113 2412 3500 82phenotype 24439453 1445 724184 0.03 160.04 0.05 1E-67 389 716176 0.04 1E-273 Same 389 2136 9453 717 389 2136 716176 0.15 12 1E-273 82MIPS 2443 2136 1445 9453 724184 0.03 0.15 16 0.05 1E-67 MIPS complex 82 12 2443 1445 962140 24298361 2586 723043 0.04 110.04 5E-63 385 717268 0.04 6E-288 Same annotation 385 2140 8361 717 385 717268 0.15 13 6E-288 96GOinteraction 2429 2140 2586 8361 723043 0.04 0.15 11 5E-63 Physical 96 13 2429 2586 472412 2478 1769 723860 0.02 1E-25 113 2412 722129 980.03 0.037E-68 7E-68 Same 113 2412 3500 727 113 3500 722129 0.04 947 3)10E-3) 47subcellular 2478localization 1769 3500 723860 0.02 0.04 8 1E-25 Sequence homology (BLAST Eval < 10E-3) 2478 1769 mplexes) 20complex 2505 166 725463 0.01 350.05 0.111E-67 1E-23 82 724184 0.05 1E-67 Same82 82 2443 1445 727 2443 1445 724184 0.03 16 es) 20MIPS 2505 2443 166no1445 725463 0.01 0.03 35 0.11 1E-23 MIPS complex (with subcomplexes) 20 16 2505 166 29protein 2496 1023 724606 0.01 80.04 0.035E-63 4E-17 96 723043 11 0.04 5E-63 Physical interaction 96 2429 2586 727 2429 2586 723043 0.04 11 29MIPS 2496 2429 1023 2586 724606 0.01 0.04 8 0.03 4E-17 Same96 class 29 2496 1023 262478 2499 906723860 724723 0.01 1E-15 47 2478 1769 723860 0.031E-25 1E-25 Sequence homology (BLAST < 10E-3) 47826 80.03 2478 1769 727 3)10E-3) 1769 0.02 26MCODE 2499 906 Eval 724723 0.01 0.02 1E-15 Same47 complex 2499 906 r 33complex 2492 722718 30.11 0.011E-23 1E-08 mplexes) 20 2505 166725463 725463 35 0.11 1E-23 Same (with no 2911 subcomplexes) 20 2505 166 727 es) 20 2505 166 0.01 0.01 35 33MIPS 2492 interaction 2911 722718 3 0.01 1E-08 Common physical partner 33 2492 2911 .7) Same 46 2479 6171 719458 0.02 7E-03 3E-06 2496 827E-03 0.034E-17 4E-17 protein class 29846 2496 1023 727 29 2496 1023 0.01 0.03 46MIPS29 2479 expr 6171 1023 719458 0.02 0.01 2 3E-06 Correlated mRNA (Cho, CC724606 >0.7)724606 2479 6171 142499 2511 724597 0.011E-15 3E-05 26 2499 906724723 724723 840.03 0.03 1E-15 Same complex 26814 2499 906 727 26 906 1032 0.01 0.01 14MCODE 2511 1032 724597 4 0.01 3E-05 Physical interaction: APMS 2511 1032 CI 9 2492 2516 699722718 724930 1E-03 r 33 2492 2911 722718 0.01 340.01 0.011E-08 1E-08 Common physical partner 33349 2492 2911 727 33 2911 0.01 4E-03 9 interaction: 2516interaction 699 HMS-PCI 724930 4E-03 1E-03 Physical APMS: 2516 699 ke" Correlated 5 2479 2520 290 725339 50.02 0.023E-06 4E-03 .7) 46 2479 6171 0.02 27E-03 7E-03 3E-06 5interaction: 2520Y2H 290 725339 2E-03 4E-03 mRNA expr CC >0.7) 719458 46255 2479 6171 717 6171 719458 0.02 2E-03 Phys46 U(Cho, APMS "spoke" 2520 290 14 2511 2511 724597 40.01 Physical APMS 144 2511 0.013E-05 10323E-05 72 14 interaction: 1032 1032 724597 0.01 0.01 CI 9 2516 2516 699724930 724930 40.01 Physical APMS: 94 2516 0.011E-03 699 1E-03 72 9 interaction: 699HMS-PCI 4E-034E-03 Overlap between synthetic lethality S+C Neither P(S|C) P-valueN Query Characteristic S+C Odds S only C only S+C S onlyS only C onlyC only Neither P(C|S)P(C|S) Odds P(S|C) P-value &27256 other “interactions” 27256 698373 80.03 3E-4606 779GO779 1746 1746 698373 0.31 0.31 8 3E-460 Similar annotation 779 17460.03 27256 389 716176 1E-2737 389 2136 21369453 9453 716176 0.15 0.15 12 120.04 1E-273 Same MIPS phenotype 389 21360.04 9453 385 717268 0.15 13 0.04 6E-288 S+C only only Neither P(S|C) P-value Query Characteristic S+C S0.04 only C only Ne7 S+C S2140 onlyS2140 C8361 onlyC8361 Neither P(C|S) Odds P(S|C) P-value 385 717268 0.15 P(C|S) 13 Odds 6E-288 Same GO annotation 385 2140 8361 113 241227256 3500 722129 0.04 7E-68 1746 698373 890.03 0.03 3E-460 Similar GO779 annotation 779 1746 27256 697 779 1746 698373 0.31 89 3E-460 113 2412localization 3500 27256 722129 0.04 0.31 7E-68 Same subcellular 113 2412 3500 82phenotype 24439453 1445 724184 0.03 160.04 0.05 1E-67 389 716176 0.04 1E-273 Same 389 2136 9453 717 389 2136 716176 0.15 12 1E-273 82MIPS 2443 2136 1445 9453 724184 0.03 0.15 16 0.05 1E-67 MIPS complex 82 12 2443 1445 962140 24298361 2586 723043 0.04 110.04 5E-63 385 717268 0.04 6E-288 Same annotation 385 2140 8361 717 385 717268 0.15 13 6E-288 96GOinteraction 2429 2140 2586 8361 723043 0.04 0.15 11 5E-63 Physical 96 13 2429 2586 47 2478 1769 723860 0.02 1E-25 113 2412 722129 980.03 0.037E-68 7E-68 Same 113 2412 3500 727 113 2412 3500 722129 0.04 947 3)10E-3) 47subcellular 2478localization 1769 3500 723860 0.02 0.04 8 1E-25 Sequence homology (BLAST Eval < 10E-3) 2478 1769 mplexes) 20complex 2505 166 725463 0.01 350.05 0.111E-67 1E-23 82 724184 0.05 1E-67 82 2443 1445 727 2443 1445 724184 0.03 16 es) Same82 20MIPS 2505 2443 166no1445 725463 0.01 0.03 35 0.11 1E-23 MIPS complex (with subcomplexes) 20 16 2505 166 29protein 2496 1023 724606 0.01 80.04 0.035E-63 4E-17 96 723043 11 0.04 5E-63 Physical interaction 96 2429 2586 727 2429 2586 723043 0.04 11 29MIPS 2496 2429 1023 2586 724606 0.01 0.04 8 0.03 4E-17 Same96 class 29 2496 1023 262478 2499 906723860 724723 0.01 1E-15 47 2478 1769 723860 0.031E-25 1E-25 Sequence homology (BLAST < 10E-3) 47826 80.03 2478 1769 727 3)10E-3) 1769 0.02 26MCODE 2499 906 Eval 724723 0.01 0.02 1E-15 Same47 complex 2499 906 rmplexes) 33complex 2492 722718 30.11 0.011E-23 1E-08 20 2505 166725463 725463 35 0.11 1E-23 (with no 2911 subcomplexes) 20 2505 166 727 es) Same 20 2505 166 0.01 0.01 35 33MIPS 2492 interaction 2911 722718 3 0.01 1E-08 Common physical partner 33 2492 2911 .7) Same 46 2479 6171 719458 0.02 7E-03 3E-06 2496 827E-03 0.034E-17 4E-17 protein class 29846 2496 1023 727 29 2496 1023 0.01 0.03 46MIPS29 2479 expr 6171 1023 719458 0.02 0.01 2 3E-06 Correlated mRNA (Cho, CC724606 >0.7)724606 2479 6171 142499 2511 724597 0.011E-15 3E-05 26 2499 906724723 724723 840.03 0.03 1E-15 Same complex 26814 2499 906 727 26 906 1032 0.01 0.01 14MCODE 2511 1032 724597 4 0.01 3E-05 Physical interaction: APMS 2511 1032 CI 9 2492 2516 699722718 724930 1E-03 r 33 2492 2911 722718 0.01 340.01 0.011E-08 1E-08 Common physical partner 33349 2492 2911 727 33 2911 0.01 4E-03 9 interaction: 2516interaction 699 HMS-PCI 724930 4E-03 1E-03 Physical APMS: 2516 699 ke" Correlated 5 2479 2520 290 725339 50.02 0.023E-06 4E-03 .7) 46 2479 6171 0.02 27E-03 7E-03 3E-06 5interaction: 2520Y2H 290 725339 2E-03 4E-03 mRNA expr CC >0.7) 719458 46255 2479 6171 717 6171 719458 0.02 2E-03 Phys46 U(Cho, APMS "spoke" 2520 290 14 2511 2511 724597 40.01 Physical APMS 144 2511 0.013E-05 10323E-05 72 14 interaction: 1032 1032 724597 0.01 0.01 CI 9 2516 2516 699724930 724930 40.01 Physical APMS: 94 2516 0.011E-03 699 1E-03 72 9 interaction: 699HMS-PCI 4E-034E-03 Overlap between synthetic lethality S+C Neither P(S|C) P-valueN Query Characteristic S+C Odds S only C only S+C S onlyS only C onlyC only Neither P(C|S)P(C|S) Odds P(S|C) P-value &27256 other “interactions” 27256 698373 80.03 3E-460 779GO779 1746 1746 698373 0.31 0.31 8 3E-460 Similar annotation 779 17460.03 27256 389 716176 1E-273 389 2136 21369453 9453 716176 0.15 0.15 12 120.04 1E-273 Same MIPS phenotype 389 21360.04 9453 385 717268 0.15 13 0.04 6E-288 S+C only only Neither P(S|C) P-value Query Characteristic S+C S0.04 only C only Ne S+C S2140 onlyS2140 C8361 onlyC8361 Neither P(C|S) Odds P(S|C) P-value 385 717268 0.15 P(C|S) 13 Odds 6E-288 Same GO annotation 385 2140 8361 113 241227256 3500 722129 0.04 7E-68 1746 698373 890.03 0.03 3E-460 Similar GO779 annotation 779 1746 27256 69 779 1746 698373 0.31 89 3E-460 113 2412localization 3500 27256 722129 0.04 0.31 7E-68 Same subcellular 113 2412 3500 82phenotype 24439453 1445 724184 0.03 160.04 0.05 1E-67 389 716176 0.04 1E-273 Same 389 2136 9453 71 389 2136 716176 0.15 12 1E-273 82MIPS 2443 2136 1445 9453 724184 0.03 0.15 16 0.05 1E-67 MIPS complex 82 12 2443 1445 962140 24298361 2586 723043 0.04 110.04 5E-63 385 717268 0.04 6E-288 Same annotation 385 2140 8361 71 385 717268 0.15 13 6E-288 96GOinteraction 2429 2140 2586 8361 723043 0.04 0.15 11 5E-63 Physical 96 13 2429 2586 47 2478 1769 723860 0.02 1E-25 113 2412 722129 980.03 0.037E-68 7E-68 Same 113 2412 3500 72 113 2412 3500 722129 0.04 947 3)10E-3) 47subcellular 2478localization 1769 3500 723860 0.02 0.04 8 1E-25 Sequence homology (BLAST Eval < 10E-3) 2478 1769 mplexes) 20complex 2505 166 725463 0.01 350.05 0.111E-67 1E-23 82 724184 0.05 1E-67 82 2443 1445 72 2443 1445 724184 0.03 16 es) Same82 20MIPS 2505 2443 166no1445 725463 0.01 0.03 35 0.11 1E-23 MIPS complex (with subcomplexes) 20 16 2505 166 29protein 2496 1023 724606 0.01 80.04 0.035E-63 4E-17 96 723043 11 0.04 5E-63 Physical interaction 96 2429 2586 72 2429 2586 723043 0.04 11 29MIPS 2496 2429 1023 2586 724606 0.01 0.04 8 0.03 4E-17 Same96 class 29 2496 1023 262478 2499 906723860 724723 0.01 1E-15 47 2478 1769 723860 0.031E-25 1E-25 Sequence homology (BLAST < 10E-3) 47826 80.03 2478 1769 72 3)10E-3) 1769 0.02 26MCODE 2499 906 Eval 724723 0.01 0.02 1E-15 Same47 complex 2499 906 rmplexes) 33complex 2492 722718 30.11 0.011E-23 1E-08 20 2505 166725463 725463 35 0.11 1E-23 (with no 2911 subcomplexes) 20 2505 166 72 es) Same 20 2505 166 0.01 0.01 35 33MIPS 2492 interaction 2911 722718 3 0.01 1E-08 Common physical partner 33 2492 2911 .7) Same 46 2479 6171 719458 0.02 7E-03 3E-06 2496 827E-03 0.034E-17 4E-17 protein class 29846 2496 1023 72 29 2496 1023 0.01 0.03 46MIPS29 2479 expr 6171 1023 719458 0.02 0.01 2 3E-06 Correlated mRNA (Cho, CC724606 >0.7)724606 2479 6171 142499 2511 724597 0.011E-15 3E-05 26 2499 906724723 724723 840.03 0.03 1E-15 Same complex 26814 2499 906 72 26 906 1032 0.01 0.01 14MCODE 2511 1032 724597 4 0.01 3E-05 Physical interaction: APMS 2511 1032 CI 9 2492 2516 699722718 724930 1E-03 r 33 2492 2911 722718 0.01 340.01 0.011E-08 1E-08 Common physical partner 33349 2492 2911 72 33 2911 0.01 4E-03 9 interaction: 2516interaction 699 HMS-PCI 724930 4E-03 1E-03 Physical APMS: 2516 699 ke" Correlated 5 2479 2520 290 725339 50.02 0.023E-06 4E-03 .7) 46 2479 6171 0.02 27E-03 7E-03 3E-06 5interaction: 2520Y2H 290 725339 2E-03 4E-03 mRNA expr CC >0.7) 719458 46255 2479 6171 71 6171 719458 0.02 2E-03 Phys46 U(Cho, APMS "spoke" 2520 290 14 2511 2511 724597 40.01 Physical APMS 144 2511 0.013E-05 10323E-05 72 14 interaction: 1032 1032 724597 0.01 0.01 CI 9 2516 2516 699724930 724930 40.01 Physical APMS: 94 2516 0.011E-03 699 1E-03 72 9 interaction: 699HMS-PCI 4E-034E-03 Overlap between synthetic lethality S+C Neither P(S|C) P-valueN Query Characteristic S+C Odds S only C only S+C S onlyS only C onlyC only Neither P(C|S)P(C|S) Odds P(S|C) P-value &27256 other “interactions” 27256 698373 80.03 3E-460 779GO779 1746 1746 698373 0.31 0.31 8 3E-460 Similar annotation 779 17460.03 27256 389 716176 1E-273 389 2136 21369453 9453 716176 0.15 0.15 12 120.04 1E-273 Same MIPS phenotype 389 21360.04 9453 385 717268 0.15 13 0.04 6E-288 S+C only only Neither P(S|C) P-value Query Characteristic S+C S0.04 only C only Ne S+C S2140 onlyS2140 C8361 onlyC8361 Neither P(C|S) Odds P(S|C) P-value 385 717268 0.15 P(C|S) 13 Odds 6E-288 Same GO annotation 385 2140 8361 113 241227256 3500 722129 0.04 7E-68 1746 698373 890.03 0.03 3E-460 Similar GO779 annotation 779 1746 27256 69 779 1746 698373 0.31 89 3E-460 113 2412localization 3500 27256 722129 0.04 0.31 7E-68 Same subcellular 113 2412 3500 82phenotype 24439453 1445 724184 0.03 160.04 0.05 1E-67 389 716176 0.04 1E-273 Same 389 2136 9453 71 389 2136 716176 0.15 12 1E-273 82MIPS 2443 2136 1445 9453 724184 0.03 0.15 16 0.05 1E-67 MIPS complex 82 12 2443 1445 962140 24298361 2586 723043 0.04 110.04 5E-63 385 717268 0.04 6E-288 Same annotation 385 2140 8361 71 385 717268 0.15 13 6E-288 96GOinteraction 2429 2140 2586 8361 723043 0.04 0.15 11 5E-63 Physical 96 13 2429 2586 47 2478 1769 723860 0.02 1E-25 113 2412 722129 980.03 0.037E-68 7E-68 Same 113 2412 3500 72 113 2412 3500 722129 0.04 947 3)10E-3) 47subcellular 2478localization 1769 3500 723860 0.02 0.04 8 1E-25 Sequence homology (BLAST Eval < 10E-3) 2478 1769 mplexes) 20complex 2505 166 725463 0.01 350.05 0.111E-67 1E-23 82 724184 0.05 1E-67 82 2443 1445 72 2443 1445 724184 0.03 16 es) Same82 20MIPS 2505 2443 166no1445 725463 0.01 0.03 35 0.11 1E-23 MIPS complex (with subcomplexes) 20 16 2505 166 29protein 2496 1023 724606 0.01 80.04 0.035E-63 4E-17 96 723043 11 0.04 5E-63 Physical interaction 96 2429 2586 72 2429 2586 723043 0.04 11 29MIPS 2496 2429 1023 2586 724606 0.01 0.04 8 0.03 4E-17 Same96 class 29 2496 1023 262478 2499 906723860 724723 0.01 1E-15 47 2478 1769 723860 0.031E-25 1E-25 Sequence homology (BLAST < 10E-3) 47826 80.03 2478 1769 72 3)10E-3) 1769 0.02 26MCODE 2499 906 Eval 724723 0.01 0.02 1E-15 Same47 complex 2499 906 rmplexes) 33complex 2492 722718 30.11 0.011E-23 1E-08 20 2505 166725463 725463 35 0.11 1E-23 (with no 2911 subcomplexes) 20 2505 166 72 es) Same 20 2505 166 0.01 0.01 35 33MIPS 2492 interaction 2911 722718 3 0.01 1E-08 Common physical partner 33 2492 2911 .7) Same 46 2479 6171 719458 0.02 7E-03 3E-06 2496 827E-03 0.034E-17 4E-17 protein class 29846 2496 1023 72 29 2496 1023 0.01 0.03 46MIPS29 2479 expr 6171 1023 719458 0.02 0.01 2 3E-06 Correlated mRNA (Cho, CC724606 >0.7)724606 2479 6171 142499 2511 724597 0.011E-15 3E-05 26 2499 906724723 724723 840.03 0.03 1E-15 Same complex 26814 2499 906 72 26 906 1032 0.01 0.01 14MCODE 2511 1032 724597 4 0.01 3E-05 Physical interaction: APMS 2511 1032 CI 9 2492 2516 699722718 724930 1E-03 r 33 2492 2911 722718 0.01 340.01 0.011E-08 1E-08 Common physical partner 33349 2492 2911 72 33 2911 0.01 4E-03 9 interaction: 2516interaction 699 HMS-PCI 724930 4E-03 1E-03 Physical APMS: 2516 699 ke" Correlated 5 2479 2520 290 725339 50.02 0.023E-06 4E-03 .7) 46 2479 6171 0.02 27E-03 7E-03 3E-06 5interaction: 2520Y2H 290 725339 2E-03 4E-03 mRNA expr CC >0.7) 719458 46255 2479 6171 71 6171 719458 0.02 2E-03 Phys46 U(Cho, APMS "spoke" 2520 290 14 2511 2511 724597 40.01 Physical APMS 144 2511 0.013E-05 10323E-05 72 14 interaction: 1032 1032 724597 0.01 0.01 CI 9 2516 2516 699724930 724930 40.01 Physical APMS: 94 2516 0.011E-03 699 1E-03 72 9 interaction: 699HMS-PCI 4E-034E-03 Overlap between synthetic lethality S+C Neither P(S|C) P-valueN Query Characteristic S+C Odds S only C only S+C S onlyS only C onlyC only Neither P(C|S)P(C|S) Odds P(S|C) P-value &27256 other “interactions” 27256 698373 80.03 3E-460 779GO779 1746 1746 698373 0.31 0.31 8 3E-460 Similar annotation 779 17460.03 27256 389 716176 1E-273 389 2136 21369453 9453 716176 0.15 0.15 12 120.04 1E-273 Same MIPS phenotype 389 21360.04 9453 385 717268 0.15 13 0.04 6E-288 S+C only only Neither P(S|C) P-value Query Characteristic S+C S0.04 only C only Ne S+C S2140 onlyS2140 C8361 onlyC8361 Neither P(C|S) Odds P(S|C) P-value 385 717268 0.15 P(C|S) 13 Odds 6E-288 Same GO annotation 385 2140 8361 113 241227256 3500 722129 0.04 7E-68 1746 698373 890.03 0.03 3E-460 Similar GO779 annotation 779 1746 27256 69 779 1746 698373 0.31 89 3E-460 113 2412localization 3500 27256 722129 0.04 0.31 7E-68 Same subcellular 113 2412 3500 82phenotype 24439453 1445 724184 0.03 160.04 0.05 1E-67 389 716176 0.04 1E-273 Same 389 2136 9453 71 389 2136 716176 0.15 12 1E-273 82MIPS 2443 2136 1445 9453 724184 0.03 0.15 16 0.05 1E-67 MIPS complex 82 12 2443 1445 962140 24298361 2586 723043 0.04 110.04 5E-63 385 717268 0.04 6E-288 Same annotation 385 2140 8361 71 385 717268 0.15 13 6E-288 96GOinteraction 2429 2140 2586 8361 723043 0.04 0.15 11 5E-63 Physical 96 13 2429 2586 47 2478 1769 723860 0.02 1E-25 113 2412 722129 980.03 0.037E-68 7E-68 Same 113 2412 3500 72 113 2412 3500 722129 0.04 947 3)10E-3) 47subcellular 2478localization 1769 3500 723860 0.02 0.04 8 1E-25 Sequence homology (BLAST Eval < 10E-3) 2478 1769 mplexes) 20complex 2505 166 725463 0.01 350.05 0.111E-67 1E-23 82 724184 0.05 1E-67 82 2443 1445 72 2443 1445 724184 0.03 16 es) Same82 20MIPS 2505 2443 166no1445 725463 0.01 0.03 35 0.11 1E-23 MIPS complex (with subcomplexes) 20 16 2505 166 29protein 2496 1023 724606 0.01 80.04 0.035E-63 4E-17 96 723043 11 0.04 5E-63 Physical interaction 96 2429 2586 72 2429 2586 723043 0.04 11 29MIPS 2496 2429 1023 2586 724606 0.01 0.04 8 0.03 4E-17 Same96 class 29 2496 1023 262478 2499 906723860 724723 0.01 1E-15 47 2478 1769 723860 0.031E-25 1E-25 Sequence homology (BLAST < 10E-3) 47826 80.03 2478 1769 72 3)10E-3) 1769 0.02 26MCODE 2499 906 Eval 724723 0.01 0.02 1E-15 Same47 complex 2499 906 rmplexes) 33complex 2492 722718 30.11 0.011E-23 1E-08 20 2505 166725463 725463 35 0.11 1E-23 (with no 2911 subcomplexes) 20 2505 166 72 es) Same 20 2505 166 0.01 0.01 35 33MIPS 2492 interaction 2911 722718 3 0.01 1E-08 Common physical partner 33 2492 2911 .7) Same 46 2479 6171 719458 0.02 7E-03 3E-06 2496 827E-03 0.034E-17 4E-17 protein class 29846 2496 1023 72 29 2496 1023 0.01 0.03 46MIPS29 2479 expr 6171 1023 719458 0.02 0.01 2 3E-06 Correlated mRNA (Cho, CC724606 >0.7)724606 2479 6171 142499 2511 724597 0.011E-15 3E-05 26 2499 906724723 724723 840.03 0.03 1E-15 Same complex 26814 2499 906 72 26 906 1032 0.01 0.01 14MCODE 2511 1032 724597 4 0.01 3E-05 Physical interaction: APMS 2511 1032 CI 9 2492 2516 699722718 724930 1E-03 r 33 2492 2911 722718 0.01 340.01 0.011E-08 1E-08 Common physical partner 33349 2492 2911 72 33 2911 0.01 4E-03 9 interaction: 2516interaction 699 HMS-PCI 724930 4E-03 1E-03 Physical APMS: 2516 699 ke" Correlated 5 2479 2520 290 725339 50.02 0.023E-06 4E-03 .7) 46 2479 6171 0.02 27E-03 7E-03 3E-06 5interaction: 2520Y2H 290 725339 2E-03 4E-03 mRNA expr CC >0.7) 719458 46255 2479 6171 71 6171 719458 0.02 2E-03 Phys46 U(Cho, APMS "spoke" 2520 290 14 2511 2511 724597 40.01 Physical APMS 144 2511 0.013E-05 10323E-05 72 14 interaction: 1032 1032 724597 0.01 0.01 CI 9 2516 2516 699724930 724930 40.01 Physical APMS: 94 2516 0.011E-03 699 1E-03 72 9 interaction: 699HMS-PCI 4E-034E-03 Overlap between synthetic lethality S+C Neither P(S|C) P-valueN Query Characteristic S+C Odds S only C only S+C S onlyS only C onlyC only Neither P(C|S)P(C|S) Odds P(S|C) P-value &27256 other “interactions” 27256 698373 80.03 3E-460 779GO779 1746 1746 698373 0.31 0.31 8 3E-460 Similar annotation 779 17460.03 27256 389 716176 1E-273 389 2136 21369453 9453 716176 0.15 0.15 12 120.04 1E-273 Same MIPS phenotype 389 21360.04 9453 385 717268 0.15 13 0.04 6E-288 S+C only only Neither P(S|C) P-value Query Characteristic S+C S0.04 only C only Ne S+C S2140 onlyS2140 C8361 onlyC8361 Neither P(C|S) Odds P(S|C) P-value 385 717268 0.15 P(C|S) 13 Odds 6E-288 Same GO annotation 385 2140 8361 113 241227256 3500 722129 0.04 7E-68 1746 698373 890.03 0.03 3E-460 Similar GO779 annotation 779 1746 27256 69 779 1746 698373 0.31 89 3E-460 113 2412localization 3500 27256 722129 0.04 0.31 7E-68 Same subcellular 113 2412 3500 82phenotype 24439453 1445 724184 0.03 160.04 0.05 1E-67 389 716176 0.04 1E-273 Same 389 2136 9453 71 389 2136 716176 0.15 12 1E-273 82MIPS 2443 2136 1445 9453 724184 0.03 0.15 16 0.05 1E-67 MIPS complex 82 12 2443 1445 962140 24298361 2586 723043 0.04 110.04 5E-63 385 717268 0.04 6E-288 Same annotation 385 2140 8361 71 385 717268 0.15 13 6E-288 96GOinteraction 2429 2140 2586 8361 723043 0.04 0.15 11 5E-63 Physical 96 13 2429 2586 47 2478 1769 723860 0.02 1E-25 113 2412 722129 980.03 0.037E-68 7E-68 Same 113 2412 3500 72 113 2412 3500 722129 0.04 947 3)10E-3) 47subcellular 2478localization 1769 3500 723860 0.02 0.04 8 1E-25 Sequence homology (BLAST Eval < 10E-3) 2478 1769 mplexes) 20complex 2505 166 725463 0.01 350.05 0.111E-67 1E-23 82 724184 0.05 1E-67 82 2443 1445 72 2443 1445 724184 0.03 16 es) Same82 20MIPS 2505 2443 166no1445 725463 0.01 0.03 35 0.11 1E-23 MIPS complex (with subcomplexes) 20 16 2505 166 29protein 2496 1023 724606 0.01 80.04 0.035E-63 4E-17 96 723043 11 0.04 5E-63 Physical interaction 96 2429 2586 72 2429 2586 723043 0.04 11 29MIPS 2496 2429 1023 2586 724606 0.01 0.04 8 0.03 4E-17 Same96 class 29 2496 1023 262478 2499 906723860 724723 0.01 1E-15 47 2478 1769 723860 0.031E-25 1E-25 Sequence homology (BLAST < 10E-3) 47826 80.03 2478 1769 72 3)10E-3) 1769 0.02 26MCODE 2499 906 Eval 724723 0.01 0.02 1E-15 Same47 complex 2499 906 rmplexes) 33complex 2492 722718 30.11 0.011E-23 1E-08 20 2505 166725463 725463 35 0.11 1E-23 (with no 2911 subcomplexes) 20 2505 166 72 es) Same 20 2505 166 0.01 0.01 35 33MIPS 2492 interaction 2911 722718 3 0.01 1E-08 Common physical partner 33 2492 2911 .7) Same 46 2479 6171 719458 0.02 7E-03 3E-06 2496 827E-03 0.034E-17 4E-17 protein class 29846 2496 1023 72 29 2496 1023 0.01 0.03 46MIPS29 2479 expr 6171 1023 719458 0.02 0.01 2 3E-06 Correlated mRNA (Cho, CC724606 >0.7)724606 2479 6171 142499 2511 724597 0.011E-15 3E-05 26 2499 906724723 724723 840.03 0.03 1E-15 Same complex 26814 2499 906 72 26 906 1032 0.01 0.01 14MCODE 2511 1032 724597 4 0.01 3E-05 Physical interaction: APMS 2511 1032 CI 9 2492 2516 699722718 724930 1E-03 r 33 2492 2911 722718 0.01 340.01 0.011E-08 1E-08 Common physical partner 33349 2492 2911 72 33 2911 0.01 4E-03 9 interaction: 2516interaction 699 HMS-PCI 724930 4E-03 1E-03 Physical APMS: 2516 699 ke" Correlated 5 2479 2520 290 725339 50.02 0.023E-06 4E-03 .7) 46 2479 6171 0.02 27E-03 7E-03 3E-06 5interaction: 2520Y2H 290 725339 2E-03 4E-03 mRNA expr CC >0.7) 719458 46255 2479 6171 71 6171 719458 0.02 2E-03 Phys46 U(Cho, APMS "spoke" 2520 290 14 2511 2511 724597 40.01 Physical APMS 144 2511 0.013E-05 10323E-05 72 14 interaction: 1032 1032 724597 0.01 0.01 CI 9 2516 2516 699724930 724930 40.01 Physical APMS: 94 2516 0.011E-03 699 1E-03 72 9 interaction: 699HMS-PCI 4E-034E-03 Scenarios resulting in synthetic interaction Partially redundant genes A 3 partially redundant pathways, 2 required 2 partially redundant pathways A E A E J B F B F K C1 C2 C G C G L D D H D H M B E < 2% I Protein complex tolerating 1 but not 2 destabilizing mutations A B D C E F I SSL < 4% * Outline Human alternative-splicing motif search Yeast synthetic-lethal network analysis Background Overlap with other biological relationships Network motifs Predicting synthetic lethality Role of transcription compensation in mutational robustness SSL vs. protein interaction in predicting function “Network Motifs: Simple Building Blocks of Complex Networks” R. Milo, S. Shen-Orr, , , U. Alon, Science (2003). The synthetic lethal network has many triangles Xiaofeng Xin, Boone Lab Motifs in an integrated S. cerevisiae network S: Synthetic sickness or lethality S H Sequence homology H: X: Correlated expression P: Stable physical interaction R: Transcriptional regulation S: Synthetic sickness or lethality S Motifs in an integrated S. cerevisiae network X: Correlated expression P: Stable physical interaction R: Transcriptional regulation Atp3 Mcm1 Motif A1 Set B f Set A H Sequence homology H: R P R Mcm1 Motif Set C Swi4 C1 P C2 Hap2 Hap3 R ... R R X Apt17 Hap4 Hir1 Hap5 Hir1 Grx4 Ypl207w Hap2 Hap3 Hhf1 Pcl1 a R Qcr9 Prp3 R R R R RSwi4 R Yhp1 a R R Hta2 b R Yor315w R Qcr10 Isa1 Bni1 R R B1 B2 P X R b Hhf2 Sim1 Rax2 Hhf1 Hht1 P,X R R Cox4 2 3 2 Nreal: 4.7×10 Nreal: 5.6×10 1.5×104 Cox4 Ccc1 Nreal : 2.0×102 1.0×10 Gin4 c Cdc6 R b Cox6 Bck1 Clb2 ect. Clb2 Motif Set F Rpb5 P/X 2 2 3 F1 F5 D1 Nrand: (2.6+0.5) D2c ×10 D3 Nrand: D4 NrandF2 : (4.3+0.5)×10F3 (3.6+0.2) ×10F4 7.0+6.7 (2.3+1.1) ×101 Pho85 motif a net a network motif a network aH network a Pnetwork X Pa network S S X Xtheme Ha motif P P H H S S P theme X X P P Pho85 S,H S,H Yke2 Smi1 P/X P/X S H H H H S H S S Ssn8 S 2 3 4 4 3 Cla4 Slt2 3 2 3 5 S,H Nreal : 4.4×10 7.5×10 1.8×10 1.2×10 3.2×10 Nreal : 2.7×10 9.8×10 3.2×10 5.6×10 c b S/H Cla4 Slt2 3 3 5 rand: (1.8+0.3)×10 2 (3.9+0.2) ×103 G3 (1.4+0.1)×10Atp20 (8.4+0.2)×103 G5 (1.3+0.1)×103 G6 Motif Set G Nrand: E1 (1.7+0.1) ×103 E2 (3.8+0.4) ×102 E3 (1.3+0.1)×10 (1.0+0.2) ×10N E4 G1 G2 G4 a network a network motif a network theme Atp20 H H H H a S S H S S S H S X X Atp2 Atp15 X X P P P P P,X P,X S/H S/H P P P X X X X P X P Atp14 Atp3 6 6 Nreal : Atp14 1.7×103 8.6×103 1.5×103 1.2×103 6.7×104 Nreal : 4.5×105 4.7×102 1.5 9.5×10 Atp32 b ×10P/X c P,X 2.9×10 ect. et H 1 H2 2 (2.7+0.4) ×10 H3 5 (1.7+0.1) H4 H5 H6 ×101 (1.7+0.1)×103 (6.1+0.5) ×102 (8.3+0.6) ×102 (3.1+0.1) ×104 (7.2+1.2) Nrand: (6.4+0.2) ×104 H (2.6+0.5) ×10 ×106 Nrand: (9.0+1.6)×10 a network motif a network theme R H R R R X R X R P R R R P/X/H R H1 Nreal : N 6.1×102 : (8.0+2.3)×101 H P H H S 1.9×103 1.1×103 3.2×103 8.2×102 5.8×101 (5.3+0.5)×102 (6.6+1.0) ×102 (2.5+0.2)×103 (4.0+0.4) ×102 (1.8+0.6) ×101 Hap4 Atp3 Motifs Themes Mcm1 Motif Set C Mcm1 Motif A1 Set B Motif Set A Swi4 C1 P C2 Hap2 Hap3 R ... R R Hir1 Grx4 Hap5 Apt17 Hir2 Ypl207w Hir1 P X Hta1 Hap2 Hap3 Hhf1 Pcl1 a R Qcr9 Hht1 Prp3 R R R R RSwi4 R Yhp1 Hta2 Htb2 R Yor315w R Rpb7 Qcr10 Isa1 RB2 R P XBni1 RB1 R b Hhf2 Hht2 Sim1 Rax2 Hhf1 Hht1 P,X Cox43 4 Rpb3 Rpb9 2 Nreal: R 4.7×102R Nreal: Cox4 Ccc1 Nreal : 2.0×102 1.0×10 Gin4 c Cdc6 5.6×10Bck1 1.5×10 R b Htb1 Cox6 Clb2 ect. Clb2 Motif Set F Rpb5 P/X Motif Set D Rpb2 Rpb4 2 3 F5 D1 Nrand: (2.6+0.5) D2 c ×102 D3 Nrand: D4 NrandF2 : (4.3+0.5)×10F3 (3.6+0.2) ×10F4 7.0+6.7 (2.3+1.1)F1 ×101 Pho85 a network motif a network theme c a network motif a network theme a network motif a network theme a X P S S H H X X H H P P S S P P X X Rpb5 Rpo21 P P a Pho85 S,H S,H Yke2 Smi1 ect. P/X P/X S/H S/H S H H H H S H S S Ssn8 Cdc73 S Sec7 Slt2 3 5 Nreal : Cla4 4.4×102S,H 7.5×103 1.8×104 1.2×104 3.2×103 Nreal : 2.7×103 9.8×102 3.2×10 5.6×10 c b S/H Cla4 Slt2 Ssn8 Cdc73 b Set S/HE c 3 Motif 3 (1.8+0.3)×102 (3.9+0.2) ×103 G3 (1.4+0.1)×10Atp20 (8.4+0.2)×103 G5 (1.3+0.1)×103 G6 Motif Set G Nrand: E1 (1.7+0.1) ×103 E2 (3.8+0.4) ×102 E3 (1.3+0.1)×10 (1.0+0.2) ×105Nrand: G1 Sec72 a network theme E4 G2 G4 a network motif a network motif a network theme Atp20 a H H H H S S H S Atp2 S S H S X Xa X Atp15 X P P P P Gim3 S S Gim4 P,X P,X P/X P/X S/H S/H P P P X X X X P X P Yke2 Gim5 Atp14 Atp3 P,X 3 2 3 3 3 4 6 6 Nreal : Atp14 1.7×10 8.6×10 1.5×10 1.2×10 6.7×10 Nreal : 4.5×105 4.7×102 1.5 9.5×10 Atp3 b c b ×10P/X c P,X 2.9×10 Yke2 P/X ect. Motif Set H H2 ×105 (1.7+0.1) H3 ×106 NrandH4 1 1 (1.7+0.1)×10 3 : (9.0+1.6)×10H5 (7.2+1.2) ×10H6 (6.1+0.5) ×102 (8.3+0.6) ×102 (3.1+0.1) ×104 Nrand: (6.4+0.2) ×104 (2.6+0.5) H ×102 (2.7+0.4) a network motif a network a Xnetwork theme R a network H R R R R motifX R P R R a a R one gene synthetic lethal with a complex? R R P/X/H R R R b H1 Nreal : 6.1×102 H P H H S 1.9×103 1.1×103 3.2×103 8.2×102 5.8×101 (5.3+0.5)×102 (6.6+1.0) ×102 (2.5+0.2)×103 (4.0+0.4) ×102 (1.8+0.6) ×101 Nrand: (8.0+2.3)×101 Motif Set G G1 S G2 H S S G3 H H G4 S S G5 H S G6 H Sec72 a H S/H Nreal : P P P X X X 1.7×103 2.9×102 8.6×103 1.5×103 1.2×103 6.7×104 Yke2 b Nrand: (9.0+1.6)×101 (7.2+1.2) ×101 (1.7+0.1)×103 (6.1+0.5) ×102 (8.3+0.6) ×102 (3.1+0.1) ×104 Sec72 6 H S X 6.7×104 H X S Yke2 6.7×104 1+0.1) ×104 ×102 (3.1+0.1) ×104 P,X Yke2 Sec72 S S S Gim4 Gim3 Gim4 Gim5 Gim5 P,X Yke2 a network motif a network motif Yke2 Gim5 P,X Gim5 c a network motif Sec72 Gim3Pac10 S S/H P/X Sec72 G6 H H S Pac10 Gim5 a network theme a network theme Gim NreY Nra Motifs Themes pair of synthetic lethal complexes? P/X H P/X P/X P/X S/H S/H S/H S/H S/H S/H S/H S/H S/H P/X S/H S/H S/H …………… S/H P/X P/X P/X Sec62 P Sec72 Sec62Sec62 Sec66 Sec72 Sec63 Sec63Sec63 Sec66 Sec62 Sec63 S Gim3 Sec72Sec72 Sec66 PSec66 S Sec66 Sec72Sec72P Sec66 S S P S Sec72 Gim4 Pac10 S Sec72 Yke2 Gim5 Sec66 P,X Gim3 Gim3 S S S S S S Yke2 Gim5 S S S Gim4 Pac10 Gim4 Pac10 Yke2 Yke2 Gim5 Gim5 P,X P,X a network a network theme motif S Yke2 a network motif motif a network Gim5 P,X Yke2 Yke2 Gim4 Sec66 Gim3 Pac Gim5 Gim5 a network themetheme a network Yke2 Gim5 Thematic map of synthetic-lethal complexes Mapping pairs of synthetic-lethal complexes Outline Human alternative-splicing motif search Yeast synthetic-lethal network analysis Background Overlap with other biological relationships Network motifs Predicting synthetic lethality Role of transcription compensation in mutational robustness SSL vs. protein interaction in predicting function Predicting Synthetic Lethality: Why? Especially if the consortia led by Boone and Boeke are testing all yeast gene pairs Simple Answer: Even finished, this project is one strain, one organism, one phenotype (growth), and one growth condition. Predicting Synthetic Lethality: S+C Neither P(S|C) P-valueN Query Characteristic S+C Odds S only C only S+C S onlyS only C onlyC only Neither P(C|S)P(C|S) Odds P(S|C) P-value Many weak predictors 779 1746 27256 698373 0.31 8 0.03 3E-460 779GO annotation 1746 27256 698373 0.31 8 0.03 Similar 779 1746 3E-460 27256 6 389phenotype 716176 1E-2737 389 2136 21369453 9453 716176 0.15 0.15 12 120.04 1E-273 Same MIPS 389 21360.04 9453 385 717268 0.15 13 0.04 6E-288 S+C only only Neither P(S|C) P-value Query Characteristic S+C S0.04 only C only Ne7 S+C S2140 onlyS2140 C8361 onlyC8361 Neither P(C|S) Odds P(S|C) P-value 385 717268 0.15 P(C|S) 13 Odds 6E-288 Same GO annotation 385 2140 8361 113 241227256 3500 722129 0.04 7E-68 1746 698373 890.03 0.03 3E-460 Similar GO779 annotation 779 1746 27256 697 779 1746 698373 0.31 89 3E-460 113 2412localization 3500 27256 722129 0.04 0.31 7E-68 Same subcellular 113 2412 3500 82phenotype 24439453 1445 724184 0.03 160.04 0.05 1E-67 389 716176 0.04 1E-273 Same 389 2136 9453 717 389 2136 716176 0.15 12 1E-273 82MIPS 2443 2136 1445 9453 724184 0.03 0.15 16 0.05 1E-67 MIPS complex 82 12 2443 1445 962140 24298361 2586 723043 0.04 110.04 5E-63 385 717268 0.04 6E-288 Same annotation 385 2140 8361 717 385 717268 0.15 13 6E-288 96GOinteraction 2429 2140 2586 8361 723043 0.04 0.15 11 5E-63 Physical 96 13 2429 2586 47 2478 1769 723860 0.02 1E-25 113 2412 722129 980.03 0.037E-68 7E-68 Same 113 2412 3500 727 113 2412 3500 722129 0.04 947 3)10E-3) 47subcellular 2478localization 1769 3500 723860 0.02 0.04 8 1E-25 Sequence homology (BLAST Eval < 10E-3) 2478 1769 mplexes) 20complex 2505 166 725463 0.01 350.05 0.111E-67 1E-23 82 724184 0.05 1E-67 82 2443 1445 727 2443 1445 724184 0.03 16 es) Same82 20MIPS 2505 2443 166no1445 725463 0.01 0.03 35 0.11 1E-23 MIPS complex (with subcomplexes) 20 16 2505 166 29protein 2496 1023 724606 0.01 80.04 0.035E-63 4E-17 96 723043 11 0.04 5E-63 Physical interaction 96 2429 2586 727 2429 2586 723043 0.04 11 29MIPS 2496 2429 1023 2586 724606 0.01 0.04 8 0.03 4E-17 Same96 class 29 2496 1023 262478 2499 906723860 724723 0.01 1E-15 47 2478 1769 723860 0.031E-25 1E-25 Sequence homology (BLAST < 10E-3) 47826 80.03 2478 1769 727 3)10E-3) 1769 0.02 26MCODE 2499 906 Eval 724723 0.01 0.02 1E-15 Same47 complex 2499 906 rmplexes) 33complex 2492 722718 30.11 0.011E-23 1E-08 20 2505 166725463 725463 35 0.11 1E-23 (with no 2911 subcomplexes) 20 2505 166 727 es) Same 20 2505 166 0.01 0.01 35 33MIPS 2492 interaction 2911 722718 3 0.01 1E-08 Common physical partner 33 2492 2911 .7) Same 46 2479 6171 719458 0.02 7E-03 3E-06 2496 827E-03 0.034E-17 4E-17 protein class 29846 2496 1023 727 29 2496 1023 0.01 0.03 46MIPS29 2479 expr 6171 1023 719458 0.02 0.01 2 3E-06 Correlated mRNA (Cho, CC724606 >0.7)724606 2479 6171 142499 2511 724597 0.011E-15 3E-05 26 2499 906724723 724723 840.03 0.03 1E-15 Same complex 26814 2499 906 727 26 906 1032 0.01 0.01 14MCODE 2511 1032 724597 4 0.01 3E-05 Physical interaction: APMS 2511 1032 CI 9 2492 2516 699722718 724930 1E-03 r 33 2492 2911 722718 0.01 340.01 0.011E-08 1E-08 Common physical partner 33349 2492 2911 727 33 2911 0.01 4E-03 9 interaction: 2516interaction 699 HMS-PCI 724930 4E-03 1E-03 Physical APMS: 2516 699 ke" Correlated 5 2479 2520 290 725339 50.02 0.023E-06 4E-03 .7) 46 2479 6171 0.02 27E-03 7E-03 3E-06 5interaction: 2520Y2H 290 725339 2E-03 4E-03 mRNA expr CC >0.7) 719458 46255 2479 6171 717 6171 719458 0.02 2E-03 Phys46 U(Cho, APMS "spoke" 2520 290 14 2511 2511 724597 40.01 Physical APMS 144 2511 0.013E-05 10323E-05 72 14 interaction: 1032 1032 724597 0.01 0.01 CI 9 2516 2516 699724930 724930 40.01 Physical APMS: 94 2516 0.011E-03 699 1E-03 72 9 interaction: 699HMS-PCI 4E-034E-03 Predicting Synthetic Lethality: Probabilistic decision trees Predicting Synthetic Lethality: Cross-validation success True Positive Rate (Sensitivity) 1 0.8 xval: all characteristics 0.6 xval: no 2hop characteristics random 0.4 ~80% sensitivity by testing ~20% of gene pairs (80/20 Rule!) 0.2 0 0 0.2 0.4 0.6 False Positive Rate 0.8 1 Outline Human alternative-splicing motif search Yeast synthetic-lethal network analysis Background Overlap with other biological relationships Network motifs Predicting synthetic lethality Role of transcription compensation in mutational robustness SSL vs. protein interaction in predicting function Upregulation of compensatory genes? Gene M SSL Gene G Anecdotally, this is rare (Lesage et al, 2004) Upregulation of compensatory genes: Data sets examined mRNA expression, mutant vs. wild type (Rosetta compendium, Hughes et al, 2000) SSL genetic interactions (Tong et al.) Upregulation of compensatory genes: Distribution of log (Gmutant_M / Gwt)] 0.1 non-SSL 116,863 SSL 935 Fraction of genes 0.08 0.06 0.04 0.02 0 -2 -1 0 Log ratios 1 2 Upregulation of compensatory genes: How common is it? Of 935 SSL M:G pairs examined, only thirteen went up significantly That is only four more than expected given the fraction that went up in 120,000 non-SSL pairs Thirteen examples of compensatory upregulation Six previously observed examples FKS1 RLM1 Zhao, 1998 FKS1 SLT2 Lesage, 2004 GAS1 CHS3 Lesage, 2004 GAS1 KRE11 Lesage, 2004 GAS1 SLT2 Lesage, 2004 GAS1 YALO53W Lesage, 2004 Seven new examples BNI1 SLT2 CDC42 GIC2 (translational compensation; Jacquenoud, 1998) FKS1 KAI1 FKS1 PAL2 GAS1 YMR316C-A SHE4 ARC40 SHE4 CHS7 Transcriptional compensation for gene loss. exists for a few SSL pairs is extremely rare Rationalization Regulatory apparatus to detect gene loss… Provides only a weak benefit if gene loss is rare Large mutational target Outline Human alternative-splicing motif search Yeast synthetic-lethal network analysis Background Overlap with other biological relationships Network motifs Predicting synthetic lethality Role of transcription compensation in mutational robustness SSL vs. protein interaction in predicting function SSL vs protein interaction: The arena Tested for genetic intxn Tested for Protein intxn Tested for both data sets: 1 genetic, 5 protein SSL vs protein interaction phys data set (overlapping w/ genetic) TAP spoke gene pairs 104409 TAP matrix 104409 HMS-PCI spoke 168114 HMS-PCI matrix 168114 Y2H 566423 predictive interaction physical genetic physical genetic physical genetic GO characteristic shared Process shared Function shared Component accuracy sensitivity 0.42 0.006 0.19 0.041 0.31 0.008 0.15 0.061 0.29 0.012 0.08 0.048 Protein intxns are more accurate. SSL intxns are more sensitive. SSL vs protein interaction: combining with other relationships Common regulator Gene co-occurence Gene fusion Gene neighborhood Homology mRNA coexpression Chromosomal distance Same localization Same phenotype Many ‘2-hop’ relationships SSL vs protein interaction: combining with other relationships True Positive Rate (Sensitivity) 1 all characteristics no P no G no P, G P only G only P, G only random 0.8 0.6 0.4 0.06 0.2 0.03 0 0 0.2 0.4 0.6 0 0.8 0 1 0.01 False Positive Rate (1 – Specificity) 0.02 Summary, Part II SSL other biological relationships Motifs in an integrated S. cerevisiae network Map of compensatory complexes Predicting synthetic lethality Transcriptional compensation plays a minor role in robustness to gene loss SSL vs. protein interaction in predicting function: SSL wins but is complementary Acknowledgments, Part II Roth Lab Sharyl Wong Lan Zhang Oliver King Debra Goldberg Gabriel Berriz Frank Gibbons Others Data Sources SGD MIPS YPD Canadian Synthetic Lethal Team Charlie Boone Amy Tong Guillaume Lesage Howard Bussey Brenda Andrews Xiaofeng Xin Gary Bader Zhijian Li Others Marc Vidal Probabilistic Decision Trees Models conditional probability of one variable given a combination of others. Capable of integrating many variables (built-in feature selection). Provides intuition about why predictions are made.