Predicting Highly Connected Proteins in PIN using QSAR Art Cherkasov UBC / VGH artc@interchange.ubc.ca Apr 14, 2011 THE UNIVERSITY OF BRITISH COLUMBIA What is Chem(o)informatics ??? Chemical Space: Navigation(Grouping) THE UNIVERSITY OF BRITISH COLUMBIA What is Chem(o)informatics ??? hits + GARBAGE Chemical Space: Navigation(Grouping) GARBAGE hits THE UNIVERSITY OF BRITISH COLUMBIA What is Chem(o)informatics ??? LIGAND-BASED METHODS - hits + GARBAGE + Target Structure? METHODS Chemical Space: Navigation(Grouping) GARBAGE hits + STRUCTURE-BASED Docking works? + Traditional Drug Design Modes De Novo works? THE UNIVERSITY OF BRITISH COLUMBIA Predictive QSAR Modeling Workflow* is complex Y-randomization Original Dataset Split into Training, Test and External Validation sets Multiple Training Sets Combi-QSAR Modeling Multiple Test Sets Database Screening Using Applicability Domain Experimental Validation External validation Using Applicability Domain Activity Prediction Validated Predictive Models with High Internal & External Accuracy Only accept models that have Q2 > 0.6 R2 > 0.6 etc. *Tropsha, A. Best Practices for QSAR Model Development, Validation, and Exploitation Mol. Inf., 2010, 29, 476 – 488 CHEMBENCH.MML.UNC.EDU hits + GARBAGE Cheminformatics ??? LIGAND-BASED METHODS - Target Structure? GARBAGE STRUCTURE-BASED METHODS - QSAR, FP similarity, Clustering, MolFields, etc hits + + Docking works? + Traditional Drug Design Modes De Novo works? THE UNIVERSITY OF BRITISH COLUMBIA hits + GARBAGE STRUCTURE-BASED LIGAND-BASED METHODS METHODS Cheminformatics !!! GARBAGE hits + Docking works? + Conventional Drug Design Modes De Novo works? THE UNIVERSITY OF BRITISH COLUMBIA QSAR – “Quantitative Structure-Activity Relationships” PubMed Citations 1000 QSAR papers in PubMed 900 Protein structures in PDB (x100) 800 compounds in CAS (x100k) 700 600 500 400 300 200 100 0 from A. Cherkasov & A. Tropsha, Nature Drug Discovery Reviews, 2011 (in progress) THE UNIVERSITY OF BRITISH COLUMBIA specifics of the talk: 1. Chemical Space: Quantification (Modeling ) and Navigation (Grouping) a. Ligand QSAR:: Concept:Consensus Modeling b. Ligand QSAR: :Examples: BML Model, Antibiotics 2. Peptide QSAR: :Example: Antimicrobial Peptides 3. Protein QSAR: :Example:“Hubs” in PINs THE UNIVERSITY OF BRITISH COLUMBIA Principles of QSAR modeling: Compounds, Descriptors, Functions, Activity O C O M P O U N D S N 0.613 O 0.380 N O N O N O N O N O N O N O N D E S C R I P T O R S -0.222 0.708 Quantitative Structure Activity Relationships 1.146 0.491 0.301 0.141 0.956 0.256 0.799 1.195 O N 1.005 Slide by A. Tropsha, 2010 A C T I V I T Y Principles of QSAR modeling: Compounds, Descriptors, Functions, Activity O C O M P O U N D S N 0.613 O 0.380 N O N O N O N O N O N O N O N D E S C R I P T O R S -0.222 0.708 Quantitative Structure Property Relationships 1.146 0.491 0.301 0.141 0.956 0.256 0.799 1.195 O N 1.005 Slide by A. Tropsha, 2010 P R O P E R T Y Compounds : Chemical Universe 1040 - 10120 compounds with C, H, O, N, P, S, F, Cl, Br, I, and MW < 500 ?? THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCT TCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGA GGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC . AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGT GCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Descriptors: “Inductive” etc N 1 ( j i )( R 2j Ri2 ) i j r j2i N j Q j N N s MOL j i j i MOL 1 s MOL Picture 4 R R 2 j r 2 i 2 j i 1 N 1 R 2 R 2 j i 2 2 r j i j i THE UNIVERSITY OF BRITISH COLUMBIA Functions: MLR, PLS, kNN, SVM, ANN, Binary Regression, Decision Tree, RandomForest, PCA, Hybrid Methods, LDA, etc THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Activity: Continuous, Binary Molecular structure gets translated into numbers (descriptors) Activity: Continuous f ( Descriptors) ~ Activity Binary THE UNIVERSITY OF BRITISH COLUMBIA Chemical Space: Activity (Property) Quantification THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA EXAMPLE: QSAR TOX CONSENSUS MODELING:2008 THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC Overview of QSAR modeling approaches employed by six cheminformatic groups CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG involved in this study. GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Group ID Modeling Techniques Descriptor Type Applicability Domain Definition UNC kNN, SVM MolconnZ, Dragon Euclidean distance threshold between a test compound and compounds in the modeling set ULP MLR, SVM, kNN Fragments (ISIDA), Molecular (CODESSA-Pro) Euclidean distance threshold between a compound and compounds in the modeling set; bounding box UI MLR/OLS Dragon Leverage approach UK PLS Dragon Residual standard deviation and leverage within the PLSR model VCCLAB ASNN E-state indices Maximal correlation coefficient of the test molecule to the training set molecules in the space of models UBC MLR, ANN, SVM, PLS IND_I Range of independent variables values in the training set +/- 15% THE UNIVERSITY OF BRITISH COLUMBIA Modeling Set (n=644) Validation Set I (n=339) Validation Set II (n=110) Model Group ID Coverage Coverage Coverage GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC Q2abs MAE R2abs MAE R2abs MAE (%) (%) (%) CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG kNN-Dragon UNC 0.92 0.22 100 0.85 0.27 80.2 0.72 0.33 52.7 GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA kNNGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT UNC 0.91 0.23 99.8 0.84 0.30 84.3 0.44 0.39 53.6 MolconnZ TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT SVM-Dragon UNC 0.93 0.21 100 0.81 0.31 80.2 0.83 0.27 52.7 CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC SVMCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC UNC 0.89 0.25 100 0.83 0.30 84.3 0.55 0.37 53.6 TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC MolconnZ CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG ISIDA-kNN ULP 0.77 0.37 100 0.73 0.36 78.5 0.63 0.37 42.7 TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT ISIDA-SVM ULP 0.95 0.15 100 0.76 0.32 100 0.38 0.50 100 CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT ISIDA-MLR ULP 0.94 0.20 100 0.81 0.31 95.9 0.65 0.41 51.8 CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA CODESSAULP 0.72 0.42 100 0.71 0.44 100 0.58 0.47 100 GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC MLR TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG OLS UI 0.86 0.30 92.1 0.77 0.35 97.0 0.59 0.43 98.2 ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA PLS UK 0.88 0.28 97.7 0.81 0.34 96.1 0.59 0.40 95.5 TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA ASNN VCCLAB 0.83 0.31 83.9 0.87 0.28 87.4 0.75 0.32 71.8 GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT PLS-IND_I UBC 0.76 0.39 100 0.74 0.39 99.7 0.45 0.54 100 CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA MLR-IND_I UBC 0.77 0.39 100 0.75 0.40 99.7 0.46 0.53 100 GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC ANN-IND_I UBC 0.77 0.39 100 0.76 0.39 99.7 0.46 0.53 100 AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA SVM-IND_I UBC 0.79 0.31 100 0.79 0.35 99.7 0.53 0.46 100 TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA THE UNIVERSITY OF BRITISH COLUMBIA Modeling Set (n=644) Validation Set I (n=339) Validation Set II (n=110) Model Group ID Coverage Coverage Coverage GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC Q2abs MAE R2abs MAE R2abs MAE (%) (%) (%) CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG kNN-Dragon UNC 0.92 0.22 100 0.85 0.27 80.2 0.72 0.33 52.7 GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA kNNGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT UNC 0.91 0.23 99.8 0.84 0.30 84.3 0.44 0.39 53.6 MolconnZ TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT SVM-Dragon UNC 0.93 0.21 100 0.81 0.31 80.2 0.83 0.27 52.7 CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC SVMCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC UNC 0.89 0.25 100 0.83 0.30 84.3 0.55 0.37 53.6 TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC MolconnZ CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG ISIDA-kNN ULP 0.77 0.37 100 0.73 0.36 78.5 0.63 0.37 42.7 TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT ISIDA-SVM ULP 0.95 0.15 100 0.76 0.32 100 0.38 0.50 100 CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT ISIDA-MLR ULP 0.94 0.20 100 0.81 0.31 95.9 0.65 0.41 51.8 CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA CODESSAULP 0.72 0.42 100 0.71 0.44 100 0.58 0.47 100 GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC MLR TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG OLS UI 0.86 0.30 92.1 0.77 0.35 97.0 0.59 0.43 98.2 ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA PLS UK 0.88 0.28 97.7 0.81 0.34 96.1 0.59 0.40 95.5 TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA ASNN VCCLAB 0.83 0.31 83.9 0.87 0.28 87.4 0.75 0.32 71.8 GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT PLS-IND_I UBC 0.76 0.39 100 0.74 0.39 99.7 0.45 0.54 100 CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA MLR-IND_I UBC 0.77 0.39 100 0.75 0.40 99.7 0.46 0.53 100 GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC ANN-IND_I UBC 0.77 0.39 100 0.76 0.39 99.7 0.46 0.53 100 AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA SVM-IND_I UBC 0.79 0.31 100 0.79 0.35 99.7 0.53 0.46 100 TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG Consensus TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG 0.92 0.23 100 0.85 0.29 100 0.67 0.39 100 GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC Model Ia GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC Consensus AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA 0.92 0.22 100 0.87 0.27 100 0.70 0.34 100 TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG Model IIb GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC Consensus TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC 0.92 0.22 100 0.87 0.27 100 0.70 0.36 100 TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT Model IIBc CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT Consensus TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT 0.92 0.22 100 0.86 0.28 99.7 0.70 0.34 98.2 GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG Model IIId GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA QSAR CONSENSUS MODELING:2010 THE UNIVERSITY OF BRITISH COLUMBIA Geography of collaboration 40 scientists, 15 institutions 23 Slide by A. Tropsha, 2011 Chemical Space: Navigation(Grouping) THE UNIVERSITY OF BRITISH COLUMBIA THE UNIVERSITY OF BRITISH COLUMBIA THE UNIVERSITY OF BRITISH COLUMBIA THE UNIVERSITY OF BRITISH COLUMBIA THE UNIVERSITY OF BRITISH COLUMBIA MEDICINE INFECTIOUS DISEASES Retrival Percentage Retrival Percentage Retrieval of Antibiotic Compounds Retreival of Human Methabolites GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC 100 CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA 100 90 CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG 90 R=0.1 GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT 80 R=0.15 CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA 80 R=0.2 GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT 70 R=0.1 Random R=0.1 TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAG TATTCCATTGTAGCTC 70 R=0.15 Random R=0.15 TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT 60 R=0.2 Random 60 CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA R=0.2 50 GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC R=0.1 Random 50 CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC 40 R=0.15 Random TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC 40 CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG R=0.20 Random 30 30 TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT 20 TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG 20 GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT 10 CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC 10 AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT 0 CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG 0 0 100 200 300 400 500 600 700 GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGG AATTTCAGTTCATGCA 0 100 200 300 400 500 600 700 Query number GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC Query points TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATG CCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCC ACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA db50 THE UNIVERSITY OF BRITISH COLUMBIA MERCK Database QSAR annotation as Antibiotics and BML GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCAC AGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC OH O OH O CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT O O TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT O O O O CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG O O ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGT CTCTGTTTGCTGATGC OH O O TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT OH O OH CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCA O OH O GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC O GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG O OH GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT OH O CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG OH O OH OH GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA O O O GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT O O OH OH O CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC O O O O O OH OH O AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG O O O GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA O O TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGT GGCAGCCAGTGCCACC O O O OH ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG O O TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG OH GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG OH GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC OH OH TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC OH TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT O O TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT OH OH GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Lead compound Merck annotation BM L scor e Antilipemic 0.71 Antibioti clikeness (30-10-2 ANN) 0.56 Antineoplasti c Cytostatic agent 0.72 0.99 Analgesic; Antiinflamat ory 0.74 0.66 Lovastatin NP-007587 Olivomycin A Gentisic acid Metabolite analogue NP-009248 NP-001423 THE UNIVERSITY OF BRITISH COLUMBIA MERCK Database QSAR annotation as Antibiotics and BML GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCAC AGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC OH O OH O CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT O O TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT O O O O CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG O O ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGT CTCTGTTTGCTGATGC OH O O TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT OH O OH CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCA O OH O GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC O GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG O OH GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT OH O CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG OH O OH OH GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA O O O GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT O O OH OH O CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC O O O O O OH OH O AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG O O O GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA O O TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGT GGCAGCCAGTGCCACC O O O OH ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG O O TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG OH GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG OH GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC OH OH TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC OH TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT O O TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT OH OH GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Lead compound Merck annotation BM L scor e Antilipemic 0.71 Antibioti clikeness (30-10-2 ANN) 0.56 Metabolite analogue CONFIRMED Lovastatin NP-007587 Antineoplasti c Cytostatic agent 0.72 Analgesic; Antiinflamat ory 0.74 0.99 CONFIRMED Olivomycin A Gentisic acid NP-009248 0.66 NP-001423 CONFIRMED THE UNIVERSITY OF BRITISH COLUMBIA THE UNIVERSITY OF BRITISH COLUMBIA ` GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAG TATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT Distinguishing CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG Distinguishing Antimicrobials from Antimicrobials from Antimicrobials versus QSAR AATTTCAGTTCATGCA model for GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGG Antimicrobials from GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC Drugs Drug-likes Drugs versus Drug- Bacterial Metabolites TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC all others likes ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA Train Test Train Test Train Test Train Test Train Test TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCA T_P 327 130 332 140 294 124 270 89 360 139 GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG T_N 631 248 841 342 1490 621 1486 644 792 347 GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG F_P 49 33 7 14 32 20 17 14 39 26 GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATG CCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC F_N 33 35 30 23 66 41 108 58 48 19 AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA SPEC 0.93 0.88 0.99 0.96 0.98 0.97 0.99 0.98 0.95 0.93 TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG SENS 0.91 0.79 0.92 0.86 0.82 0.75 0.71 0.61 0.88 0.88 TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC ACCUR 0.92 0.85 0.97 0.93 0.95 0.92 0.93 0.91 0.93 0.92 GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG PPV 0.87 0.80 0.98 0.91 0.90 0.86 0.94 0.86 0.90 0.84 GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCC NPV 0.95 0.88 0.97 0.94 0.96 0.94 0.93 0.92 0.94 ACATTTTCTAGCCCAC 0.95 TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA A number of QSAR models have been elaborated to separate individual clusters within the dataset of 958 human therapeutics, 519 antimicrobials, 1202 drug-like chemicals, as well as 1102 human-, 551 bacterial-, 2351 plant- and 825 fungal metabolites. THE UNIVERSITY OF BRITISH COLUMBIA Separation of various classes of substances in the chemical space General drugs Bacterial metabolites Inactive Chemicals Antibacterials THE UNIVERSITY OF BRITISH COLUMBIA The two acyl hydrazone-based in silico hits with potent selective inhibitory activity towards MRSA Pyruvate Kinase. IC50 (mM) Compound Structure MRSA Growth inhibition (%) PK Human M1 PK Human M2 PK Human R PK Human L PK S. aureus HeLa 0.85 450 519 450 38 10 13 IS-63 N O NH N O OH FP search of ZINC db with BML scoring THE UNIVERSITY OF BRITISH COLUMBIA The two acyl hydrazone-based in silico hits with potent selective inhibitory activity towards MRSA Pyruvate Kinase. IC50 (mM) Compound Structure MRSA Growth inhibition (%) PK Human M1 PK Human M2 PK Human R PK Human L PK S. aureus HeLa 0.85 450 519 450 38 10 13 0.091 375 125 350 350 35 0 IS-63 N O NH N O OH IS-130 H N Br NH N N O OH THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAG TATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGG AATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC 1 ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC 0.9 TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT 0.8 CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCA Power law GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC 0.7 GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT Weibull 0.6 CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG analytical GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA Weibull by 0.5 GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATG CCTTCAAGTTGGGCTT plotting CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC 0.4 experiment AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA 0.3 TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG 0.2 TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC 0.1 AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG 0 GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC 0 100 200 300 400 500 TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCC ACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA b f (GO) aGO Pareto’s inequality law introduced more then a century ago (Pareto, 1897 ) economic-, professional-, sexualand social networks airline routing power lines connections language networks internet hyperlinks protein interactomes brain organization metabolic pathways food and ecological webs THE UNIVERSITY OF BRITISH COLUMBIA Most common distinct molecular scaffolds classified for the studied groups of chemical substances GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAG TATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGG AATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATG CCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCC ACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAG TATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGG AATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATG CCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCC ACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Most common distinct substituents classified for the studied groups of chemical substances THE UNIVERSITY OF BRITISH COLUMBIA GoingBigger -> Peptide QSAR:: Antimicrobial Peptides THE UNIVERSITY OF BRITISH COLUMBIA Bad Bugs Need Drugs: IDSA, March 2006 Antimicrobial Availability Task Force Widespread prevalence of MDR bacteria in hospitals Few drugs in a pipeline, Urgent need for R&D Experts Fear Increase in Drug-resistant Infectious Here: Globe and Mail, March 2006 MRSA, a treatment-resistant form of bacteria that spreads through direct contact, is called a greater threat to public health than SARS or bird flu. The Boston Globe, August 21, 2006 THE UNIVERSITY OF BRITISH COLUMBIA Antimicrobial Peptides (AMP) Modes of Action Oren and Shai (Biopolymers 1998) GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCT GCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTG ATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA 8-50 AA long, gene-coded, often contain lead sequence, parts of innate immunity THE UNIVERSITY OF BRITISH COLUMBIA Factors Influencing activity of AMP’s: Usually helical, but can be beta, cyclic, irregular, induced IKWLKIFL THE UNIVERSITY OF BRITISH COLUMBIA Factors Influencing activity of AMP’s: Hydrophobicity, Positive Charge, two-phased IKWLKIFL BUT: 9^20 possible sequence variants!!! THE UNIVERSITY OF BRITISH COLUMBIA Sources of antibiotic peptides GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA SWISS-PROT database: ftp://ftp.ebi.ac.uk/pub/databases/swissprot/release/sprot42.dat University of Nebraska Medical Center: http://aps.unmc.edu/AP/main.php Biochemistry Department University of Triest, Italy: http://www.bbcm.units.it/~tossi/pag5.htm National Library of Health Sciences, TERKKO, University of Helsinki: http://oma.terkko.helsinki.fi:8080/~SAPD/login School of Crystallography, Birkbeck University of London: http://www.cryst.bbk.ac.uk/peptaibol/peptaibol_database_1lettercodes.htm THE UNIVERSITY OF BRITISH COLUMBIA Examples of typical gene-coded AMPs: GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCT GCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTG ATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA beta-Defensins, alpha-Defensins Cecropins, Cholecystokinins Stomoxyns, Gastrins, Transferrins, Magainins, Brevinins, Xenopsins Dermaseptins, Provicilins Cupiennines, Vicilins, Corticostatins, Apidaecin, Cathelicidin , Statherins, Histatins Bombinins, Dermaseptins, Maximins, Dermadistinctins, Maculatina, Caerins, Aureins, Citropin, Waglerins, Gastrins, Cholecystokinins, Magainins, Xenopsins non -TOXIC non - IMMUNOGENIC do not cause RESISTANCE fast and broadly ACTIVE THE UNIVERSITY OF BRITISH COLUMBIA Examples of typical gene-coded AMPs: GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCT GCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTG ATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA beta-Defensins, alpha-Defensins Cecropins, Cholecystokinins Stomoxyns, Gastrins, Transferrins, Magainins, Brevinins, Xenopsins Dermaseptins, Provicilins Cupiennines, Vicilins, Corticostatins, Apidaecin, Cathelicidin , Statherins, Histatins Bombinins, Dermaseptins, Maximins, Dermadistinctins, Maculatina, Caerins, Aureins, Citropin, Waglerins, Gastrins, Cholecystokinins, Magainins, Xenopsins non -TOXIC non - IMMUNOGENIC do not cause RESISTANCE fast and broadly ACTIVE BIOINFORMATICS APPROACHES TO MODELING AMPs ALL FAILED! THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA 100,000 2 0 Designed Cheminformatics pipeline for AMPs THE UNIVERSITY OF BRITISH COLUMBIA Trained statistics for AMPs QSAR models GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Accuracy Specificity Sensitivity Positive Predictive Value 5% 0.96 0.98 0.62 0.58 10% 0.93 0.94 0.76 0.39 25% 0.78 0.78 0.85 0.17 5% 0.94 0.97 0.33 0.30 10% 0.88 0.90 0.33 0.12 25% 0.77 0.77 0.80 0.12 5% 0.95 0.97 0.47 0.47 10% 0.91 0.92 0.54 0.27 25% 0.76 0.77 0.66 0.13 Training Top % set as actives A B A+B 10 cross THE UNIVERSITY OF BRITISH COLUMBIA 0.5 GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA 100,000 PEPTIDES have been designed using random sequence composition with ongoing enrichment for key aminoacids Subjected to QSAR and 20 AMPs Synthesized and TESTED 0.45 Amino acid fraction 0.4 0.35 Set A Set B Q1 Q2 Q3 Q4 0.3 0.25 0.2 0.15 0.1 0.05 0 A R N D C Q E G H I L K M F P S T W Y V Amino acid THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA SRANDARD USED: Compound MX-226 (aka MBI-226, omiganan) ILRWPWPWRRK - prevention of wounds - burn -device-related infections (central venous catheter related infections) In Phase III-b clinical trials by MIGENIX© THE UNIVERSITY OF BRITISH COLUMBIA EX VIVO against 12 bacterial strains (uM) 100,000 randomly designed, 20 tested from Q1-Q4 (predicted high-, medium-, and low-actives Pseudomonas aeruginosa, Pseudomonas maltophilia, Staphylococcus aureus, Enterobacter cloacae THE UNIVERSITY OF BRITISH COLUMBIA THE UNIVERSITY OF BRITISH COLUMBIA MEDICINE INFECTIOUS DISEASES Pseudomonas aeruginosa, Pseudomonas maltophilia, Staphylococcus aureus, Enterobacter cloacae THE UNIVERSITY OF BRITISH COLUMBIA THE UNIVERSITY OF BRITISH COLUMBIA MEDICINE INFECTIOUS DISEASES Pseudomonas aeruginosa, Pseudomonas maltophilia, Staphylococcus aureus, Enterobacter cloacae THE UNIVERSITY OF BRITISH COLUMBIA THE UNIVERSITY OF BRITISH COLUMBIA MEDICINE INFECTIOUS DISEASES 6 400 GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCT TCTGCCTTTGCTCCTG 350 GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT 5 CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA 300 GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC 4 TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT 250 CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC 200 3 TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG 150 TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG 2 GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT 100 CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT 1 CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG 50 GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC 0 TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC 0 Q4 Q3 Q2 Q1 Set B Set A ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG Set A Set B Q1 Q2 Q3 Q4 ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGA GGCTGGCTGGCTGGAC 0.7 4 GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT 3.5 0.6 CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT 3 0.5 CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG 2.5 GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA 0.4 TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG 2 TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG 0.3 GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACC 1.5 GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGT GCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA 0.2 TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG 1 GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC 0.1 TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT 0.5 CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT 0 0 GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG Set A Set B Q1 Q2 Q3 Q4 Set A Set B Q1 Q2 Q3 Q4 GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Hydrophobic fraction MEDIAN PROPERTIES DISTRIBUTIONS AMONG HIGH-, MEDIUM- AND LOWACTIVES Formal charge MEDIAN ACTIVITY H20 – phobicity Hydrophobic moment MEDIAN MEDIAN MIC [microMolar] CHARGE H20 – phobic moment THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Untreated THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Treated THE UNIVERSITY OF BRITISH COLUMBIA TOXICITY GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA % of Survived Reb Blood Cells after 24h peptide administration @50ug/ml 150 100 50 0 H H C H -8 H H C-9 H C H - 10 H C H - 20 H C H - 36 H C H - 45 H C H - 48 H C H - 53 H C H - 57 H C H - 66 H C H - 69 H C H - 71 H C H - 75 H H C-7 H C 7 H -10 H C 0 H -12 H C 3 H -12 H C 6 H -13 H C 3 H -14 H C 2 -1 4 Ba 8 c2 a % viable cells 200 THE UNIVERSITY OF BRITISH COLUMBIA IN VIVO Ability of new antimicrobial peptides HHC-10 and HHC-36 to GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA protect mice against S. aureus infections. Bacterial loads in the peritoneal CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA lavage from individual mice after 24 h of infection. Dead animals were GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC assigned the highest CFU count obtained in the experiment. The solid line TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC represents the arithmetic mean for each group. `` CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT 9 TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG 8 GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC 7 TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC 6 GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT 5 CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG 4 TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA 1.0×10 1.0×10 1.0×10 1.0×10 -3 H H C H H C -1 0 6 1.0×10 Sa lin e CFU/ml 1.0×10 Treatment Group THE UNIVERSITY OF BRITISH COLUMBIA GoingEvenBigger -> Protein QSAR THE UNIVERSITY OF BRITISH COLUMBIA Protein interaction networks are scale-free networks The web of human sexual contacts (Liljeros et al., Nature, 411 (2001) 907. The food network Neurons connections THE UNIVERSITY OF BRITISH COLUMBIA MRSA Proteins Interactions Network 2D representation of the developed MRSA PIN. Hub proteins are marked in yellow and non-hubs are in blue. The conventional antimicrobial targets are marked in red if they are also non-hubs. The conventional antimicrobial targets are marked in pink if they are also hubs. THE UNIVERSITY OF BRITISH COLUMBIA MRSA Proteins Interactions Network TASK: to sample the network with fewest experiments? HUBS ! THE UNIVERSITY OF BRITISH COLUMBIA A summary of protein interaction data used in the training and testing of the hub classifiers Training / Testing set E. coli S. cerevisiae # of proteins # of hubs (10% of total proteins) # of non-hubs (90% of total proteins) # of protein interactions minimum # of interactions per hub 2860 286 2574 13888 20 5397 535 4862 37167 33 D. melanogaster 6935 628 6307 19994 16 H. sapiens 6592 620 5972 19115 13 THE UNIVERSITY OF BRITISH COLUMBIA Hub proteins conservation among species Query species E. coli Subject species S. D. cerevisiae melanogaster H. sapiens E. coli % of hubs with similar proteins 18.18% 15.03% 18.18% % of non-hubs with similar proteins 8.00% 5.67% 5.75% % of conserved hubs % of conserved non-hubs S. cerevisiae % of hubs with similar proteins 4.20% 6.72% 1.05% 5.56% 2.80% 5.36% 7.48% 34.02% 39.44% % of non-hubs with similar proteins 3.78% 10.98% 11.74% % of conserved hubs 3.55% 6.36% 10.28% % of conserved non-hubs 2.88% 10.22% 10.26% D. melanogaster % of hubs with similar proteins 1.27% 12.26% 23.89% % of non-hubs with similar proteins 1.93% 9.75% 20.64% % of conserved hubs % of conserved non-hubs 1.11% 1.43% 6.69% 7.23% 6.69% 17.82% H. sapiens % of hubs with similar proteins 2.58% 22.10% 37.90% % of non-hubs with similar proteins 2.28% 12.34% 24.55% % of conserved hubs 1.94% 10.00% 9.35% % of conserved non-hubs 1.62% 8.98% 21.78% Index QSAR descriptors 1 2 3-22 number of residues molecular weight fraction of each residues in sequence 23 24 25 26 27 28 29 30 31 32-51 fraction of polar residues in sequence fraction of hydrophobic residues fraction of charged residues net charge at pH = 7.0 average hydrophobicity, Gtrans (kcal/mol) average “hydrophilicity”, Gapp (kcal/mol) fraction of surface residues in sequence estimated surface area estimated volume fraction of each residue at surface 52 53 54 55 56 57 58 fraction of polar residues at surface fraction of hydrophobic residues at surface fraction of charged residues at surface net surface charge average surface hydrophobicity average surface hydrophilicity ratio of average surface hydrophobicity to average hydrophobicity for sequence 59 ratio of average surface “hydrophilicity” to average hydrophilicity for sequence 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 isoelectric point (elementary charge unit) isoelectric point of surface (elementary charge unit) fraction of random coil residues fraction of α-helix residues fraction of -sheet residues helix-to-coil ratio for surface – helix-to-coil ratio for sequence average surface polarizability (kcal/mol) average surface MEP (kcal/mol) average surface ionization potential (kcal/mol) average surface electron affinity (kcal/mol) average surface electronegativity (kcal/mol) number of coil stretches > 4 residues in length length of longest contiguous coil stretch fraction of flexible coil residues in sequence fraction of flexible residues at surface THE UNIVERSITY OF BRITISH COLUMBIA average flexibility index for coil residues at the surface E. coli hub classifier S. cerevisiae hub classifier Four-fold cross-validation average performance Four-fold cross-validation average performance Training Training sensitivity specificity accuracy PPV NPV sensitivity specificity accuracy PPV NPV 86.71% 91.60% 91.11% 53.41% 98.41% 84.36% 88.99% 88.53% 45.74% 98.10% Testing Testing sensitivity specificity accuracy PPV NPV sensitivity specificity accuracy PPV NPV 51.40% 88.19% 84.51% 32.59% 94.23% 62.99% 86.16% 83.86% 33.37% 95.49% D. melanogaster H. sapiens hub classifier hub classifier Four-fold cross-validation average performance Four-fold cross-validation average performance Training Training sensitivity specificity accuracy PPV NPV 74.95% 87.24% 86.12% 36.90% 97.22% sensitivity specificity accuracy PPV NPV 51.77% 91.31% 87.59% 38.21% 94.80% sensitivity specificity accuracy PPV NPV 26.61% 88.78% 82.93% 19.76% 92.10% Testing Testing sensitivity specificity accuracy PPV NPV 41.24% 83.86% 80.00% 20.28% 93.48% THE UNIVERSITY OF BRITISH COLUMBIA THE UNIVERSITY OF BRITISH COLUMBIA MRSA Proteins Interactions Network Bait coverage summary and conserved interactions for MRSA and other PIN datasets. nr = non-redundant. *Percentages were calculated with respect to the subject species. THE UNIVERSITY OF BRITISH COLUMBIA TakeHomeMessages { -> QSAR allows sampling and navigating through Chemical Space as well as modeling complex mol properties -> When done properly, QSAR can handle even unconventional systems like peptides -> QSAR methodology can/should substitute sequencebased ideology (bioinformatics) on many levels } THE UNIVERSITY OF BRITISH COLUMBIA GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCT GAGCAGCCACAACCTA CTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTG GGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACT CATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAA GATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCT TGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTC TTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCT CAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGA GTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGAC CTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTGGCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTC TGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACATCAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGAC CTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCGGGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGG TGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCAGGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGT TCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTTCTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCG GGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCACAAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCT CCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAGGGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGC AGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACT CCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACG GCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCA GGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCC TGCTTTTCAAGGCTGTATGTTTACATTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCAC ATTTGTATTTGTCATTAGTCAACCGGAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATG ATCACACAGTCATACACGTTCTAACTCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGC TGATGATCCACATTTTCTAGCCCACTCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAA TCCTAAAGCTCTGGGAGCTGGGTGTCAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAAT CAGTGAACACACTTGATGGGAGTTTTCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTA GGTGCCCTTGAGCCCA GCTTTGGGAGCAATGTTGGATGAGTGAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGAC GAGTCAGGAGCCCCTTCCAAGGGTGGACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAATCTCAGCCTCGCCCACTG GCGCTGGACTTGGTACACAGGGTGGGGCAAAGTGGGTACTGGATCCTGATCATCCCTATCCCTGGGGTGTGGCTTCTTGCTGCACAGTCAGCTTCTAGTTCTGTAGCCCCAGCTGCTCCTGCGGTGGAGGGAGCTACACAT CAGGCTCTGACCCCCTCCAGGTGGGGCCTTCGCGTGAGGGGAGTCAGCACGCATCAGCAGCTGGGCCCAGGGAGTTGCCCCACTGAGCACTGCGGGCTGACCTGCTCCCAACCAGGGAGATGGAGCTTCCCCCTTGAGTCG GGCTGCTGAAGGGGGGTAGGGGATGGAAACAGTGCGTTTGCAGGAGTAAGGGTGCAGTTGGGTCCCTGCGAGAAAATGTCTCAGTTGTGGCAACTGATTGGTGACCTGGGGGGCGTTTCTGAGCCCACAGTGCTGGCATCA GGACTCAGGTGTGAGGTGCCCCAGACCCTCCCCTTGCCAGTAATTAGCTGATGGCTCGGTGATGCCCAGGGTGAAGGAAGACTTGATTTTGGGAGGGGAGTTCTCTCGTAATGACACTGAGGATGCCTTCAAGTTGGGCTT CTGGCATGTTCTGCCCTCGCTCCCCTTCTGTAGTCACCTTGGCCCTCGTGTTGCTGAGCTGTGTGTGGGAGCGGGAAGCGCGTCAGTGGGCGGAGGGAGCGGGAAGCGCGTCAGTGGGCGGAGTATTTGAGAACATTTCAC AAGCCGCTGTTGAGGTTCAGAATCAACCAGCAGATACAGAAACATATTTCGGAGCGTGGGGACCCTTGGGTGAGCTGCCACATGAAGCAGCCCCAGGACCTCCCTGGCTCAAGGAGTGACAGCGAGTTTGTCTGAGGTGAG GGCACAGGCCTGGCGAAGCCTCGTGTGTGGGTGAGACCTGCCCGACCCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCGTTGAGGCCAGGGGCA TAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTGGCAGCCAGTGCCACC ATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGCCAGTGCCTTACCCGAGGAGCTACTGGCCCAGTGGGGGAGGCATTCAGGTGGGCAGAGTCAGGGAGACTCATGAGGCCG TTGAGGCCAGGGGCATAGAGCTGGCCAAGGAGCCATGGCTCACTAACGTGTTGTATGGGGCTCCTTCCCTTCAGGTCCAGGCTCCTGCGTGAAGTGATGCTCCTCTTTGCCTTACTCCTAGCCATGGAGCTCCCATTGGTG GCAGCCAGTGCCACCATCGCGCTCAGTGTAAGTATCATTCCCTCTCACTGTCCTGGAGAGGACGAGAATTCCACCTGGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTA ATTTGCATTCCCTACC GGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTACCCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCC AAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGGTTGTCACATTCTGAGCAGCCACAACCTACTGCATCCCATGTAGAAGTGGAAGTGACCTGATTTTTTCCTGCTTTTCAAGGCTGTATGTTTACA TTTGCCTCCAATCATTCCTATGGGAATTCCTTGGGAGTCTAACTTGGAGATTTTGTTTCTTCTGCCTTTGCTCCTGGGGGCTTAATCACTTCTGTGCCTCTGGTTATCTGTGGCACATTTGTATTTGTCATTAGTCAACCG GAGACTCGGGGTCTGAGTGGAGGGTATGTCCCCCTCCAGTGATGGTTTCTGTTGGCTTCCCAGGGTGAGGATGACTCATGACCACTTGCAAGTGGTTTTTGTGTCTGGGGTTTATGATCACACAGTCATACACGTTCTAAC TCCAGACTGACTGTTGAGAAAGCCTCTGGGTAAGGGAATTCCTGGGAAACACACTGTTTTCATGCATCCTCTGGAAGATGAGGCCTGAAGTTACCAGGGTCTCTGTTTGCTGATGCTGATGATCCACATTTTCTAGCCCAC TCTGCTTCTCTGACACCTTTAGTCTTGAGGATCCATGNTCTGTGAAGGAATCCAAGCTCTCATTTCGCACTCACCTTGGCCCTGGCTCTGTCTCCAGGACCTCTTCTACTACAAAATCCTAAAGCTCTGGGAGCTGGGTGT CAACCTGTGCCCGAGGAAATCATACAGTTACTGTGGACTTTCCAGTTTGCTGTCTTCTAGTATTCCATTGTAGCTCTTGGGTATTTTCCCATCCACCCCAAGATCCAGCTGGAAATCAGTGAACACACTTGATGGGAGTTT TCCTGCATGTGCTCTGGGCATTGACAGTAGAAGGGTGTTCAGAATGTCTGCTGTGCCCTCATGGAGGAAGAGNGCTCAGTGTACATGCTCTGGGTCAGTAGGTGCCCTTGAGCCCAGCTTTGGGAGCAATGTTGGATGAGT GAAGGAGGGATCCAGGGCAAAGCAGGCACGACAGAGTGGAGACGGCGCTGCTGGCTCTCAGGGGAATGGGCATGGAGTGGGTAGGAGATCCACCTAAGGAGGCTGGCTGGCTGGACGAGTCAGGAGCCCCTTCCAAGGGTG GACACTGACAGGCCCCCAGTCTTGGTCTCCTGCATGCCAGAGGTACCAGCCCATCTTTTTTCCTAAACTTGATGACCTAGGGCTAGGGGCATGTTGAA Lab: Michael Hsing Simon Chan Nels Thorstein Chris Fjell Fuqiang Ban Melian Huang Ken Bydler Osvaldo Santos-Filho P. Axiero Evgeny Maksakov UBC Microbiology REW Hancock K Hilpert H Jenssen U.Sask VIDO: L Babuick & team SFU Computer Sciences C. Sahinalp E. Karakoc F. Hormozdiari CIHR V.I.D.O. U.Sask. Saskatoon, SK CIHR/MSFHR Bioinformatics UBC I.D., Microbiology Genome Canada, Genome BC UBC/VGH Prostate Centre THE UNIVERSITY OF BRITISH COLUMBIA