An efficient visualization tool for the analysis of protein mutation matrices Maria Pamela C. David, Carlo M. Lapid and Vincent Ricardo M. Daria Computational Science Research Center University of the Philippines, Diliman Outline Visualization A crash course on protein structure, function and engineering Protein mutation matrices Generation Data Visualization Applications of visualization Visualization Visualization Visualization Visualization “Our species relies on vision over all other senses” (about half of the human brain is devoted directly or indirectly to vision) “We learn about 11% audibly and 83% visually” ~50% of our brain gets involved in visual processing Other perceptual channels can be used : - They work in parallel as long as they don’t conflict - In a conflict, the visual channel will dominate Crash course on protein structure and engineering Protein mutations? Protein mutation/substitution matrices Analysis of evolving sequences/sequences with high variability Would have applications in artificial protein design Matrix generation A D G E G G G S T W K I Y E L R K T D R Matrix generation A D G E G G S T W K I Y E L R K T D R G Physico-chemical properties Hydrophilicity Size and Polarizability Charge and polarity T→I D→G R→E S→L S→K W→R T→D Note: 1. Alignment should be good 2. Primer-derived sequences should be removed Replaced residues, increasing hydrophilicity Sample mutation matrix Replacement residues, increasing hydrophilicity W L F I M V Y P A T C Q G K S H N E R D W 0 0 0 0 1 37 3 0 36 0 2 1 4 0 0 35 25 0 0 0 L 2 0 59 225 155 76 0 26 73 36 37 110 40 4 149 2 4 110 7 76 F 36 6 0 1 0 3 3 0 46 0 0 0 79 2 4 38 37 39 0 3 I 41 92 7 0 148 235 35 0 0 143 0 0 39 49 24 1 74 0 73 2 M 36 32 72 15 0 50 0 0 3 3 0 37 26 10 5 0 2 1 64 4 V 1 84 36 0 2 0 73 0 15 41 0 1 4 39 43 266 75 60 0 0 Y 0 36 0 2 0 1 0 1 53 2 2 38 0 39 97 41 35 4 37 37 P 0 1 0 0 0 0 2 0 0 0 0 1 0 0 166 0 0 36 2 0 A 0 8 68 47 34 130 0 1 0 74 0 0 43 6 68 0 0 119 37 91 T 0 37 3 50 0 80 2 35 51 0 0 3 0 4 40 35 97 95 1 2 C 0 0 1 26 0 36 35 0 0 0 0 0 0 0 26 0 4 5 0 0 Q 0 3 0 36 1 38 36 1 3 0 24 0 1 3 4 2 39 150 13 70 G 25 7 75 39 0 0 79 0 87 2 25 3 0 37 44 3 32 41 1 39 K 0 55 49 47 29 4 36 0 0 29 0 114 81 0 7 0 47 140 394 40 S 2 103 0 5 0 4 40 110 112 96 135 5 114 42 0 0 239 5 63 111 H 0 0 0 4 2 88 4 0 2 0 2 194 0 6 2 0 33 37 0 8 N 0 160 2 43 3 2 3 0 1 8 4 32 9 90 159 10 0 151 76 124 E 1 149 1 6 4 38 2 0 3 75 0 41 159 137 39 2 85 0 9 135 R 0 43 39 2 0 1 1 0 1 54 8 2 97 236 93 0 78 49 0 58 D 4 37 2 3 39 4 76 3 2 73 1 1 125 93 122 4 123 44 4 0 Image equivalent Lower hydrophilicity Higher hydrophilicity Tool 1: Matrix scaling W L F I M V Y P A T C Q G K S H N E R D Tool 2: Matrix comparison Find mutations exclusive to either matrix W L F I M V Y P A T C Q G K Matrix 1 S H N E R D W L F I M V Y P A T C Q G K Matrix 2 S H N E R D Tool 2: Matrix comparison W L F I M V Y P A T C Q G K S H N E R D Matrix 2 only Matrix 1 only Application 1: antibody engineering Complementarity Determining Region FRamework Application 1: antibody engineering W L F I M V Y P A T C Q G K CDR mutations S H N E R D W L F I M V Y P A T C Q G K FR mutations S H N E R D Application 1: antibody engineering S T W K I Y E L R G K T D D K T I A D G E G G Engineered antibody R R D Application 2: vaccine design Application 2: vaccine design Lower hydrophilicity Higher hydrophilicity Application 3: biosensor design Application 3: biosensor design Smaller residues Larger residues Other advantages of visualization Conclusion Visualization plays a key role in analysis Key applications of protein mutation matrix visualization Quicker elucidation of patterns Find those that were not immediately obvious Evolution studies Protein engineering Image manipulation and analysis techniques may be applied to images Current activities + future targets Matrix generation tool development Prototype hosted at the CSRC site Migrate from Matlab to open source software (i.e. Scilab/Octave + PERL/Tcl/Tk) Offer full matrix generation + visualization package online T H A N K Y O U ! Q U E S T I O N S ? ?