Inter-species sequence conservation and intraspecies sequence diversity Apratim Mitra Background Sequence conservation across species is a well-documented fact Genes coding for the same or similar proteins, even in evolutionary distant organisms, have been observed to have remarkable similarities Background (Contd.) At the same time, proteins with similar functions, even in the same species, can show a bewildering diversity Eg. Immuno-globulins ( commonly called ‘antibodies’) Aim of this project To demonstrate intra-species sequence diversity and inter-species sequence conservation using various web-based resources and tools, eg., NCBI, GenBank, ClustalW, etc. Investigating a new way of visualizing the multiple alignment results Cumulative alignment profile We produce a pair-wise alignment from such alignment programs as ClustalW or MUSCLE Using BLOSUM / PAM substitution matrices and Gap opening/extension penalties, we build a cumulative alignment score profile from the above alignment In addition to global sequence similarity this would include spatial information Plan of Action 1. Pool sequences of same/similar genes from different species and proteins (eg. IgG) from same species that exhibit diversity. 2. Run multiple alignment and clustering programs to obtain phylogenic trees hinting at evolutionary relationships. 3. Transform the alignment results into a cumulative alignment profile that indicates spatial features. 4. Cluster these profiles using correlation measures and obtain phylogenic trees. 5. Compare the two results. Schematic Collect sequences from online libraries Align using ClustalW, MUSCLE, etc Convert alignment scores into a ‘profile’ that indicates spatial information about the alignment Cluster these profiles and compare with the phylogenic tree obtained at the earlier step Why do it ? Global pair-wise alignment scores look at the entire alignment at once An alignment profile which indicates some spatial information would be a way of ‘improving’ interpretation Sequences which have a high degree of similarity can be differentiated on the basis of ‘patterns’ of dissimilarity in the profiles Further Uses/Extensions This method could be useful when trying to find: Differences between closely related species Similarities between distant species Can be easily extended to multiple alignments although results might be hard to interpret The cumulative profiles can also be analyzed by time-frequency methods like Fourier transforms or wavelet analysis for feature extraction Thank You