Project in BioInformatics Variability of Membrane proteins of different HIV strains By Emad Nimer Wisam Kadry Hiv groups & subtypes hiv Less easily transmitted Longer period between infection And illness hiv2 Hiv types hiv1 (Cameron) O Predominant Hiv1 groups M A M subtypes J (Europe) The 20 chosen strains: hiv1: O(1 strain), M(13 strains from subtypes A,D,E,F,G,J) hiv2(6 strains) HIV Infection The Membrane Protein Gp160 Gp160: Gp120+Gp41 Project goal explanation of regions of conservation/variability in the gp160 protein for different hiv strains Methodology & Tools 1. Extracting the gp160 sequences. hiv sequence database- http://hiv-web.lanl.gov/ 2. Detect conserved/variable residues. Multiple alignment- multalin/clustw 3. Detect motifs. MEME 4. Detect conserved 2D-structure - PSIPRED Expected Results gp120 SIG gp41 1 V1 36 CS 510 FD 512 V2 131 HR1 527 546 SIG-signal peptide V1/V2/V3-loops CS-cleavage site FD-fusion domain V3 185 HR2 579 628 345 TM 655 685 370 CYT D 705 HR1/2-heptad repeats TM-trans membrane domain CYT D-cytoplasmic domain 856 509 Results-conserved regions Domain,residues Explanation 37-50,280,425,124-130 CD4 binding sites gp120 120/1,418/9 421/2 CCR5 binding sites gp120 88,198,241,339 Glycosylation sites gp120 511 Cleavage site 685-705 Trans membrane gp41 Conserved regions within groups Domain,residues Explanation 1-36 signal peptide -Targeting to and translocation across different membranes’ cells gp120 230-340 bridging sheet :is likely includes components of CCR5-binding site 283,370,368,472-474 CD4 binding sites gp120 512-527 Fusion domains gp41 546-579,628-655 HR1 and HR2:two heptad repeats motifs results 706-856 Cytoplasmic domain:contains sequences critical for CD4 degradation. (different groups have different levels of CD4 degradation) Results-variable regions Domain,residues Explanation 131-185 (V1,V2) V1/V2 loops, part of CD4 binding site, their variability disrupt blocking the CD4 binding by the antibodies. Loops have a flexible structure(that explains the low consensus). 345-370 (V3) V3 loop ,includes CCR5 binding sites. SIG multalin CD4 binding site V1/V2 loops multalin MEME-finding motifs 4 6 9 9 1 0 7 6 9 6 4 6 9 4 6 9 6 6 1 0 1 0 9 5 4 6 7 6 9 9 7 4 6 9 3 7 5 5 3 6 5 5 5 4 4 6 3 3 8 3 8 4 4 8 3 8 8 3 8 3 6 3 8 8 3 8 3 8 3 4 8 8 3 8 4 4 3 3 3 4 8 8 4 8 3 3 4 4 8 3 4 4 8 3 8 4 4 4 4 3 8 4 4 4 4 4 4 8 3 6 3 8 3 6 4 8 1 0 3 6 4 8 6 5 8 8 3 4 3 6 6 4 3 3 3 4 8 3 7 8 8 6 6 4 3 3 3 8 3 3 3 1 0 3 6 3 4 7 9 6 8 3 3 3 6 7 9 6 1 0 1 0 6 3 8 3 7 7 4 6 1 0 7 3 3 7 9 9 3 1 0 7 6 6 6 7 9 6 1 0 7 4 4 8 3 4 8 4 | | | | | | | | | | | | | | 475 500 525 550 575 600 625 650 675 700 725 750 775 800 | | | | | | | | | | | | | 2D Structure Prediction of Gp160 α,β C, β β,α v v 1 2 131 185 α β C Gp120 Gp41 α,β,C α,C v 3 HR 1 345 370 511 546 579 628 HR 2 T M 655 685 705 Cyto 856 Conclusions 1. Conserved regions/residues have an important role in the functionality gp160 proteins,such as: trans membrane domain (high affinity to lymphocyte cell’s membrane), CD4/CCR5 binding sites and glycosation sites . 2. Conserved regions within the groups attribute group speciality such as different levels of infections : Why hiv1 is more dominant? • More binding sites •Higher level of CD4 degradation (Cytoplasmic domain) •Different pre-fusion complexes (hiv1 has 6 heptads and hiv2 has 3) 3. Variable regions are loops near binding sites ,their variability disrupt antibodies performance. (that is why HIV is so dangerous) Future research 1. Investigate the variable regions V1/V2/V3, Is mutation random ? 2. Our results showed that hiv1 has 6 heptad repeats while hiv2 has only three. Is that true ? (try to use other tools). 3. Include more strains to yield better sampling. 4. Investigate the conserved/variable regions of the Gag protein. Is there any interaction or correlation between Gag protein and gp160 ?