IMMUNOINFORMATICS: How structural bioinformatics impacts immunology Julia Ponomarenko jpon@sdsc.edu PHARM 201 November 18, 2011 Julia Ponomarenko, Ph.D. San Diego Supercomputer Center, SSPPS M.Sc., Physics, Novosibirsk, Russia Ph.D., Biology, Novosibirsk, Russia 2001 – GlaxoSmithKline Pharmaceuticals, USA 2002 – Research Scientist, Prof. Bourne’s Lab, UCSD 2004 – Lead of the structural group at UCSD developing the Immune Epitope Database (IEDB) (PI: Alex Sette, LIAI) 2008 – PI for NIH: Transcriptional regulation of stimulus-responsive gene expression programs (with Alex Hoffmann, Biochemistry) Data integration and tools for systems biology (with M. Baitaluk, SDSC) Swine-origin H1N1 influenza virus (S-OIV) in mid-April 2009 2 cases of a unique combination of North America and Eurasian swine-lineage H1N1 influenza virus occurred in California in people not exposed to pigs Neutralizing antibodies against S-OIV were found exclusively in persons born before 1957 That raised the concern that little protective immune memory exists in the general human population The New York Times, 2009 3 April, 2009: Examined pre-existing immunity against S-OIV - if epitopes that were present in the H1N1 seasonal flu strains in 1988-2008 are also present in the S-OIV strains May, 2009: Found that 69% (54/78) of the epitopes recognized by CD8+ Tcells were completely invariant August, 2009: Confirmed experimentally that memory T-cell immunity against S-OIV is present in the adult population and is of similar magnitude as the pre-existing memory against seasonal H1N1 influenza 4 Immunoinformatics is the application of (bio)informatics techniques to study the immune system Immunoinformatics is the application of (bio)informatics techniques to study the immune system BIOINFORMATICS IMMUNOINFORMATICS National Institute of Health FY 2009 Total budget $30.3B $4.97B 16% $4.70B 16% $3.02B 10% $2.00B 7% National Cancer Institute National Institute of Allergy and Infectious Diseases National Heart, National Institute Lung, and Blood of General Institute Medical Sciences BIOINFORMATICS IMMUNOINFORMATICS $4.97B 16% $4.70B 16% $3.02B 10% $2.00B 7% National Cancer Institute National Institute of Allergy and Infectious Diseases National Heart, National Institute Lung, and Blood of General Institute Medical Sciences Immunoinformatics goals Understanding and modeling the immune systems at the levels of cells, tissues, whole organisms, and populations Design medical diagnostics and vaccines for cancers, allergies, infectious and autoimmune diseases Immunoinformatics: Areas of Study Immunological Databases Epitopediscovery: discovery: antigen recognition Epitope antigen recognition Evolution of the immune system Evolution of pathogens and co-evolution of host and pathogen Modeling of host-pathogen interactions Regulatory networks in the cells of the immune system Mathematical modeling of immunological memory Computational models of the immune system Roadmap Vaccines Roadmap Vaccines Immune system Roadmap Vaccines Immune system Therapeutic Vaccines Roadmap Vaccines Immune system Therapeutic Vaccines Epitope Discovery Roadmap Vaccines Vaccine types ATTENUATED: Live but weakened whole virus or bacterium. Minimal reproduction extends immune cells’ exposure to antigen without causing disease: Measles, Mycobacterium Tuberculosis INACTIVATED: Whole but “killed” and unable to reproduce or to cause diseases: Rabies, Flu SUBUNIT and Recombinant: Fragments of the pathogen. Toxoids - Inactivated pathogen’s toxins: Tetanus, Diphtheria Recombinant viral capsid proteins: Hepatitis B, HPV Purified bacteria polysaccharides: Meningococcal meningitis Pathogen’s antigens conjugated with toxoid: Haemophilus influenzae type b meningitis Forbes.com 17 Plasmodium falciparum malaria vaccine RTS,S/AS, is a recombinant vaccine based on the Hepatitis B surface antigen virus-like particle (VLP) platform, genetically-engineered to include the carboxy terminus (amino acids 207-395) of the P. falciparum circumsporozoite (CS) antigen . CS covers the entire surface of sporozoites, the form of the malaria parasite inoculated into humans by female anopheline mosquitoes. Asparagine-Alanine-Asparagine-Proline (NANP) amino acid repeat sequence forms the immunodominant B-cell epitope from P. falciparum CS antigen. This sequence is species-specific, but highly conserved for isolates from each species. RTS,S/AS01 induces very high IgG concentrations in vaccinated humans to the NANP CS repeat. In addition, this vaccine induces moderate to high CD4+ Th1 responses against 18 flanking region peptides. Vaccines have been made for 36 of >1,400 human pathogens Emerg Infect Dis. 2005;11(12):1842 +HPV Alternative vaccines Peptide-based Quimi-Hib (Cuba, 2003) -- The first human vaccine against Haemophilus influenzae type B (or Hib), a bacteria that causes meningitis and pneumonia in children – in USA conjugate vaccine is used (1988; very expensive) Peptide vaccine against canine parvovirus (cause enteritis and myocarditis in dogs and minks) Experimental technologies Recombinant vector DNA vaccines Roadmap Vaccines Immune system Vaccines mimic infection to avert it Vaccines mimic infection to avert it Vaccines mimic infection to avert it How many lymph nodes humans have? Xenoreactive Complex AHIII 12.2 TCR bound to P1049 (ALWGFFPVLS) /HLA-A2.1 Vaccines mimic infection to avert it T-Cell Receptor V V MHC class I -2-Microglobulin 1lp9 MHC class I pathway Xenoreactive Complex AHIII 12.2 TCR bound to P1049 (ALWGFFPVLS) /HLA-A2.1 Intracellular pathogen (virus, mycobacteria) T-Cell Receptor Cytosolic protein V Proteasome V Peptides CD8 epitope TAP ER ER MHC I TCR CD8 Any cell CTL (TCD8+) MHC class I -2-Microglobulin 1lp9 Complex Of A Human TCR, Influenza HA Antigen Peptide (PKYVKQNTLKLAT) and MHC Class II Vaccines mimic infection to avert it T-Cell Receptor V V MHC class II MHC class II MHC class I pathway Intracellular pathogen (virus, mycobacteria) MHC class II pathway Extracellular protein Endosome Cytosolic protein Proteasome Endosome ? Peptides CD8 epitope TAP ER ER MHC I CD4 epitope TCR TCR CD8 Any cell MHC II CTL (TCD8+) Endosome CD4 TCD4+ B-cell, macrophage, or dendritic cell Vaccines mimic infection to avert it During embryonic development, regions of V genes combine with D, J, and C genes to produce 1.0E+15 different antibodies 31 Igg2A Intact Mouse Antibody - Mab231 (PDB ID 1igt) Fab fragment VL VH CL CH Fv fragment Light chain Fc fragment Heavy chain Interaction between APC and Th http://www.youtube.com/watch?v=M48qu5c7Cfg&NR=1 33 Antibody affinity maturation (great video) http://www.youtube.com/watch?v=qGsyBwDVnTU&feature=rel ated 34 Vaccines mimic infection to avert it Epitope HIV-1 envelope protein gp120 (core fragment) Epitope CD4 (N-terminal two domain fragment) 17b epitope Antibody 17b (Fab fragment) PDB: 1gc1 Roadmap Vaccines Immune system Therapeutic Vaccines Immunotherapy: Monoclonal Antibodies Alemtuzumab: For leukemia Infliximab: For Crohn’s disease and rheumatoid arthritis Rituximab: For non-Hodgkin’s lymphoma Trastuzumab: Herceptin for breast cancer Basiliximab and daclizumab: Block IL–2, immunosuppresives for transplants Movie how rituximab works: http://www.youtube.com/watch?v=UtNeImBmQCM&feature=related 3m 40 sec Cancer immunotherapy Cancer immunotherapy Cancer immunotherapy A therapeutic patient-targeted prostate cancer vaccine Provenge Approved by FDA in 2010 The median survival time for patients was 25.8 months comparing to 21.7 months for placebo-treated patients. Video: http://www.provenge.com/how-provenge-works.aspx (3 min) Re-engineered T-cells kill B-cells affected by chronic lymphocytic leukemia Tiny magnetic beads force the larger T-cells to divide before they are infused into the patient. http://www.nytimes.com/2011/09/13/health/13gene.html?pagewanted=all 44 Re-engineered T-cells kill B-cells affected by chronic lymphocytic leukemia To survive without B-cells, the patients need periodic infusions of IVIG (intravenous immunoglobulin) - the pooled IgG antibodies extracted from the plasma of over one thousand blood donors. IVIG's effects last between 2 weeks and 3 months. IVIG is an infusion of IgG antibodies only. Therefore, peripheral tissues that are defended mainly by IgA antibodies, such as the eyes, lungs, gut and urinary tract are not fully protected by the IVIG treatment. http://www.nytimes.com/2011/09/13/health/13gene.html?pagewanted=all 45 More to read about cancer immunotherapies: http://www.ncbi.nlm.nih.gov/pubmed/20706612 http://www.ncbi.nlm.nih.gov/pubmed?term=20187092 46 Roadmap Vaccines Immune system Therapeutic Vaccines Epitope Discovery Three types of epitopes T-cell MHC class I Xenoreactive Complex AHIII 12.2 TCR bound to P1049 (ALWGFFPVLS) /HLA-A2.1 Vaccines mimic infection to avert it T-Cell Receptor V V MHC class I -2-Microglobulin 1lp9 Three types of epitopes T-cell MHC class I T-cell MHC class II Complex Of A Human TCR, Influenza HA Antigen Peptide (PKYVKQNTLKLAT) and MHC Class II Vaccines mimic infection to avert it T-Cell Receptor V V MHC class II MHC class II Three types of epitopes T-cell MHC class I T-cell MHC class II B-cell or antibody epitopes HIV-1 envelope protein gp120 (core fragment) Epitope CD4 (N-terminal two domain fragment) 17b epitope Antibody 17b (Fab fragment) PDB: 1gc1 B cell (magenta, orange) and T cell epitopes (blue, green, red) of lysozyme PDB: 1dpx Why to know epitopes? Vaccines - epitope should be able to elicit T-cell response or/and production of antibodies neutralizing the pathogen Diagnostics - epitope should in vitro bind an antibody under diagnosis Early diagnostics of infectious diseases : SARS (2004), malaria, Chagas' disease, leishmaniasis (2003), Lyme disease (2005) Autoimmune diseases: lupus, rheumatoid arthritis Allergic reactions Data for epitope discovery Pathogen Databases HIV databases Pathogen Database at Los Alamos National Laboratory Sequences of oral pathogens (18 bacteria and 5 viruses), Influenza Virus Resource at NCBI NMPDR – National Microbial Pathogen Data Resource Sequences of 670 bacterial, 44 archaeal, and 29 eukaryotic genomes Airborne Pathogen Database - 27 pathogens Six Bioinformatics Resource Centers (BRCs) Virulence Factor Databases Data for epitope discovery Immune Genes and Diseases IMGT/GENE-DB - IG and TR genes from human, mouse, rat and rabbit IMGT/LIGM-DB - IG and TR genes, > 250 species IPD databases @EBI – other genes IPD-MHC (include IMGT/MHC-NHP) (@EBI) – MHC alleles for non-human species IMGT/HLA (@EBI) – HLA (human MHC) class I and II alleles HPTAA - Potential tumor-associated antigens Allele Frequency Database MHC (Major Histocompatibility Complex), aka HLA (Human Leukocyte Antigen) in human HLA complex contains more than 220 genes Most heterozygous humans express two copies of three MHC class I (two alleles of HLA-A, -B, -C genes) and three MHC class II molecules (HLA-DR, HLA-DP, HLA-DQ) inherited from both parents Different species have different number of active MHC genes; e.g., the resus macaque has 22 MHC class I genes HLA genes are the most polymorphic in human genome How many HLA alleles are known? 7,059 (5,674 a year ago; 4,161 two years ago) 5,468 HLA class I (4,383 a year ago; 3,007 two year ago) 1,591 HLA class II (1,291 a year ago; 1,154 two years ago ) Data from IMGT/HLA @EBI Populations differ by allele frequencies Why? Populations differ by allele frequencies MHC polymorphism confers a population susceptibility to a wide range of diseases and pathogens Why to know allele frequencies by populations? Population-optimized diagnostic tests: Designing reagents for HLA-typing, such as primers or probes Population-optimized epitope-based vaccines: A vaccine should be effective for a sufficiently large percentage of a given population At the same time, it should contain minimum number of epitopes to satisfy cost of approval, quality control, production, etc. Populations differ by allele frequencies Most of the MHC highly polymorphic residues are in the peptide binding pocket A*02 vs A*24: 14 of 26 polymorphic residues bind peptide Most of the MHC highly polymorphic residues are in the peptide binding pocket Phe9 in A*02 interacts with Ile of GILGFVFTL Ser9 in A*24 interacts with Tyr of VYGFVRACL Mutation from bulky Phe9 in A*02 to small Ser9 in A*24 makes the HLA binding pocket and be able to A*02 deeper vs A*24 accommodate bulky Tyr Data for epitope discovery Immune Epitopes IEDB database AntiJen database HIV Los Alamos database Rotation Student Project: Further Development of EpitopeViewer (Beaver J., Bourne P., Ponomarenko J. BMC Immunome Research 2008) For the structures of TCR-MHC-peptide complexes, visualize interactions between TCR-MHC, TCR-peptide, and MHC-peptide Visualize CDR regions of antibodies and TCR Visualize the user’s submitted data through the web Make it in JMOL Prediction of MHC class I epitopes Intracellular pathogen (virus, mycobacteria) • Proteosomal cleavage sites (several methods exist based on small amount of in vitro data) Cytosolic protein Proteasome • Peptide-TAP binding (ibid.) Peptides CD8 epitope TAP ER ER MHC I TCR CD8 Any cell CTL (TCD8+) • Peptide-MHC binding • Prediction of pMHC-TCR binding Measuring and predicting MHC class I binding peptide IC50 Sequence QIVTMFEAL 3.6 LKGPDIYKG 308 NFCNLTSAF 50,000 AQSQCRTFR 38,000 CTYAGPFGM 143 CFGNTAVAK 50,000 ... Predict binding peptides means to find function Fi such that Fi (Sequence) ≈ Affinity log(IC50) ~ Binding free Energy low IC50 high affinity The half maximal inhibitory concentration (IC50) is a measure of the effectiveness of a compound (peptide) in inhibiting biological or biochemical function (binding MHC). Indicates how much of a compound is needed to bind MHC by half. Calculate scoring matrix from affinities Function F is a matrix F(sequence) = Sum of ‘sequence’ S matrix entries Find the matrix that minimizes differences F(S) – Affinity(S) log (IC50) 0.50 0.72 2.37 3.42 3.46 4.07 4.18 4.24 4.39 4.40 4.90 Peptide FQPQNGSFI ISVANKIYM RVYEALYYV FQPQSGQFI LYEKVKSQL FKSVEFDMS FQPQNGQFH VLMLPVWFL YMTLGQVVF EDVKNAVGV VFYEQMKRF … A C D E F G H I K L M N P Q R S T V W Y 1 -0.3 0.2 0.8 0.6 -1.3 -0.2 1.1 -0.4 -0.3 0.0 -0.7 -0.1 1.2 0.4 -0.2 -0.3 -0.2 -0.1 0.0 -0.3 2 0.8 0.9 0.9 -0.4 0.5 0.1 0.9 -0.7 0.0 -1.9 -1.2 0.3 0.5 -1.1 0.9 0.1 -0.5 -0.9 0.7 0.2 3 -0.3 0.0 -0.4 0.7 -0.5 0.3 -0.1 -0.4 1.1 -0.4 -0.7 0.1 0.6 0.0 1.0 0.1 0.1 -0.1 -0.5 -0.6 HLA A*0201 4 5 6 -0.3 -0.2 -0.3 0.3 -0.5 -0.1 -0.3 0.3 0.2 -0.2 0.1 -0.4 0.1 -0.1 0.0 -0.1 0.0 0.4 0.4 0.1 0.2 0.1 -0.1 -0.4 0.1 0.1 0.6 -0.2 0.0 -0.2 0.2 -0.6 0.0 -0.3 -0.1 -0.3 -0.3 0.4 0.0 -0.1 0.4 -0.2 0.3 0.1 0.4 -0.4 0.1 0.3 0.4 0.1 -0.5 0.2 0.0 -0.3 -0.2 -0.1 0.2 0.2 0.0 0.4 7 0.0 0.1 0.4 -0.2 -0.3 0.3 0.0 -0.5 0.9 0.0 0.0 0.0 -0.4 -0.3 0.7 -0.2 0.2 0.1 -0.3 -0.4 8 0.0 0.2 0.3 -0.2 -0.4 -0.1 0.2 0.5 0.2 -0.1 0.0 0.2 -0.5 0.2 0.0 -0.1 0.0 0.1 -0.1 -0.3 9 -0.9 0.4 0.6 -0.5 -0.8 0.2 0.8 -1.4 0.9 -1.1 -0.8 0.7 0.7 0.7 0.9 0.2 -0.1 -1.9 0.4 0.8 1 -0.3 0.3 0.8 0.3 0.4 0.2 -0.3 -0.4 -0.7 -0.4 -0.6 0.2 0.6 0.0 -0.7 -0.3 0.3 -0.1 0.5 0.2 2 -0.2 0.4 0.4 0.3 0.7 0.4 0.1 -0.7 0.9 -0.7 -1.0 0.4 0.5 -0.7 0.9 -0.5 -1.2 -0.5 0.3 0.3 3 0.1 0.0 0.6 0.4 -0.5 0.2 0.1 -0.3 0.5 -0.3 -0.5 -0.4 0.4 -0.1 0.2 0.0 0.3 0.0 -0.5 -0.5 Performance measure for prediction methods Predicted score (binding affinity value) ROC curve TP FP Score FN threshold TN TP+FN – actual binders (based on a defined threshold on binding affinity values) TN+ FP – actual non-binders (ibid.) Sensitivity = TP / (TP + FN) = 6/7= 0.86 Specificity = TN / (TN + FP) = 6/8 = 0.75 True positive rate, TP / (TP + FN) 1 0.9 0.8 0.7 0.6 0.5 AUC or AROC 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 False positive rate, FP / (FP + TN) 1 Benchmarking predictions of peptide binding to MHC I (Peters et al. PLoS Comput Biol. 2006 Jun 9;2(6):e65) 48 MHC class I alleles Length of peptides 8 – 11 aa 48,828 data points {peptide – affinity value} 20 different methods were evaluated Performance evaluation measured IC [nM] 50 100000 ROC: Predict Binders with IC50 < 500 nM 10000 1000 100 100% syfpeithi 10 1 -5 0 5 10 15 20 25 30 predicted score measured IC [nM] 50 100000 10000 1000 35 true positive rate r2 = 0.29 80% 60% 40% bimas (AUC=0.920) syfpeithi (AUC=0.871) random (AUC = 0.5) 20% 100 10 bimas 0% 0% r2 = 0.48 1 0.0001 0.01 1 predicted score 100 10000 20% 40% 60% false positive rate 80% 100% Consensus Rank Peptide LTDLGLLYT CSANNSHHY LSIRGNSNY FSDQIEQEA QSSINISGY LSDSSGVEN IC50 Rank ann smm ann smm consensus 2 3 1 1 1 20 33 2 2 2 80 137 3 4 3.5 189 89 4 3 3.5 200 4920 5 6 5.5 1400 403 6 5 5.5 Consensus works best MHC H-2_Db H-2_Dd H-2_Kd H-2_Kk HLA_A-0201 HLA_A-3001 HLA_A-6802 HLA_B-0702 HLA_B-0801 HLA_B-1501 HLA_B-2705 HLA_B-3501 HLA_B-5101 HLA_B-5301 HLA_B-5401 HLA_B-5801 SMM 0.912 0.853 0.936 0.770 0.952 0.941 0.898 0.964 0.943 0.952 0.940 0.889 0.868 0.882 0.921 0.964 ANN 0.933 0.925 0.939 0.790 0.957 0.947 0.899 0.965 0.955 0.941 0.938 0.875 0.886 0.899 0.903 0.961 ANN+SMM 0.933 0.910 0.949 0.796 0.956 0.952 0.903 0.966 0.959 0.952 0.944 0.889 0.888 0.902 0.921 0.966 Summary on peptide-MHC class I binding Large, quantitative peptide-MHC binding datasets available Consensus approach gives AUC > 0.90 Top 1% predicted binders are actual MHC class I epitopes (based on testing of predicted epitope binding against CD8+ T-cell response in mice infected by vaccinia virus) MHC class II epitope prediction: Challenges The epitope length 9-37 aa The peptide may have nonlinear conformation The MHC binding groove is open from both sides and it is known that residues outside the groove effect peptide binding Complex Of A Human TCR, Influenza HA Antigen Peptide (PKYVKQNTLKLAT) and MHC Class II T-Cell Receptor V V MHC class II MHC class II Benchmarking predictions of peptide binding to MHC II (Wang et al. PLoS Comput Biol. 2007) 16 alleles 10,017 data points {peptide – affinity value} 9 different methods were evaluated: 6 matrix-based, 2 SVM, 1 QSAR-based AUC values varied from 0.5 to 0.83 Comparison with 29 X-ray structures of peptide-MHC II complexes (14 different alleles): The success of the binding core recognition was 21%-62% Ab initio structure-based prediction of peptideMHC class II binding Statistical pair potential (Zhang, DTU) Molecular dynamics (Wang, LIAI) Contact maps (Nikitas Papangelopoulos, UCSD) Benchmarked on 3,882 experimentally measured peptideHLA DRB1*0101 binding affinities Ab initio structure-based prediction of peptideMHC class II binding (Zhang et al., PLOS One 2010) The reason of low performance could be in complex nature of peptide-MHC class II interactions: Long peptides Contribution of flanking amino acids into binding Antigen processing MD for peptide-MHC class II binding See review http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2981876 Example: HLA-DP2-peptide binding (http://www.ncbi.nlm.nih.gov/pubmed/21898654): 247 peptideHLA complexes (all possible mutations for 13 positions) were simulated to obtain the parameters of the binding matrix (20 aa x13 peptide positions); applied to 457 peptide with known binding affinities to HLA-DP2, the method gave better prediction than existing sequence-based methods Summary on peptide-MHC class II binding Large, quantitative peptide-MHC binding datasets available But prediction is still poor New methods relying on both structural and peptide-MHC binding data should improve the prediction. Prediction of peptide-MHC I peptide binding from both sequence, binding and structural data (Jojic et al., Bioinformatics, 2006) Method: threading of a peptide sequence onto 3D-structure of a complex of other peptide with the same or similar (by sequence) HLA molecule combined with machine learning on binding data Results: • The method outperformed all other sequence-based methods, except ANN method (Nielsen et al. 2003) for some alleles. • The method outperformed ANN when the available training data for an allele was small; e.g., for B*4002 allele (119 data points) it gave AUC of 0.82 vs. 0.75 (ANN). Methods for antibody epitope prediction Sequence-based (suitable for linear epitopes only) Maximum sensitivity of sequence-based methods is 59%; maximum AUC is ~0.60 Structure-based (antibody binding site prediction for a protein of a given 3D structure) Epitope mapping using peptide libraries with following reconstruction of the epitope on the surface of protein 3D structure (if known or can be modeled) Prediction methods performance measures TN=127 FP=13 1 sensitivity = TP / (TP + FN) = 0.38 specificity = 1 – FP / (TN + FP) = 0.92 True positive rate, TP / (TP + FN) TP=6 FN=10 ROC curve 0.9 0.8 0.7 0.6 0.5 AUC 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 False positive rate, FP / (FP + TN) Area Under ROC Curve (AUC) = 0.5*(sensitivity + specificity) = 0.64 Benchmark of the methods on 42 X-ray structures of antibody-protein complexes Random method 0.50 AUC PatchDock 1st model 0.58 DOT 1st model 0.59 CEP average 0.54 DiscoTope 0.60 PEPITO 0.63 Rubinstein et al., 2008 0.65 ElliPro average 0.53 ElliPro best 0.73 new method 0.75 "Ideal" method 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Ponomarenko et al., BMC Bioinformatics, 2008 ElliPro prediction for Plasmodium vivax ookinete surface protein Pvs25 [PDB:1Z3G, chain A] 94 The method basics Actual epitope from the structure of antibody-protein complex Generated Epitope - surface residues inside the sphere of radius R with the center at the actual epitope Non-epitopes are generated randomly on the rest of protein surface with the sphere of radius R Propensity of polar residues discriminated epitopes versus non-epitopes 0.35 0.3 epitopes 0.25 non-epitopes 0.2 0.15 0.1 0.05 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 [number of A in the epitope]/[number of all residues in the epitope] [number of A on the surface]/[number of all residues on the surface] Naïve Bayes classifier Random method 0.50 AUC PatchDock 1st model 0.58 DOT 1st model 0.59 CEP average 0.54 DiscoTope 0.60 PEPITO 0.63 Rubinstein et al., 2008 0.65 ElliPro average 0.53 ElliPro best 0.73 new method 0.75 "Ideal" method 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Low success rate of epitope predictions: Reasons & Perspectives Assumption “The epitope is a property of the antigen” is wrong Epitopes cover ~75% of a lysozyme surface 180° Low success rate of epitope predictions: Reasons & Perspectives Assumption “The epitope is a property of the antigen” is wrong The epitope needs to be considered in the context of a specific antibody There is no enough data to carry statistical analysis of protein residue - antibody residue preferences The number of X-ray structures of protein-antibody complexes in PDB 350 300 250 200 150 100 50 0 1997 1999 2001 2003 2005 2007 2009 < 100 representative structures 2011 2012 There is no enough data to carry statistical analysis of protein residue - antibody residue preferences The number of X-ray structures of protein-antibody complexes in PDB 350 300 250 200 150 100 50 0 1997 1999 2001 2003 2005 2007 2009 < 200 representative structures 2011 2012 Framework for predicting epitopes in the context of specific antibodies Select a pathogen(s) Obtain the panel of mAbs targeting the pathogen For each mAb, Identify antigens Measure affinity Determine epitope(s), utilizing functional and structural assays Determine neutralization capacity in vitro and in vivo Analyze data and develop algorithms for predicting epitope and paratope for a given antigen-antibody pair 104 Converging on an HIV Vaccine 105 106 Critical Ab-antigen interactions are similar 107 Summary Knowledge of epitopes is essential for development of vaccine and diagnostics The problem of epitope prediction is far from solution Supplement slides 109 Videos 1. http://www.youtube.com/watch?v=M48qu5c7Cfg&NR=1 (2 min) 2. http://www.youtube.com/watch?v=qGsyBwDVnTU&feature=related (3m 30s) 3. http://www.youtube.com/watch?v=UtNeImBmQCM&feature=related (3m30s) 4. http://www.provenge.com/how-provenge-works.aspx (~3 min) 110 Recommended Books Immunological Bioinformatics, Ole Lund et al., MIT Press, 2005 Immunoinformatics: Predicting Immunogenicity In Silico, Ed.: Darren Flower, Humana Press, 2007 In Silico Immunology, Eds.: Darren Flower & Jon Timmis, Springer, 2007 Bioinformatics for Vaccinology, Darren Flower, Wiley-Blackwell, 2008 Recommended Journals Immunome Research Nucleic Acids Research BMC Immunology Journal of Molecular Recognition Immunogenetics Vaccine Journal of Immunology Molecular Immunology Bioinformatics Drug Discovery Today BMC Bioinformatics Applied Bioinformatics BMC Structural Biology In Silico Biology PLoS Computational Biology International Journal of Immunogenetics Immunity Methods in Molecular Biology PLoS One Biosystems Immunological synapse T-cell-antigen recognition and the immunological synapse Johannes B. Huppa & Mark M. Davis Nature Reviews Immunology 3, 973-983 (December 2003) 113 Immunological synapse: 25–30 peptide–MHC complexes were required at the interface to induce T cells Immunology Volume 133, Issue 4, pages 420-425, 1 JUN 2011 DOI: 10.1111/j.1365-2567.2011.03458.x http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2567.2011.03458.x/full#f1 114 Large-scale molecular dynamics simulation of a membraneembedded TCR–pMHC–CD4 complex on atomistic level The TCR–pMHC–CD4 complex is composed of two crystal structures: the CD4 four domain molecule [PDB:1WIO] and the TCR–pMHCII complex [PDB:1FYT]. The CHARMM and VMD packages were used for preparing the initial molecular models and analyzing the simulation data. Modeller was used to build the transmembrane and extracellular loops missing in the X-ray structures. Explicit solvent molecular dynamics simulations were performed using NAMD. The computed structural and thermodynamic properties were in fair agreement with experiment. http://www.ncbi.nlm.nih.gov/pubmed/17980430 115